Durability Simulator Design for OpenStack Swift

Post on 12-Jul-2015

171 views 0 download

Transcript of Durability Simulator Design for OpenStack Swift

Copyright©2014 NTT Corp. All Rights Reserved.

Durability Simulator Design for OpenStack Swift (Interactive Durability Calculation Tools)

Kota Tsuyuzaki [IRC: kota_] tsuyuzaki.kota@lab.ntt.co.jp NTT Software Innovation Center

Copyright(c)2009-2014 NTT CORPORATION. All Rights Reserved.

2 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Goal & Benefits

• How to calculate?

• Demo

Outline

Etherpad: https://etherpad.openstack.org/p/kilo-swift-durability-simulator

3 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Issue

User

I wanna build a durable object storage system by

using OpenStack Swift. I wanna know also the durability

to confirm it will be enough for our SLA.

4 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Issue

User

Provider A

Provider B

Provider C

Hey, guys. Could you tell me the

Swift system architecture and its

storage durability you support.

OpenStack Providers

5 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Issue

User

Provider A

Provider B

Provider C

A: 7-9s durability with 3 copies

B: 9-9s durability with 3 copies

C: 11-9s durability with 3 copies

WHAT’S HAPPEN!? WHICH IS CORRECT?

OpenStack Providers

6 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Goal

• Building durability calculation tools supported (or recommended) by Swift community

• Enabling to get the calculation result easily from both specs of system component HWs and swift configures. (e.g. # of disks, size of each disk, # of partitions)

• Benefits

• Swift Administrators (almost beginners) can find their own system durability easily

• Enable to standardize the calculation definition among Swift providers

• Swift Users can choose the policy for their use case (Replica? EC? Which # of parities are best for you?)

Goal & Benefits

7 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

How to calculate the durability?

8 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

For Replica Case

9 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Calculation Using Markov Model (Markov Process)

• 2 Replica -> k = 1, m = 1 • i.e. Data Lost with 2 Fragments

• 3 Replica -> k = 1, m = 2 • i.e. Data Lost with 3 Fragments

• Reference:

• [1]: "Reliability Mechanisms for Very Large Storage Systems"

• http://www.ssrc.ucsc.edu/Papers/xin-mss03.pdf

How to Calculate EC Durability?

[1]

10 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Redundancy Set[1]:

• Definition

• A block group composed of data blocks or object and their associated replicas or parity blocks. A single redundancy set will typically contain 1MB to 1TB, though we expect that redundancy sets will be at least 1GB to minimize bookkeeping overhead and reduce the likelihood that two redundancy sets will be stored on the same set of object storage system.

• Assuming a Reduandancy Set as a Partition

Consideration for Swift’s Partition

Ring Ring

MD5*(URL) = index

partitions

idx Copy 1 Copy 2 Copy 3

0 1 5 7

… … … …

8 3 2 6

Partition table from part to device id.

From [1]

11 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Definition: • Absorbing State: The end state in the state transition model.

• P: Transition Probability Matrix

Markov Process (1)

Absorbing State

Temporary State

P=𝑄 𝑈𝑂 𝐼

𝟏 − 𝟐𝝁 𝟐𝝁 𝟎𝒗 𝟏 − (𝝁 + 𝒗) 𝝁𝟎 𝟎 𝟏

Q: Transition Probability Matrix among Temporary State U: Probability Matrix from Temporary State into Absorbing State O: Zero Matrix、I: Identity Matrix

State0 State1 State2

12 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Time (t) Limitation of State Transition Matrix (P) shows average # of state transition (M) from initial state to absorbing state

• MTTDL (Time to be absorbing state) calculated from sum of each rows in MN

Markov Process (2)

𝐥𝐢𝐦𝒕→∞

𝑷𝒕=𝟎 𝑴𝑼𝟎 𝑰

M = (I-Q)-1 MTTDLrs = M𝟏⋮𝟏

P=𝟏 − 𝟐𝝁 𝟐𝝁 𝟎

𝒗 𝟏 − (𝝁 + 𝒗) 𝝁𝟎 𝟎 𝟏

𝟏

𝟐𝝁

𝝁 + 𝒗

𝝁𝟐

𝒗

𝝁𝟐

State Transition Matrix for 2 replica

M MTTDLrs 𝟏

𝟐𝝁𝟐𝟑𝝁 + 𝒗𝟐𝝁 + 𝒗

Durability = 1 – N/ MTTDLrs

Probability for Data Lost

Durability

1 - 2𝑵𝝁𝟐

𝟏

𝟑𝝁+𝒗

𝟏

𝟐𝝁+𝒗

13 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

For EC Case

14 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Object Size(bytes): n

• # of Sliced Raw Objects: k

• # of Parities: m

• Total # of Fragments: k + m

• Fragment Size(bytes): n / k (+ checksum)

• Total Stored Size (bytes): Fragment Size * (k + m)

Erasure Code Definition

object

Data

fragment

Data

fragment

parity

fragment

parity

fragment

… k

m

encode

decode

Terminology Reference: http://specs.openstack.org/openstack/ swift-specs/specs/swift/erasure_coding.html

15 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Basic Idea

• Expansion of Durability Calculation for Replica Model

• Calculation Using Markov Model (Markov Process)

• Replica Model based on Markov Process:

• 2 Replica -> k = 1, m = 1 • i.e. Data Lost with 2 Fragments

• 3 Replica -> k = 1, m = 2 • i.e. Data Lost with 3 Fragments

How to Calculate EC Durability?

[1]

※ Markov Process works to calculate the durability with matrix calculation. [3]

16 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Algorithms

• State: Status (exists or lost) for All fragments

• Each state is transferred by constant probability

• μ = Disk Failure Rate, v = Fragments Repair Rate

• Each Rate related to # of Fragments

• E.g. RAID related to # of Devices

• Extract States to m + 1 (i.e. data lost)

Durability Calculation Algorithms

0 1 m-1 m … m+1

state transitions for “m” parities EC

D = # of Devices (RAID5) N = k + m (N fragments located in the system)

-Nμ

v

Nμ -(N-1)μ-v

(N-m)μ

mv

(N-(m-1))μ

-(N-(m-1))μ-mv

17 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Demo

18 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Demo

19 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Demo

20 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Kota Tsuyuzaki [IRC: kota_] tsuyuzaki.kota@lab.ntt.co.jp

NTT Software Innovation Center

Questions?

Etherpad: https://etherpad.openstack.org/p/kilo-swift-durability-simulator