Luc Bouganim Björn Þór Jónsson Philippe Bonnet
description
Transcript of Luc Bouganim Björn Þór Jónsson Philippe Bonnet
SRBIAU, Kurdistan Campus
Luc Bouganim Björn Þór JónssonPhilippe Bonnet
Danesh Zandi , Afshin Rahmany & mohamad kavosi
Spring 12
1
Assistant Professor : Kyumars Sheykh Esmaili
uFLIP: Understanding Flash IO Patterns
Overview1.Introduction
1.Motivation2.Contributions3.How Flash Devices Work4.Why the State Matters
2.Content1.Definitions2.The Benchmark3.Benchmarking Methodology4.Device Evaluations
3.Review1.Problems2.Evaluation3.Conclusion
2
Introduction: Motivation
Flash devices (vs. HDDs)
Faster
More robustSoon as big
•Lower latency•Higher throughput
More complex to handle
ReadWrite → Program/Erase
New/adapted algorithms?
We need to understand the devices!
3
Introduction: Contributions
The uFLIP Benchmark
Benchmarking Methodology
Device Evaluations
Consisting of 9 micro benchmarks
How to apply the benchmark
Example evaluations of a set of devices
4
Introduction: How Flash Devices Work
Units
Page: ~2KBBlock: 64 * 2KB ≈ 128 KB
Read
Program
Default state: 1Program → 0
Erase
Back to defaultOnly possible 10⁵ to 10⁶ timesPer block → slow
5
Introduction: How Flash Devices Work
Flash Chips
Block Manager
Wear levelingMaps LBA to flash pagePossibly trades in-place for writes into free pages Possibly asynchronous page reclamation
6
Introduction: Why the State Matters
General principles are well known
Details not.
Flash Devices are black-boxes.
# free pages unknown
Time of next erase unknown
Cost of I/O operation is non uniform in time Depends on state of device
7
Content: Definitions
I/O operation
Time, size, LBA, read/write
Baseline patterns
Sequential/random read, sequential/random write
Time
Consecutive, pause, burst
Logical Block Address
Sequential, random, ordered, partitionedTarget offset/size, shift
OthersIOIgnore, IOCount 8
Content: Definitions
9
Content: The Benchmark(s)
Granularity
Locality
I/O size
Target size
10
Content: The Benchmark(s)
Partitioning
Order
Parallelism
Target space divided into partitionsOperations within partition are sequential
Linear increase/decrease, in-place
Target space divided into subsetsEach accessed by different process 1
1
Content: Benchmarking Methodology
Device stateOut of the box 16KB write: 1msecAfter writing whole device: 8msec
Well defined initial state
„Write the whole flash device completely yields a well-defined state.“
Start-up Phase
Defined by IOIgnore
Running Phase
Defined by IOCount – IOIgnore
Content: Device Evaluations
Devices are from 2009
Range from USB stick over IDE modules to SSDs
From $12 to $943 and 2GB to 32GB
More expensive → faster
Parallelism has no effect
Review: Problems
They vary only one parameter at a time
„Writing the whole flash device completely yields a well-defined state.“
Interactions between parameters not captured
Multidimensinal graphs can be analyzedFull factorial design is not feasible
•e.g. what if locality and partitioning work well together?
•Why not 2^k factorial design?
Next paragraph:„...by performing random IOs of random size (ranging from 0.5KB tothe flash block size, 128KB) on the whole device.“„writes“ or „random IOs“?What does „the whole device“ mean?
•All LBAs? All flash pages (not possible)? Total size?14
Review: Evaluation
The paper was interesting to read.
Of the 3 contributions:
The results are obsolete (but interesting).The methodology are (mostly) well known benchmarkingbest practices.The benchmark is still valid and useful.
More explanations for the results
15
Review: Conclusion
Many areas for improvement
AutomationCapturing interaction
SSDs are getting more and more important
Evaluation with todays devices
Parallesim?
No alternative offers as much information
16
End of presentation 17