Enterprise Application of SSD 曹庆玲 qingling1220@sina

Post on 06-Jan-2016

41 views 8 download

description

Enterprise Application of SSD 曹庆玲 qingling1220@sina.com. Towards SSD-Ready Enterprise Platforms Building Large Storage Based On Flash Disks. Towards SSD-Ready Enterprise Platforms Building Large Storage Based On Flash Disks. Outline. Motivation Platform and methodology - PowerPoint PPT Presentation

Transcript of Enterprise Application of SSD 曹庆玲 qingling1220@sina

Enterprise Application of SSD

曹庆玲qingling1220@sina.com

• Towards SSD-Ready Enterprise Platforms

• Building Large Storage Based On Flash Disks

• Towards SSD-Ready Enterprise Platforms

• Building Large Storage Based On Flash Disks

Outline

• Motivation

• Platform and methodology

• Platform bottleneck analysis Platform latency bottlenecks

I/O processing bottlenecks

Performance scaling bottlenecks

• Conclusion

Motivation

• SSD deliver 2-3 orders of magnitude increase in IOPS over HDD

• Platform have long been optimized for HDD

• Is it ready for SSD?

Platform and methodology

Platform and methodology

Platform and methodology

• Use Linux* as a reference OS for experiment

• Focus on fixed-size 4KB random reads .

Random read to avoid I/O merging policies and if the platform ready for read , then it must be ready for write.

Platform bottleneck analysis

• Platform latency bottlenecks—determine component dominates I/O latency

• I/O processing bottlenecks—determine software contribute the most CPU overhead for I/O processing

• Performance scaling bottlenecks—determine component limits scaling of performance

Platform bottleneck analysis

—Platform latency

Total I/O latency is the time from application issue an I/O to the time it receives completion.

Time due to media

Time due to platform

Platform bottleneck analysis

—Platform latency

The platform only contribute 26% of the total latency.

Optimizing the media is necessary.

Platform bottleneck analysis —I/O processing cost

35000

Platform bottleneck analysis —I/O processing cost

• ahci_interrupt() and ahci_scr_read() executed

uncacheable (UC) reads. The UC reads incurred averaging 2,100 clocks per UC read.

Device interfaces that adopt message signaled interrupts (MSI),and the added intelligence to push status to drivers , can eliminate such UC reads.

Can reduce overhead about 8,400 clocks/IO.

Platform bottleneck analysis —I/O processing cost

• I/O processing when done through an MSI-based interface like LSI’s, incurred 25,000 clocks/IO

Platform bottleneck analysis —I/O processing cost

• The LSI’s driver return path (5250 clocks/IO) is still substantial.

It can be reduced by employing interrupt coalescing. Then only 650 clocks remain in the driver return path, resulting in about 20,000 clocks/IO.

Platform bottleneck analysis —Performance scaling

Ensure that I/O processing scales with cores and SSDs.

The single core with 3 SSDs is fully saturated,more cores are required.

One adapter enable 177K IOPS.

With more throughput scaled up to 445K IOPS.

Platform bottleneck analysis —Performance scaling

Conclusion

• Existing platforms to be ready for SSDs.

• Scalability of file system

• I/O behavior of real application

• Implementation of RAID

• Towards SSD-Ready Enterprise Platforms

• Building Large Storage Based On Flash Disks

Outline

• Introduction

• SSD RAID configuration

• Scalability

• Solution alternatives

• Conclusion

Input data streamInput data

RAID controller

parallel

SSD1 SSD2 SSD3 SSD4 SSD5

RAID0

Input data stream

RAID controller

Parallel

SSD1 SSD2

Group 1

SSD3 SSD4

RAID1

Group 2

Work disk Mirror disk

Two RAID 1’s Striped

RAID LevelsRAID Levels——

RAID 10RAID 10

Input data streamInput data

RAID controller

RAID5

parity

parity

parity

parity

Introduction

SSD RAID shows the performance loss.

Test setup:• 16 core server with 64GB RAM• 3 RAID controllers with 512MB cache• Intel 64GB SSD

Workloads:• Workload light – one worker,32 queue;• Workload heavy – ten worker,queue depth 16;• Workload latency – single request,one worker,

queue depth 1.

Test setup and workload

SSD RAID Configurations —throughput(workload heavy)

RAID 0,5,10 With 8 SSDs on a single controller

SSD RAID Configurations —throughput(workload heavy)

RAID 0,5,10 With 8 SSDs on a single controller

SSD RAID Configurations—throughput(workload light)

Volume=240GBShow single SSD data for comparison

Volume=240GBShow single SSD data for comparison

SSD RAID Configurations—throughput(workload light)

saturate

Scalability

Experiment data above indicate:

Exist a bottleneck along the IO chain

Is it RAID controller or PCIe bus?

With the best throughput,the utilization PCIe bus is less than 50%.

RAID controller is the bottleneck.

Scalability

Scalability

Two SSDs are enough to saturate the controller!

With read-ahead With write cache

Scalability

Without write cache

Scalability

Solution alternatives

Combination of hardware and software.

A. Without controller. Devices connect directly with software RAID on top

B. Use controller just as simple device aggregator while running software RAID on top

C. Use simple RAID level on multiple RAID controller while running software on top

Solution alternatives

Compare option A and B RAID with 2 SSDs

Second controller have a profound effect on performance.

Solution alternatives

Compare option B and C

conclusions

• Software RAID-approaches • Multiple blocksize• RAID controllers are not designed for the

characteristic of SSD

Thank you~