연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University...

15
연연연연연 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23

Transcript of 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University...

Page 1: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

연세대학교Yonsei Univer-sity

Data Processing Systems for Solid State

Drive

Yonsei UniversityMincheol Shin

2015.11.23

Page 2: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

Overview

• Main Target : Data Processing Systems with SSD

• Purpose : Improving I/O Performance

• Data Processing System– Relational Database Management System

• e.g. Oracle, MySQL, PostgreSQL, SQLite

– Distributed Data Processing System• e.g. Hadoop Distributed File System, MapReduce, Hive, Hbase, Tajo,

Spark

– Key-value Store• e.g. Redis

Page 3: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

Outline

• Solid State Drive (SSD)• RDBMS on Solid State Drive• Big Data Processing for Solid State Drive

Page 4: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

Solid State Drive: Flash Memory [VLDB2011Tut2]

• Great Performance !!– High I/O Performance: 41 MB/s Read, 7.5 MB/s Program [Micron 2014]

– Fast Random Access: Under 0.1 ms (HDD: 2.9 to 12 ms)

– Low Energy Consumption

• Four Constraints of NAND Flash Memory– C1: Program granularity (2KB~16KB)

– C2: Must erase a block before updating a page (256KB ~ 1MB)

– C3: Pages must be programmed sequentially within a block

– C4: Limited lifetime (104 ~ 105)

4k Page4k Page

A Erase Block (1 MB)

[VLDB2011Tut2] P. Bonnet, L. Bouganim, I. Koltsidas, S. D. Viglas, VLDB 2011 Tutorial: System Co-Design and Data management for Flash Devices

Page 5: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

Solid State Drive

• Solid State Drive (SSD)– Definition: Persistent data storage without disks nor a drive motor.– Support Traditional Block I/O

• Characteristics for SSD– Fast Random Access (inherited from flash memory)– Read/Write Imbalance (inherited from flash memory)– Exploiting Internal Parallelism (SSD internal structure)– In-Storage Processing

SSD

HostI/F

(SATA, SAS, PCIE)

Read(addr)

Write(addr, data)

Internal Algorithm (FTL)

Mapping

Wear leveling

Garbage Collection

Physical Storage

Flash Chips

Flash Chips

Flash Chips

Flash Chips

Flash Chips

Flash Chips

ReadPro-gramErase

Page 6: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

Solid State Drive: Flash Translation Layer (FTL)

• Flash Translation Layer– Convert the block I/O operations to internal operations

– Three Major Components • Mapping

– Map Logical Block Address(LBA) to physical page

• Garbage Collection

• Wear Leveling– To extend lifetime of SSD

Logical

Physical

Block 1 Block 2 Block 3 Block 4

Update

v v v v I I v I v v

Block 2 Block 3 Block 4

v v v v I I I I v v v

Block 2 Block 3 Block 4

Erase

Page 7: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

Solid State Drive: Internal Parallelism

• SSD can read/write the data in parallel

SSD

HostI/F

(SATA, SAS, PCIE)

Flash Package

Flash Package

Flash Package

Flash Package

Flash Package

Flash Package

Flash Package

Flash Package

Channel-level Parallelism(N Parallel Channels)

Package-level parallelism(Interleaving)

Memory

Time

Read 1 Transfer 1

Read 3 Transfer 3

Read 5 Transfer 5

Read 7 Transfer 7

Read 2 Transfer 2

Read 4 Transfer 4

Read 6 Transfer 6

Read 8 Transfer 8

Package 1 (Ch. 1)

Package 2 (Ch. 1)

Package 3 (Ch. 2)

Package 4 (Ch. 2)

Channel 1

Channel 2 Data 2 Data 4 Data 6 Data 8

Data 1 Data 3 Data 5 Data 7

Page 8: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

Solid State Drive: Internal Parallelism

• Using internal parallelism, SSD achieves – High performance for sequential I/O

• Similar to Striping (RAID 0)• Seq. bw for SATA SSD

– Write : 450 MB/s– Read : 500 MB/s

– High performance for concurrent I/O

[VLDB2012Roh] H. Roh, S. Park, S. Kim, M. Shin, S-W. Lee,B+-tree index optimization by exploiting internal parallelism of flash-based Solid State Drives

Page 9: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

Solid State Drive: In-Storage Processing

• SSD has CPU and Memory for FTL

• Host Interface is bottleneck !– H/I has lower bandwidth than internal bandwidth of SSD

• Two approaches– Light-weight filter in SSD

• Transfer smaller data through H/F• Filter tuples using predicates

– Sub-modules in SSD• e.g. Transaction management with COW

• Need special SSD to implement ISP– OpenSSD, SmartSSD and so or

Page 10: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

DBMS on Solid State Drive

• Main research areas:– Buffer Management– Index Management– Query Processing– Transaction Management

• Most of researches using SSDs focused on storage I/O

Page 11: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

DBMS on Solid State Drive: Index Management

• FD-tree– Exploit sequential bandwidths of SSDs– B-Tree + sorted runs

• PIO B-tree– Exploit internal parallelism of

SSDs– Access to multiple B-tree node

along multiple paths

Page 12: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

DBMS on Solid State Drive: Query Processing

• FlashJoin: PAX based query processing – NSM layout

• Most typical page layout• Tuples are stored in a contiguous

region

– PAX layout• Values of columns are stored

in contiguous region (minipage)• Originally, PAX is designed for reducing cache miss in CPU cache

– FlashScan reads only needed minipages– FlashJoin joins minipages read by flashScan

Page 13: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

DBMS on Solid State Drive: Query Processing

• FMSort– Exploit internal parallelism of SSD– During merge phase,

Page 14: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

DBMS on Solid State Drive: Transaction Mgmt.

• X-FTL: Shadow Paging in SSD– Writing operations of SSD is similar to Copy-on-write

• When a page is updated, the modified page is written to an empty page.• And then, invalidate old page

– X-FTL maintains old pages until transaction is committed.– There is no copying the original pages

Page 15: 연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin 2015.11.23.

Big Data on Solid State Drive

• 3 approaches to improve performance using SSDs– Complete replacement

• Higher cost per capacity

– Selective replacement• e.g. intermediate results on SSDs, HDFS data on HDDs

– SSD as a cache• Commercial/Noncommercial cache SW exist• Open source : bcache, flashcache, enhanced IO, DM-cache • Project with SK Telecom

• Archival Storage of HDFS– Store replica into 4 tiers of storage

• ARHIVE : slowest and biggest capacity storage (petabyte of storage)• DISK, SSD, RAM_DISK• https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.ht

ml#Storage_Types:_ARCHIVE_DISK_SSD_and_RAM_DISK

• Issues– Industry leads Big Data processing platform area– There is no standard model– Because CPU overhead are too high