An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... ·...
Transcript of An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... ·...
![Page 1: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/1.jpg)
Expanding the Boundaries of the AI Revolution:
An In-depth Study of
High Bandwidth Memory
Nayoung Lee & Sung Lee | March 2018
![Page 2: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/2.jpg)
1
Table of Contents
1
2
3
![Page 3: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/3.jpg)
1
![Page 4: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/4.jpg)
3
Source: Standford
Deep Neural Network
Σ(Activation function, Compute)
= Multiply & Accumulate sum
Weights x Input
Output
Layer
Weights x Input
Weights x Input
……
……
……
Simple View
Deep Neural Network Fundamental Concepts
MEM Write
MEM Read
![Page 5: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/5.jpg)
GPU Computing Performance bottleneck
4
Δ2x Bandwidth = Δ1.7x
performance
1) In-Datacenter Performance Analysis of a Tensor Processing Unit, Norm P. Jouppi et. al, (Google)
The Need for High Bandwidth Memory
![Page 6: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/6.jpg)
5
![Page 7: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/7.jpg)
2
![Page 8: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/8.jpg)
GDDR/DDR/LPDDR HBM
7
FBGA KGSD
HBM in 2.5D SiP
PHYTSVDA ball
DRAM Slice
DRAM Slice
DRAM Slice
DRAM Slice
Interposer
SoC
PHY
Side
Mold
ing
Side
Mold
ing
Substrate
Soldered on PCB directly
Or
Use as DIMM Type
Mold
DRAM
DRAM
PCB Substrate
HBM, What’s the difference?
![Page 9: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/9.jpg)
8
To Achieve 1TB Bandwidth …
160ea of
DDR4-3200
40ea of
DDR4-3200 Module
4ea HBM2 in
a single 50mm x 50mm Sip
Note: Advil is
a registered trademark
High Bandwidth Memory Delivers Small Form Factor
HBM provides highest bandwidth compare to other DRAM memories per unit area
![Page 10: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/10.jpg)
GDDR5(X) HBM2
9
High Bandwidth Memory Delivers Small Form Factor
Density 8Gb x 12 = 12GB
IO speed 8Gbps - 11Gbps
# of IO 384 bits
Bandwidth 384 – 528GB
Density 8GB x 4 = 32GB
IO speed 2Gbps
# of IO 1024*4 = 4096
Bandwidth 1TB
![Page 11: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/11.jpg)
10
High Bandwidth Memory Delivers Unprecedented Bandwidth
HBM overcomes all DRAM bandwidth challenges
Bandwidth Challenges High Bandwidth + High I/O
![Page 12: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/12.jpg)
11
High Bandwidth Memory Delivers Power Efficiency
HBM low speed per pin & Cio reduces power consumption and increases power efficiency
100%
Power Efficiency Power Consumption(mW/Gbps/pin)
![Page 13: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/13.jpg)
12
Next Generation System Architectures Leveraging HBM
HBM and 2.5D integration unlock new system architectures
HPC & Server(B/W & Capacity)
Network & Graphics(B/W)
Client-DT & NB(B/W & Cost)
+
Bandwidth
Solution
Cost Solution
+
Bandwidth
Solution
Bandwidth
Solution
+
Bandwidth
Solution Capacity Solution
Post-DDR4
+
Post-DDR4
+
B/W
B/W
B/W
B/W &
Capacity
B/W & Cost
HBM
![Page 14: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/14.jpg)
3
1) Innovative Design
2) Revolutionary Technological Features
3) Next Generation Line-up Considerations
![Page 15: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/15.jpg)
14
HBM standard adopted by the Joint Electron Device Engineering Council(JEDEC) in 2013, and
the current 2nd generation HBM in 2016.
Total HBM (+HMC) market expected to increase from $922.7M in 2018 to $3,842.5M by
2023, resulting in CAGR 33%. (Source: RESEARCH AND MARKETS)
High bandwidth, high power efficiency and compact form factors have propelled HBM
collaboration engagements covering all IT sectors.
e.g. Graphics, AI/Deep Learning, HPC, SVR, NTW Router/Switches etc.
Did You Know?Introduction
![Page 16: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/16.jpg)
15
HBM KGSD ArchitectureInnovative Design
TSV TSV
CH5
CH5
CH7
CH7
CH5 CH7
CH4 CH6
CH1 CH3
CH0 CH2
CH0 CH2
CH1 CH3
CH4 CH6
CH5 CH7
SID1
SID0
BASE DIE
CORE DIE
11.87mm
0.7
2m
m
11.87x7.75x0.72mm PKG dimension
9Gb per cell array (Optional 1Gb ECC cell)
4/8GB density per mKGSD stack
Max 2.4Gbps data transmission speed enabling
307GB/s B/W performance
![Page 17: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/17.jpg)
16
HBM Gen2 Core DieInnovative Design
PC0 PC0PC1 PC1
CH0/1/4/5 CH2/3/6/7 10.63mm x 6.65mm
Supports Pseudo CH mode
2 individual sub-CH of 64bits I/O,
16 banks
Two seamless array access w/
Burst Length 4
256b Prefetch per PCH
![Page 18: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/18.jpg)
11.87mm x 8.87mm
Programmable Memory
Built-In Self Test
Direct Access
IEEE1500
PHY
17
HBM Gen2 Base DieInnovative Design
![Page 19: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/19.jpg)
18
Underfill
TSV Formation
Temporary Bond/Debonding Vertical Chip Stacking
Wafer Molding
PKG Stacking & InterconnectionRevolutionary Technical Features
![Page 20: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/20.jpg)
19
Wire Bonding Through Silicon Via
PKG Stacking & InterconnectionRevolutionary Technical Features
![Page 21: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/21.jpg)
20
Wafer & KGSD PKG Level ReliabilityRevolutionary Technical Features
Wafer-level Process Qualification PKG-level Product Qualification
Time Dependent Dielectric BreakdownEFR, HTOL, LTOL
(Lifetime)
Hot Carrier InjectionTC, THB, HAST, uHAST, HTS w/ Preconditioning
(Environmental)
Negative Bias Temp Instability Electrostatic Discharge
Electro Migration Latch-up
Stress Migration Package Construction Analysis
TSV, uBump Electromigration Electrical Characterization
![Page 22: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/22.jpg)
21
Wafer & KGSD PKG Level ReliabilityRevolutionary Technical Features
Type Direction T0.1% Lifetime Criteria
Core DieVDD
>> 10 years
• ΔR/R0 x 100> 20%
• F(10yrs) < 0.1%
@ use condition
VSS
Base DieVDD
VSS
TSVVDD
VSS
![Page 23: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/23.jpg)
22
Wafer & KGSD PKG Level ReliabilityRevolutionary Technical Features
Method Target
Human Body Model ≥ 2,000V
Charged Device Model ≥ 500V
VF-TLP(CDM like) : 1.25ns
Method Target
VF-TLP (CDM-like) It2 ≥ ~ 1.xA
* Very Fast Transmission Line Pulse
Direct Access Bump PHY Bump
![Page 24: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/24.jpg)
23
Wafer & KGSD PKG Level ReliabilityRevolutionary Technical Features
Core Die Base Die
WFBI
Logic TestHot & Cold Test
Repair
KGSD
TSV Scan
Built-In Stress
Hot & Cold Test
Speed Test
KGSD HBM Test Flow
![Page 25: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/25.jpg)
24
Wafer & KGSD PKG Level ReliabilityRevolutionary Technical Features
Area Type Comment
PHYFunction Test RD/WT,CL,BL
Margin Test Speed, VDD, Setup/Hold Timing
TSVFunction Test RD/WT,CL,BL,TSV interface
OS Check TSV Open/Short Check
LogicFunction Test IEEE1500, Function, BIST, Repair
Margin Test VDD, Speed, Setup/Hold
Core
Function Test RD/WT, Self Ref, Power Down
Margin Test Speed, VDD, Async, Refresh
Repair Cell Repair
KGSD HBM Test Coverage
![Page 26: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/26.jpg)
25
Key Performance ConsiderationsNext Generation Line-up
Transistor performance between DRAM process and Logic Process
(2.8Gbps~3Gbps may be the realistic max speed on DRAM)
TSV lines to be doubled to secure valid window
Speed increasing makes worse power consumption
All possible solution should be considered for power reduction
Additional HBM cubes
DRAM density and process are limited by SiP size
Higher DRAM stack has to be considered to increase density
Speed
Power
Density
Scaling
![Page 27: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/27.jpg)
26
Key Performance ConsiderationsNext Generation Line-up
Cost Effective Solutions
Sub
CPU
ROM
DRAM
SRAM FLASH
Analog DSP RF Chip
MEMS
CMOS Image Sensor
Substrate
High Speed Signal Transmission
HBMLogic
Organic Substrate (Fine Pitch)
Logic HBM
Organic Substrate
Si Interposer (TSVless)
TSVless Si-Interposer 2.1D SiP Fan Out SiP on Sub.
Removing Si to expose
BEoL layer (as RDL)
Fine pitch organic substrate allows direct
interconnection w/o interposer
Removing Si-interposer thanks to fine
pitch RDL trace of Fan Out Package
Source : CEA-Leti
Chip to chip optical signal transmission
through embedded wave guide in Si-interposer
Low Power and Small Form Factor
More chips in a package
with TSV stack
Si Photonics in 2.5D SiP Hetero-generous 3D Stack
HBMLogic
Organic Substrate
![Page 28: An In-depth Study of High Bandwidth Memoryon-demand.gputechconf.com/gtc/2018/presentation/s... · DDR4-3200 Module 4ea HBM2 in a single 50mm x 50mm Sip Note: Advil is ... the current](https://reader033.fdocument.pub/reader033/viewer/2022050201/5f54d153fe96756432462895/html5/thumbnails/28.jpg)
Thank you
Come visit us at Booth #711 and learn more about SK hynix memory solutions