Ch6 DT Interconnect
-
Upload
balajibs203285 -
Category
Documents
-
view
226 -
download
0
Transcript of Ch6 DT Interconnect
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 1/52
Jan M. Rabaey
Low Power Design Essentials ©2008 Chapter 6
Optimizing Power @ Design Time
Interconnect and Clocks
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 2/52
Low Power Design Essentials ©2008 6.2
Chapter Outline
Trends and bounds
An OSI approach to interconnect optimization
– Physical layer – Data link and MAC
– Network
– Application
Clock distribution
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 3/52
Low Power Design Essentials ©2008 6.3
ITRS Projections
Calendar Year 2012 2018 2020
Interconnect One Half Pitch 35 nm 18 nm 14 nm
MOSFET Physical Gate Length 14 nm 7 nm 6 nm
Number of Interconnect Levels 12-16 14-18 14-18
On-Chip Local Clock 20 GHz 53 GHz 73 GHz
Chip-to-Board Clock 15 GHz 56 GHz 89 GHz
# of Hi Perf. ASIC Signal I/O Pads 2500 3100 3100
# of Hi Perf. ASIC Power/Ground Pads 2500 3100 3100
Supply Voltage 0.7-0.9 V 0.5-0.7 V 0.5-0.7 VSupply Current 283-220 A 396-283 A 396-283 A
[Source: ITRS Roadmap, 2004, 2005]
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 4/52
Low Power Design Essentials ©2008 6.4
Increasing Impact of Interconnect
Interconnect is now exceeding transistors in – Latency
– Power dissipation
– Manufacturing complexity Direct consequence of scaling
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 5/52
Low Power Design Essentials ©2008 6.5
Communication Dominant Part of Power Budget
65%
21%
9% 5% Interconnect
Clock
I/O
CLB
FPGAmProcessor
Signal processor
Clock
Logic Memory
I/O
Clocks
Caches
ExecutionUnits
Control I/O Drivers
40% 20%
15%
15% 10%
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 6/52
Low Power Design Essentials ©2008 6.6
Idealized Wire Scaling Model
Parameter Relation Local Wire Constant Length Global Wire
W, H, t 1/ S 1/ S 1/ S
L 1/ S 1 1/ S C
C LW/t 1/ S 1 1/ S C
R L / WH S S 2 S 2 /SC
t p ~ CR L2 / Ht 1 S 2 S 2 /SC2
E CV 2 1/ SU 2 1/ U 2 1/(S C U 2)
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 7/52 Low Power Design Essentials ©2008 6.7
Distribution of Wire Lengths on Chip
[Ref: J. Davis, C&S’98]
© IEEE 1998
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 8/52 Low Power Design Essentials ©2008 6.8
Technology Innovations
Reduce dielectricpermittivity
(e.g. Aerogels or air)
Reduce resistivity(e.g. Copper)
Reduce wirelengths through3D-integration
Novel interconnectmedia (carbonnanotubes, optical)
(Pictures courtesy of IBM and IFC FCRP)
© IEEE 1998
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 9/52 Low Power Design Essentials ©2008 6.9
Logic Scaling
10-12
10-9
10-6
10-3
100
Pt p ~ 1/S 3
100
10-3
10-6
10 -9
10-12
10-15
P o w e r [ W ] , P
Delay [s], t p
10-6J
10-9J
10-12J
10-15J
10-18J
[Ref: J. Davis, Proc’01]
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 10/52
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 11/52 Low Power Design Essentials ©2008 6.11
Lower Bounds on Interconnect Energy
Claude Shannon
)1(2logkTB
P BC
S
C: capacity in bits/secB: bandwidthP s: average signal power
C P E Sbit /
Valid for an ―infinitely long‖ bit transition (C/B→0) Equals 4.10-21J/bit at room temperature
)2ln()0 / ((min) kT BC E E bit bit
Shannon’s theorem on maximum capacity of
communication channel
[Ref: J. Davis, Proc’01]
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 12/52 Low Power Design Essentials ©2008 6.12
Reducing Interconnect Power/Energy
Same philosophy as with logic: reduce capacitance,voltage (or voltage swing) and/or activity
A major difference: sending a bit(s) from one point toanother is fundamentally a communications/networking problem, and it helps to consider it as
such.
Abstraction layers are different:
– For computation: device, gate, logic, micro-architecture
– For communication: wire, link, network, transport
Helps to organize along abstraction layers, wellunderstood in the networking world: the OSI protocolstack
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 13/52 Low Power Design Essentials ©2008 6.13
OSI Protocol Stack
Reference model for wiredand wireless protocol design— Also useful guide forconception and optimizationof on-chip communication
Layered approach allows fororthogonalization of concernsand decomposition ofconstraints
Network
Transport
Session
Data Link
Physical
Presentation/Application
No requirement to implement all layers of the stack
Layered structure must not necessarily be maintained infinal implementation
[Ref: M. Sgroi, DAC’01]
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 14/52 Low Power Design Essentials ©2008 6.14
The Physical Layer
Transmit bits over
physical interconnectmedium (wire)
Physical medium
– Material choice, repeaterinsertion
Signal waveform
– Discrete levels, pulses,modulated sinusoids
Voltages
– Reduced swing
Timing, synchronization
Network
Transport
Session
Data Link
Physical
Presentation/Application
So far, on-chip communication almost uniquely “level-based”
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 15/52 Low Power Design Essentials ©2008 6.15
Repeater Insertion
Optimal receiver insertion results in wire delay linear with L
))(( wwd d p cr C R Lt
with R d C d and r w c w intrinsic delays of inverter and wire, respectively
But: At major energy cost!
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 16/52 Low Power Design Essentials ©2008 6.16
Repeater Insertion ─ Example
1 cm Cu wire in 90 nm technology (on
intermediate layers)
– r w = 250 W /mm; c w = 200 fF/mm
– t p = 0.69r w c w L2 = 3.45 nsec
Optimal driver insertion: – t popt = 0.5 nsec
– Requires insertion of 13 repeaters
– Energy per transition 8 times larger than just charging
the wire (6 pJ verus 0.75 pJ)!
It pays to back off!
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 17/52 Low Power Design Essentials ©2008 6.17
Wire Energy-Delay Trade-off
1 2 3 4 5 6 7 8 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
dNorm
e N o r m
wire energy only
L = 1cm (Cu)90 nm CMOS
(dMin, eMax )
R e p e a t e r o v e
r h e a d
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 18/52 Low Power Design Essentials ©2008 6.18
Multi-dimensional Optimization
Design parameters:
Voltage, number ofstages, buffer sizes
Voltage scaling has
largest impact, followedby selection of numberof repeaters
Transistor sizing
secondary.
1 2 3 4 5 6 7 8
0
2
4
6
8
10
12
dNorm
N u m b e r o f
s t a g e s
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
V D D
( V )
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 19/52 Low Power Design Essentials ©2008 6.19
Reduced Swing
E bit = CV DD V swing
Concerns:
– Overhead (area, delay) – Robustness (supply noise, crosstalk, process variations)
– Repeaters?
Transmitter (TX) Receiver (RX)
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 20/52
Low Power Design Essentials ©2008 6.20
Traditional Level Converter
Requires two discrete voltage levels Asynchronous level conversion adds extra
delay
VDDH VDDL
VDDH VDDH
in
CL
OUT OUT
VDDL
[Ref: H. Zhang, TVLSI’00]
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 21/52
Low Power Design Essentials ©2008 6.21
Avoiding Extra References
[Ref: H. Zhang, VLSI’00]
in
VDD
VDD VDD
in2
CL
outN3
P3
N1
P1
N2
P2
VTC
Transient
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 22/52
Low Power Design Essentials ©2008 6.22
Differential (Clocked) Signaling
Allows for very low swings (200 mV)
Robust Quadratic energy savings
But: doubling the wiring, extra clock signal, complexity
[Ref: T. Burd, UCB’01]
in
REF VDD
REF
CL
CL
clk
clk clk
d_b d
out_b
out
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 23/52
Low Power Design Essentials ©2008 6.23
Lower Bound on Signal Swing?
Reduction of signal swing translates into higher power dissipation inreceiver – trade-off between wire and receiver energy dissipation
Reduced SNR impacts reliability – current on-chip interconnectstrategies require Bit Error Rate (BER) of zero (in contrast tocommunication and network links)
– Noise source: power supply noise, crosstalk
Swings as low as 200 mV have been reported [Ref: Burd’00], 100
mV definitely possible Further reduction requires crosstalk suppression
shielding folding
GND
GND
GND
Q i Adi b i Ch i
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 24/52
Low Power Design Essentials ©2008 6.24
Quasi-Adiabatic Charging
t
V V DD
V DD / N
[Ref: L. Svensson, ISLPED’96]
• Uses stepwise approximation ofadiabatic (dis)charging• Capacitors acting as ―charge
reservoir‖
• Energy drawn from supply reduced
by factor N
CT1
CT2
CTN-1
Ch R di ib i S h
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 25/52
Low Power Design Essentials ©2008 6.25
Charge Redistribution Schemes
V DD / 2
V DD / 4
3V DD / 4
Precharge Eval Precharge
B 0
B 0
B 1
B 1
B 0 = 0
B 1 = 1
V DD
E
E
E
P
P
GND
RX1
RX0
1
0
B1
B1
B0
B0
Charge recycled from top to bottom Precharge phase equalizes differential lines
Energy/bit = 2C (V DD / N )2
Challenges: Receiver design, noise margins
[Ref: H. Yamauchi, JSSC’95]
Al i C i i S h
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 26/52
Low Power Design Essentials ©2008 6.26
Alternative Communication Schemes
Example: Capacitively-driven wires
Offers some compelling advantages Reduced swing
Swing is V DD /(n+1) without extra
supply Reduced load
Allows for smaller driver
Reduced delayCapacitor pre-emphasizes edges Pitchfork capacitors exploit
sidewall capacitance [Ref: D. Hopkins, ISSCC’07]
Si li P l
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 27/52
Low Power Design Essentials ©2008 6.27
Signaling Protocols
Network
ProcessorModule
(mProc, ALU, MPY, SRAM…)
din reqin ackindout reqout ackout
Din
REQin
done
GloballyAsynchronous
self-timed handshakingprotocol
Allows individual modulesto dynamically
trade-off performancefor energy-efficiency
Si li P l
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 28/52
Low Power Design Essentials ©2008 6.28
Signaling Protocols
Network
Physical LayerInterface Module
ProcessorModule
(mProc, ALU, MPY, SRAM…)
din reqin ackindout reqout ackout
din dout clk
Din
REQin
Clk
done
Locallysynchronous
done
Globally Asynchronous
Th D t Li k /M di A L
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 29/52
Low Power Design Essentials ©2008 6.29
The Data Link /Media Access Layer
Reliable transmission over
physical link and sharinginterconnect medium
between multiple sources
and destinations (MAC)
Bundling, serialization,
packetizing
Error detection and correction
Coding
Multiple-access schemes
NetworkTransport
Session
Data Link
Physical
Presentation/Application
C di
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 30/52
Low Power Design Essentials ©2008 6.30
Coding
E n c o d e r
D e c o d e r
N N + k N
LinkTX RX
Adding redundancy to communication link (extra bits) to: Reduce transitions (activity encoding) Reduce energy/bit (error-correcting coding)
A ti it R d ti Th h C di
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 31/52
Low Power Design Essentials ©2008 6.31
Activity Reduction Through Coding
[Ref: M. Stan, TVLSI’95]
E n c o d e r
D e c o d e r
N N + 1
N
Example: Bus-Invert Coding
Invert bit p
Data word D inverted if Hamming distance from previous is larger than N /2.
D Denc
D
D # T Denc p #T
00101010
00111011110101000000110101110110…
-
2756
00101010
00111011001010110000110110001001…
0
0101
-
21+13+12+1
B I t C di
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 32/52
Low Power Design Essentials ©2008 6.32
Bus-Invert Coding
Gain: 25 % (at best – for random data)
Overhead: Extra wire (and activity)
Encoder, decoderNot effective for correlated data
R e g
LP
Encode
Decode
D D Denc
p
[Ref: M. Stan, TVLSI’95]
Bus
Other Transition Coding Schemes
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 33/52
Low Power Design Essentials ©2008 6.33
Other Transition Coding Schemes
Advanced bus-invert coding (e.g. partition bus into sub-components)(e.g. [M.Stan, TVLSI’97])
Coding for address busses ( which often display sequentiality)(e.g. [L. Benini, DATE’98])
Full-fledged channel coding, borrowed from communication links(e.g. [S. Ramprasad, TVLSI’99])
Coding to reduce impactof Miller capacitancebetween neighboringwires[Ref: Sotiriadis, ASPDAC’01]
Maximum capacitancetransition – can beavoided by coding
bit k-1 bit k bit k+1 Delay factor g
h h h 1
h h − 1 + r
h h i 1 + 2r
− h − 1 + 2r
− h i 1 + 3r
i h i 1 + 4r
Error Correcting Codes
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 34/52
Low Power Design Essentials ©2008 6.34
Error-Correcting Codes
E n c o d e
r
D e c o d e
r
N
N + k
N D
Denc D
with
e.g.
1
1
0
= 3
Example: (4,3,1) Hamming Code
B 3wrong Adding redundancy allows
for more aggressive scaling ofsignal swings and/or timing
Simpler codes such asHamming prove most effective
P 1P 2B 3P 4B 5B 6B 7
P 1
+ B 3
+ B 5
+ B 7
= 0
P 4 + B 5 + B 5 + B 7 = 0
P 2 + B 3 + B 6 + B 7 = 0
Media Access
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 35/52
Low Power Design Essentials ©2008 6.35
Media Access
Sharing of physical media over multiple data streamsincreases capacitance and activity (see Chapter 5), but
reduces area
Many multi-access schemes known from communications
– Time domain:Time-Division Multiple Access (TDMA)
– Frequency domain: narrow band, code division multiplexing
Buses based on Arbitration-based TDMA most commonin today’s ICs
Bus Protocols and Energy
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 36/52
Low Power Design Essentials ©2008 6.36
Bus Protocols and Energy
Some Lessons from the Communications world:
– When utilization is low, simple schemes are more effective – When traffic is intense, reservation of resources minimizes
overhead and latency (collisions, resends)
Combining the two leads to energy efficiency
Example : SiliconBackplane MicroNetwork
CurrentSlot
[Courtesy: Sonics, Inc]
Independent arbitration for every cycle includes two phases:- Distributed TDMA for guaranteed latency/bandwidth- Round robin for random access
Arbitration
Command
The Network Layer
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 37/52
Low Power Design Essentials ©2008 6.37
The Network Layer
Topology-independentend-to-end communicationover multiple data links(routing, bridging,repeaters)
Topology
Static versus dynamicconfiguration / routing
Physical
Transport
Session
Data Link
Network
Presentation/Application
Becoming more important in today’s complex multi-processor designs“The Network-on-a-Chip (NOC)”
[Ref: G. De Micheli, Morgan-Kaufman’06]
Network on a Chip (NoC)
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 38/52
Low Power Design Essentials ©2008 6.38
Network-on-a-Chip (NoC)
Dedicated networks with reserved links preferable forhigh traffic channels – but: limited connectivity, areaoverhead
Flexibility an increasing requirement in multi (many) –core chip implementations
or
The Network Trade off’s
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 39/52
Low Power Design Essentials ©2008 6.39
The Network Trade-off s
Interconnect-oriented architecture trades off flexibility, latency,energy and area-efficiency through the following concepts
Locality - eliminate global structures
Hierarchy - expose locality in communication requirements
Concurrency/Multiplexing
Very Similar to Architectural Space Trade-off’s
Dedicated wiring
Proc
LocalLogic
Router
NetworkWires
Network-on-a-Chip
[Courtesy: B. Dally, Stanford]
Networking Topology
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 40/52
Low Power Design Essentials ©2008 6.40
Networking Topology
Homogeneous – Crossbar, Butterfly, Torus,Mesh,Tree, …
Heterogeneous
– Hierarchy
Mesh (FPGA)
Tree
Crossbar
Network Topology Exploration
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 41/52
Low Power Design Essentials ©2008 6.41
Network Topology Exploration
Manhattan Distance
E n e r g y x D e l a y
Mesh
Binary Tree
Manhattan Distance
E n e r g y x D e l a
y
Mesh
Binary Tree
Mesh + Inverse
Short connections in tree are redundant
Inverse clustering complements mesh
[Ref: V. George, Springer’01]
Circuit Switched versus Packet Based
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 42/52
Low Power Design Essentials ©2008 6.42
Circuit-Switched versus Packet Based
On-Chip Reality: Wires (bandwidth) are
relatively cheap, buffering and routingexpensive
Packet-switched approach versatile
– Preferred approach in large networks
– But … routers come with large overhead
– Case study Intel: 18% of power in link, 82%in router
Circuit-switched approach attractive forhigh-data rate quasi-static links
Hierarchical combination often preferredchoice
Bus
C C
C C
Bus to connect overshort distances
Hierarchical circuit and packetswitched networks for longerconnections
Bus
C C
C C
Bus
C C
C C
Bus
C C
C C
Bus
C C
C C
R R
R R
Example: The Pleiades Network on a Chip
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 43/52
Low Power Design Essentials ©2008 6.43
Example: The Pleiades Network-on-a-Chip
Configuration Bus
•Configurable platform for
low-energy communicationand signal-processingapplications(See Chapter 5)• Allows for dynamic task-
level reconfiguration ofprocess network
Energy-efficient flexible networkessential to the concept
Configurable Interconnect
ArithmeticModule
ArithmeticModule
ArithmeticModule
ConfigurableLogic
ConfigurableLogicmP
Configuration
Dedicated
Arithmetic
Network Interface
[Ref: H. Zhang, JSSC’00]
Pleiades Network Layer
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 44/52
Low Power Design Essentials ©2008 6.44
Pleiades Network Layer
Universal Switchbox
Cluster
Cluster
Level-1 Mesh Level-2 Mesh
Hierarchical Switchbox
• Network statically configured at start of session and ripped up at end• Structured approach reduces interconnect energy with factor 7over straightforward cross-bar
Hierarchical reconfigurable mesh network
Top Layers of the OSI Stack
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 45/52
Low Power Design Essentials ©2008 6.45
Top Layers of the OSI Stack
Abstracts communication
architecture to system andperforms data formattingand conversion
Establishes and maintains
end-to-endcommunications
– flow control, messagereordering, packetsegmentation and
reassembly Physical
Transport
Session
Data Link
Presentation/Application
Network
Example: Establish, maintain and rip-up connections indynamically reconfigurable Systems-on-a-Chip – Important in power-management
What About Clock Distribution?
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 46/52
Low Power Design Essentials ©2008 6.46
What About Clock Distribution?
Clock easily the most energy-consuming signal
of a chip – Largest length
– Largest fanout
– Most activity (a = 1)
Skew control adding major overhead – Intermediate clock repeaters
– De-skewing elements
Opportunities
– Reduced swing
– Alternative clock distribution schemes
– Avoiding a global clock altogether
Reduced-Swing Clock Distribution
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 47/52
Low Power Design Essentials ©2008 6.47
Reduced-Swing Clock Distribution
Similar to reduced-swing interconnect
Relatively easy to implement But: Extra-delay in flip-flop’s adds directly to clock period
Example: half-swing clockdistribution scheme
Regular 2-phase clock
Half-swing clock
VDD
GND
VDD
GND
NMOS clock
PMOS clock
NMOS clock
PMOS clock
[Ref: H. Kojima, JSSC’95]
© IEEE 1995
Alternative Clock Distribution Schemes
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 48/52
Low Power Design Essentials ©2008 6.48
Alternative Clock Distribution Schemes
Canceling skew in perfecttransmission line scenario
Example: Transmission-Line Based Clock Distribution
[Ref: V. Prodanov, CICC’06]
© IEEE 2006
Summary
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 49/52
Low Power Design Essentials ©2008 6.49
Summary
Interconnect important component of overall
power dissipation
Structured approach with exploration at differentabstraction layers most effective
Lot to be learned from communications andnetworking community – yet, techniques must beapplied judiciously
– Cost relationship between active and passive
components different
Some exciting possibilities for the future: 3D-integration, novel interconnect materials, opticalor wireless I/O
References
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 50/52
Low Power Design Essentials ©2008 6.50
Books and Book Chapters
T. Burd, ―Energy-Efficient Processor System Design,‖
http://bwrc.eecs.berkeley.edu/Publications/2001/THESES/energ_eff_process-sys_des/index.htm,UCB, 2001.
G. De Micheli and L. Benini, ―Networks on Chips: Technology and Tools,‖ Morgan-Kaufman, 2006.
V. George and J. Rabaey, ―Low-energy FPGAs: Architecture and Design‖, Springer 2001.
J. Rabaey, A. Chandrakasan, B. Nikolic, ―Digital Integrated Circuits: A Design Perspective,‖ 2nd ed,Prentice Hall 2003.
C. Svensson, ―Low-Power and Low-Voltage Communication for SoC’s,‖ in C. Piguet, Low-Power
Electronics Design , Ch. 14, CRC Press, 2005. L. Svensson, ―Adiabatic and Clock-Powered Circuits,‖ in C. Piguet, Low-Power Electronics Design ,
Ch. 15, CRC Press, 2005.
G. Yeap, ―Special Techniques‖, in Practical Low Power Digital VLSI Design, Ch 6., KluwerAcademic Publishers, 1998.
Articles
L. Benini et al, ―Address bus encoding techniques for system-level power optimization,‖ Proceedings
DATE’98, pp. 861-867, Paris, February 1998
T. Burd et al., ―A Dynamic Voltage Scaled Microprocessor System,‖ IEEE ISSCC Digest of Technical
Papers, pp. 294-295, Feb. 2000.
M. Chang et al, ―CMP Network-on-Chop Overlaid with Multi-Band RF Interconnect‖, International
Symposium on High-Performance Computer Architecture, Febr. 2008.
D.M. Chapiro, ―Globally Asynchronous Locally Synchronous Systems,‖ PhD thesis, Stanford
University, 1984.
References
References (cntd)
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 51/52
Low Power Design Essentials ©2008 6.51
W. Dally, ―Route Packets, Not Wires: On-Chip Interconnect Networks,‖ Proceedings DAC 2001, pp.
684-689, Las Vegas, June 2001. J. Davis and J. Meindl, ―Is Interconnect the Weak Link?,‖ IEEE Circuits and Systems Magazine, pp.
30-36, March 1998.
J. Davis et al., ―Interconnect Limits on Gigascale Integration (GSI) in the 21st Century,‖ Proceedings
of the IEEE, Vol. 89, No. 3, pp. 305-324, March 2001.
D. Hopkins et al, "Circuit techniques to enable 430Gb/s/mm2 proximity communication," IEEEInternational Solid-State Circuits Conference, vol. XL, pp. 368 - 369, February 2007.
H. Kojima et al., ―Half -Swing Clocking Scheme for 75% Power Saving in Clocking Circuitry,‖ Journalof Solid Stated Circuits, vol. 30, no 4, pp. 432-435, April 1995.
E. Kusse and J. Rabaey, ―Low-energy embedded FPGA structures,‖ Proceedings ISLPED’98,
pp.155-160, Monterey, Aug. 1998.
V. Prodanov and M. Banu, ―GHz Serial Passive Clock Distribution in VLSI using Bidirectional
Signaling,‖ Proceedings CICC 06.
S. Ramprasad et al., ―A coding framework for low-power address and data busses,‖ IEEE
Transactions on VLSI Signal Processing, Vol. 7, No 2, pp. 212-221, June 1999.
M. Sgroi et al, ―Addressing the System-on-a-Chip Woes Through Communication-Based Design,‖
Proceedings DAC 2001, pp. 678-683, Las Vegas, June 2001.
P. Sotiriadis and A. Chandrakasan, ―Reducing Bus Delay in Submicron Technology Using Coding,‖
Proceedings ASPDAC Conference, Yokohama, January 2001.
References (cntd)
References (cntd)
8/4/2019 Ch6 DT Interconnect
http://slidepdf.com/reader/full/ch6-dt-interconnect 52/52
References (cntd)
M. Stan and W. Burleson, ―Bus-Invert Coding for Low-Power I/O,‖ IEEE Transactions on VLSI, pp.
48-58, March 1995. M.. Stan, W. Burleson, "Low-Power Encodings for Global Communication in CMOS VLSI", IEEE
Transactions on VLSI Systems, pp. 444-455, Dec. 1997.
V. Sathe, J.-Y. Chueh, and M. C. Papaefthymiou, ―Energy-Efficient GHz-Class Charg-Recoverylogic‖, IEEE JSSC vol. 42 No 1, pp.38-47, January 2007.
L. Svensson et al., ―A sub-CV2 pad Driver with 10 ns Transition Time,‖ Proc. ISLPED 96,
Monterey, Aug. 12-14, 1996.
D. Wingard, ―Micronetwork-Based Integration for SOCs,‖ Proceedings DAC 01, pp. pp. 673-677,Las Vegas, June 2001.
H. Yamauchi et al., ―An Asymptotically Zero Power Charge Recycling Bus,‖ IEEE Journal of Solid
Stated Circuits, vol. 30, no 4, pp. 423-431, April 1995.
H. Zhang, V. George and J. Rabaey, ―Low-Swing on-chip Signaling Techniques: Effectivenessand Robustness,‖ IEEE Transactions on VLSI Systems, Vol. 8, No 3, pp. 264-272, June 2000.
H. Zhang et al, ―A 1V Heterogeneous Reconfigurable Processor IC for Baseband Wireless Applications,‖ IEEE Journal of Solid-State Circuits, vol. 35, no. 11, pp. 1697-1704, Nov. 2000.