Digital System Clocking Compact
-
Upload
ancil-cleetus -
Category
Documents
-
view
14 -
download
0
description
Transcript of Digital System Clocking Compact
1
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Digital System Clocking
1
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Outline Digital System Clocking
� Timing classes in digital systems– Synchronous timing– Mesochronous timing
– Plesiochronous timing – Asynchronous timing– Timing constraints in synchronous systems
� Clock manipulation– Clock dividers
– Clock splitter, edge shifters, etc.
� Clock distribution networks� Clock generation � next chapter on phase-locked-loops
2
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Timing Classification of Digital Systems: Synchronously Timed Systems
� A signal x is synchronous to the clock – if the rate of potential signal transitions is equal to the clock frequency– and if the time relation between the clock and the potential signal
transitions, i.e. the phase, is fixed(this does not necessarily mean that the frequency is the same)
3
2
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Clock Uncertainty
� Clock skew:Spatial but time invariant variation of the clock arrival times– Positive clock skew:
The clock arrives at the receiving register first– Negative clock skew:
The clock arrives at the sending register first– Caused by design asymmetries, loading asymmetry, device and
interconnect variations, temperature gradients, …
� Clock jitter:Temporal variation with a correlated spatially dependency of the clock arrival times– Mean free random process (with pseudorandom contributors)– Caused by physical noise, supply noise, cross talk, PLL transients, …
4
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Setup Time Constraint in Sync Logic
5
� Maximum delay requirement (Setuptime-Check)– Data arrives early enough at receiving ff (at least setup time earlier)– Checking with respect to two subsequent clock edges
� Worst case timing check– Assume that everything along green path is slow
– Assume that everything along red path is fast
� green clock edge launches data� red clock edge samples data
Branching pointTiming check starts here
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Hold Time Constraint in Sync Logic
6
� Minimum delay requirement (Holdtime-Check)– New data must not reach receiving ff before actual data is sampled– Checking with respect to same clock edges
� Worst case timing check– Assume that everything along green path is fast
– Assume that everything along red path is slow
Branching pointTiming check starts here
� red clock edge samples data� green clock edge launches NEW data
3
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Maximum Logic Delay in Synchronous Logic
sequencing overhead7
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Minimum Logic Delay in Synchronous Logic
this constraint makes hold-timefixing necessary
8
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Discussion of Synchronous Timing
� Maximum delay constraint sets upper frequency limit
– If setup violations occur the circuit is functional (in principle) for a reduced clock frequency
– Jitter is mean free but increases or decreases the clock period temporarily � jitter always reduces the maximum frequency
– Skew is time invariant so may increase/decrease the max. frequency
9
4
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Discussion of Synchronous Timing
� Minimum delay constraint sets lower limit for combinatoric delay in between two flip-flops
– Describes a race between the clock and the data signals
– Hold time violations are caused by combinatoric delay, jitter and skew so cannot be fixed by altering the clock frequency� catastrophic defect, to be avoided under all circumstances
– Skew can intensify (negative skew) or relax the hold time problem (positive skew)
– Hold time fixing: Insert buffers in very short paths
10
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Intentional Skew
� Minimize clock uncertainty in terms of unidentified skew to minimize risk of races and to maximize circuit speed
� Clock skew can be inserted intentionally by designAttention: Verify circuit carefully
� Intentional negative skew:– Receiving flip-flop samples data later � clock cycle prolonged
(time-borrowing)– Increased risk of races, i.e. hold time violations– Resynchronization required to enter original clock domain
� Intentional positive skew:– Receiving flip-flop samples data before sending flip-flop issues new
value � clock cycle reduced– Decreased risk of races, i.e. hold time violations
(good for robustness of highly pipelined paths)
11
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Clock Routing in High-Performance Pipelines
� Same direction: Maximum speed, risky w.r.t. races� Opposite direction: Robust w.r.t. races, reduced speed
12
5
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Non-Synchronous Timingand
Synchronization
13
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Need for Synchronization
� Consider a signal transmitted by a sending flip-flop FFsend and received by a receiving flip-flop FFrec
� If the clocks of FFsend and FFrec
are different or if the delay between the two flip-flops is unknown or varying metasability will occur in the receiving flip-flop(maybe not in every cycle but for sure with a certain probability)
� wrong data sampled by FFrec
14
cloc
k-to
-out
put
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Synchronization
� Required if signals cross clock domains� Clock domain crossing may have different reasons:
Clock of sending and receiving flip-flop are not synchronous� Problem is meta-stability of latches:
– Not synchronous � setup/hold time constraint will be violated sooner or later
– Violation of setup/hold constraint means� Propagation delay through receiver flip-flop becomes unpredictable� Sampled value becomes unpredictable� For busses even worse: some flip-flops may sample the correct others
the incorrect value � result is complete non-sense
15
6
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Basic Waiting Synchronizer
� Simple circuit, especially for asynchronous inputs:Second flip-flop waits for the first flip-flop to settle
� Additional latency to resolve meta-stability (still not perfectly save but error probability dramatically reduced, chose wait time according to safety requirements)
� Suitable especially for slowly varying signals� Don’t use for synchronization of busses
16
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Clock Domain Crossing of Busses
� Handshaking: safe sampling of busses when data is valid� Various protocols� Handshaking reduces data rate
17
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Synchronization with Gray Coding
� Bus crossing the clock domain coded such, that only one bit changes at a time– Only one out of N signals may suffer from meta-stability
– Result is either new or old bus value, but not completely random
� Especially for slowly varying signals
� Feasible for counters or simple states
� Interfacing via simple waiting synchronizers possible
18
7
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Synchronization via Shared Memory
� For (bidirectional) exchange of large data volumes shared memory accessible from both clock domains is advantageous
� Dual port SRAMs are especially advantageous (moderate speed)
� FIFO (first-in-first-out buffer) is special implementation
19
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Timing Classification of Digital Systems: Mesochronous Systems
� A signal x is mesochronous to the clock – if the rate of potential signal transitions is equal to the clock frequency– and if the time relation between the clock and the potential signal
transitions, i.e. the phase, is constant but unknown
� Synchronization required to sample signal x properly� Example: Wire/cable with unknown latency
20
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Timing Classification of Digital Systems: Plesiochronous Systems
� A signal x is plesiochronous to the clock – if the rate of possible signal transitions is nominally equal to the clock
frequency, but in reality slightly different� phase relation between data transitions and clock shifts slowly
� Clock-data-recovery required to sample signal x properly� Occurs usually for distributed systems with independent
clock generators
21
8
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Timing Classification of Digital Systems: Asynchronous Systems
� Two signals are asynchronous – if the related clock frequencies are completely different– or if there are no periodic clocks at all
� Hand-shaking and buffering schemes used to communicate� Four phase synchronization (RTZ):
22
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Timing Classification of Digital Systems: Asynchronous Systems 2
� Two signals are asynchronous – if the related clock frequencies are completely different– or if there are no periodic clocks at all
� Hand-shaking and buffering schemes used to communicate� Two phase synchronization (NRZ):
23
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Clock Manipulation and Distribution
24
9
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Clock Manipulation
� Clock splitterto generate aligned differential clock from single ended clock
� Non-overlapping two phase clock paire.g. for race free clocking and switched capacitor circuits
� Edge shifters for generation of overlapping clockse.g. for latch free interfacing of domino circuits
� Clock dividerse.g. to reduce the clock delivered by the PLL or used within the PLL itself (� PLL chapter)
25
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Clock Splitter
� Clock splitters are used to derive a well aligned complementary clock pair from a single ended clock signal
� Alignment strongly depends on sizing– Critical circuit block, strongly susceptible to variations– Use differential clock sources whenever possible (to avoid splitters)
� More complex and more accurate circuits possible, e.g. coupling of differential outputs, 4/5 element paths, …
size both paths for equal delay!
26
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Non-Overlapping Two-Phase Clock
� The high-phases of the two clock phases do not overlap� refer to Mixed-Signal-Electronics @ LTE
27
10
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Rising-Edge Shifter
� Rising edge can be shifted to reduce the output duty cycle� Purpose: Overlapping clocks, duty cycle correction
28
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Falling-Edge Shifter
� Falling edge can be shifted to reduce the output duty cycle� Purpose: Overlapping clocks, duty cycle correction
29
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Digital Divider Circuits
� Dividers are cyclical counters0,1, … N-2, N-1, 0, 1, …
� Binary dividers require at leaststorage elements
� possible but undesirable states
� High activity, often high freq.� minimize logic and control
� Avoid async. reset (would require control / sync.
� Implement counter with auto-initialization / auto-recovery
30
11
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Digital Divide-by-N Circuits
� Counter logic forces N step limit cycle� Multi-modulus dividers enable multiple divider factors by
reconfiguring the counter logic� Divide-by-2 often used as prescaler
– Trivial counter logic
– Approximately 2-3x faster than all other dividers
31
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Digital Divide-by-2 Stage
� Flip-flop with input connected to the inverting output or with explicit feedback inverter
� Check hold-time carefully!NO pulsed flip-flops
� N stages can be cascaded to implement 2N divider(asynchronous)
� Strip off all flip-flop functions except latching functionality(no test, no reset, eventually use differential output of VCO)
32
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Divide-by-2 IQ-Signal Generation
� Duty cycle dependence� Phase accuracy of crucial
interest� Amplitude not an issue in
full swing CMOS� Skewless differential signals
desirable for subsequent circuit blocks (phase rotators, mixers, … )
33
12
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Divide-by-4 IQ-Signal Generation
� No duty cycle dependence but 4:1 frequency ratio� Phase accuracy of crucial interest
34
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Example: 2/3 Dual Modulus Divider
� sel = high: Divide-by-three� sel = low: Divide-by-two
35
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Design Example: Low Power Div-15 Circuit
� Reduce frequency as soon as possible� div 15 = div 3 x div 5
� Use ultra-high speed registers only where really re quired� sense amplifiers for div 3, master-slaves registers for div 5
� Minimize interface overhead� differential output signals of sense-amps used as complementary clocks of master slave registers
� Minimize combinatorial logic� optimal state coding � only one logic function required
� Selection of register is a key issue for power optimization of high-speed building blocks
36
13
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
CMOS Divider 1/3 – Integrated Sense-Amp Approach
logic integrated in slavestage of sense-amps
auto-initialization
37
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
CMOS Divider 1/5 – Static Master-Slave Approach
master slave flipflops controlledby differential output signals ofsense amplifiers in div 3 stage
auto-initialization required
38
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Pulse Swallowing Divider
� Start analysis with reset swallow counter � Dual modulus divider divides by (N+1) for (N+1)S cycles� Value in program counter is S� Dual modulus divider divides by N for N(P-S) cycles� Period, i.e. effective (average) divider factor: (NP + S)
39
14
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Pulse Removing from Periodic Signals
� A pulse swallower removes one of N pulses
� average frequency
� This corresponds to a multiplication which causes spurs
40
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Fractional Frequency Divider
fractional !
41
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Miller Divider (Regenerative Divider)
� Analog multiplier mixes input and output signal� two frequency components fin + fout and fin – fout
� Low pass filter eliminates high frequency component, i.e.fout = fin – fout � fout = 0.5 x fin
� Loop must fulfill Barkhausen criterion
42
15
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Design Example:Timing Budget of IQ Divide-by-2 Circuit
Example:
7 GHz input clock 142 ps
Half input clock period 71 ps
+/- 2% clock jitter 6 ps
Maximum available D-Q-delay 65 psSS 2008
43
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Sense-Amplifier Based Divider Core I
� Two internally coupledsense amplifiers as pulse generator
� Pulse shape may change with varying process and operation conditions
� Intrinsic symmetry results in good phase relations
� Footer devices can be used as power switch
44
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Sense-Amplifier Based Divider Core II
� Internal nodes allow for coupling with minimum latency.
� Differential pair allows for reliable decision even without rail-to-rail signals
� Latch increases reliability and noise immunity
� Precharged logic � fast
45
16
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Output Signals of SA Pulse Generator
� SA based pulse generator can be pushed to very high frequencies
� Output signals– π/2 pulses– varying shape & amplitude
– strong symmetry
� Pulses cannot be used to drive static slaves� coupling stage which imposes no requirements for the pulse signals but the symmetry of the four phases
VDD, high
VDD, low
VDD, nom
46
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Self-Precharging Dynamic Coupling Stage I
� Purpose: Recreate full swing signals
� Four circularly pre-charging dyn. coupling elements– Fast & reliable pull down
– Phase purity requires only symmetry of pulse signals, no serious waveform dependence
� Intrinsic power gating capability� Duty cycle correction possible by
subsequent RS-latches
47
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Self-Precharging Dynamic Coupling Stage II
� Purpose: Recreate full swing signals
� Four circularly pre-charging dyn. coupling elements– Fast & reliable pull down
– Phase purity requires only symmetry of pulse signals, no serious waveform dependence
� Intrinsic power gating capability� Duty cycle correction possible by
subsequent RS-latches
48
17
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Self-Precharging Dynamic Coupling Stage III
Self-pre-charging dynamic coupling stage regenerates pulse signals efficiently
� pulse
� reconstructed signal atoutput of coupling stage
49
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Power & Max. Frequency of Design Example
Target application: VDD = 1.0V @ T= -40 ..125°Cfrequency 5..7 GHz
(incl
udin
g cl
ock
driv
ers
and
load
)
50
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Phase Rotation in Coupling Stage
EN1 EN2 EN3 EN4
Glitch-free by construction!
51
18
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Phase Transition in Coupling Stage
Glitch-free phase transition by construction
52
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Application of Phase Rotator in 128/129 Dual Modulus Prescaler
Synchronization element between divider and phase rotator which is required in conventionalprescalers may be omitted.
53
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Clock Distribution
� Purpose of clock tree– Distribute clock from PLL to all synchronous circuit elements
� huge load, clock network is high fanout net
– Provide sufficient drive strength– Contributes up to 30% to overall power dissipation!
� Although the functionality is trivial, jitter and skew constraints makes the design of the clock tree a critical task.
� Next: Overview on common clock tree architectures
54
19
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
H-Tree
� FO4 like tree� Very popular� Good compromise
between power consumption and skew/jitter
� Symmetrical design� Adjacent flip-flops
may be connected to different paths along the tree� skew, jitter
55
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Balanced Clock Tree
� The H-tree concept can be generalized– Optimized for equal delay along all paths (interconnect and gate delay)
– No regular physical structure necessary
– Advantageous for irregular circuits and synthesized logic– Good tools support available
� Remark:– Skew only relevant for interacting blocks, not for independent
modules or modules with slow interfaces, e.g. shared memory– Basically the latency along the tree is irrelevant. This does not hold
anymore when history effects caused by supply noise etc. occur and for fast clock gating schemes.
56
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Overlapping Rings / Interleaved Rings
� Similar to H-tree, all edges of the ‘H’ connected, symmetrically supplied from center
� Parallelized distribution– averaging of variations– increased power
consumption– more complex routing
57
20
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Clock Mesh / Clock Grid
� Mesh/grid of clock wires supplied by drivers (usually at the edges)
� Very good averaging of variations
� Very low skew� High power consumption� Usually combined with
other tree concepts, e.g. global H-tree and local mesh structure
Mesh: Outputs of all buffers of a certainlevel of hierarchy connected together,not necessarily regular layout
Grid: Mesh with regular layout(� figure)
58
High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012
Trunk Tree
� Clock skew increases with distance from trunk
� Common in full-custom pipelines
59