Digital System Clocking Compact

20
1 High-Speed Digital CMOS Circuits Stephan Henzler Technische Universität München Sumer Term 2012 Digital System Clocking 1 High-Speed Digital CMOS Circuits Stephan Henzler Technische Universität München Sumer Term 2012 Outline Digital System Clocking Timing classes in digital systems Synchronous timing Mesochronous timing Plesiochronous timing Asynchronous timing Timing constraints in synchronous systems Clock manipulation Clock dividers Clock splitter, edge shifters, etc. Clock distribution networks Clock generation next chapter on phase-locked-loops 2 High-Speed Digital CMOS Circuits Stephan Henzler Technische Universität München Sumer Term 2012 Timing Classification of Digital Systems: Synchronously Timed Systems A signal x is synchronous to the clock if the rate of potential signal transitions is equal to the clock frequency and if the time relation between the clock and the potential signal transitions, i.e. the phase, is fixed (this does not necessarily mean that the frequency is the same) 3

description

Digital System Clocking Compact

Transcript of Digital System Clocking Compact

Page 1: Digital System Clocking Compact

1

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Digital System Clocking

1

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Outline Digital System Clocking

� Timing classes in digital systems– Synchronous timing– Mesochronous timing

– Plesiochronous timing – Asynchronous timing– Timing constraints in synchronous systems

� Clock manipulation– Clock dividers

– Clock splitter, edge shifters, etc.

� Clock distribution networks� Clock generation � next chapter on phase-locked-loops

2

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Timing Classification of Digital Systems: Synchronously Timed Systems

� A signal x is synchronous to the clock – if the rate of potential signal transitions is equal to the clock frequency– and if the time relation between the clock and the potential signal

transitions, i.e. the phase, is fixed(this does not necessarily mean that the frequency is the same)

3

Page 2: Digital System Clocking Compact

2

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Clock Uncertainty

� Clock skew:Spatial but time invariant variation of the clock arrival times– Positive clock skew:

The clock arrives at the receiving register first– Negative clock skew:

The clock arrives at the sending register first– Caused by design asymmetries, loading asymmetry, device and

interconnect variations, temperature gradients, …

� Clock jitter:Temporal variation with a correlated spatially dependency of the clock arrival times– Mean free random process (with pseudorandom contributors)– Caused by physical noise, supply noise, cross talk, PLL transients, …

4

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Setup Time Constraint in Sync Logic

5

� Maximum delay requirement (Setuptime-Check)– Data arrives early enough at receiving ff (at least setup time earlier)– Checking with respect to two subsequent clock edges

� Worst case timing check– Assume that everything along green path is slow

– Assume that everything along red path is fast

� green clock edge launches data� red clock edge samples data

Branching pointTiming check starts here

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Hold Time Constraint in Sync Logic

6

� Minimum delay requirement (Holdtime-Check)– New data must not reach receiving ff before actual data is sampled– Checking with respect to same clock edges

� Worst case timing check– Assume that everything along green path is fast

– Assume that everything along red path is slow

Branching pointTiming check starts here

� red clock edge samples data� green clock edge launches NEW data

Page 3: Digital System Clocking Compact

3

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Maximum Logic Delay in Synchronous Logic

sequencing overhead7

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Minimum Logic Delay in Synchronous Logic

this constraint makes hold-timefixing necessary

8

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Discussion of Synchronous Timing

� Maximum delay constraint sets upper frequency limit

– If setup violations occur the circuit is functional (in principle) for a reduced clock frequency

– Jitter is mean free but increases or decreases the clock period temporarily � jitter always reduces the maximum frequency

– Skew is time invariant so may increase/decrease the max. frequency

9

Page 4: Digital System Clocking Compact

4

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Discussion of Synchronous Timing

� Minimum delay constraint sets lower limit for combinatoric delay in between two flip-flops

– Describes a race between the clock and the data signals

– Hold time violations are caused by combinatoric delay, jitter and skew so cannot be fixed by altering the clock frequency� catastrophic defect, to be avoided under all circumstances

– Skew can intensify (negative skew) or relax the hold time problem (positive skew)

– Hold time fixing: Insert buffers in very short paths

10

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Intentional Skew

� Minimize clock uncertainty in terms of unidentified skew to minimize risk of races and to maximize circuit speed

� Clock skew can be inserted intentionally by designAttention: Verify circuit carefully

� Intentional negative skew:– Receiving flip-flop samples data later � clock cycle prolonged

(time-borrowing)– Increased risk of races, i.e. hold time violations– Resynchronization required to enter original clock domain

� Intentional positive skew:– Receiving flip-flop samples data before sending flip-flop issues new

value � clock cycle reduced– Decreased risk of races, i.e. hold time violations

(good for robustness of highly pipelined paths)

11

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Clock Routing in High-Performance Pipelines

� Same direction: Maximum speed, risky w.r.t. races� Opposite direction: Robust w.r.t. races, reduced speed

12

Page 5: Digital System Clocking Compact

5

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Non-Synchronous Timingand

Synchronization

13

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Need for Synchronization

� Consider a signal transmitted by a sending flip-flop FFsend and received by a receiving flip-flop FFrec

� If the clocks of FFsend and FFrec

are different or if the delay between the two flip-flops is unknown or varying metasability will occur in the receiving flip-flop(maybe not in every cycle but for sure with a certain probability)

� wrong data sampled by FFrec

14

cloc

k-to

-out

put

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Synchronization

� Required if signals cross clock domains� Clock domain crossing may have different reasons:

Clock of sending and receiving flip-flop are not synchronous� Problem is meta-stability of latches:

– Not synchronous � setup/hold time constraint will be violated sooner or later

– Violation of setup/hold constraint means� Propagation delay through receiver flip-flop becomes unpredictable� Sampled value becomes unpredictable� For busses even worse: some flip-flops may sample the correct others

the incorrect value � result is complete non-sense

15

Page 6: Digital System Clocking Compact

6

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Basic Waiting Synchronizer

� Simple circuit, especially for asynchronous inputs:Second flip-flop waits for the first flip-flop to settle

� Additional latency to resolve meta-stability (still not perfectly save but error probability dramatically reduced, chose wait time according to safety requirements)

� Suitable especially for slowly varying signals� Don’t use for synchronization of busses

16

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Clock Domain Crossing of Busses

� Handshaking: safe sampling of busses when data is valid� Various protocols� Handshaking reduces data rate

17

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Synchronization with Gray Coding

� Bus crossing the clock domain coded such, that only one bit changes at a time– Only one out of N signals may suffer from meta-stability

– Result is either new or old bus value, but not completely random

� Especially for slowly varying signals

� Feasible for counters or simple states

� Interfacing via simple waiting synchronizers possible

18

Page 7: Digital System Clocking Compact

7

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Synchronization via Shared Memory

� For (bidirectional) exchange of large data volumes shared memory accessible from both clock domains is advantageous

� Dual port SRAMs are especially advantageous (moderate speed)

� FIFO (first-in-first-out buffer) is special implementation

19

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Timing Classification of Digital Systems: Mesochronous Systems

� A signal x is mesochronous to the clock – if the rate of potential signal transitions is equal to the clock frequency– and if the time relation between the clock and the potential signal

transitions, i.e. the phase, is constant but unknown

� Synchronization required to sample signal x properly� Example: Wire/cable with unknown latency

20

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Timing Classification of Digital Systems: Plesiochronous Systems

� A signal x is plesiochronous to the clock – if the rate of possible signal transitions is nominally equal to the clock

frequency, but in reality slightly different� phase relation between data transitions and clock shifts slowly

� Clock-data-recovery required to sample signal x properly� Occurs usually for distributed systems with independent

clock generators

21

Page 8: Digital System Clocking Compact

8

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Timing Classification of Digital Systems: Asynchronous Systems

� Two signals are asynchronous – if the related clock frequencies are completely different– or if there are no periodic clocks at all

� Hand-shaking and buffering schemes used to communicate� Four phase synchronization (RTZ):

22

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Timing Classification of Digital Systems: Asynchronous Systems 2

� Two signals are asynchronous – if the related clock frequencies are completely different– or if there are no periodic clocks at all

� Hand-shaking and buffering schemes used to communicate� Two phase synchronization (NRZ):

23

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Clock Manipulation and Distribution

24

Page 9: Digital System Clocking Compact

9

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Clock Manipulation

� Clock splitterto generate aligned differential clock from single ended clock

� Non-overlapping two phase clock paire.g. for race free clocking and switched capacitor circuits

� Edge shifters for generation of overlapping clockse.g. for latch free interfacing of domino circuits

� Clock dividerse.g. to reduce the clock delivered by the PLL or used within the PLL itself (� PLL chapter)

25

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Clock Splitter

� Clock splitters are used to derive a well aligned complementary clock pair from a single ended clock signal

� Alignment strongly depends on sizing– Critical circuit block, strongly susceptible to variations– Use differential clock sources whenever possible (to avoid splitters)

� More complex and more accurate circuits possible, e.g. coupling of differential outputs, 4/5 element paths, …

size both paths for equal delay!

26

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Non-Overlapping Two-Phase Clock

� The high-phases of the two clock phases do not overlap� refer to Mixed-Signal-Electronics @ LTE

27

Page 10: Digital System Clocking Compact

10

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Rising-Edge Shifter

� Rising edge can be shifted to reduce the output duty cycle� Purpose: Overlapping clocks, duty cycle correction

28

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Falling-Edge Shifter

� Falling edge can be shifted to reduce the output duty cycle� Purpose: Overlapping clocks, duty cycle correction

29

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Digital Divider Circuits

� Dividers are cyclical counters0,1, … N-2, N-1, 0, 1, …

� Binary dividers require at leaststorage elements

� possible but undesirable states

� High activity, often high freq.� minimize logic and control

� Avoid async. reset (would require control / sync.

� Implement counter with auto-initialization / auto-recovery

30

Page 11: Digital System Clocking Compact

11

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Digital Divide-by-N Circuits

� Counter logic forces N step limit cycle� Multi-modulus dividers enable multiple divider factors by

reconfiguring the counter logic� Divide-by-2 often used as prescaler

– Trivial counter logic

– Approximately 2-3x faster than all other dividers

31

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Digital Divide-by-2 Stage

� Flip-flop with input connected to the inverting output or with explicit feedback inverter

� Check hold-time carefully!NO pulsed flip-flops

� N stages can be cascaded to implement 2N divider(asynchronous)

� Strip off all flip-flop functions except latching functionality(no test, no reset, eventually use differential output of VCO)

32

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Divide-by-2 IQ-Signal Generation

� Duty cycle dependence� Phase accuracy of crucial

interest� Amplitude not an issue in

full swing CMOS� Skewless differential signals

desirable for subsequent circuit blocks (phase rotators, mixers, … )

33

Page 12: Digital System Clocking Compact

12

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Divide-by-4 IQ-Signal Generation

� No duty cycle dependence but 4:1 frequency ratio� Phase accuracy of crucial interest

34

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Example: 2/3 Dual Modulus Divider

� sel = high: Divide-by-three� sel = low: Divide-by-two

35

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Design Example: Low Power Div-15 Circuit

� Reduce frequency as soon as possible� div 15 = div 3 x div 5

� Use ultra-high speed registers only where really re quired� sense amplifiers for div 3, master-slaves registers for div 5

� Minimize interface overhead� differential output signals of sense-amps used as complementary clocks of master slave registers

� Minimize combinatorial logic� optimal state coding � only one logic function required

� Selection of register is a key issue for power optimization of high-speed building blocks

36

Page 13: Digital System Clocking Compact

13

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

CMOS Divider 1/3 – Integrated Sense-Amp Approach

logic integrated in slavestage of sense-amps

auto-initialization

37

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

CMOS Divider 1/5 – Static Master-Slave Approach

master slave flipflops controlledby differential output signals ofsense amplifiers in div 3 stage

auto-initialization required

38

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Pulse Swallowing Divider

� Start analysis with reset swallow counter � Dual modulus divider divides by (N+1) for (N+1)S cycles� Value in program counter is S� Dual modulus divider divides by N for N(P-S) cycles� Period, i.e. effective (average) divider factor: (NP + S)

39

Page 14: Digital System Clocking Compact

14

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Pulse Removing from Periodic Signals

� A pulse swallower removes one of N pulses

� average frequency

� This corresponds to a multiplication which causes spurs

40

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Fractional Frequency Divider

fractional !

41

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Miller Divider (Regenerative Divider)

� Analog multiplier mixes input and output signal� two frequency components fin + fout and fin – fout

� Low pass filter eliminates high frequency component, i.e.fout = fin – fout � fout = 0.5 x fin

� Loop must fulfill Barkhausen criterion

42

Page 15: Digital System Clocking Compact

15

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Design Example:Timing Budget of IQ Divide-by-2 Circuit

Example:

7 GHz input clock 142 ps

Half input clock period 71 ps

+/- 2% clock jitter 6 ps

Maximum available D-Q-delay 65 psSS 2008

43

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Sense-Amplifier Based Divider Core I

� Two internally coupledsense amplifiers as pulse generator

� Pulse shape may change with varying process and operation conditions

� Intrinsic symmetry results in good phase relations

� Footer devices can be used as power switch

44

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Sense-Amplifier Based Divider Core II

� Internal nodes allow for coupling with minimum latency.

� Differential pair allows for reliable decision even without rail-to-rail signals

� Latch increases reliability and noise immunity

� Precharged logic � fast

45

Page 16: Digital System Clocking Compact

16

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Output Signals of SA Pulse Generator

� SA based pulse generator can be pushed to very high frequencies

� Output signals– π/2 pulses– varying shape & amplitude

– strong symmetry

� Pulses cannot be used to drive static slaves� coupling stage which imposes no requirements for the pulse signals but the symmetry of the four phases

VDD, high

VDD, low

VDD, nom

46

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Self-Precharging Dynamic Coupling Stage I

� Purpose: Recreate full swing signals

� Four circularly pre-charging dyn. coupling elements– Fast & reliable pull down

– Phase purity requires only symmetry of pulse signals, no serious waveform dependence

� Intrinsic power gating capability� Duty cycle correction possible by

subsequent RS-latches

47

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Self-Precharging Dynamic Coupling Stage II

� Purpose: Recreate full swing signals

� Four circularly pre-charging dyn. coupling elements– Fast & reliable pull down

– Phase purity requires only symmetry of pulse signals, no serious waveform dependence

� Intrinsic power gating capability� Duty cycle correction possible by

subsequent RS-latches

48

Page 17: Digital System Clocking Compact

17

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Self-Precharging Dynamic Coupling Stage III

Self-pre-charging dynamic coupling stage regenerates pulse signals efficiently

� pulse

� reconstructed signal atoutput of coupling stage

49

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Power & Max. Frequency of Design Example

Target application: VDD = 1.0V @ T= -40 ..125°Cfrequency 5..7 GHz

(incl

udin

g cl

ock

driv

ers

and

load

)

50

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Phase Rotation in Coupling Stage

EN1 EN2 EN3 EN4

Glitch-free by construction!

51

Page 18: Digital System Clocking Compact

18

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Phase Transition in Coupling Stage

Glitch-free phase transition by construction

52

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Application of Phase Rotator in 128/129 Dual Modulus Prescaler

Synchronization element between divider and phase rotator which is required in conventionalprescalers may be omitted.

53

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Clock Distribution

� Purpose of clock tree– Distribute clock from PLL to all synchronous circuit elements

� huge load, clock network is high fanout net

– Provide sufficient drive strength– Contributes up to 30% to overall power dissipation!

� Although the functionality is trivial, jitter and skew constraints makes the design of the clock tree a critical task.

� Next: Overview on common clock tree architectures

54

Page 19: Digital System Clocking Compact

19

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

H-Tree

� FO4 like tree� Very popular� Good compromise

between power consumption and skew/jitter

� Symmetrical design� Adjacent flip-flops

may be connected to different paths along the tree� skew, jitter

55

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Balanced Clock Tree

� The H-tree concept can be generalized– Optimized for equal delay along all paths (interconnect and gate delay)

– No regular physical structure necessary

– Advantageous for irregular circuits and synthesized logic– Good tools support available

� Remark:– Skew only relevant for interacting blocks, not for independent

modules or modules with slow interfaces, e.g. shared memory– Basically the latency along the tree is irrelevant. This does not hold

anymore when history effects caused by supply noise etc. occur and for fast clock gating schemes.

56

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Overlapping Rings / Interleaved Rings

� Similar to H-tree, all edges of the ‘H’ connected, symmetrically supplied from center

� Parallelized distribution– averaging of variations– increased power

consumption– more complex routing

57

Page 20: Digital System Clocking Compact

20

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Clock Mesh / Clock Grid

� Mesh/grid of clock wires supplied by drivers (usually at the edges)

� Very good averaging of variations

� Very low skew� High power consumption� Usually combined with

other tree concepts, e.g. global H-tree and local mesh structure

Mesh: Outputs of all buffers of a certainlevel of hierarchy connected together,not necessarily regular layout

Grid: Mesh with regular layout(� figure)

58

High-Speed Digital CMOS CircuitsStephan HenzlerTechnische Universität München Sumer Term 2012

Trunk Tree

� Clock skew increases with distance from trunk

� Common in full-custom pipelines

59