ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ ([email protected]) · Static...

198
ΗΜΥ 664 ΨΗΦΙΑΚΟΣ ΣΧΕΔΙΑΣΜΟΣ ΜΕ FPGAs Χειμερινό Εξάμηνο 2010 ΔΙΑΛΕΞΗ 3: Design Flow ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ ([email protected]) Some slides adopted from Digital Integrated Circuits, Rabbey et. al.

Transcript of ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ ([email protected]) · Static...

Page 1: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ 664ΨΗΦΙΑΚΟΣ ΣΧΕΔΙΑΣΜΟΣ ΜΕ FPGAs

Χειμερινό Εξάμηνο 2010

ΔΙΑΛΕΞΗ 3: Design Flow

ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣΛέκτορας ΗΜΜΥ

([email protected]) Some slides adopted from Digital Integrated Circuits, Rabbey et. al.

Presenter
Presentation Notes
Other handouts To handout next time
Page 2: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.2 © Θεοχαρίδης, ΗΜΥ, 2010

Design Process Steps (Review)

Definition of system requirements. Example: ISA (instruction set architecture) for CPU. Includes software and hardware interfaces including

timing. May also include cost, speed, reliability and

maintainability specifications.

Definition of system architecture. Example: high-level HDL (hardware description

language) representation - this is not required in ECE 408/664 but is done in the real world).

Useful for system validation and verification and as a basis for lower level design execution and validation or verification.

Page 3: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.3 © Θεοχαρίδης, ΗΜΥ, 2010

Design Process Steps (Review)

Refinement of system architecture In manual design, descent in hierarchy, designing

increasingly lower-level components In synthesized design, transformation of high-level HDL to

“synthesizable” register transfer level (RTL) HDL

Logic design or synthesis In manual or synthesized design, development of logic

design in terms of library components Result is logic level schematic or netlist representation or

combinations of both. Both manual design or synthesis typically involve

optimization of cost, area, or delay.

Page 4: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.4 © Θεοχαρίδης, ΗΜΥ, 2010

Design Process Steps (Review)

Implementation Conversion of the logic design to physical implementation Involves the processes of:

Mapping of logic to physical elements, Placing of resulting physical elements, And routing of interconnections between the elements.

In case of SRAM-based FPGAs, represented by the programming bitstream which generates the physical implementation in the form of CLBs, IOBs and the interconnections between them

Page 5: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.5 © Θεοχαρίδης, ΗΜΥ, 2010

Design Process Steps (Review)

Validation (used at number of steps in the process) At architecture level - functional simulation of HDL At RTL level- functional simulation of RTL HDL At logic design or synthesis - functional simulation of gate-

level circuit - not usually done in ECE 408/664 At implementation - timing simulation of schematic, netlist or

HDL with implemention based timing information (functional simulation can also be useful here)

At programmed FPGA level - in-circuit test of function and timing

Page 6: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.6 © Θεοχαρίδης, ΗΜΥ, 2010

Hardware design in general

Logic (RTL) designLogic simulationLogic debugging

RTL code(Verilog)

Placement & routing

Timing simulation

Timing analysis

Netlist &Gate delay

(SDF)GDSII

Semiconductor fabricationGDSII &Test-vector

Logic synthesis

Gate-level simulation

Gate-level debugging

Netlist(EDIF)

RTL code &Targetlibrary

Logic synthesis

Placement & routingFPGA

bit-stream

RTL code &FPGAlibrary

FPGABOARD

FPGADebugging

Logic synthesisRTL compilation

Placement & routing

H/W Platform

General Hardware Design Flow / Methodologies.

Page 7: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.7 © Θεοχαρίδης, ΗΜΥ, 2010

Xilinx HDL/Core Design Flow

DESIGN ENTRY

CORE GENERATIONRTL HDL EDITING

RTL HDL-CORESIMULATION

SYNTHESIS

IMPLEMENTATION

TIMINGSIMULATION

FPGA PROGRAMMING& IN-CIRCUIT TEST

Page 8: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.8 © Θεοχαρίδης, ΗΜΥ, 2010

Xilinx HDL/Core Design Flow - HDL Editing

Language Construct Templates

HDL EDITOR

DESIGN WIZARD LANGUAGE ASSISTANTAccessed within HDL Editor

RTL HDL Files

HDL Module Frameworks

Page 9: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.9 © Θεοχαρίδης, ΗΜΥ, 2010

Xilinx HDL/core Design Flow – Core Generation

CORE GENERATOR

Select core and specify input parameters

HDL instantiation module for core_name

EDIF netlist for core_name

Other core_name files

Page 10: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.10 © Θεοχαρίδης, ΗΜΥ, 2010

Xilinx HDL/core Design Flow - HDL Functional Simulation

Compile HDL Files

Waveforms or List Files

Set Up and Map work library RTL HDL Files

Test Inputs or Force Files

HDL instantiation module for core_name

EDIF netlists for core_names

Functional Simulate

Testbench HDL Files

HDLSIMULATOR

Page 11: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.11 © Θεοχαρίδης, ΗΜΥ, 2010

All HDL Files

Gate/Primitive Netlist Files (EDIF or XNF)

Xilinx HDL Design Flow - Synthesis

Select Top Level

Select Target Device

Edit XST Synthesis Constraints

Synthesize

Synthesis/Implement-ation Constraints

Synthesis Report Files

EDIF netlists for core_names

XST

Page 12: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.12 © Θεοχαρίδης, ΗΜΥ, 2010

Model Extraction

Xilinx HDL/core Design Flow - Implementation

Netlist Translation

Map

Place & Route

BIT File

Create Bitstream

Timing Model Gen

Gate/Primitive Netlist Files (XNF or EDN)

Standard Delay Format File

HDL or EDIF for Implemented Design

XILINX DESIGN MANAGER

Page 13: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.13 © Θεοχαρίδης, ΗΜΥ, 2010

Xilinx HDL/core Design Flow- Timing Simulation

Test Inputs, Force Files

MODELSIM

Compile HDL Files

Waveforms or List Files

Set Up and Map work Directory

Compiled HDL

HDL Simulate

Standard Delay Format FileHDL or EDIF for Implemented Design

Testbench HDL Files

Page 14: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ408 L3 Design Flow.14 © Θεοχαρίδης, ΗΜΥ, 2010

Xilinx HDL Design Flow - Programming and In-circuit Verification

Bit File

FPGA Board

iMPACT

I/O Port

Input Byte

Human Inputs

Outputs

Page 15: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.15 © Θεοχαρίδης, ΗΜΥ, 2010

A Few Notes on Programming: Start up Sequence

° During start-up, the device performs four operations:1. The assertion of DONE. The failure of DONE to go

High may indicate the unsuccessful loading of configuration data.

2. The release of the Global Three State (GTS). This activates all the I/Os.

3. The release of the Global Set Reset (GSR). This allows all flip-flops to change state.

4. The assertion of Global Write Enable (GWE). This allows all RAMs and flip-flops to change state.

° By default, these operations are synchronized to CCLK.

° The entire start-up sequence lasts eight cycles, called C0-C7, after which the loaded design is fully functional.

Page 16: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.16 © Θεοχαρίδης, ΗΜΥ, 2010

Serial Load Configuration

° There are two serial configuration modes. ° Master Serial mode

• the FPGA controls the configuration process by driving CCLK as an output.

° Slave Serial mode• the FPGA passively receives CCLK as an input from an external

agent (e.g., a microprocessor, CPLD, or second FPGA in master mode) that is controlling the configuration process.

° In both modes, the FPGA is configured by loading one bit per CCLK cycle.

° The MSB of each configuration data byte is always written to the DIN pin first.

Page 17: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.17 © Θεοχαρίδης, ΗΜΥ, 2010

ASIC Design Flow

°ASIC• Application Specific Integrated Circuits• Custom design, usually from scratch or from pre-built components

• Chip performs a particular function• Typically NOT general purpose

°Front End Back End• Front End – Synthesis / Gate Level • Back End – Layout / Mask Generation

Page 18: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.18 © Θεοχαρίδης, ΗΜΥ, 2010

ASIC Design Flow – Typical flow

° ASIC Design Flow Steps

• Specifications• Early Planning• Architecture• Design• Synthesis• Pre-Layout Static Timing Analysis• Layout• Post-Layout Static Timing Analysis• Pads Placement• Sent for Manufacturing

VERIFICATION

Page 19: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.19 © Θεοχαρίδης, ΗΜΥ, 2010

Design Flow – Commercial (Example Tools)

Synopsys Design Compiler

Modelsim

Prime Time

Cadence Silicon Ensemble

Silicon Ensemble/Virtuoso

HDL Model

Verilog Gate Level / Netlist

Verilog Gate Level / Netlist

DEF File

DEF File

DEF File

VHDL / Verilog

Verilog Simulation

Static Timing Analysis

Standard Cell Placement and Routing

Post-Layout Static Timing Analysis

Pads Placement

Prime Time

RTL (Register Transfer Language)

Page 20: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.20 © Θεοχαρίδης, ΗΜΥ, 2010

Other Commercial Tools

° RTL Verification with Specman e ° Gate-level simulation with ModelSim ° Logic Synthesis with Synopsys Design Compiler ° Static Timing Analysis with Synopsys PrimeTime ° Placement and Routing with Cadence Silicon

Ensemble ° Running Silicon Ensemble in the GUI mode ° Clock Tree Generation with Cadence CTGen ° Integrating IP Block, DesignWare and Virage SRAM ° Power Estimation with Synopsys Power Compiler ° Code Revision Control with CVS

Page 21: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.21 © Θεοχαρίδης, ΗΜΥ, 2010

° Must understand specifications first° Start by looking it as black box

° e.g. Adder• F(X,Y) = X+Y• Takes two inputs, produces Sum of Inputs

Starting A Design

XY F(X,Y)

Page 22: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.22 © Θεοχαρίδης, ΗΜΥ, 2010

Starting A Design

° SPECS Architecture• Block Diagram• Brainstorming (if collaborating)• Feedback• I/O Specs• Architectural Decisions

- Frequency?- Latency?- Power/Performance?- Reliability?

• Architectural Optimizations• Finalizing the initial Design

Page 23: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.23 © Θεοχαρίδης, ΗΜΥ, 2010

From “Architecture” to RTL

° Create Block Diagram of Design - with sub-blocks if necessary

° Create I/O Specs for each block• e.g adder

- Sum generator– Takes three inputs, produces one output

- Carry generator– Takes three inputs, produces one output

- Interconnected?• Place box in functional order

- i.e. can’t generate sum after carry-in arrives!!!• Create pipeline flow

- i.e. IF IDIXICWB• Clocked signals/registers/latches

° Proceed then to code module by module

Page 24: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.24 © Θεοχαρίδης, ΗΜΥ, 2010

Hierarchical Design

•Multiple modules•Multiple instances

•Top-Level Design•Contains all sub-modules and connection information

•Sub-Modules can be hierarchically built themselves

Page 25: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.25 © Θεοχαρίδης, ΗΜΥ, 2010

HDL

° Hardware Description Language• Verilog, VHDL, SystemC, etc.

° High Level of Design Abstraction• ex:

- Input A, B- Output C- Architecture entity of adder is

C A + B

° Not going to talk in depth about HDL• Refer to multiple online resources

- www.deeps.org• Behavioral vs. Structural• Code Simulate or Code Synthesize (Compile) Simulate

Page 26: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.26 © Θεοχαρίδης, ΗΜΥ, 2010

HDL - Tools

° Code programming• Just a text editor!• Today, fancy text editors with syntax highlighting are available

for free (emacs, nedit, etc.)

° Simulation• Multiple free HDL Simulators for simple designs• State-of-the-art Simulators available at CSE

- Modelsim- NCVHDL- NCVerilog

• Not necessary synthesized code

° Synthesis (Compilation)• Neet a target library of “standard” cells (i.e. AND, XOR, ADDER,

etc.)• Synopsys Design Compiler

Page 27: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.27 © Θεοχαρίδης, ΗΜΥ, 2010

HDL Simulation / Verification

° Upon coding each block / module, we can then simulate its functionality

° Use an HDL / RTL Simulator• Event Driven • Cycle Driven

° Simulator reads code and models code functionality based on clock cycles or events, e.g.

• CA+B @posedge clk• CA+B after 10 ns

° Tools Available:• Modelsim, NCVHDL, NCVerilog, etc.

Page 28: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.28 © Θεοχαρίδης, ΗΜΥ, 2010

HDL Synthesis H/W

Timing Analysis

Routing

Placement

Synthesis

Page 29: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.29 © Θεοχαρίδης, ΗΜΥ, 2010

Why learning about Logic Synthesis?

° Logic synthesis is the core of today's CAD flows for IC and system design

• course covers many algorithms that are used in a broad range of CAD tools

• basis for other optimization techniques, e.g. embedded software• basis for functional verification techniques

° Most algorithms are computationally hard• covered algorithms and flows are good example for approaching

hard algorithmic problems• course covers theory as well as implementation details• demonstrates an engineering approaches based on theoretical

solid but also practical solutions- very few research areas can offer this combination

Page 30: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.30 © Θεοχαρίδης, ΗΜΥ, 2010

Design of Integrated Systems

System Level

Register Transfer Level

Gate Level

Transistor Level

Layout Level

Mask Level

Page 31: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.31 © Θεοχαρίδης, ΗΜΥ, 2010

System Level

° Abstract algorithmic description of high-level behavior

• e.g. C-Programming language

• abstract because it does not contain any implementation details for timing or data

• efficient to get a compact execution model as first design draft• difficult to maintain throughout project because no link to

implementation

Port*compute_optimal_route_for_packet(Packet_t *packet,

Channel_t *channel){static Queue_t *packet_queue;

packet_queue = add_packet(packet_queue, packet);...

}

Page 32: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.32 © Θεοχαρίδης, ΗΜΥ, 2010

RTL Level

° Cycle accurate model “close” to the hardware implementation

• bit-vector data types and operations as abstraction from bit-level implementation

• sequential constructs (e.g. if - then - else, while loops) to support modeling of complex control flow

module mark1;reg [31:0] m[0:8192];reg [12:0] pc;reg [31:0] acc;reg[15:0] ir;

alwaysbeginir = m[pc];if(ir[15:13] == 3b’000)

pc = m[ir[12:0]];else if (ir[15:13] == 3’b010)

acc = -m[ir[12:0]];...

endendmodule

Page 33: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.33 © Θεοχαρίδης, ΗΜΥ, 2010

Gate Level

° Model on finite-state machine level• models function in Boolean logic using registers and gates• various delay models for gates and wires

• in this lecture we will mostly deal with gate level

1ns

4ns3ns

5ns

Page 34: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.34 © Θεοχαρίδης, ΗΜΥ, 2010

Transistor Level

° Model on CMOS transistor level• depending on application function modeled as resistive

switches- used in functional equivalence checking

• or full differential equations for circuit simulation- used in detailed timing analysis

Page 35: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.35 © Θεοχαρίδης, ΗΜΥ, 2010

Layout Level

° Transistors and wires are laid out as polygons in different technology layers such as diffusion, poly-silicon, metal, etc.

Page 36: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.36 © Θεοχαρίδης, ΗΜΥ, 2010

Design of Integrated SystemsR

elat

ive

Effo

rt

Project Time

System

RTL

Logic

- Design phases overlap to large degrees- Parallel changes on multiple levels, multiple teams- Tight scheduling constraints for product

Transistor

Page 37: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.37 © Θεοχαρίδης, ΗΜΥ, 2010

Design Challenges

° Systems are becoming huge, design schedules are getting tighter

• > 100 Mio gates becoming common for ASICs• > 0.4 Mio lines of C-code to describe system behavior• > 5 Mio lines of RLT code

° Design teams are getting very large for big projects• several hundred people• differences in skills• concurrent work on multiple levels• management of design complexity and communication very difficult

° Design tools are becoming more complex but still inadequate• typical designer has to run ~50 tools on each component• tools have lots of bugs, interfaces do not line up etc.

Page 38: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.38 © Θεοχαρίδης, ΗΜΥ, 2010

Design Challenges

° Decision about design point very difficult• compromise between performance / costs / time-to-market• decision has to be made 2-3 years before design finished• design points are difficult to predict without actually doing the

design• scheduling of product cycles

° Functional verification • simulation still main vehicle for functional verification but

inadequate because of size of design space• results in bugs in released hardware that is very expensive to

recover from (different in software ;-)

Page 39: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.39 © Θεοχαρίδης, ΗΜΥ, 2010

Design Challenges

° Fundamental tradeoffs between different modeling levels:

• modeling detail and team size to maintain model- high-level models can be maintained by one or two people- detailed models need to be partitioned which results in a

significant communication overhead• modeling accuracy versus modeling compactness

- compact models omit details and give only crude estimations for implementation

- detailed models are lengthy and difficult to adopt for major changes in design points

• simulation speed versus hardware performance- high-level models can be simulated fast but cannot be

implemented efficiently with automatic means- low-level models can be made to have a fast

implementation but cannot be simulated very fast

Page 40: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.40 © Θεοχαρίδης, ΗΜΥ, 2010

General Design Approach

° How do engineers build a bridge?

° Divide and conquer !!!!• partition design problem into many sub-problems which are

manageable• define mathematical model for sub-problem and find an

algorithmic solution- beware of model limitations and check them !!!!!!!

• implement algorithm in individual design tools, define and implement general interfaces between the tools

• implement checking tools for boundary conditions• concatenate design tools to general design flows which can be

managed• see what doesn’t work and start over

Page 41: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.41 © Θεοχαρίδης, ΗΜΥ, 2010

Design Automation

° Design Automation is one of the most advanced areas in practical computer science

• many problems require sophisticated mathematical modeling• many algorithms are computationally hard and require advanced and

fine-tuned heuristics to work on realistic problem sizes• boundary conditions need to be well declared and synchronized

between different tools (patchwork to cover all wholes)

° Two common pitfalls in CAD research• problem is looking for a solution:

- problem scope is too big, makes modeling difficult or algorithms don’t scale

- problem scope is too small, solutions are not good enough• solution is looking for a problem:

- model was oversimplified because real problem was too complex with too many boundary conditions

Page 42: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.42 © Θεοχαρίδης, ΗΜΥ, 2010

Key to Success

° Fine-tuned combination of Design Methodology and Tools

• addresses algorithmic complexity by requiring- manual partitioning of the problem- manual input of hints/suggestions- manual iterations to drive tool application to best solution

• makes CAD systems and design flows very complex and difficult to manage

Problem space Tools applicable

Practical combination through design methodology

Page 43: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.43 © Θεοχαρίδης, ΗΜΥ, 2010

Examples of Divide and Conquer

° RLT cycle simulation does only evaluate the next state logic of the circuits, timing is assumed to be correct

• combination of static timing analysis, formal equivalence checking, and cycle simulation allows separation of issues

• cycle simulation avoids expensive event scheduling and processing and performs significantly faster

° However:• timing analysis is conservative with respect to the achievable

clock cycle time

Page 44: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.44 © Θεοχαρίδης, ΗΜΥ, 2010

Examples of Divide and Conquer

° Static timing analysis assumed simple gate delay models

• complexity of static timing analysis becomes linear (simple longest and shortest paths analysis in circuit implementation)

• very efficient implementation of incremental static timing analysis which is needed in the inner loop of the technology dependent part of logic synthesis

° However:• actual gate delay varies a lot in reality

- models often assume average fan-out rather than actual gate load

• delay model assumes ideal signals- slew dependency ignored

Page 45: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.45 © Θεοχαρίδης, ΗΜΥ, 2010

Examples of Divide and Conquer

° Logic synthesis assumes ideal gates which are independent of physical environment

• standard cell place and route technology has made logic synthesis possible

- gates are heavily over-designed to be functional in a wide variety of combinations (e.g. range of fan-out gates possible, different wire loads

- layout placement and route done in standard rows that minimize latch-up effects and optimize power and clock wiring

° However:- layout implementation remains sub-optimal because cells

are designed for worst case application and with large safety margins with respect to environment

Page 46: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.46 © Θεοχαρίδης, ΗΜΥ, 2010

Examples of Divide and Conquer

° Logic synthesis uses crude model to estimate circuit area

- literal count or simple table-lookup for gates sizes allows fast comparison of different implementation choices

° However:- actual gate size can vary to a very large degree depending

on load and timing requirement- area for wiring completely ignored

Page 47: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.47 © Θεοχαρίδης, ΗΜΥ, 2010

Examples of Divide and Conquer

° Formal equivalence checking assumes identical state encoding of the two designs to be compared

• reduces the general equivalence checking problem to combinational equivalence checking which is computationally less complex

• exploitation of structural similarities between designs to be compared makes tools applicable for huge (multi-million gate) designs

• automatic algorithms for identifying register correspondence compensate to some extent for limited model

° However:• combinational verification model cannot handle sequential

verification problems

Page 48: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.48 © Θεοχαρίδης, ΗΜΥ, 2010

Full Custom Design Flow

° Application: ultra-high performance designs • general-purpose processors, DSPs, graphic chips, internet

routers, games processors etc.

° Target: very large markets with high profit margins• e.g. PC business

° Complexity: very complex and labor intense• involving large teams• high up-front investments and relatively high risks

° Role of Logic Synthesis:• limited to components that are not performance critical or that

might change late in design cycle (due to designs bugs found late)

- control logic- non-critical data paths logic

• bulk of data-path components and fast control logic are manually crafted for optimal performance

Page 49: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.49 © Θεοχαρίδης, ΗΜΥ, 2010

Full Custom Design Flow

ISA Specification

RTL Spec

Gate Level Netlist

Transistor Level Circuit

Layout

Circuit Simulation

Simulation

Design Rule Checker

FormalEquivalence

Checking

Simulation

Logic Synthesis

Manual or semi-automatic

Design

Extract&Compare

° Incomplete picture:

Page 50: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.50 © Θεοχαρίδης, ΗΜΥ, 2010

ASIC Design Flow

° Application: general IC market• peripheral chips in PCs, toys, handheld devices etc.

° Target: small to medium markets, tight design schedules

• e.g. consumer electronics

° Complexity of design: standard design style, quite predictable

• standard flows, standard off-the-shelf tools

° Role of Logic Synthesis:• used on large fraction of design except for special blocks such

as RAM’s, ROM’s, analog components

Page 51: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.51 © Θεοχαρίδης, ΗΜΥ, 2010

ASIC Design Flow

Informal Specification

RTL Spec

Gate Level Netlist

Modifies Gate Level Netlist Static Timing Analysis

FormalEquivalence

Checking

Simulation

Logic Synthesis

Manual Changesto fix timing

° Incomplete picture:

ASIC FoundryTest Logic Insertion

Page 52: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.52 © Θεοχαρίδης, ΗΜΥ, 2010

What is Logic Synthesis?

D

X Yλδ

Given: Finite-State Machine F(X,Y,Z, , ) where:λ δX: Input alphabetY: Output alphabetZ: Set of internal states

: X x Z Z (next state function): X x Z Y (output function)

λδ

Target: Circuit C(G, W) where:

G: set of circuit components g {Boolean gates,flip-flops, etc}

W: set of wires connecting G

Page 53: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.53 © Θεοχαρίδης, ΗΜΥ, 2010

Objective Function for Synthesis

° Minimize area• in terms of literal count, cell count, register count, etc.

° Minimize power• in terms of switching activity in individual gates, deactivated

circuit blocks, etc.

° Maximize performance• in terms of maximal clock frequency of synchronous systems,

throughput for asynchronous systems

° Any combination of the above• combined with different weights• formulated as a constraint problem

- “minimize area for a clock speed > 300MHz”

° More global objectives• feedback from layout

- actual physical sizes, delays, placement and routing

Page 54: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.54 © Θεοχαρίδης, ΗΜΥ, 2010

Constraints on Synthesis

° Given implementation style:• two-level implementation (PLA, CAMs)• multi-level logic• FPGAs

° Given performance requirements• minimal clock speed requirement• minimal latency, throughput

° Given cell library• set of cells in standard cell library• fan-out constraints (maximum number of gates connected to

another gate)• cell generators

Page 55: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.55 © Θεοχαρίδης, ΗΜΥ, 2010

Why learn HDL coding styles for FPGAs?° HDLs contain many complex constructs that are

difficult to understand at first.° Methods and examples included in HDL manuals do

not always apply to the design of FPGA devices. ° If you currently use HDLs to design ASICs, your

established coding style may unnecessarily increase the number of gates or CLB levels in FPGA designs

° HDL synthesis tools implement logic based on the coding style of your design.

Page 56: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.56 © Θεοχαρίδης, ΗΜΥ, 2010

Naming Convention - Restrictions° The following FPGA resource names are reserved

and should not be used to name nets or components. • Components (Comps), Configurable Logic Blocks (CLBs),

Input/Output Blocks (IOBs), Slices, basic elements (bels), clock buffers (BUFGs), tristate buffers (BUFTs), oscillators (OSC), CCLK, DP, GND, VCC, and RST

• CLB names such as AA, AB, SLICE_R1C2, SLICE_X1Y2, X1Y2, and R1C2

• Primitive names such as TD0, BSCAN, M0, M1, M2, or STARTUP• Do not use pin names such as P1 and A4 for component names• Do not use pad names such as PAD1 for component names

Page 57: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.57 © Θεοχαρίδης, ΗΜΥ, 2010

Use optional labels on flow control constructs

° Make the code structure more obvious ° Can slow execution in some simulators

/* Changing Latch into a D-Register * D_REGISTER.V */

module d_register (CLK, DATA, Q);

input CLK;

input DATA;

output Q;

reg Q;

always @ (posedge CLK)

begin: My_D_Reg

Q <= DATA;

end

endmodule

Page 58: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.58 © Θεοχαρίδης, ΗΜΥ, 2010

Coding for Synthesis

° Omit the Wait for XX ns Statement• XX specifies the number of nanoseconds that must pass before a

condition is executed. • VHDL: wait for XX ns;• Verilog: #XX;

° Omit the ...After XX ns or Delay Statement• VHDL

(Q <=0 after XX ns)• Verilog

assign #XX Q=0;• This statement is usually ignored by the synthesis tool. In this

case, the functionality of the simulated design does not match the functionality of the synthesized design.

Page 59: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.59 © Θεοχαρίδης, ΗΜΥ, 2010

Coding for Synthesis

° Omit Initial Values• VHDLsignal sum : integer := 0;• Veriloginitial sum = 1’b0;

° Order and Group Arithmetic Functions• ADD = A1 + A2 + A3 + A4;

cascades three adders in series.• ADD = (A1 + A2) + (A3 + A4);

two additions are evaluated in parallel and the results are combined with a third adder.

• RTL simulation results are the same for both statements,• however, the second statement results in a faster circuit after

synthesis (depending on the bit width of the input signals).• When is second construct preferred ?

Page 60: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.60 © Θεοχαρίδης, ΗΜΥ, 2010

Coding for Synthesis

For example, if the A4 signal reaches the adder later than the other signals, the first statement produces a faster implementation because the cascaded structure creates fewer logic levels for A4.

This structure allows A4 to catch up to the other signals. In this case, A1 is the fastest signal followed by A2 and A3; A4 is the slowest signal.

° Most synthesis tools can balance or restructure the arithmetic operator tree if timing constraints require it.

° However, Xilinx® recommends that you code your design for your selected structure.

Page 61: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.61 © Θεοχαρίδης, ΗΜΥ, 2010

Comparing If Statement vs.Case Statement° If statement generally produces priority-encoded logic ° Case statement generally creates balanced logic.° Use the Case statement for complex decoding and use the If

statement for speed critical paths.° Make sure that all outputs are defined in all branches of an if

statement. • If not, it can create latches or long equations on the CE signal. • Have default values for all outputs before the if statements.

° Limiting the number of input signals into an if statement can reduce the number of logic levels.

° If there are a large number of input signals, see if some of them can be pre-decoded and registered before the if statement.

° Avoid bringing the dataflow into a complex if statement. ° Only control signals should be generated in complex if-else

statements.

Page 62: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.64 © Θεοχαρίδης, ΗΜΥ, 2010

Implementation

Page 63: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.65 © Θεοχαρίδης, ΗΜΥ, 2010

Case vs IF

° Case implementation requires only one Virtex™ slice while the If construct requires two slices in some synthesis tools.

° In this case, design the multiplexer using the Case construct because fewer resources are used and the delay path is shorter.

Page 64: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.66 © Θεοχαρίδης, ΗΜΥ, 2010

Example – From XCELL

° Verilog designs that use the CASE construct with the NESTED IF to more effectively describe the same function.

° The CASE construct reduces the delay by approximately 3 ns (using an XC4005E-2 part)

Source:http://www.xilinx.com/xcell/xl30/xl30_21.pdf

Page 65: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.67 © Θεοχαρίδης, ΗΜΥ, 2010

From IF construct

Page 66: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.68 © Θεοχαρίδης, ΗΜΥ, 2010

From Case construct

Page 67: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.69 © Θεοχαρίδης, ΗΜΥ, 2010

Implementing Latches and Registers

° Synthesizers infer latches from incomplete conditional expressions, such as an If statement without an Else clause.

° This can be problematic for FPGA designs because not all FPGA devices have latches available in the CLBs.

° In addition, you may think that a register is created, and the synthesis tool actually created a latch.

° The Spartan-II™, Spartan-3™ and Virtex™, Virtex-E™, Virtex-II™, Virtex-II Pro™ and Virtex-II Pro X™ FPGA devices do have registers that can be configured to act as latches.

° For these devices, synthesizers infer a dedicated latch from incomplete conditional expressions.

Page 68: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.70 © Θεοχαρίδης, ΗΜΥ, 2010

D Latch

module d_latch (GATE, DATA, Q);

input GATE;

input DATA;

output Q;

reg Q;

always @ (GATE or DATA)

begin

if (GATE == 1'b1)

Q <= DATA;

end // End Latch

endmodule

Page 69: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.71 © Θεοχαρίδης, ΗΜΥ, 2010

D register

module d_register (CLK, DATA, Q);

input CLK;

input DATA;

output Q;

reg Q;

always @ (posedge CLK)

begin: My_D_Reg

Q <= DATA;

end

endmodule

Page 70: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.72 © Θεοχαρίδης, ΗΜΥ, 2010

How to handle latches?

° With some synthesis tools you can determine the number of latches that are implemented in your design.

° You should convert all If statements without corresponding Else statements and without a clock edge to registers.

° Use the recommended register coding styles in the synthesis tool documentation to complete this conversion.

Page 71: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.73 © Θεοχαρίδης, ΗΜΥ, 2010

Resource Sharing

° Resource sharing is an optimization technique that uses a single functional block (such as an adder or comparator) to implement several operators in the HDL code.

° Use resource sharing to improve design performance by reducing the gate count and the routing congestion.

° If you do not use resource sharing, each HDL operation is built with separate circuitry.

° However, you may want to disable resource sharing for speed critical paths in your design.

Page 72: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.74 © Θεοχαρίδης, ΗΜΥ, 2010

Resource Sharing

module res_sharing (A1, B1, C1, D1, COND_1, Z1);input COND_1;

input [7:0] A1, B1, C1, D1;

output [7:0] Z1;

reg [7:0] Z1;

always @(A1 or B1 or C1 or D1 or COND_1)

begin

if (COND_1)

Z1 <= A1 + B1;

else

Z1 <= C1 + D1;end

endmodule

Page 73: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.75 © Θεοχαρίδης, ΗΜΥ, 2010

With and Without Resource Sharing

Page 74: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.76 © Θεοχαρίδης, ΗΜΥ, 2010

Resource Sharing

° The following operators can be shared either with instances of the same operator or with an operator on the same line.

• *• + –• > >= < <=

° For example, a + operator can be shared with instances of other + operators or with – operators.

° A * operator can be shared only with other * operators.

Page 75: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.77 © Θεοχαρίδης, ΗΜΥ, 2010

Resource Sharing

° You can implement arithmetic functions (+, –, magnitude comparators) with gates or with your synthesis tool’s module library.

° The library functions use modules that take advantage of the carry logic in CLBs/slices.

° Resource sharing of the module library automatically occurs in most synthesis tools if the arithmetic functions are in the same process.

° Resource sharing adds additional logic levels to multiplex the inputs to implement more than one function.

• Do not use it for arithmetic functions that are part of your design’s time critical path.

Page 76: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.78 © Θεοχαρίδης, ΗΜΥ, 2010

Using Preset Pin or Clear Pin

° Xilinx® FPGA devices consist of CLBs that contain function generators and flip-flops. Spartan-II™, Spartan-3™ , Virtex™, Virtex-E™, Virtex-II™, Virtex-II Pro™ and Virtex-II Pro X™ registers can be configured to have either or both preset and clear pins.

Page 77: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.79 © Θεοχαρίδης, ΗΜΥ, 2010

FlipFlopmodule ff_example( RESET, SET, CLOCK, ENABLE;D_IN;

A_Q_OUT; B_Q_OUT; C_Q_OUT; D_Q_OUT; E_Q_OUT);input RESET;input SET;input CLOCK;input ENABLE;input [7:0] D_IN;output [7:0] A_Q_OUT;output [7:0] B_Q_OUT;output [7:0] C_Q_OUT;output [7:0] D_Q_OUT;output [7:0] E_Q_OUT;

// D flip-flopalways @(posedge CLOCK)beginA_Q_OUT <= D_IN;

end // End FF

Page 78: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.80 © Θεοχαρίδης, ΗΜΥ, 2010

Asynchronous Reset

always @(posedge CLOCK || posedge RESET)beginif (RESET == 1'b1)B_Q_OUT <= “00000000”;else if (CLOCK == 1'b1)B_Q_OUT <= D_IN;end

Page 79: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.81 © Θεοχαρίδης, ΗΜΥ, 2010

Asynchronous Set

always @(posedge CLOCK || posedge SET)beginif (SET == 1'b1)C_Q_OUT <= “11111111”;else if (CLOCK == 1'b1)C_Q_OUT <= D_IN;end

Page 80: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.82 © Θεοχαρίδης, ΗΜΥ, 2010

What is this ?

always @(posedge CLOCK || posedge RESET)beginif (RESET == 1'b1)D_Q_OUT <= “00000000”;else if (CLOCK == 1'b1)beginif (ENABLE == 1'b1)D_Q_OUT <= D_IN;endend

Page 81: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.83 © Θεοχαρίδης, ΗΜΥ, 2010

Answer

° Flip-flop with asynchronous reset and clock enable

Page 82: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.84 © Θεοχαρίδης, ΗΜΥ, 2010

Flip-flop with asynchronous reset; asynchronous set and clo

always @(posedge CLOCK || posedge RESET || posedge SET)begin

if (RESET == 1'b1)E_Q_OUT <= "00000000";else if (SET == 1'b1)E_Q_OUT <= "11111111";else if (CLOCK == 1'b1)beginif (ENABLE == 1'b1)E_Q_OUT <= D_IN;end

end

Page 83: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.85 © Θεοχαρίδης, ΗΜΥ, 2010

Using Clock Enable Pin Instead of Gated Clocks

° Use the CLB clock enable pin instead of gated clocks in your designs. Gated clocks can introduce glitches, increased clock delay, clock skew, and other undesirable effects

Page 84: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.86 © Θεοχαρίδης, ΗΜΥ, 2010

Gated Clockmodule gate_clock(IN1, IN2, DATA, CLK,LOAD,OUT1);input IN1;input IN2;input DATA;input CLK;input LOAD;output OUT1;reg OUT1;wire GATECLK;assign GATECLK = (IN1 & IN2 & CLK);always @(posedge GATECLK)beginif (LOAD == 1'b1)OUT1 <= DATA;endendmodule

Page 85: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.87 © Θεοχαρίδης, ΗΜΥ, 2010

Gated Clock

BAD IDEA !!

Page 86: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.88 © Θεοχαρίδης, ΗΜΥ, 2010

Clock Enable

module clock_enable (IN1, IN2, DATA, CLK, LOAD, DOUT);input IN1, IN2, DATA;input CLK, LOAD;output DOUT;wire ENABLE;reg DOUT;assign ENABLE = IN1 & IN2 & LOAD;always @(posedge CLK)beginif (ENABLE)DOUT <= DATA;endendmodule

Page 87: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.89 © Θεοχαρίδης, ΗΜΥ, 2010

Clock Enable

Page 88: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.90 © Θεοχαρίδης, ΗΜΥ, 2010

HDL Gate Level (Synthesis)

Synopsys Design Compiler

This tool will compile a HDL model to a netlist (Gate Level) model. It requires the use of the following user-provided files:- Library file, in .db format.- Script file (file extension .script)- The HDL file to be synthesized.

At this stage, we will be using Design Compiler to generate a gate level model of an HDL file

Page 89: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.91 © Θεοχαρίδης, ΗΜΥ, 2010

Overview

Synopsys Design Compiler (DC) shell was derived from one of the public domain shell programs. It is possible that the shell is Kornshell, which is a public domain program with source code available.

Like any other shells (such as C-shell), a user can issue both simplecommands and complex command sequences (a script).

DC shell commands can be categorized into the following: OS level commands and interface (ls, cd, sh and etc) Synthesis commands (read, write, link and compile) User defined commands using existing DC shell commands

DC shell commands can also be used to make minor design modifications either before or after synthesis, such as TAP (Test Access Port) controller insertion.

Page 90: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.92 © Θεοχαρίδης, ΗΜΥ, 2010

DC Objects

° Design objects are divided into the following categories:

• clock – user defined clock group name (using “create_clock –name ck” command). Not the clock pins in the design.

• port - This refers to the top level pins on the design.• pin - Refers to the ports on all the instances in the current design.• reference – Lists all the sub-designs referenced by the current design.• cell – Lists all the instances under the current design• net - Lists all the nets under the current design• library - Lists all the libraries in memory.• file – Lists all the .db files associated with library or designs in

memory.

° The find command can search objects by category. ° A wild card character “*” can be used:

find (cell, “*”)find (port, “*”)

Page 91: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.93 © Θεοχαρίδης, ΗΜΥ, 2010

DC Commands

° Design compiler contains many commands. The most commonly used ones are: read, link, compile, write and etc.

° To get a list of all dc_shell commands:dc_shell> list –commands

° To get a list of options for a dc_shell command, enter the command with an unknown option to get a short help line:dc_shell> compile –abc

° To get detailed help on an option, use the help command:dc_shell> help compile

° The help command list usage examples as well.

Page 92: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.94 © Θεοχαρίδης, ΗΜΥ, 2010

DC Attributes° Design Compiler uses attributes to keep object

property information. ° Attributes can be attached to the following objects:

• cell• clock• design• net• pin• port• reference

° Use the command “help attributes” to list all the attributes available under dc_shell.

° There is a very large number of attributes available for design compiler.

Page 93: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.95 © Θεοχαρίδης, ΗΜΥ, 2010

Commonly Used DC Variable Categories

• The DC predefined variables are divided into the following categories:– atpg : variables that affect test vector generation– bc : behavioral compiler variables– compile : compile command related variables– eco : variables affect ECO compiler– hdl : variables affecting hdl read/write/hdl synthesis– io : read, read_lib and write commands variables– jtag: jtag related variables– timing : timing analysis related variables– system : global system variables– test : DFT compiler variables– power : power compiler related variables– schematic: schematic generation related variables

– Use “list –variables category_name” to get a list of variables for the category.

Page 94: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.96 © Θεοχαρίδης, ΗΜΥ, 2010

Design Analyzer

° GUI Version of Design Compiler

° Can Start by: % design_analyzer

Page 95: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.97 © Θεοχαρίδης, ΗΜΥ, 2010

Back to Design Compiler

° What we need:• Environment Setup

° Again, see me after class for environment setups

#Cadence tools for primetime and silicon ensembleset path=($path /home/software/cadence/dsmse/tools/dsm/bin /home/software/cadence/ic/tools/dfII/bin /home/software/cadence/dsmse/tools/pbs/bin /home/software/cadence/tools/bin)

if ( $?LD_LIBRARY_PATH ) thensetenv LD_LIBRARY_PATH /home/software/cadence/ic/tools/lib:${LD_LIBRARY_PATH}

elsesetenv LD_LIBRARY_PATH /home/software/cadence/ic/tools/lib

endif

# Synopsys CAD Toolsif (-f $confdir/cshrc.synopsys) source $confdir/cshrc.synopsys

Page 96: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.98 © Θεοχαρίδης, ΗΜΥ, 2010

Synopsys Design Compiler

•The HDL file that we will be using is top_module.v•The file describes the module top_module•Using a script file with Design Compiler, we will synthesize this file to a gate level netlist.•The next page shows the script file used to compile top_module.v•The script file is called synthesize.script

Page 97: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.99 © Θεοχαρίδης, ΗΜΥ, 2010

read -format verilog "../behavioral/top_module.v"create_clock -period 4.0 clkreal_inputs = all_inputs() - clk - rstset_input_delay -clock clk -max 2.0 real_inputsset_output_delay -clock clk -max 2.0 all_outputs()set_clock_latency -rise .1 {clk}set_clock_latency -fall .1 {clk}set_clock_uncertainty -hold 0.2 {clk}set_clock_uncertainty -setup 0.2 {clk}linkuniquifyset_dont_use { mg75m_typical/com4 }set_max_fanout 12 top_modulecompile -map_effort high -area_effort high report_area report_timing -path full -delay max -max_paths 1 -nworst 1 -true set_input_delay -clock clk -max 0 all_inputs()set_output_delay -clock clk -max 0 all_outputs()link\nuniquify\nset_max_area $2true_delay_prove_false_backtrack_limit = 2000true_delay_prove_true_backtrack_limit = 2000write -hierarchy -output $1.dbreport_powerwrite -format verilog -hierarchy -output "../gate/top_module_gate.v"exit

Page 98: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.100 © Θεοχαρίδης, ΗΜΥ, 2010

Running Design Compiler

Why use a script file? Using a script file with dc_shell is equivalent to typing the

exact commands in dc_shell interactively. A script file automates the process of typing in all the

commands manually.

What is the .db file for?

° The database (.db) file holds information about the standard cell library used to implement the HDL design.

° It provides information about the standard cells: the names of the standard cells, input/output ports, as well as timing characteristics and functionality.

Page 99: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.101 © Θεοχαρίδης, ΗΜΥ, 2010

Design Compiler

1. Copy your design to a synthesis (/behavioral) directory (one for each stage, better managed that way)

2. Ensure that your environment is setup3. Type dc_shell –f synthesize.script4. The ‘-f’ option tells design compiler to use a

script file, and not run in interactive mode.5. After design compiler finishes (it should take

some time to finish the compilation), a verilog gate netlist file called top_module_gate.vshould be created in the directory /gate.

° Note: If a verilog gate level netlist file with the same name exists in the target directory, design compiler will overwrite it.

Page 100: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.102 © Θεοχαρίδης, ΗΜΥ, 2010

Post Synthesis

° Synthesis also gives us Power, Area and Speed• Optimize Code for Power, Area, Speed!!!

° DC Reports critical paths° Optimize code for reducing length of critical

path° Re-Synthesize until optimal results.° Now that we have a verilog gate file, our next

step would be to simulate the verilog gate level netlist to check for errors.

° We use ModelSim again!° If functional design is verified, we move on.

Page 101: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.103 © Θεοχαρίδης, ΗΜΥ, 2010

Static Timing Analysis(STA)

° STA is a method for determining if a design meets timing constraints w/o having to simulate the design with vectors.

° STA consists of three major steps:• Break down the design into timing paths (R-R, PI-R,PI-PO & R-PO).• Delay of each path is calculated• All path delays are checked against timing constraints to see if it is met.

° STA advantage• Speed (orders of magnitude faster than dynamic simulation)• Capacity to handling full chip• Exhaustive timing coverage• Vectors are not required

° STA disadvantage• It is pessimistic (too conservative)• Reports false paths

Page 102: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.104 © Θεοχαρίδης, ΗΜΥ, 2010

Synopsys PrimeTime (PT)° PT is a stand-alone full chip, gate-level static timing analyzer. ° The tool was intended to replace the popular motive timing

verification from it obtained from Viewlogic.° PT was jointly developed by IBM and Synopsys. ° The major advantage of PT is speed and the ability to handle much

larger designs since it does not have the overhead of Synthesis models (compare to DesignTime under design compiler).

° The major disadvantage of PT is the additional step it takes to setup the environment again after synthesis.

° PT gives similar results as DesignTime & is considered to be sign-off quality STA.

Page 103: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.105 © Θεοχαρίδης, ΗΜΥ, 2010

STA in a Typical ASIC Design Flow

HDL HDL Simulation

Pass?

HDL Synthesis

Func. Sim. & STA

Physical Implementation

Netlist

Pass?

Pass?

Tim Sim & STA

no

no

no

physical Verification& fabrication

Setup & import design

Set timing Environment

Perform Timing Analysis

Timing model generation

Static Timing Analysis(STA)

UpdatedNetlist R & C

SDF

Page 104: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.106 © Θεοχαρίδης, ΗΜΥ, 2010

Design Compiler (DC) to PrimeTime(PT) interface

° DesignTime (DT) under DC for pre-layout STA.

° Use PT for post-layout STA.° The utility “transcript” can be used to

convert DC script to PT script:

% transcript dc.scr dc.scr.pt

° Not all DC commands are applicable or convertible to PT commands, but it provides a good starting point.

° DC & PT share the same libraries.

.DBlibs

DC

P & R

PT

Netlist

Designconstraints

UpdatedNetlist R & C SDF

TimingReport

Page 105: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.107 © Θεοχαρίδης, ΗΜΥ, 2010

PrimeTime

° You can either run PrimeTime under pt_shell or under the GUI. To start the pt_shell mode, use the following command line option:

% pt_shell (for ASCII interface) or% primetime (for GUI interface, or pt_shell –gui)

° PT can be initialized with an INI file: “.synopsys_pt.setup”. Its precedence: $SYNOPSYS/admin/setup, ~user & current project.

Page 106: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.108 © Θεοχαρίδης, ΗΜΥ, 2010

Setting up Timing Environment

° Timing environment refers to the condition the chip under analysis has to operate under. It includes the following:

• Clock frequency and waveform• Input rise/fall times (also referred to as slew rate)• Output loading• input arrival times• output departure times

° All the above five types of timing environment constraints are required for verification mode.

° Only Input rise/fall times and output loading are required for timing model extraction or delay calculation.

Page 107: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.109 © Θεοχαρίδης, ΗΜΥ, 2010

PLACEMENT AND ROUTING

Post-Synthesis Implementation

Page 108: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.110 © Θεοχαρίδης, ΗΜΥ, 2010

Placement and routing

Two critical phases of layout design:– placement of components on the chip;– routing of wires between components.

Placement and routing interact, but separating layout design into phases helps us understand the problem and find good solutions.

Page 109: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.111 © Θεοχαρίδης, ΗΜΥ, 2010

Placement metrics

° Quality metrics for layout:• Area• Delay• Energy consumption

° Ideally placement and routing would be performed together • Both problems are NP-hard• For practical considerations placement and routing must be

performed separately

° Design time may be important for FPGAs

Page 110: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.112 © Θεοχαρίδης, ΗΜΥ, 2010

Wire length as a quality metric

bad placement good placement

Page 111: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.113 © Θεοχαρίδης, ΗΜΥ, 2010

Wire length measures

Estimate wire length by distance between components.

Possible distance measures:– Euclidean distance (sqrt(x2 +

y2));– Manhattan distance (x + y).

Multi-point nets must be broken up into trees for good estimates.

Euclidean

Manhattan

Page 112: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.114 © Θεοχαρίδης, ΗΜΥ, 2010

Wiring trees

Steiner point

Page 113: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.115 © Θεοχαρίδης, ΗΜΥ, 2010

Placement techniques

Can construct an initial solution, improve an existing solution.Pairwise interchange is a simple

improvement metric:– Interchange a pair, keep the swap if it helps

wire length.– Heuristic determines which two components to

swap.

Page 114: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.116 © Θεοχαρίδης, ΗΜΥ, 2010

Placement by partitioning

Works well for components of fairly uniform size.Partition netlist to minimize total wire

length using min-cut criterion.Partitioning may be interpreted as 1-D or 2-

D layout.

Page 115: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.117 © Θεοχαρίδης, ΗΜΥ, 2010

Recursive partitioning

Page 116: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.118 © Θεοχαρίδης, ΗΜΥ, 2010

Min-cut bisecting partitioning

partition 1 partition 2

AB

C D

3 nets

1 net

Page 117: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.119 © Θεοχαρίδης, ΗΜΥ, 2010

Min-cut bisecting partitioning, cont’d

Swapping A and B:– B drags 1 net;– A drags 3 nets;– total cut increase: 3 nets.

Conclusion: probably not a good swap, but must be compared with other pairs.

Page 118: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.120 © Θεοχαρίδης, ΗΜΥ, 2010

Before Placement: Clustering° Need to group BLEs into

groups° Goals:

• Minimize number of clusters

• Minimize inter-cluster wiring

• Minimize critical path (timing-driven)

° How do we do this• Take advantage of cluster

architecture

Page 119: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.121 © Θεοχαρίδης, ΗΜΥ, 2010

PO1

PO2

PO3

PI1

PI2

PI3

1

3

1

4

6

44

6

6

5

5

7

4

netlist with delay for each gate

Timing Analysis

PO1

PO2

PO3

PI1

PI2

PI3

1

3

1

4

6

44

6

6

5

5

7

4

arrival times

0

0

0

1

3

1

7

9

7

7

13

15

14

18

22

18

Source: David Pan

Page 120: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.122 © Θεοχαρίδης, ΗΜΥ, 2010

PO1

PO2

PO3

PI1

PI2

PI3

1

3

1

4

6

44

6

6

5

5

7

4

arrival time/required time

0/4

0/0

0/8

1/5

3/3

1/9

7/9

9/9

7/15

7/13

13/15

15/15

14/18

18/22

22/22

18/22

PO1

PO2

PO3

PI1

PI2

PI3

1

3

1

4

6

44

6

6

5

5

7

4

slack = required time -arrival time

4

0

8

4

0

8

2

0

8

6

2

0

4

4

0

4

Timing Analysis

Page 121: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.123 © Θεοχαρίδης, ΗΜΥ, 2010

Example with interconnect delay

5 5 5

4 4 4

2

FF

FF

3 2 1 1

2 1 3 2

1

22

19

Page 122: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.124 © Θεοχαρίδης, ΗΜΥ, 2010

Placement• Placement has a set of competing goals.• Can’t optimize locally and globally simultaneously.• Use heuristic approaches to evaluate quality.

C D F

AB

E1 2

LUT1 LUT2ABCD E

Page 123: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.125 © Θεοχαρίδης, ΗΜΥ, 2010

Placement Algorithms

• Constructive methods: begin from netlist and generate an initial placement.

- Partitioning methods: mincut and Kernighan-Lin methods

- Clustering• Iterative improvement

- Begin with random or constructive placement.- Iterate to improve it.- Hill-climbing

Page 124: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.126 © Θεοχαρίδης, ΗΜΥ, 2010

Iterative Placement Algorithms• Pairwise interchange methods• Force-directed methods

- FD relaxation- FD pairwise exchange

• Simulated annealing- Generates best results- Can be time consuming

• Macro-based approaches- Genetic algorithms- Quad swaps

Page 125: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.127 © Θεοχαρίδης, ΗΜΥ, 2010

Iterative Improvement Algorithms

Force-directed: (classical mechanics)- Force vector computed on each module corresponding to

all nets

- Solve set of non-linear differential equations.

Simulated annealing: (statistical mechanics)- Model a physical annealing process which optimizes

energy.

- Similar to “quenching” metal.

°

Page 126: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.128 © Θεοχαρίδης, ΗΜΥ, 2010

Timing-driven Placement

• Take both wire length and critical path into account• Problem

- Critical path changes as I move blocks- How do I balance the two objectives

• How do we go about modeling routing delay during placement?

Page 127: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.129 © Θεοχαρίδης, ΗΜΥ, 2010

Determining Criticality

• Same basic approach as used for clustering criticality• For each (i, j) connection from source i and sink j

- Determine arrival times (pre-order BFS)- Determine required arrival times (post-order BFS)- Determine slack -> required_arrival_time –

arrival_time- Criticality(i, j) = [1- slack(i, j)]/ (Max slack)

What is the purpose of the criticality exponent?

Page 128: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.130 © Θεοχαρίδης, ΗΜΥ, 2010

Balancing Wiring and Timing Cost

• Need to determine relative changes in timing and wiring based on moves

• Idea: Use relative changes from previous calculation- Both values less than 1- Helps balance effect based on scaling parameter

This still doesn’t help address changes in delay

Page 129: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ07 Design Flow.131 © Θεοχαρίδης, ΗΜΥ, 2010

Routing• Problem Given a placement, and a fixed number of metal

layers, find a valid pattern of horizontal and vertical wires that connect the terminals of the nets

Levels of abstraction:o Global routingo Detailed routing

• Objectives Cost components:

o Area (channel width) – min congestion in prev levels helpedo Wire delays – timing minimization in previous levelso Number of layers (less layers less expensive)o Additional cost components: number of bends, vias

Page 130: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ07 Design Flow.132 © Θεοχαρίδης, ΗΜΥ, 2010

Metal layer 1

Via

Routing Anatomy

Topview

3Dview

Metal layer 2

Metal layer 3

Symbolic

Layout

Note: Colors usedin this slide are notstandard

Presenter
Presentation Notes
Top layers have more spacing between wires Top layers higher aspect ratio (like walls)
Page 131: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ07 Design Flow.133 © Θεοχαρίδης, ΗΜΥ, 2010

Global vs. Detailed Routing• Global routing

Input: detailed placement, with exact terminal locations

Determine “channel” (routing region) for each net

Objective: minimize area (congestion), and timing (approximate)

• Detailed routing Input: channels and approximate routing from

the global routing phase Determine the exact route and layers for each

net Objective: valid routing, minimize area

(congestion), meet timing constraints Additional objectives: min via, power

Figs. [©Sherwani]

Page 132: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.134 © Θεοχαρίδης, ΗΜΥ, 2010

Channel graph

LE LE

LE LE

channel channel

channel

channel

channel channel

channel

channelchannel

channel

channel channelswitchbox

switchbox

switchbox

switchbox

switchbox

switchbox

switchbox

switchbox

switchbox

Page 133: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ07 Design Flow.135 © Θεοχαρίδης, ΗΜΥ, 2010

Routing Environment• Routing regions Channel

o Fixed height ?( fixed number of tracks)

o Fixed terminals on top and bottomo More constrained problem: switchbox.

Terminals on four sides fixed

Area routingo Wires can pass through any region not occupied by cells

(exception: over-the-cell routing)

• Routing layers Could be pre-assigned (e.g., M1 horizontal, M2 vert.) Different weights might be assigned to layers

1 1 4 5 4

3 2 32 5

1,3 4,5

Page 134: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ07 Design Flow.136 © Θεοχαρίδης, ΗΜΥ, 2010

Routing Environment• Chip architecture Full-custom:

o No constraint on routing regions

Standard cell:o Variable channel height?o Feed-through cells connect

channels

FPGA: o Fixed channel heighto Limited switchbox connectionso Prefabricated wire segments

have different weights

Failed netChannel

Feedthroughs

Figs. [©Sherwani]

Tracks

Failed connection

Page 135: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ07 Design Flow.137 © Θεοχαρίδης, ΗΜΥ, 2010

FPGA Programmable Switch Elements• Used in connecting: The I/O of functional units

to the wires

A horizontal wire to a vertical wire

Two wire segments to form a longer wire segment

Page 136: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ07 Design Flow.138 © Θεοχαρίδης, ΗΜΥ, 2010

FPGA Routing Channels Architecture• Note: fixed channel widths (tracks)• Should “predict” all possible connectivity

requirements when designing the FPGA chip• Channel -> track -> segment

• Segment length? Long: carry the signal longer,

less “concatenation” switches, but might waste track Short: local connections, slow for longer connections

channeltrack

segment

Page 137: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ07 Design Flow.139 © Θεοχαρίδης, ΗΜΥ, 2010

FPGA Switch Boxes• Ideally, provide switches

for all possible connections

• Trade-off: Too many switches:

o Large areao Complex to program

Too few switches:o Cannot route signals

Xilinx 4000One possible

solution

Page 138: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.140 © Θεοχαρίδης, ΗΜΥ, 2010

FPGA Routing Architecture

°Island – Style FPGA°Row – Based FPGA°Sea – Gates FPGA°Hierarchical FPGA

Commercial FPGAs can be classified into the four groups, based on their routing architecture.

Page 139: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ07 Design Flow.141 © Θεοχαρίδης, ΗΜΥ, 2010

FPGA Architecture - Layout• Island FPGAs Array of functional units Horizontal and vertical routing

channels connecting the functional units

Versatile switch boxes Example: Xilinx, Altera

• Row-based FPGAs Like standard cell design Rows of logic blocks Routing channels (fixed width)

between rows of logic Example: Actel FPGAs

Page 140: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.142 © Θεοχαρίδης, ΗΜΥ, 2010

The Four Classes of FPGA

Page 141: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.143 © Θεοχαρίδης, ΗΜΥ, 2010

An Island – Based FPGA

Page 142: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.144 © Θεοχαρίδης, ΗΜΥ, 2010

Island-Style Devices

• Two dimensional problem• (X+Y)!/(X!Y!) possible paths• Restricted within bounding box

Page 143: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.145 © Θεοχαρίδης, ΗΜΥ, 2010

Example channel segmentation distribution

Page 144: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.146 © Θεοχαρίδης, ΗΜΥ, 2010

Virtex Routing Architecture

Page 145: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.147 © Θεοχαρίδης, ΗΜΥ, 2010

18Kb BRAM

CAM

MultiplierBLVDS

Backplane

PCI-X

DDR

DDR

DDR

CAM

QDRSRAM

DDRSDRAMDistri

RAM

LVDS

Shift Registers

DCM

FIFOPCI

SONET / SDH

Virtex II Architecture

Page 146: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.148 © Θεοχαρίδης, ΗΜΥ, 2010

Virtex II Routing Hierarchy

Page 147: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.149 © Θεοχαρίδης, ΗΜΥ, 2010

Virtex II Clock Distribution

Page 148: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ07 Design Flow.150 © Θεοχαρίδης, ΗΜΥ, 2010

FPGA Routing• Routing resources pre-fabricated

100% routability using existing channels If fail to route all nets, redo placement

• FPGA architectural issues Careful balance between number of logic blocks and routing

resources (100% logic area utilization?) Designing flexible switchboxes and channels

(conflicts with high clock speeds)

• FPGA routing algorithms Graph search algorithms

o Convert the wire segments to graph nodes, and switch elements to edges

Bin packing heuristics (nets as objects, tracks as bins) Combination of maze routing and graph search algorithms

Page 149: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.151 © Θεοχαρίδης, ΗΜΥ, 2010

FPGA issues

Often want a fast answer. May be willing to accept lower quality result for less place/route time.May be interested in knowing wirability

without needing the final configuration.Fast placement: constructive placement,

iterative improvement through simulated annealing.

Page 150: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.152 © Θεοχαρίδης, ΗΜΥ, 2010

FPGA routing

Finding a route into given interconnection network.Global routing assigns to channels.Local routing selects the programming

points used to make the connections.

Page 151: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

FPGA-Based System Design: Chapter 4 Copyright 2004 Prentice Hall PTR

ΗΜΥ664 Δ03 Design Flow.153 © Θεοχαρίδης, ΗΜΥ, 2010

FPGA routing techniques

Nair: route based on congestion, not distance. Route in two passes:– Estimate congestion.– Final routing.

Triptych: more gradual penalty for congestion.

Page 152: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.154 © Θεοχαρίδης, ΗΜΥ, 2010

Xilinx XC4000 Routing

25

Page 153: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.155 © Θεοχαρίδης, ΗΜΥ, 2010

Altera Stratix Logic Array Blocks (Clusters)

Page 154: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.156 © Θεοχαρίδης, ΗΜΥ, 2010

Routing Connections

Based on the switch and wire parasitic, interconnect routes can be modeled as RC networks.

S S

Other issues:Power

Routability

Page 155: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.157 © Θεοχαρίδης, ΗΜΥ, 2010

Timing-Driven Routing

• Add delay cost component to routing.• Represent delay along path as RC chain. Buffering

important here.• Note that timing driven routing selects most distant

point for first route.- Sets upper bound on delay.

• Need for combined breadth-first congestion and timing-driven route.

Page 156: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.158 © Θεοχαρίδης, ΗΜΥ, 2010

Timing-Driven Routing

• Difficult to estimate remaining timing along a path

• Difficult to balance costs for each critical net

• Some routers attempt to “look-ahead” to anticipate congested or time-critical areas

• Optimal approaches have generally failed.

Page 157: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.159 © Θεοχαρίδης, ΗΜΥ, 2010

Combined Placement and Routing

• Used depth-first route to select initial connections• Swap blocks and rip up attached nets• Bias nets that span the bulk of device onto long-line

resources.• Took 16X longer than place and route

- 8% to 15% improvement.

Page 158: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.160 © Θεοχαρίδης, ΗΜΥ, 2010

Optimizing your FPGA design

° Pinout and Area Constraints Editor (PACE)° Implementation (Mapping, Placing, Routing)

• Constraints Editor• Text Editor (HDL source)• Floorplanner -- Placement• FPGA Editor – Routing

° Timing Constraints• Xilinx Constraints Editor

Page 159: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.161 © Θεοχαρίδης, ΗΜΥ, 2010

VHDL based synthesis

Page 160: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.162 © Θεοχαρίδης, ΗΜΥ, 2010

VHDL code

architecture RTL1 of RESOURCE isbegin

seq : process (RSTn, CLOCK)begin

if (RSTn = '0') thenDOUT <= (others => '0');

elsif (CLOCK'event and CLOCK = '1') thencase SEL is

when "00" => DOUT <= unsigned(A) - 1;when "01" => DOUT <= unsigned(B) - 1;when "10" => DOUT <= unsigned(C) - 1;when others => DOUT <= unsigned(D) - 1;

end case;end if;

end process;end RTL1;

Page 161: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.163 © Θεοχαρίδης, ΗΜΥ, 2010

Synthesized schematic

for RTL1 of resource

delay 57 nsarea 65

number of flip-flops 16

Page 162: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.164 © Θεοχαρίδης, ΗΜΥ, 2010

4-bit Shift Register

Page 163: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.165 © Θεοχαρίδης, ΗΜΥ, 2010

4-bit Shift Register

Page 164: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.166 © Θεοχαρίδης, ΗΜΥ, 2010

HDL: Design Verification

HDL

Synthesis

Implementation

Download

HDLImplement your design using VHDL or Verilog

Functional Simulation

TimingSimulation

In-Circuit Verification

BehavioralSimulation

Page 165: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.167 © Θεοχαρίδης, ΗΜΥ, 2010

BehavioralSimulation

Synthesis: Design Verification

HDL

Synthesis

Implementation

Download

HDL

Synthesize the design to create an FPGA netlist

Functional Simulation

TimingSimulation

In-Circuit Verification

Page 166: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.168 © Θεοχαρίδης, ΗΜΥ, 2010

Implementation: Design Verification

BehavioralSimulationHDL

Synthesis

Implementation

Download

HDL

Translate, place and route and generate a bitstream to download in the FPGA

Functional Simulation

TimingSimulation

In-Circuit Verification

Page 167: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.169 © Θεοχαρίδης, ΗΜΥ, 2010

HDL: Summary

° Full VHDL/Verilog (RTL code)• Advantages:

- Portability- Complete control of the design implementation and tradeoffs- Easier to debug and understand a code that you own

• Disadvantages:

- Can be time consuming - Don’t always have control over the Synthesis tool- Need to be familiar with algorithm and how to write it

Page 168: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.170 © Θεοχαρίδης, ΗΜΥ, 2010

But…

What about the custom ASIC case?

Page 169: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.171 © Θεοχαρίδης, ΗΜΥ, 2010

Layout – Back End Tools

° Once our design meets all static timing requirements, we move on

° Next step: Layout° Objective: Receive an HDL gate level netist° Create a custom cell (or a chip) using that netlist° We use Cadence Silicon Ensemble now

Page 170: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.172 © Θεοχαρίδης, ΗΜΥ, 2010

Cadence Silicon Ensemble (SE) or SoC Encounter

° Silicon Ensemble is typically used to perform standard cell placement and routing.

° Silicon Ensemble was originally created by Tangent Systems named “tancell”. The product was renamed to “cell3” by Cadence before Silicon Ensemble.

° SE (Silicon Ensemble) can be used to place and route a standard-cell based design with embedded blocks such as RAM, ROM or any other IP blocks.

Page 171: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.173 © Θεοχαρίδης, ΗΜΥ, 2010

A typical ASIC Design Flow

HDL HDL Simulation

Pass?

HDL Synthesis

Func. Sim.

Physical Implementation

Netlist

Pass?

Pass?

Tim Sim & STA & DRC/ERC/LVS

no

no

no

fabrication

Floor Planning

Placement

Routing

DRC/LVS

Page 172: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.174 © Θεοχαρίδης, ΗΜΥ, 2010

A typical Layout Flow

Import Files(Design files & Libraries)

Floor Planning(Create cell rows)

Placement(Place IO & cells)

Routing• Power Ring Generation• Global Routing• Detailed Routing

Timing Data Generation(RC Extraction & Delay Calculation)

Design Rules Check(Antenna, connectivity, geometry)

Output Generation(GDSII, DEF, LEF, and SDF)

Clock Tree Synthesis

Page 173: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.175 © Θεοχαρίδης, ΗΜΥ, 2010

Environment Setup

° The Cadence Place & Route tool (Silicon Ensemble) can run either in the ANSI (ASCII) mode or the X-windows GUI mode.

° Beginners typically use the GUI mode which means a user must come to the workstation console physically.

° The following files a typically needed for a Place & Route session:

• A technology library (tech.lef, can be one or two files: one for the technology file and one for macro cells)

• A Verilog Stub library which contains all the macro cells (verilog_stub.v)

• A gate level Verilog netlist for the synthesized design (top_module_gate.v)

• A SE setup file (se.ini)

Page 174: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.176 © Θεοχαρίδης, ΗΜΥ, 2010

Read in the Technology Files° The technology file (LEF) contains layer definition, via

definition, via generation rules some design rules and cell definitions.

° Some vendors separate the technology file from cell definitions. The *.lef contains both the technology definition and cell definitions.

° Use the following command to read in the npu018.lef technology file:

File=>Import=>LEF …

Page 175: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.177 © Θεοχαρίδης, ΗΜΥ, 2010

Imported LEF File

Page 176: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.178 © Θεοχαρίδης, ΗΜΥ, 2010

The Environment

Variables Form

Environment Variables

Page 177: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.179 © Θεοχαρίδης, ΗΜΥ, 2010

Init Floorplan

The Init Floorplan Form

Page 178: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.180 © Θεοχαρίδης, ΗΜΥ, 2010

Floorplan Initialization

After Floorplan Initialization

Page 179: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.181 © Θεοχαρίδης, ΗΜΥ, 2010

Place IO ° In a complete design, IO cells would be used. ° If no IO cells are used, the design can be treated as a

macro. Hierarchical place & route can be done in this fashion.

° Use “Place IOs …” command to place IO pins.° It can be seen that the pins are placed at the boundary

of the chip.

Page 180: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.182 © Θεοχαρίδης, ΗΜΥ, 2010

Place Cells° The standard cells in the design can be placed

after the IO cells (pins) are placed.° Use “Place=>Cells …” to place the standard cells.° Note that the standard cells are placed on the cell

rows.

Page 181: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.183 © Θεοχαρίδης, ΗΜΥ, 2010After Cell Placement

Cell Placement

Page 182: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.184 © Θεοχαρίδης, ΗΜΥ, 2010

Viewing Layers (Silicon Ensemble)° To view nets, special wires,

pins, cell boundaries etc. while you are working on your design, make sure all the appropriate Vs (visible) fields are checked.

Page 183: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.185 © Θεοχαρίδης, ΗΜΥ, 2010

After Adding Filler Cells

Page 184: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.186 © Θεοχαρίδης, ΗΜΥ, 2010

Power Routing° All cells must connect to power supply and ground

lines.° Power routing must be done before general routing.° Use “Route=>Connect Rings…” to bring up the

power routing menu. Click on ok to start the routing.

Page 185: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.187 © Θεοχαρίδης, ΗΜΥ, 2010

Global and Detailed Routing° Global routing performs the task of creating routing

channels and divide large nets that run across multiple channels into sub-nets.

° Detailed routing performs the task of routing the chip channel by channel.

° Use “Route=>Wroute…” to start global and detailed routing.

° Make sure the options “Global and Final Route” and “Auto Search and Repair” are selected.

Page 186: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.188 © Θεοχαρίδης, ΗΜΥ, 2010

Search and Repair

° After the first round of global and detailed routing, there might be some unrouted nets.

° Sometimes, the first round of routing allows design rules to be violated. These nets must be re-routed.

° The search and repair steps typically use the so-called maze routers.

° Use “Route=>Wroute…” to bring up the routing menu.

° Make sure to select “Incremental Final Route” option.

Page 187: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.189 © Θεοχαρίδης, ΗΜΥ, 2010

Verify° After routing the design successfully, the following

commands can be issued to verify the routed design.

• Verify=>Antenna… (to verify that there are no antenna violations on the long nets.)

• Verify=>Connectivity … (to verify that there are no shorts and opens)

• Verify=>Geometry … (to verify that there are no design rules violations.)

Page 188: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.190 © Θεοχαρίδης, ΗΜΥ, 2010

GDS Export° SE uses object names (metal1, via2 and etc) to

represent design objects.° GDS uses number and shapes to represent design

objects.° A mapping file is needed to convert SE data into GDS

data. This file is called a GDS mapping file.° Use “File=>export=>GDS …” to start the GDS export

menu.° The GDS file name can be arbitrary. ° The Mapping file needs to be gds2_tech.map file.° The structure name can be “top_module”° The library name can be “top_module_lib”

Page 189: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.191 © Θεοχαρίδης, ΗΜΥ, 2010

Running SE in ASCII Mode or Batch° Silicon Ensemble can be executed in batch mode:

% seultra -b “execute batch.cmds; quit;” &° Or it can be started graphically first and then execute

the command file using “File=>Execute …”.° In either case, the settings in se.ini file are executed

first before any other commands.° Silicon Ensemble can be invoked in the ASCII mode

instead of the X-windows GUI mode with the “-gd=ansi” option. This mode is typically used by expert users

° Faster and allows off-line (i.e. at night) P&R

Page 190: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.192 © Θεοχαρίδης, ΗΜΥ, 2010

Extracting a Verilog Netlist

° We need to extract a Verilog netlist out of our placed-and-routed design to verify that the place-and-route tools did their jobs without errors.

° This Verilog netlist will be simulated using Modelsim to verify for correct functionality and Prime Time to verify Static Timing analysis

Page 191: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.193 © Θεοχαρίδης, ΗΜΥ, 2010

Extracting a Verilog Netlist

1. First, we need to create an extracted view of our design. Open the layout view of top_module.

2. Change the editing tool to layout editing by clicking on Tools -> Layout.

3. Click on Verify -> Extract…

4. Make sure the macro cell box is checked. Click on OK.

5. The extraction process will take a few minutes to complete.

Page 192: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.194 © Θεοχαρίδης, ΗΜΥ, 2010

Extracting a Verilog Netlist

7. Click on Tools -> Verilog-XL. You should see a new form called Setup Environment pop up on your screen.

8. Enter “topchip_nopads.verilog” for the simulation run directory.

9. Simulate the design in:

Library: tutorial

Cell: topchip_nopads

View: extracted

10. Click on OK. (sample form is on next slide)

Page 193: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.195 © Θεοχαρίδης, ΗΜΥ, 2010

Extracting a Verilog Netlist

12. The Verilog-XL Integration window will now pop up. Click on Setup -> Netlist…

13. The Verilog Netlisting Options form will pop up. Click on the More >> button. This will enable you to see all the options for this form.

14. For the Netlist These Views field, enter: “behavioral functional symbol verilog”.

15. For the Stop Netlisting at Views field, enter: “behavioral functional symbol”.

16. Enter “vdd” and “gnd” for Global Power Nets and Global Ground Nets, respectively.

17. Make sure the netlist explicitly box is checked. Then, Click on OK.

Page 194: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.196 © Θεοχαρίδης, ΗΜΥ, 2010

Verilog or Hspice?

° For our case, Verilog is a more practical choice. ° Verilog is a switch-level language, which means it does not

model any parasitics of the design. This makes simulation much faster than Hspice, which models the parasitics of the system.

° Since we started out with a HDL file, we can assume that most of our designs will be relatively complex (e.g. having more than a few thousand transistors). Hspice simulation for designs of this scale is too time consuming.

Page 195: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.197 © Θεοχαρίδης, ΗΜΥ, 2010

Verilog or Hspice?

° For example, if we were to simulate our design using Hspice for 150 us, it would take more than 24 hours to simulate. Verilog simulation using Modelsim takes less than 1 second.

° Conclusion: Hspice is great for detailed simulations (especially for analog systems), but for complex, purely digital systems, Verilog simulation is much more practical.

° Other simulators such as IRSim fall somewhere in between Verilog and Hspice simulators.

Page 196: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.198 © Θεοχαρίδης, ΗΜΥ, 2010

Back Annotation

° Use the extracted netlist now with Prime Time (Same way we did earlier, only this time the parasitic capacitances will be included)

° Prime Time reports final timing analysis° If goal is met, then design ready for

manufacturing!!!

° Of course, we’d never get to this stage without a large amount of frustration, endless hours in front of a computer, computer failing to comply with our demands, all sorts of trouble

° However, we’ve (hopefully) accomplished our goals!

Page 197: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.199 © Θεοχαρίδης, ΗΜΥ, 2010

ASIC Design Flow (Synopsys)

The starting point of the flow is a synthesized Verilog netlist from Design Compiler.

Physical implementation was done using Silicon Ensemble, with all the files needed by PrimeTime generated from Silicon Enseble.

VerilogNetlist

.SDC

.LEF

.LIB

Silic

on E

nsem

ble

.V

.setload(C)

.setres(R)

.SDF

PrimeTime

.db

Des

ign

Com

pile

r

.SDC

TimingReport

Page 198: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ Λέκτορας ΗΜΜΥ (ttheocharides@ucy.ac.cy) · Static Timing Analysis. Standard Cell Placement and Routing. Post-Layout Static Timing Analysis.

ΗΜΥ664 Δ03 Design Flow.200 © Θεοχαρίδης, ΗΜΥ, 2010

Sample Design