IL2207 SoC Architecture Course Jan – March 2011, KTH

Post on 03-Jan-2016

20 views 1 download

description

IL2207 SoC Architecture Course Jan – March 2011, KTH. Dr. Zhonghai Lu zhonghai@kth.se ( 鲁中海 ). Course Information. Course staff Responsible: Dr. Zhonghai Lu, zhonghai@kth.se Examiner: Prof. Axel Jantsch, axel@kth.se Assistants: Huimin She, huimin@kth.se - PowerPoint PPT Presentation

Transcript of IL2207 SoC Architecture Course Jan – March 2011, KTH

IL2207SoC Architecture Course

Jan – March 2011, KTH

Dr. Zhonghai Luzhonghai@kth.se

(鲁中海 )

April 20, 2023 SoC Architecture 2

Course Information Course staff

Responsible: Dr. Zhonghai Lu, zhonghai@kth.se Examiner: Prof. Axel Jantsch, axel@kth.se Assistants: Huimin She, huimin@kth.se

Abbas Eslami Kiasari, kiasari@kth.se

12 Lectures, 4 Tutorials, 3 Labs Home page: www.ict.kth.se/courses/IL2207/1101 Course Material

Dally, Towles: Principles and Practices of Interconnection Networks

Distributed materials and slides Advanced-level course, 7.5 credits, 40x5=200 hours

April 20, 2023 SoC Architecture 3

Lecture Overview

L1: Introduction  L2: Buses and Arbitration (Dally: 22, 18) L3: Shared Memory Multiprocessors L4: Cache Coherency Protocols L5: Memory Consistency L6: Introduction to Network-on-Chip, Topologies (Dally: 1, 2, 3, 4, 5) L7: Routing Algorithms and Mechanics (Dally: 8, 9, 10, 11) L8: Flow Control (Dally: 12, 13) L9: Deadlock and Livelock (Dally: 12, 13, 14) L10: Router Architecture and Network Interface (Dally: 16, 17, 20) L11: Network Performance Analysis and Quality of Service (Dally:

23, Dally 15) L12: Course Summary

April 20, 2023 SoC Architecture 4

Tutorial Overview

T1: Bus, arbitration and cache coherency T2: Memory consistency and network topology T3: Interconnection networks (routing, flow control, deadlock etc.) T4: Router architecture, QoS and performance analysis

Tutorials will be given by Abbas.

For each tutorial questions, 2 Questions should be answered and handed in to Abbas before each tutorial session. 10% for the final grade.

April 20, 2023 SoC Architecture 5

Lab Overview

Laboratory 1: Uniprocessor SoC Design on FPGA  Assistant: Huimin

Laboratory 2: Multiprocessor SoC Design with Altera FPGA Assistant: Huimin

Laboratory 3: Wormhole Networks Assistant: Abbas

Each lab has 2 sessions: a, b. Students work in groups of max. 2 students Good preparation is required. Take good care of the FPGA boards.

April 20, 2023 SoC Architecture 6

Course Requirements

To pass the course the student has to fulfill the following requirements:

Pass the final exam. The grade for the exam will be the 90% grade of the course: ABCDEFxF Final exam: March 16, 2011, 9:00-13:00, Register the exam in Daisy 2 weeks before the exam

date in order to guarantee a seat ! Complete all labs: Pass | Fail Attend lectures, tutorials and labs

Labs 3 labs in total

Lab 1 and 2: FPGA board. (Assistant: Huimin She huimin@kth.se ) Lab 3: Network simulator. (Assistant: Abbas Eslami Kiasari

kiasari@kth.se ) Only 2 (NOT 3) lab sessions for each lab

Possible cancelled sessions: 13:00-16:00 (Jan. 24, Feb. 7, Feb. 21). Please note the final changes in TimeEdit.

Evenly distributed to avoid long waiting time (Approx. 20 persons in each session)

Lab partners Two persons in a group If you also take IL2212 Embedded Software, please choose the

same partner as you have for IL2212 The FPGA boards must be returned after lab 1

Note schedule changes

We are resolving schedule conflicts with IL2201 course: Digital Integrated Circuit Design – VLSI

For each lab, we have 3 sessions booked, two will remain and one to be cancelled. Tomorrow's IL2207 lecture will be 10 to 12 AM in

Ka-C21 (Electrum)

April 20, 2023 SoC Architecture 8

Observations in System Design

Observations

Good news Chip capacity increases following the Moore’s law Functionality increases accordingly to exploit

these transistors Bad news

Difficult to design, Productivity decreases Cost increases

Platform-based design can reduce cost Architecture is a key!

April 20, 2023 SoC Architecture 10

April 20, 2023 SoC Architecture 11

Advances in Integration

If automobile speed had increased similarly over the same period, we could now drive from Stockholm to Shanghai in about 23 seconds.

Intel 4004(1971)

108 KHz2,300 transistors

Intel Pentium 4 (2000)

1.5 GHz42 million transitors

Intel chips with Moore’s law

October, 2008Seminar at National Institute of Informatics, Tokyo

12/22

Scaling ARM9

65 nm1.4 mm2

90 nm, 2.6 mm2

130 nm, 5.2 mm2

ARM 9180 nm11.8 mm2

April 20, 2023 SoC Architecture 14

Growing Design-Productivity GapDesign Productivity Crisis

Potential Design Complexity and Designer ProductivityL

og

ic T

ran

sist

or

per

Ch

ip (

M )

Pro

du

ctivity ( K

) Tran

s./Staff – M

o.

19811983

19851987

19891991

19931995

19971999

20012003

20052007

2009

100,000,000

0.01

0.1

1

10

100

1,000

10,000

Equivalent Added Complexity

1,000

100

10

1

0.1

0.01

0.001

10,000

21% / yr compounded

Productivity Growth Rate

xxx

xxx

x x

58% / yr c

ompounded

Complexity Growth Rate

Logic Tr. / Chip

Tr. / S.M.

Designs do not only get more complex, but also much more expensive!

April 20, 2023 SoC Architecture 15

The Role of the Market!

Source: Smith 1997Time-to-Market pressure!

April 20, 2023 SoC Architecture 16

Verification Costs

The percentage of the verification costs of the total design costs is continuously increasing (at present 50-70% for large designs)

April 20, 2023 SoC Architecture 17

Moore’s Law drives the development of System-in-Chip Architectures

Yesterday’s SOC

Processor

Memory

RTL function 1

RTL function 2

RTL function 3

RTL I/O

Today’s SOC

Ctl Proc

Mem

DSP RTL I/O

RTL RTL

Mem

RTL RTL

RTL RTL RTL RTL RTL RTL

RTL

RTL

RTL

RTL

RTL

RTL

The growing number of transistors on an SOC drives the trend towards more RTL blocks on the chip

Source: Leibson (DAC2004)

From ASIC to SoC, MPSoC

We get more and more cores on a single chip

ASIP: Application Specific Instruction Set ProcessorsSoC: both hardware and software (processor plus memory)

April 20, 2023 SoC Architecture 19

$10M design cost, $15 manf. cost, 5% premium for programmability

0

20

40

60

80

100

120

1 2 3 4 5 6 7

100 000

1 000 000

System designs per chip design

To

tal

pe

r u

nit

co

st

SOC Flexibility = Per-Unit Cost Reduction (Model: 100K and 1M system volumes)

Platforms reduce Costs

Low-endstill camera

High-endstill camera

Video camcorder

One Chip Many System Designs

Source: Leibson 2004

April 20, 2023 SoC Architecture 20

Platform Example: Nexperia

April 20, 2023 SoC Architecture 21

Nexperia Instance: Viper

April 20, 2023 SoC Architecture 22

Arm based MPSoC Platform

OMAP from Texas Instruments TI’s OMAP (Open

Multimedia Application Platform) is a category of proprietary system on chips that has capabilities for portable and mobile multimedia applications.

A number of mobile phones use OMAP SoCs.

April 20, 2023 SoC Architecture 23

April 20, 2023 SoC Architecture 24

OMAP: Hierarchy of Platforms

OMAP uses platforms on different levels This is a precondition for reuse

Silicon Technology

ASIC Library & Tools

SoC Platform

Appl. Platform

RefDesign

Reuse

OMAP Infrastructure

OMAP Products

Application Specific

April 20, 2023 SoC Architecture 25

SoC Platform

The SoC platform consists of A library of hardware components An architecture for their interconnection

The Application Platform Processor and Peripherals Low-Level Software (Drivers) Development Environment

The System Platform OS and Middleware Includes the code that controls all aspects of the system from

device driver to system interface Compilers and tools

April 20, 2023 SoC Architecture 26

OMAP 1510

OMAP 1510 is based on Enhanced ARM 925 core (RISC processor) TI C55x core DMA, SRAM, Busses, Peripherals

April 20, 2023 SoC Architecture 27

Current OMAP platform for Wireless Handset & PDA

OMAP™ 3 architecture combines mobile entertainment with high performance productivity applications (Source: Texas Instruments)

Evolving SoC Architectures

April 20, 2023 SoC Architecture 29

System-on-Chip Architectures A system-on-chip architecture integrates several

heterogeneous components on a single chip

Micro-controller

FPGA

DSPCustom

Hardware

Analog-Digital

Digital-Analog

Memory

CommunicationStructure

CommunicationStructure

A key challenge is to design the communication between the different entities of a SoC in order to minimize the communication overhead

Questions on Interconnects

1. To interconnect 2 IP hardware blocks, how would you like to let them communicate with each other?

2. What if 5 to10 IP modules?

3. What if 20 IP blocks?

4. What if 200 IP blocks?

April 20, 2023 SoC Architecture 30

April 20, 2023 SoC Architecture 31

System on a chip

System-on-Chip Architecture:A bus-based SoC

Memory DSPMicro-

processor

CustomLogic

I/O

Technology Impact on Communication

Chip Computation, storage by transistors Communication by wires

How technology scaling affect communication delay?

April 20, 2023 SoC Architecture 32

Scaling and Delays

April 20, 2023 SoC Architecture 33

Transistors are “free”; wires are “expensive”, slowing down performance.

Long wires should be avoided, and the whole chip cannot be treated as a monolithic piece and is preferably segmented into communicating regions.

Number of Cores on Chip

June 2009 34By ITRS (International Technology Roadmap for Semiconductors).

Communication architectures

Evolving from buses to networks Buses are not scalable in bandwidth, power and

performance Network-on-Chip provides

Scalable architectures Concurrent pipelined communication

April 20, 2023 SoC Architecture 35

April 20, 2023 SoC Architecture 36

System-on-Chip Architecture: Network-on-Chip

The resources are connected to the network via network interfaces

The topology of the network and the capability of the switches and communication channels determine the capacity of the network

PE1

PE2

PE3

MEM

Switch

Channel

NI

NI

NI

NI

Network Interface

Resource

Intel Teraflop Chip - 2007

80 Cores 100 Million

transistors 65nm process 3.16 GHz 0.95V 62 W 1.62 Terabit/s

aggregate bandwidth

91 Gb/s bisection bandwidth

1.01 Teraflops

Tilera Gx Family

April 20, 2023 SoC Architecture 38

4x4, 6x6, 8x8, 10x10 Chips 3 instructions per cycle per core 32 MB on chip cache 750 GOPS (32 bit operations) 200 Tbps on chip interconnect

bandwidth 500 Gbps memory bandwidth ~ 1 GHz operating frequency 10W – 55W power consumption

5 mesh networks: 32 bit; Dimension order routing; 1-2 cycle traversal

Static Network (STN) User Dynamic Network (UDN) I/O Dynamic Network (IDN) Tile Dynamic Network (TDN) Memory Dynamic Network (MDN)

Questions on Network Design

Network does 1 to 1 communication: unicast 1 to N communication: multicast N to 1 communication: gather

1. What problems needed to solve in order to realize unicast?

2. What performance metrics do you envision?

3. What factors influence the network performance?

April 20, 2023 SoC Architecture 39

In the Course

Bus-based architectures Buses and arbitration Shared memory multiprocessors Cache coherency Memory consistency

Network-on-Chip (NoC) architectures Topology Routing Flow control Performance analysis

April 20, 2023 SoC Architecture 40