Fall 2000 4 Processor Micro-architecture
Transcript of Fall 2000 4 Processor Micro-architecture
![Page 1: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/1.jpg)
IntelIntelLabsLabs
Copyright © 2000 Intel Corporation.
Fall 2000Inside the Inside the Inside the Inside the PentiumPentiumPentiumPentium®®®® 4 Processor 4 Processor 4 Processor 4 Processor
MicroMicroMicroMicro----architecturearchitecturearchitecturearchitectureNext Generation IANext Generation IANext Generation IANext Generation IA----32 Micro32 Micro32 Micro32 Micro----architecturearchitecturearchitecturearchitecture
Doug CarmeanDoug CarmeanPrincipal ArchitectPrincipal Architect
Intel Architecture GroupIntel Architecture Group
August 24, 2000August 24, 2000
![Page 2: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/2.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000AgendaAgenda
ll IAIA--32 Processor Roadmap32 Processor Roadmap
llDesign GoalsDesign Goals
llFrequencyFrequency
ll Instructions Per Cycle Instructions Per Cycle
llSummarySummary
![Page 3: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/3.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000IntelIntel®® PentiumPentium®® 4 Processor4 ProcessorP
erfo
rman
ceP
erfo
rman
ce
486 Micro486 Micro--architecturearchitecture
TimeTime
P5 MicroP5 Micro--ArchitectureArchitecture
P6 MicroP6 Micro--ArchitectureArchitecture
TodayToday
IntelIntel®® NetBurst™NetBurst™MicroMicro--ArchitectureArchitecture
![Page 4: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/4.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
Intel® Pentium® 4 Processor Intel® Pentium® 4 Processor Design GoalsDesign Goals
llDeliver world class performance Deliver world class performance across both existing and emerging across both existing and emerging applicationsapplicationsllDeliver performance headroom and Deliver performance headroom and
scalability for the futurescalability for the future
Micro-architecture that will Drive PerformanceLeadership for the Next Several Years
MicroMicro--architecture that will Drive Performancearchitecture that will Drive PerformanceLeadership for the Next Several YearsLeadership for the Next Several Years
![Page 5: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/5.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Intel® NetBurst™ MicroIntel® NetBurst™ Micro--architecturearchitecture
400 MHz 400 MHz SystemSystem
BusBus
RapidRapidExecutionExecution
EngineEngine
ExecutionExecutionTrace CacheTrace Cache
HyperHyperPipelinedPipelined
TechnologyTechnology
AdvancedAdvancedTransfer Transfer
CacheCacheAdvanced Advanced DynamicDynamic
ExecutionExecution
StreamingStreamingSIMDSIMD
Extensions 2Extensions 2
Enhanced FloatingEnhanced FloatingPoint / MultiPoint / Multi--MediaMedia
![Page 6: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/6.jpg)
Pentium® 4 Processor Pentium® 4 Processor Block DiagramBlock Diagram
FP
RF
FP
RF
FMulFMulFAddFAddMMXMMXSSESSE
FP moveFP moveFP storeFP store
3.2
GB
/s S
yste
m In
terf
ace
3.2
GB
/s S
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
-- Cac
he
and
DC
ach
e an
d D
-- TL
BT
LB
StoreStoreAGUAGULoadLoadAGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALU
ALUALU
ALUALU
ALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
uCodeuCodeROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I --
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
![Page 7: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/7.jpg)
L2 Cache and ControlL2 Cache and Control
FP
RF
FP
RF
FMulFMulFAddFAddMMXMMXSSESSE
FP moveFP moveFP storeFP store
3.2
GB
/s S
yste
m In
terf
ace
3.2
GB
/s S
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
-- Cac
he
and
DC
ach
e an
d D
-- TL
BT
LB
StoreStoreAGUAGULoadLoadAGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALU
ALUALU
ALUALU
ALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
uCodeuCodeROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I --
TL
BT
LB
Pentium® 4 Processor Pentium® 4 Processor Block DiagramBlock Diagram
![Page 8: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/8.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000CPU Architecture 101CPU Architecture 101
Delivered Performance = Frequency * Instructions Per Cycle
Delivered Performance = Delivered Performance = Frequency * Instructions Per CycleFrequency * Instructions Per Cycle
FrequencyFrequencyFrequency
![Page 9: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/9.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000FrequencyFrequency
llWhat limits frequency?What limits frequency?––Process technology Process technology
––MicroarchitectureMicroarchitecture
llOn a given process technologyOn a given process technology––Fewer gates per pipeline stage will deliver Fewer gates per pipeline stage will deliver
higher frequencyhigher frequency
Frequency is driven by Micro-architectureFrequency is driven by MicroFrequency is driven by Micro--architecturearchitecture
![Page 10: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/10.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
NetburstNetburstTMTM MicroMicro--architecture architecture Pipeline vs P6Pipeline vs P6
11 22 33 44 55 66 77 88 99 1010
FetchFetch FetchFetch DecodeDecode DecodeDecode DecodeDecode RenameRename ROB RdROB Rd RdyRdy/Sch/Sch DispatchDispatch ExecExec
Basic P6 PipelineBasic P6 Pipeline
Hyper pipelined Technology enables industry leading performance and clock rate
Hyper pipelined Technology enables industry Hyper pipelined Technology enables industry leading performance and clock rateleading performance and clock rate
Basic PentiumBasic Pentium®® 4 Processor Pipeline4 Processor Pipeline
11 22 33 44 55 66 77 88 99 1010 1111 1212TC TC Nxt Nxt IPIP TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF
Intro at Intro at ≥≥≥≥ 1.4GHz1.4GHz
.18µ.18µ
Intro at Intro at 733MHz733MHz
.18µ.18µ
![Page 11: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/11.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
Fre
qu
ency
Fre
qu
ency
TimeTimeIntroductionIntroduction
233MHz233MHz
60MHz60MHz P5 MicroP5 Micro--ArchitectureArchitecture
55
Hyper Pipelined Hyper Pipelined TechnologyTechnology
TodayToday
≥≥≥≥1.4GHz1.4GHz 1.13GHz1.13GHz
166MHz166MHz
P6 MicroP6 Micro--ArchitectureArchitecture
1010
2020NetburstNetburst™ Micro™ Micro--ArchitectureArchitecture
![Page 12: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/12.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper pipelined TechnologyHyper pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF
TC Nxt IP: Trace cache next instruction pointerPointer from the BTB, indicating location ofnext instruction.
22TC TC Nxt Nxt IPIP
11
![Page 13: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/13.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
TC Fetch: Trace cache fetchRead the decoded instructions (uOPs) out of the Execution Trace Cache
Tra
ce C
ach
eT
race
Cac
he
![Page 14: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/14.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
Drive: Wire delayDrive the uOPs to the allocator
![Page 15: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/15.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
Alloc: AllocateAllocate resources required for execution. Theresources include Load buffers, Store buffers, etc..
Ren
ame/
Allo
cR
enam
e/A
lloc
![Page 16: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/16.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
Rename: Register renamingRename the logical registers (EAX) to the physicalregister space (128 are implemented).
Ren
ame/
Allo
cR
enam
e/A
lloc
![Page 17: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/17.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
Que: Write into the uOP QueueuOPs are placed into the queues, where theyare held until there is room in the schedulers
uop
uop
Que
ues
Que
ues
![Page 18: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/18.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
Sch: ScheduleWrite into the schedulers and compute dependencies. Watch for dependency to resolve.
Sch
edul
ers
Sch
edul
ers
![Page 19: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/19.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
Disp: DispatchSend the uOPs to the appropriate executionunit.
![Page 20: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/20.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
RF: Register FileRead the register file. These are the source(s)for the pending operation (ALU or other).
Inte
ger
RF
Inte
ger
RF
FP
RF
FP
RF
![Page 21: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/21.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
Ex: ExecuteExecute the uOPs on the appropriate executionport.
FopFop
FmsFms
AGUAGU
AGUAGU
ALUALUALUALU
ALUALUALUALU
![Page 22: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/22.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
Flgs: FlagsCompute flags (zero, negative, etc..). Theseare typically the input to a branch instruction.
FopFop
FmsFms
AGUAGU
AGUAGU
ALUALUALUALU
ALUALUALUALU
![Page 23: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/23.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
Br Ck: Branch CheckThe branch operation compares the result of theactual branch direction with the prediction.
ALUALUALUALU
ALUALUALUALU
![Page 24: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/24.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Hyper Pipelined TechnologyHyper Pipelined Technology
FP
RF
FP
RF
FopFop
FmsFms
Sys
tem
Inte
rfac
eS
yste
m In
terf
ace L2 Cache and ControlL2 Cache and Control
L1
DL
1 D
--Cac
he
and
DC
ach
e an
d D
--TL
BT
LB
AGUAGU
AGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALUALUALU
ALUALUALUALU
Tra
ce C
ach
eT
race
Cac
he
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
ROMROM
33 33
Dec
od
erD
eco
der
BT
B &
IB
TB
& I--
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
22 33 44 55 66 77 88 99 1010 1111 1212TC FetchTC Fetch DriveDrive AllocAlloc RenameRename QueQue SchSch SchSch SchSch
1313 1414DispDisp DispDisp
1515 1616 1717 1818 1919 2020RFRF ExEx FlgsFlgs Br CkBr Ck DriveDriveRF RF TC TC Nxt Nxt IPIP
11
Drive: Wire delayDrive the result of the branch check to the frontend of the machine.
![Page 25: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/25.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000CPU Architecture 101CPU Architecture 101
Delivered Performance = Frequency * Instructions Per Cycle
Delivered Performance = Delivered Performance = Frequency * Instructions Per CycleFrequency * Instructions Per Cycle
Instructions Per CycleInstructions Per CycleInstructions Per Cycle
![Page 26: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/26.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Improving Improving Instructions Per CycleInstructions Per Cycle
ll Improve efficiencyImprove efficiency––Do more things in a clockDo more things in a clock
––Branch predictionBranch prediction
llReduce time it takes to do somethingReduce time it takes to do something––Reducing latencyReducing latency
![Page 27: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/27.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Improving Improving Instructions Per CycleInstructions Per Cycle
ll Improve efficiencyImprove efficiency––Branch predictionBranch prediction
––Do more things in a clockDo more things in a clock
llReduce time it takes to do somethingReduce time it takes to do something––Reducing latencyReducing latency
![Page 28: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/28.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Branch PredictionBranch Prediction
llAccurate branch prediction is key to Accurate branch prediction is key to enabling longer pipelinesenabling longer pipelinesllDramatic improvement over P6 branch Dramatic improvement over P6 branch
predictor:predictor:––8x the size (4K)8x the size (4K)––Eliminated 1/3 of the Eliminated 1/3 of the mispredictionsmispredictions
llProven to be better than Proven to be better than allall other other publicly disclosed predictors publicly disclosed predictors –– (g(g--share, hybrid, etc)share, hybrid, etc)
![Page 29: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/29.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
The Execution Trace CacheThe Execution Trace CacheL2 Cache and ControlL2 Cache and Control
L1
DL
1 D
-- Cac
he
and
DC
ach
e an
d D
-- TL
BT
LB
Tra
ce C
ach
eT
race
Cac
he
33 33
FP
RF
FP
RF
FMulFMulFAddFAddMMXMMXSSESSE
FP moveFP moveFP storeFP store
3.2
GB
/s S
yste
m In
terf
ace
3.2
GB
/s S
yste
m In
terf
ace
StoreStoreAGUAGULoadLoadAGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALU
ALUALU
ALUALU
ALUALU
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
uCodeuCodeROMROM
Dec
od
erD
eco
der
BT
B &
IB
TB
& I --
TL
BT
LB
Tra
ce C
ach
eT
race
Cac
he
BTBBTB
![Page 30: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/30.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Execution Trace CacheExecution Trace Cache
llAdvanced L1 instruction cacheAdvanced L1 instruction cache––Caches “decoded” IACaches “decoded” IA--32 instructions (uops)32 instructions (uops)
llRemoves decoder pipeline latencyRemoves decoder pipeline latency
llCapacity is ~12K Capacity is ~12K uOps uOps
ll Integrates branches into single lineIntegrates branches into single line––Follows predicted path of program executionFollows predicted path of program execution
Execution Trace Cache feeds fast engineExecution Trace Cache feeds fast engineExecution Trace Cache feeds fast engine
![Page 31: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/31.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
1 1 cmpcmp2 br 2 br --> T1 > T1
....
... (unused code)... (unused code)
T1:T1: 3 sub3 sub4 br 4 br --> T2> T2....... (unused code)... (unused code)
T2:T2: 5 5 movmov6 sub6 sub7 br 7 br --> T3> T3
....
... (unused code)... (unused code)
T3:T3: 8 add8 add9 sub9 sub
10 10 mulmul11 11 cmpcmp12 br 12 br --> T4> T4
Execution Trace CacheExecution Trace Cache
Trace Cache DeliveryTrace Cache Delivery
10 mul 11 cmp 12 br T4
7 br T3 8 T3:add 9 sub
4 br T2 5 mov 6 sub
1 cmp 2 br T1 3 T1: sub
![Page 32: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/32.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
Advanced Advanced Dynamic ExecutionDynamic Execution
llExtends basic features found in P6 coreExtends basic features found in P6 core
llVery deep speculative executionVery deep speculative execution––126 instructions in flight (3x P6)126 instructions in flight (3x P6)
––48 loads (3x P6) and 24 stores (2x P6)48 loads (3x P6) and 24 stores (2x P6)
llProvides larger window of visibilityProvides larger window of visibility––Better use of execution resourcesBetter use of execution resources
Deep Speculation Improves ParallelismDeep Speculation Improves ParallelismDeep Speculation Improves Parallelism
![Page 33: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/33.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Improving Improving Instructions Per CycleInstructions Per Cycle
ll Improve efficiencyImprove efficiency––Do more things in a clockDo more things in a clock
––Branch predictionBranch prediction
llReduce time it takes to do somethingReduce time it takes to do something––Reducing latencyReducing latency
![Page 34: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/34.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Rapid Execution EngineRapid Execution Engine
ll Intel® NetBurst™ microIntel® NetBurst™ micro--architecture:architecture:––½ clock @ >1.4GHz½ clock @ >1.4GHz
<0.35ns<0.35ns
1ns1ns
llDramatically lower ALU latencyDramatically lower ALU latency
llP6: P6: ––1 clock @ 1GHz1 clock @ 1GHz
![Page 35: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/35.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
L1 Data CacheL1 Data CacheL2 Cache and ControlL2 Cache and Control
L1
DL
1 D
-- Cac
he
and
DC
ach
e an
d D
-- TL
BT
LB
Tra
ce C
ach
eT
race
Cac
he
33 33
FP
RF
FP
RF
FMulFMulFAddFAddMMXMMXSSESSE
FP moveFP moveFP storeFP store
3.2
GB
/s S
yste
m In
terf
ace
3.2
GB
/s S
yste
m In
terf
ace
StoreStoreAGUAGULoadLoadAGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALU
ALUALU
ALUALU
ALUALU
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
uCodeuCodeROMROM
Dec
od
erD
eco
der
BT
B &
IB
TB
& I --
TL
BT
LB
L1
DL
1 D
-- Cac
he
and
DC
ach
e an
d D
-- TL
BT
LB
![Page 36: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/36.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
High Performance High Performance L1 Data CacheL1 Data Cache
ll8KB, 48KB, 4--way set associative, 64way set associative, 64--byte linesbyte lines
llVery high bandwidthVery high bandwidth––1 Ld and 1 St per clock 1 Ld and 1 St per clock
llNew access algorithmsNew access algorithms
llVery low latencyVery low latency––2 clock read2 clock read
New algorithm enables faster cacheNew algorithm enables faster cacheNew algorithm enables faster cache
![Page 37: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/37.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
llObservation: Almost all memory Observation: Almost all memory accesses hit in the cacheaccesses hit in the cache
llOptimize for the common caseOptimize for the common case––Assume that the access will hit the cacheAssume that the access will hit the cache
––Use a low cost mechanism to fix the rare Use a low cost mechanism to fix the rare cases that misscases that miss
llBenefit:Benefit:––Reduces latencyReduces latency
––Significantly higher performance Significantly higher performance
Data SpeculationData Speculation
![Page 38: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/38.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000ReplayReplay
llRepairs incorrect speculationRepairs incorrect speculation––ReRe--execute until correctexecute until correct
llReplay is uOP specificReplay is uOP specific––Replay the uOP that Replay the uOP that mismis--speculatedspeculated
––Replay dependent Replay dependent uOPsuOPs
–– Independent Independent uOPs uOPs are not replayedare not replayed
Efficient mechanism to reduce latencyEfficient mechanism to reduce latencyEfficient mechanism to reduce latency
![Page 39: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/39.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000L1 Cache is >2x FasterL1 Cache is >2x Faster
ll Intel® NetBurst™ microIntel® NetBurst™ micro--architecture:architecture:––2 clocks @ 2 clocks @ ≥≥≥≥1.4GHz1.4GHz
Lower Latency is Higher PerformanceLower Latency is Higher PerformanceLower Latency is Higher Performance
3ns3ns
<1.4ns<1.4ns
llP6:P6:––3 clocks @ 1GHz3 clocks @ 1GHz
![Page 40: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/40.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
Example with higher IPC and Example with higher IPC and Faster Clock!Faster Clock!
CodeCodeSequenceSequence
LdLd
AddAdd
AddAdd
LdLd
AddAdd
AddAdd
10 clocks10 clocks10ns10nsIPC = 0.6IPC = 0.6
6 clocks6 clocks4.3ns4.3nsIPC = 1.0IPC = 1.0
P6P6@1GHz@1GHz
Intel® NetBurst™Intel® NetBurst™[email protected]@1.4GHz
![Page 41: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/41.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
L2 Advanced Transfer CacheL2 Advanced Transfer CacheL2 Cache and ControlL2 Cache and Control
L1
DL
1 D
-- Cac
he
and
DC
ach
e an
d D
-- TL
BT
LB
Tra
ce C
ach
eT
race
Cac
he
33 33
FP
RF
FP
RF
FMulFMulFAddFAddMMXMMXSSESSE
FP moveFP moveFP storeFP store
3.2
GB
/s S
yste
m In
terf
ace
3.2
GB
/s S
yste
m In
terf
ace
StoreStoreAGUAGULoadLoadAGUAGU
Sch
edu
lers
Sch
edu
lers
Inte
ger
RF
Inte
ger
RF
ALUALU
ALUALU
ALUALU
ALUALU
Ren
ame/
Allo
cR
enam
e/A
lloc
uo
pu
op
Qu
eues
Qu
eues
BTBBTB
uCodeuCodeROMROM
Dec
od
erD
eco
der
BT
B &
IB
TB
& I --
TL
BT
LB
L2 Cache and ControlL2 Cache and Control
![Page 42: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/42.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000L2 ATC OrganizationL2 ATC Organization
ll256KB, 8256KB, 8--way set associativeway set associative––128128--byte linesbyte lines
––Two 64Two 64--byte pieces per linebyte pieces per line
llHolds both data and instructionsHolds both data and instructions
llHigh bandwidth: 45 GB/Sec @ 1.4GHzHigh bandwidth: 45 GB/Sec @ 1.4GHz––2.8x P6 @1GHz2.8x P6 @1GHz
![Page 43: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/43.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000Aggregate Cache LatencyAggregate Cache Latency
llFunction of all caches in a processorFunction of all caches in a processor
llOverall Effective LatencyOverall Effective LatencyL1 latency +L1 latency +
L1 Miss Rate * L2 latency +L1 Miss Rate * L2 latency +
L2 Miss Rate * DRAM LatencyL2 Miss Rate * DRAM Latency
Average cache speed is >1.8x better than the Pentium® III Processor
Average cache speed is >1.8x better Average cache speed is >1.8x better than the Pentium® III Processorthan the Pentium® III Processor
Average on desktop applications, Average on desktop applications, Intel® Pentium® III processor @ 1GHz, Intel® Pentium® 4 processIntel® Pentium® III processor @ 1GHz, Intel® Pentium® 4 processor @ 1.4GHzor @ 1.4GHz
![Page 44: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/44.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000ComparisonComparison
≥≥≥≥ 1.41.4≥≥≥≥ 4.2 billion/sec4.2 billion/sec3 billion/sec3 billion/secUopUop Fetch BandwidthFetch Bandwidth
8840924092512512Branch targetsBranch targets
2224241212Stores in flightStores in flight
3348481616Loads in flightLoads in flight
3.153.151261264040Instructions In flightInstructions In flight
≥≥≥≥ 2.82.8≥≥≥≥ 44.8 GB/sec44.8 GB/sec16 GB/sec16 GB/secL2 Cache BandwidthL2 Cache Bandwidth
≥≥≥≥ 2.82.8≥≥≥≥ 44.8 GB/sec44.8 GB/sec16 GB/sec16 GB/secL1 Cache BandwidthL1 Cache Bandwidth
0.50.58 KB8 KB16 KB16 KBL1 Cache SizeL1 Cache Size
≥≥≥≥ 2.12.1< 1.42 ns< 1.42 ns3 ns3 nsL1 Cache SpeedL1 Cache Speed
> 2.8> 2.8≥≥≥≥ 5.6 billion/sec5.6 billion/sec2 billion/sec2 billion/secAdder BandwidthAdder Bandwidth
≥≥≥≥ 2.82.8< .36 ns< .36 ns1 ns1 nsAdder SpeedAdder Speed
≥≥≥≥ 1.41.4≥≥≥≥ 1.4 1.4 GhzGhz1 GHz1 GHzFrequencyFrequency
RelativeRelative
ImprovementImprovement
Pentium 4 Pentium 4 ProcessorProcessor
Pentium® III Pentium® III ProcessorProcessor
![Page 45: Fall 2000 4 Processor Micro-architecture](https://reader033.fdocument.pub/reader033/viewer/2022050415/6270cdecd5b8dc68f337d1c8/html5/thumbnails/45.jpg)
IntelIntelPDXPDX
Copyright © 2000 Intel Corporation.
Fall 2000
Intel® Pentium® 4 ProcessorIntel® Pentium® 4 ProcessorSummarySummaryllRevolutionary, new microRevolutionary, new micro--
architecture from Intel designed for architecture from Intel designed for the evolving Internetthe evolving Internet
llDesign features for balanced, high Design features for balanced, high performance platform scalability and performance platform scalability and headroom headroom
llThe world’s The world’s highest performancehighest performancedesktop processordesktop processor