Scoreboarding
description
Transcript of Scoreboarding
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
1
Scoreboarding The following four steps replace ID, EX and WB
steps ID: Issue – if a functional unit for instruction is free and
no other active instruction has the same destination register (WAW) it can proceed, otherwise it stalls
ID: Read operands – a source operand is available if no earlier instruction is going to write it
EX: Execute – once the execution is complete this stage notifies the scoreboard
WB: Write results – scoreboard checks for WAR hazards and may stall write back
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
2
Scoreboarding Operands are always read from register file – no
advantage is taken of forwarding This is no large penalty as write occurs immediately
after the execution and not after MEM stage Read operand and write result stages cannot overlap so
we have 1 cycle latency
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
3
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Integer
Yes Load F6 R2 Yes
Issue first load
Time =1
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
4
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Integer
Yes Load F6 R2 Yes
First load reads operands
Time =2
Second load cannot be issued due to structural hazard
No
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
5
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes Load F6 R2 No
Integer
First load completes execution
Time =3
Second load cannot be issued due to structural hazard
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
6
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
First load writes the result and frees ALU
Time =4
Yes Load F6 R2 No
Integer
Second load cannot be issued due to structural hazard
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
7
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes Load F2 R3 Yes
Integer
Second load is issued
Time =5
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
8
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
YesYes
Load F2 R3 Yes
Integer
Mult F0 F2 F4 Integer No Yes
Mult1
Second load reads operands
Time =6
Mult is issued
No
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
9
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
YesYes
Yes
Load F2 R3 No
Integer
Mult F0 F2 F4 Integer No Yes
Mult1
Sub is issued
Sub F8 F6 F2 Integer Yes No
Add
Time =7
Second load completes execution Mult is stalled waiting for F2
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
10
Integer
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
YesYes
Mult F0 F2 F4 No Yes
Mult1
Div is issued
Sub F8 F6 F2 No
Add
Div F10 F0 F6 No YesMult1
Divide
Time =8
Second load writes resultMult is stalled waiting for F2Sub is stalled waiting for F2
Yes Load F2 R3 NoYes
YesYes
Integer
Integer
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
11
YesYes
YesYes
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
YesYes
Mult F0 F2 F4 No No
Mult1
Sub F8 F6 F2 No
Add
Div F10 F0 F6 No YesMult1
Divide
Time =9
Mult reads operandsSub reads operandsDiv is stalled waiting for F0Add cannot be issueddue to structural hazard
No
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
12
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
YesYes
Mult F0 F2 F4 No No
Mult1
Add cannot be issueddue to structural hazard
Sub F8 F6 F2 No No
Add
Div F10 F0 F6 No YesMult1
Divide
Time =10Mult in execution (1 out of 10)Sub in execution (1 out of 2)Div is stalled waiting for F0
1010
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
13
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
YesYes
Mult F0 F2 F4 No No
Mult1
Sub F8 F6 F2 No No
Add
Div F10 F0 F6 No YesMult1
Divide
Time =11
Add cannot be issueddue to structural hazard
Mult in execution (2 out of 10)Sub completes execution Div is stalled waiting for F0
1010
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
14
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
Yes
Mult F0 F2 F4 No No
Mult1
Div F10 F0 F6 No YesMult1
Divide
Time =12
Mult in execution (3 out of 10)Sub writes result, frees adderDiv is stalled waiting for F0Add cannot be issueddue to structural hazard
10
Yes Sub F8 F6 F2 No No
Add
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
15
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
YesYes
Mult F0 F2 F4 No No
Mult1
Add is issued
Yes Yes
Add
Div F10 F0 F6 No YesMult1
Add F6 F8 F2
Divide
Time =13
10
Mult in execution (4 out of 10)Div is stalled waiting for F0
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
16
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
YesYes
Mult F0 F2 F4 No No
Mult1
Add reads operands
Yes Yes
Add
Div F10 F0 F6 No YesMult1
Add F6 F8 F2
Divide
Time =14Mult in execution (5 out of 10)Div is stalled waiting for F0
10
No No
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
17
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
YesYes
Mult F0 F2 F4 No No
Mult1
Add in execution (1 out of 2)
No No
Add
Div F10 F0 F6 No YesMult1
Add F6 F8 F2
Divide
Time =15
10
Mult in execution (6 out of 10)Div is stalled waiting for F0
15
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
18
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
YesYes
Mult F0 F2 F4 No No
Mult1
No No
Add
Div F10 F0 F6 No YesMult1
Add F6 F8 F2
Divide
Time =16
Add completes execution
Mult in execution (7 out of 10)Div is stalled waiting for F0
10
15
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
19
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
YesYes
Mult F0 F2 F4 No No
Mult1
No No
Add
Div F10 F0 F6 No YesMult1
Add F6 F8 F2
Divide
Time =17
Add is stalled, WAR hazard
Mult in execution (8 out of 10)Div is stalled waiting for F0
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
20
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
YesYes
Mult F0 F2 F4 No No
Mult1
No No
Add
Div F10 F0 F6 No YesMult1
Add F6 F8 F2
Divide
Time =19
Add is stalled, WAR hazard
Mult completes execution Div is stalled waiting for F0
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
21
No
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
YesYes
Add
No No
Div F10 F0 F6 Yes Yes
Add F6 F8 F2
Time =20
Add is stalled, WAR hazard
Mult writes result Div is stalled waiting for F0
Yes Mult F0 F2 F4 No No
Mult1
Mult1
Divide
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
22
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
YesYes
Add
No No
Div F10 F0 F6 Yes Yes
Add F6 F8 F2
Divide
Time =21
No No
Div reads operandsAdd is stalled, WAR hazard
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
23
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
Add writes result
Div F10 F0 F6 No No
Divide
Time =22
Div in execution (1 out of 40)
22
Yes
Add
No NoAdd F6 F8 F2
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
24
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Yes
Div completes execution
Div F10 F0 F6 No No
Divide
Time =61
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
25
Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2
Instruction status
Busy Op Fi Fj Fk Qj Qk Rj Rk
Integer ALUFP Mult1FP Mult2FP AddFP Div
Functional unit status
F0 … F2 … F4 … F6 … F8 … F10 … F12
Functional unit
Register result status
Div writes result
Time =62
Yes Div F10 F0 F6 No No
Divide
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
26
Tomasulo’s Algorithm Use reservation stations that will hold operands for
instructions waiting to issue Reservation station fetches the operand as soon as it
is available Pending instructions read operands from reservation
stations When writes overlap in execution, only the last
write actually updates the register
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
27
Tomasulo’s Algorithm
FP registersInstruction
queue
Address unit
Memory unitFP adders FP multipliers
4321
4321
Frominstruction
unit
Reservationstations
Store buffers Load
buffers
Data Address
Common data bus
LOAD-STOREOPERATIONS
FPOPERATIONS
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
28
Tomasulo’s Algorithm Each reservation station holds the opcode for the
pending instruction and either operand values or names of reservation stations that will provide them
Load and store buffers hold data and addresses for memory access
Transfer of all data goes over the common data bus
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
29
Homework● Due Tuesday, October 19 by the end of the class● Submit either in class (paper) or by E-mail (PS or PDF only) or bring the paper copy to my office ● Show scheduling of the following code using scoreboard(assume one integer ALU, two FP multipliers, one FP adder and one FP divider)
LD F2, 0(R2)
LD F4, 100(R3)
ADD F8, F2, F2
MUL F6, F4, F8
SUB F6, F2, F4