Scoreboarding

29
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04 1 Scoreboarding The following four steps replace ID, EX and WB steps ID: Issue – if a functional unit for instruction is free and no other active instruction has the same destination register (WAW) it can proceed, otherwise it stalls ID: Read operands – a source operand is available if no earlier instruction is going to write it EX: Execute – once the execution is complete this stage notifies the scoreboard WB: Write results – scoreboard checks for WAR hazards and may stall write back

description

Scoreboarding. The following four steps replace ID, EX and WB steps ID: Issue – if a functional unit for instruction is free and no other active instruction has the same destination register (WAW) it can proceed, otherwise it stalls - PowerPoint PPT Presentation

Transcript of Scoreboarding

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

1

Scoreboarding The following four steps replace ID, EX and WB

steps ID: Issue – if a functional unit for instruction is free and

no other active instruction has the same destination register (WAW) it can proceed, otherwise it stalls

ID: Read operands – a source operand is available if no earlier instruction is going to write it

EX: Execute – once the execution is complete this stage notifies the scoreboard

WB: Write results – scoreboard checks for WAR hazards and may stall write back

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

2

Scoreboarding Operands are always read from register file – no

advantage is taken of forwarding This is no large penalty as write occurs immediately

after the execution and not after MEM stage Read operand and write result stages cannot overlap so

we have 1 cycle latency

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

3

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Integer

Yes Load F6 R2 Yes

Issue first load

Time =1

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

4

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Integer

Yes Load F6 R2 Yes

First load reads operands

Time =2

Second load cannot be issued due to structural hazard

No

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

5

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes Load F6 R2 No

Integer

First load completes execution

Time =3

Second load cannot be issued due to structural hazard

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

6

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

First load writes the result and frees ALU

Time =4

Yes Load F6 R2 No

Integer

Second load cannot be issued due to structural hazard

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

7

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes Load F2 R3 Yes

Integer

Second load is issued

Time =5

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

8

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

YesYes

Load F2 R3 Yes

Integer

Mult F0 F2 F4 Integer No Yes

Mult1

Second load reads operands

Time =6

Mult is issued

No

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

9

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

YesYes

Yes

Load F2 R3 No

Integer

Mult F0 F2 F4 Integer No Yes

Mult1

Sub is issued

Sub F8 F6 F2 Integer Yes No

Add

Time =7

Second load completes execution Mult is stalled waiting for F2

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

10

Integer

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

YesYes

Mult F0 F2 F4 No Yes

Mult1

Div is issued

Sub F8 F6 F2 No

Add

Div F10 F0 F6 No YesMult1

Divide

Time =8

Second load writes resultMult is stalled waiting for F2Sub is stalled waiting for F2

Yes Load F2 R3 NoYes

YesYes

Integer

Integer

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

11

YesYes

YesYes

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

YesYes

Mult F0 F2 F4 No No

Mult1

Sub F8 F6 F2 No

Add

Div F10 F0 F6 No YesMult1

Divide

Time =9

Mult reads operandsSub reads operandsDiv is stalled waiting for F0Add cannot be issueddue to structural hazard

No

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

12

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

YesYes

Mult F0 F2 F4 No No

Mult1

Add cannot be issueddue to structural hazard

Sub F8 F6 F2 No No

Add

Div F10 F0 F6 No YesMult1

Divide

Time =10Mult in execution (1 out of 10)Sub in execution (1 out of 2)Div is stalled waiting for F0

1010

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

13

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

YesYes

Mult F0 F2 F4 No No

Mult1

Sub F8 F6 F2 No No

Add

Div F10 F0 F6 No YesMult1

Divide

Time =11

Add cannot be issueddue to structural hazard

Mult in execution (2 out of 10)Sub completes execution Div is stalled waiting for F0

1010

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

14

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

Yes

Mult F0 F2 F4 No No

Mult1

Div F10 F0 F6 No YesMult1

Divide

Time =12

Mult in execution (3 out of 10)Sub writes result, frees adderDiv is stalled waiting for F0Add cannot be issueddue to structural hazard

10

Yes Sub F8 F6 F2 No No

Add

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

15

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

YesYes

Mult F0 F2 F4 No No

Mult1

Add is issued

Yes Yes

Add

Div F10 F0 F6 No YesMult1

Add F6 F8 F2

Divide

Time =13

10

Mult in execution (4 out of 10)Div is stalled waiting for F0

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

16

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

YesYes

Mult F0 F2 F4 No No

Mult1

Add reads operands

Yes Yes

Add

Div F10 F0 F6 No YesMult1

Add F6 F8 F2

Divide

Time =14Mult in execution (5 out of 10)Div is stalled waiting for F0

10

No No

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

17

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

YesYes

Mult F0 F2 F4 No No

Mult1

Add in execution (1 out of 2)

No No

Add

Div F10 F0 F6 No YesMult1

Add F6 F8 F2

Divide

Time =15

10

Mult in execution (6 out of 10)Div is stalled waiting for F0

15

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

18

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

YesYes

Mult F0 F2 F4 No No

Mult1

No No

Add

Div F10 F0 F6 No YesMult1

Add F6 F8 F2

Divide

Time =16

Add completes execution

Mult in execution (7 out of 10)Div is stalled waiting for F0

10

15

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

19

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

YesYes

Mult F0 F2 F4 No No

Mult1

No No

Add

Div F10 F0 F6 No YesMult1

Add F6 F8 F2

Divide

Time =17

Add is stalled, WAR hazard

Mult in execution (8 out of 10)Div is stalled waiting for F0

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

20

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

YesYes

Mult F0 F2 F4 No No

Mult1

No No

Add

Div F10 F0 F6 No YesMult1

Add F6 F8 F2

Divide

Time =19

Add is stalled, WAR hazard

Mult completes execution Div is stalled waiting for F0

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

21

No

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

YesYes

Add

No No

Div F10 F0 F6 Yes Yes

Add F6 F8 F2

Time =20

Add is stalled, WAR hazard

Mult writes result Div is stalled waiting for F0

Yes Mult F0 F2 F4 No No

Mult1

Mult1

Divide

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

22

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

YesYes

Add

No No

Div F10 F0 F6 Yes Yes

Add F6 F8 F2

Divide

Time =21

No No

Div reads operandsAdd is stalled, WAR hazard

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

23

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

Add writes result

Div F10 F0 F6 No No

Divide

Time =22

Div in execution (1 out of 40)

22

Yes

Add

No NoAdd F6 F8 F2

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

24

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Yes

Div completes execution

Div F10 F0 F6 No No

Divide

Time =61

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

25

Issue Read operands Execution complete Write result L.D F6, 34(R2)L.D F2, 45(R3)MUL.D F0, F2, F4SUB.D F8, F6, F2DIV.D F10, F0, F6ADD.D F6, F8, F2

Instruction status

Busy Op Fi Fj Fk Qj Qk Rj Rk

Integer ALUFP Mult1FP Mult2FP AddFP Div

Functional unit status

F0 … F2 … F4 … F6 … F8 … F10 … F12

Functional unit

Register result status

Div writes result

Time =62

Yes Div F10 F0 F6 No No

Divide

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

26

Tomasulo’s Algorithm Use reservation stations that will hold operands for

instructions waiting to issue Reservation station fetches the operand as soon as it

is available Pending instructions read operands from reservation

stations When writes overlap in execution, only the last

write actually updates the register

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

27

Tomasulo’s Algorithm

FP registersInstruction

queue

Address unit

Memory unitFP adders FP multipliers

4321

4321

Frominstruction

unit

Reservationstations

Store buffers Load

buffers

Data Address

Common data bus

LOAD-STOREOPERATIONS

FPOPERATIONS

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

28

Tomasulo’s Algorithm Each reservation station holds the opcode for the

pending instruction and either operand values or names of reservation stations that will provide them

Load and store buffers hold data and addresses for memory access

Transfer of all data goes over the common data bus

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04

29

Homework● Due Tuesday, October 19 by the end of the class● Submit either in class (paper) or by E-mail (PS or PDF only) or bring the paper copy to my office ● Show scheduling of the following code using scoreboard(assume one integer ALU, two FP multipliers, one FP adder and one FP divider)

LD F2, 0(R2)

LD F4, 100(R3)

ADD F8, F2, F2

MUL F6, F4, F8

SUB F6, F2, F4