Lect. 3: Superscalar Processors: - Data Dependence (A.k.a. Read After Write - RAW) - Control Dependence
Lect. 3: Superscalar Processors: - Data Dependence (A.k.a. Read After Write - RAW) - Control Dependence
Lect. 3: Superscalar Processors: - Data Dependence (A.k.a. Read After Write - RAW) - Control Dependence
3: Superscalar Processors
▪ Pipelining: several instructions are simultaneously at different
stages of their execution
▪ Superscalar: several instructions are simultaneously at the same
stages of their execution
▪ Out-of-order execution: instructions can be executed in an order
different from that specified in the program
▪ Dependences between instructions:
– Data Dependence (a.k.a. Read after Write - RAW)
– Control dependence
▪ Speculative execution: tentative execution despite dependences
Memory
Memory General
registers
IF ID EXE MEM WB
EXE I1 I2 I3 I4
MEM I1 I2 I3
WB I1 I2
cycle 1 2 3 4 5 6
I1 I3 I5 I7
EXE CPI → 0.5;
I2 I4 I6 I8 IPC → 2
MEM I1 I3 I5
I2 I4 I6
WB I1 I3
I2 I4
cycle 1 2 3 4 5 6
CS4/MSc Parallel Architectures - 2017-2018
4
Advanced Superscalar Execution
▪ Ideally: in an n-issue superscalar, n instructions are fetched,
decoded, executed, and committed per cycle
▪ In practice:
– Data, control, and structural hazards spoil issue flow
– Multi-cycle instructions spoil commit flow
▪ Buffers at issue (issue queue) and commit (reorder buffer)
decouple these stages from the rest of the pipeline and regularize
somewhat breaks in the flow
Memory
Memory General
registers
instructions instructions
Fetch
engine ID EXE MEM WB
Case 2: instructions
spread in more
lines and no branch
Figure from
Rotenberg et. al.
119
120
Instructions Per Clock
80 75
63 61 59 60
55
49
45
41
40 36 35 34
18 16 15
15 13 15 12 14 14
10 10 8 8 11 9 9
0
gcc espresso li fpppp doduc tomcatv
Figure from
Rotenberg et. al.
...
I18 [Cond. Br. to L5 ]
L4: I19 [ALU] B4
... (I19-I24)
I24 [Cond. Br. to L1]
L5:
Common path
B4
(I19-I24) CS4/MSc Parallel Architectures - 2017-2018
24