CO Unit 2
CO Unit 2
CO Unit 2
The Processor
§4.1 Introduction
Introduction
CPU performance factors
Instruction count
Determined by ISA and compiler
CPI(clock cycles per Instn) and Cycle time
Determined by CPU hardware
We will examine two MIPS implementations
A simplified version
A more realistic pipelined version
Simple subset, shows most aspects
Memory reference: LW, SW
Arithmetic/logical: add, sub, and, or, slt
Control transfer: beq, j
Arithmetic/Logic Unit
Multiplexer Y = F(A, B)
Y = S ? I1 : I0
A
I0 M
u Y ALU Y
I1 x
B
S F
Clk
Q
Clk
D Q Write
Write D
Clk
Q
Require
Two state elements
Instruction Memory (Read access)
PC (32 bit)
Adder (Adds 32 bit and o/p sum)
Instruction Fetch
Increment by
4 for next
32-bit instruction
register
To increase the
size by
replicating higher
order signbit
Just
re-routes
wires
0000 AND
0001 OR
Instruction [5– 0]
branch /jump target
In pipeline
Need to compare registers and compute target
early in the pipeline
Add hardware to do it in ID stage
Predict not
taken as
solution
MEM
Right-to-left
flow leads to WB
hazards
2 stages: PC
inc &
Register
Write Chapter 4 — The Processor — 51
Pipeline registers
Need registers between stages(64,128,97,64 bits)
To hold information produced in previous cycle
Wrong
register
number
Above example
Sub & and hazard 1a.
EX/MEM.RegisterRd= ID/EX.RegisterRs = X2
Similary
Sub – or hazard 2b.
MEM/WB.RegisterRd = ID/EX.RegisterRt= X2 Chapter 4 — The Processor — 69
Dependencies & Forwarding
ForwardA = 00 ID/EX The first ALU operand comes from the register file.
Flush these
instructions
(Set control
values to 0)
PC
n ALU/branch IF ID EX MEM WB
IPC =SUBI
7/6X20,
= 1.17
X20,#4
(c.f.noppeak IPC = 2) 2
Chapter 4 — The Processor — 114
Loop Unrolling
Replicate loop body to expose more
parallelism
Reduces loop-control overhead
Use different registers per replication
Called “register renaming”
Avoid loop-carried “anti-dependencies”
Store followed by a load of the same register
Aka “name dependence”
Reuse of a register name
IPC = nop
15/8 = 1.875 LDUR X1, [X20,#24] 2
Hold pending
operands