Computer Organization & Assembly Language: CS/COE0447
Computer Organization & Assembly Language: CS/COE0447
Computer Organization & Assembly Language: CS/COE0447
Chapter 5 Part 3
1
Single-cycle Implementation of
MIPS
• Our first implementation of MIPS used a single long clock
cycle for every instruction
• Every instruction began on one up (or, down) clock edge
and ended on the next up (or, down) clock edge
• This approach is not practical as it is much slower than a
multicycle implementation where different instruction
classes can take different numbers of cycles
– in a single-cycle implementation every instruction must take the
same amount of time as the slowest instruction
– in a multicycle implementation this problem is avoided by
allowing quicker instructions to use fewer cycles
• Even though the single-cycle approach is not practical it was
simpler and useful to understand first
4
Execution: single-cycle (reminder)
• add
– Fetch instruction and add 4 to PC add $t2,$t1,$t0
– Read two source registers $t1 and $t0
– Add two values $t1 + $t0
– Store result to the destination register $t1 + $t0 $t2
5
A Multi-cycle Datapath
•For add:
•Instruction is stored in the instruction register (IR)
•Values read from rs and rt are stored in A and B
•Result of ALU is stored in ALUOut
6
Multi-Cycle Execution: R-type
• Instruction fetch
– IR <= Memory[PC]; sub $t0,$t1,$t2
– PC <= PC + 4;
• Decode instruction/register read
– A <= Reg[IR[25:21]]; rs
– B <= Reg[IR[20:16]]; rt
– ALUOut <= PC + (sign-extend(IR[15:0])<<2); later
• Execution
– ALUOut <= A op B; op = add, sub, and, or,…
• Completion
– Reg[IR[15:11]] <= ALUOut; $t0 <= ALU result
7
Execution: single-cycle (reminder)
• lw (load word)
– Fetch instruction and add 4 to PC lw $t0,-12($t1)
– Read the base register $t1
– Sign-extend the immediate offset fff4 fffffff4
– Add two values to get address X = fffffff4 + $t1
– Access data memory with the computed address M[X]
– Store the memory data to the destination register $t0
8
A Multi-cycle Datapath
10
Execution: single-cycle (reminder)
• sw
– Fetch instruction and add 4 to PC sw $t0,-4($t1)
– Read the base register $t1
– Read the source register $t0
– Sign-extend the immediate offset fffc fffffffc
– Add two values to get address X = fffffffc + $t1
– Store the contents of the source register to the compu
ted address $t0 Memory[X]
11
A Multi-cycle Datapath
12
Multi-cycle Execution: sw
• Instruction fetch
– IR <= Memory[PC]; sw $t0,-12($t1)
– PC <= PC + 4;
• Decode/register read
– A <= Reg[IR[25:21]]; rs
– B <= Reg[IR[20:16]]; rt
– ALUOut <= PC + (sign-extend(IR[15:0])<<2);
• Execution
– ALUOut <= A + sign-extend(IR[15:0]); $t1 + -12 (sign extended)
• Memory Access
– Memory[ALUOut] <= B; M[$t1 + -12] <= $t0
13
Execution: single-cycle (reminder)
• beq
– Fetch instruction and add 4 to PC beq $t0,$t1,L
• Assume that L is +3 instructions away
– Read two source registers $t0,$t1
– Sign Extend the immediate, and shift it left by 2
• 0x0003 0x0000000c
– Perform the test, and update the PC if it is true
• If $t0 == $t1, the PC = PC + 0x0000000c
• [we will follow what Mars does, so this is not
Immediate == 0x0002; PC = PC + 4 + 0x00000008]
14
A Multi-cycle Datapath
15
Multi-cycle execution: beq
• Instruction fetch
– IR <= Memory[PC]; beq $t0,$t1,label
– PC <= PC + 4;
• Decode/register read
– A <= Reg[IR[25:21]]; rs
– B <= Reg[IR[20:16]]; rt
– ALUOut <= PC + (sign-extend(IR[15:0])<<2);
• PC + #bytes away label is (negative for backward branches, positive
for forward branches)
• Execution
– if (A == B) then PC <= ALUOut;
• if $t0 == $t1 perform branch
• Note: the ALU is used to evaluate A == B; we’ll see that this does n
ot clash with the use of the ALU above.
16
Execution: single-cycle (reminder)
• j
– Fetch instruction and add 4 to PC
– Take the 26-bit immediate field
– Shift left by 2 (to make 28-bit immediate)
– Get 4 bits from the current PC and attach to
the left of the immediate
– Assign the value to PC
17
A Multi-cycle Datapath
•For j
•No accesses to registers or memory; no need for ALU
18
Multi-cycle execution: j
• Instruction fetch
– IR <= Memory[PC]; j label
– PC <= PC + 4;
• Decode/register read
– A <= Reg[IR[25:21]];
– B <= Reg[IR[20:16]];
– ALUOut <= PC + (sign-extend(IR[15:0])<<2);
• Execution
– PC <= {PC[31:28],IR[25:0],”00”};
19
Multi-Cycle Control
What we need to cover
• Adding registers after every functional unit
– Need to modify the “instruction execution” slides to reflect this
• Breaking instruction execution down into cycles
– What can be done during the same cycle? What requires a
cycle?
– Need to modify the “instruction execution” slides again
– Timing
• Control signal values
– What they are per cycle, per instruction
– Finite state machine which determines signals based on
instruction type + which cycle it is
• Putting it all together
20
Multicycle Approach
21
Operations
•Before: we had separate memories for instructions and data, and we had
extra adders for incrementing the PC and calculating the branch address. Now
we have just one memory and just one ALU.
Five Execution Steps
• Each takes one cycle
• In one cycle, there can be at most one memory
access, at most one register access, and at mos
t one ALU operation
• But, you can have a memory access, an ALU o
p, and/or a register access, as long as there is n
o contention for resources
• Changes to registers are made at the end of the
clock cycle
– PC, ALUOut, A, B, etc. save information for the next
clock cycle
24
Step 1: Instruction Fetch
• Access memory w/ PC to fetch instruction
and store it in Instruction Register (IR)
• Increment PC by 4
– We can do this because the ALU is not being
used for something else this cycle
25
Step 2: Decode and Reg. Read
26
Step 3: Various Actions
• ALU performs one of three functions based on instructio
n type
• Memory reference
– ALUOut <= A + sign-extend(IR[15:0]);
• R-type
– ALUOut <= A op B;
• Branch:
– if (A==B) PC <= ALUOut;
• Jump:
– PC <= {PC[31:28],IR[25:0],2’b00};
27
Step 4: Memory Access…
• If the instruction is memory reference
– MDR <= Memory[ALUOut]; // if it is a load
– Memory[ALUOut] <= B; // if it is a store
• Store is complete!
• If the instruction is R-type
– Reg[IR[15:11]] <= ALUOut;
• Now the instruction is complete!
28
Step 5: Register Write Back
• Only the lw instruction reaches this step
– Reg[IR[20:16]] <= MDR;
29
Summary of Instruction
Execution
Step Action for R-type Action for memory-reference Action for Action for
Step name instructions instructions branches jumps
Instruction fetch IR = Memory[PC]
1: IF
PC = PC + 4
Instruction A = Reg [IR[25-21]]
2: ID decode/register fetch B = Reg [IR[20-16]]
ALUOut = PC + (sign-extend (IR[15-0]) << 2)
Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] II
3: EX computation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)
jump completion
Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]
4: MEM completion ALUOut or
Store: Memory [ALUOut] = B
5: WB Memory read completion Load: Reg[IR[20-16]] = MDR
30
Multicycle Execution Step (1):
Instruction Fetch
IR = Memory[PC];
PC = PC + 4;
I Instruction I
R
5 5 5 Operation
3
PC MemWrite RN1 RN2 WN
ADDR RD1 A Zero
Registers
Memory M ALU
RD D WD ALU
PC + 4 WD
R
RD2 B 4 OUT
MemRead RegWrite
31
Multicycle Execution Step (2):
Instruction Decode & Register
Fetch (A = Reg[rs])
A = Reg[IR[25-21]];
B = Reg[IR[20-15]]; (B = Reg[rt])
ALUOut = (PC + sign-extend(IR[15-0]) << 2)
I Instruction I
R
Branch
5 5 5 Reg[rs] Operation
3
Target
PC MemWrite RN1 RN2 WN
Address
ADDR RD1 A Zero
Registers
Memory M ALU
RD D WD ALU
OUT
PC + 4 R
WD RD2 B
MemRead RegWrite
Reg[rt]
32
Multicycle Execution Step (3):
Memory Reference Instructions
ALUOut = A + sign-extend(IR[15-0]);
I Instruction I
R
5 5 5
Reg[rs] Operation
Mem.
3
PC MemWrite RN1 RN2 WN Address
ADDR RD1 A Zero
Registers
Memory M ALU
RD D WD ALU
OUT
PC + 4 R
WD RD2 B
MemRead RegWrite
Reg[rt]
33
Multicycle Execution Step (4):
Memory Access - Write (sw)
Memory[ALUOut] = B;
I Instruction I
R
5 5 5
Reg[rs] Operation
3
PC MemWrite RN1 RN2 WN
ADDR RD1 A Zero
Registers
Memory M ALU
RD D WD ALU
OUT
PC + 4 R
WD RD2 B
MemRead RegWrite
Reg[rt]
34
Multicycle Execution Step (4):
Memory Access - Read (lw)
MDR = Memory[ALUOut];
I Instruction I
R
5 5 5
Reg[rs] Operation Mem.
PC MemWrite RN1 RN2 WN
3 Address
ADDR RD1 A Zero
Registers
Memory M ALU
RD D WD ALU
OUT
PC + 4 R
WD RD2 B
MemRead RegWrite
Mem. Reg[rt]
Data
35
Multicycle Execution Step (5):
Memory Read Completion (lw)
Reg[IR[20-16]] = MDR;
I Instruction I
R
5 5 5
Reg[rs] Operation
3 Mem.
PC MemWrite RN1 RN2 WN
ADDR
Address
Registers
RD1 A Zero
Memory M ALU
RD D WD ALU
OUT
PC + 4 R
WD RD2 B
MemRead RegWrite
Mem. Reg[rt]
Data
36
Multicycle Execution Step (3):
ALU Instruction (R-Type)
ALUOut = A op B
I Instruction I
R
5 5 5
Reg[rs] Operation
3
R-Type
PC MemWrite RN1 RN2 WN
ADDR
Result
Registers
RD1 A Zero
Memory M ALU
RD D WD ALU
OUT
PC + 4 R
WD RD2 B
MemRead RegWrite
Reg[rt]
37
Multicycle Execution Step (4):
ALU Instruction (R-Type)
Reg[IR[15:11]] = ALUOUT
I Instruction I
R
5 5 5
Reg[rs] Operation
3
R-Type
PC MemWrite RN1 RN2 WN
ADDR
Result
Registers
RD1 A Zero
Memory M ALU
RD D WD ALU
OUT
PC + 4 WD
R B
RD2
MemRead RegWrite
Reg[rt]
38
Multicycle Execution Step (3):
Branch Instructions
if (A == B) PC = ALUOut;
I Instruction I
R
5 5 5
Branch
Reg[rs] Operation
3
Target
PC MemWrite RN1 RN2 WN
ADDR
Address
RD1 A Zero
Registers
Memory M ALU
RD D WD ALU
Branch R OUT
Target WD RD2 B
MemRead RegWrite
Address
Reg[rt]
39
Multicycle Execution Step (3):
Jump Instruction
PC = PC[31-28] concat (IR[25-0] << 2)
I Instruction I
R
5 5 5 Operation
Branch
Reg[rs]
3 Target
PC MemWrite RN1 RN2 WN
ADDR
Address
RD1 A Zero
Registers
Memory M ALU
RD D WD ALU
Jump R OUT
WD RD2 B
Address MemRead RegWrite
Reg[rt]
40
For Reference
• The next 5 slides give the steps, one slide
per instruction
41
Multi-Cycle Execution: R-type
• Instruction fetch
– IR <= Memory[PC]; sub $t0,$t1,$t2
– PC <= PC + 4;
• Decode instruction/register read
– A <= Reg[IR[25:21]]; rs
– B <= Reg[IR[20:16]]; rt
– ALUOut <= PC + (sign-extend(IR[15:0])<<2);
• Execution
– ALUOut <= A op B; op = add, sub, and, or,…
• Completion
– Reg[IR[15:11]] <= ALUOut; $t0 <= ALU result
42
Multi-cycle Execution: lw
• Instruction fetch
– IR <= Memory[PC]; lw $t0,-12($t1)
– PC <= PC + 4;
• Instruction Decode/register read
– A <= Reg[IR[25:21]]; rs
– B <= Reg[IR[20:16]];
– ALUOut <= PC + (sign-extend(IR[15:0])<<2);
• Execution
– ALUOut <= A + sign-extend(IR[15:0]); $t1 + -12 (sign extended)
• Memory Access
– MDR <= Memory[ALUOut]; M[$t1 + -12]
• Write-back
– Load: Reg[IR[20:16]] <= MDR; $t0 <= M[$t1 + -12]
43
Multi-cycle Execution: sw
• Instruction fetch
– IR <= Memory[PC]; sw $t0,-12($t1)
– PC <= PC + 4;
• Decode/register read
– A <= Reg[IR[25:21]]; rs
– B <= Reg[IR[20:16]]; rt
– ALUOut <= PC + (sign-extend(IR[15:0])<<2);
• Execution
– ALUOut <= A + sign-extend(IR[15:0]); $t1 + -12 (sign extended)
• Memory Access
– Memory[ALUOut] <= B; M[$t1 + -12] <= $t0
44
Multi-cycle execution: beq
• Instruction fetch
– IR <= Memory[PC]; beq $t0,$t1,label
– PC <= PC + 4;
• Decode/register read
– A <= Reg[IR[25:21]]; rs
– B <= Reg[IR[20:16]]; rt
– ALUOut <= PC + (sign-extend(IR[15:0])<<2);
• Execution
– if (A == B) then PC <= ALUOut;
• if $t0 == $t1 perform branch
45
Multi-cycle execution: j
• Instruction fetch
– IR <= Memory[PC]; j label
– PC <= PC + 4;
• Decode/register read
– A <= Reg[IR[25:21]];
– B <= Reg[IR[20:16]];
– ALUOut <= PC + (sign-extend(IR[15:0])<<2);
• Execution
– PC <= {PC[31:28],IR[25:0],”00”};
46
Example: CPI in a multicycle
CPU
• Assume
– the control design of the previous slides
– An instruction mix of 22% loads, 11% stores, 49% R-type operations, 16%
branches, and 2% jumps
• What is the CPI assuming each step requires 1 clock cycle?
• Solution:
– Number of clock cycles from previous slide for each instruction class:
• loads 5, stores 4, R-type instructions 4, branches 3, jumps 3
– CPI = CPU clock cycles / instruction count
= (instruction countclass i CPIclass i) / instruction count
= (instruction countclass I / instruction count) CPIclass I
= 0.22 5 + 0.11 4 + 0.49 4 + 0.16 3 + 0.02 3
= 4.04
47
Multi-Cycle Control
What we need to cover
• Adding registers after every functional unit
– Need to modify the “instruction execution” slides to reflect this
• Breaking instruction execution down into cycles
– What can be done during the same cycle? What requires a
cycle?
– Need to modify the “instruction execution” slides again
– Timing
• Control signal values
– What they are per cycle, per instruction
– Finite state machine which determines signals based on
instruction type + which cycle it is
• Putting it all together
48
A (Refined) Datapath fig 5.26
49
Datapath w/ Control Signals Fig 5.27
50
Final Version w/ Control Fig 5.28
51
Multicycle Control Step (1):
Fetch
IR = Memory[PC];
PC = PC + 4;
1
IRWrite
I 28 32
1 R
Instruction I
5
jmpaddr
I[25:0] <<2 CONCAT
PCWr* rs rt rd
0
IorD 0 32 5 5
0
MUX
1 RegDst
0
2
M
1U
PC
5 X ALUSrcA 010
Operation 0
X
X RegWrite
2X
3
1 0 E
16 X 32
ALUSrcB
immediate
T
N
<<2 1
D
52
Multicycle Control Step (2):
Instruction Decode & Register
A = Reg[IR[25-21]];Fetch
(A = Reg[rs])
B = Reg[IR[20-15]]; (B = Reg[rt])
ALUOut = (PC + sign-extend(IR[15-0]) << 2);
0IRWrite
I 28 32
0 R
Instruction I
5
jmpaddr
I[25:0] <<2 CONCAT
PCWr* rs rt rd
X 0 1 RegDst 0 2
IorD 0 32 5 5 MUX
5 X ALUSrcA 010 1U
M
X
PC Operation 0
MemWrite RN1 RN2 WN 3
0M 0M
U ADDR M PCSource
1X Registers U
Zero
Memory
RD
D
R
1M
U WD
RD1 A 1X
ALU X
0X ALU
OUT
WD RD2 B 0
MemRead MemtoReg 4 1M
X
U
2X
RegWrite 3
0 0 E
16 X 32
ALUSrcB
immediate
T <<2 3
N
D
53
Multicycle Control Step (3):
Memory Reference Instructions
ALUOut = A + sign-extend(IR[15-0]);
0
IRWrite
I Instruction I jmpaddr 28 32
0
PCWr*
R
rs rt
5
rd
I[25:0] <<2 CONCAT
X 32
0 1 RegDst
1
2
0 5 5 MUX M
IorD 1U
PC
5 X ALUSrcA 010
Operation 0
X
54
Multicycle Control Step (3):
ALU Instruction (R-Type)
ALUOut = A op B;
0
IRWrite
I Instruction I jmpaddr 28 32
0 R 5 I[25:0] <<2 CONCAT
PCWr* rs rt rd
X 0 1 RegDst
1
2
IorD 0 32 5 5 MUX
5 X ALUSrcA ??? 1U
M
X
PC Operation 0
MemWrite RN1 RN2 WN 3
0M 0M
U ADDR M PCSource
1X Registers U
1X Zero
D 1M RD1 A
Memory
RD R U
0X
WD ALU X
ALU
OUT
WD RD2 B 0
MemRead MemtoReg 4 1M
U
X RegWrite
2X
3
0 0 E
16 X 32
ALUSrcB
immediate
T
N
<<2 0
D
55
Multicycle Control Step (3):
Branch Instructions
if (A == B) PC = ALUOut;
0
IRWrite
1 if I Instruction I jmpaddr 28 32
Zero=1 R 5 I[25:0] <<2 CONCAT
PCWr* rs rt rd
X 0 1 RegDst
1
2
IorD 0 32 5 5 MUX
5 X ALUSrcA 011 1U
M
X
PC Operation 0
MemWrite RN1 RN2 WN 3
0M 0M
U ADDR M PCSource
1X Registers U
1X Zero
D 1M RD1 A
Memory
RD R U
0X
WD ALU 1
ALU
OUT
WD RD2 B 0
MemRead MemtoReg 4 1M
U
X RegWrite
2X
3
0 0 E
16 X 32
ALUSrcB
immediate
T
N
<<2 0
D
56
Multicycle Execution Step (3):
Jump Instruction
PC = PC[21-28] concat (IR[25-0] << 2);
0
IRWrite
I Instruction I jmpaddr 28 32
1
PCWr*
R
rs rt
5
rd
I[25:0] <<2 CONCAT
X 32
0 1 RegDst
X
2
0 5 5 MUX M
IorD 1U
PC
5 X ALUSrcA XXX
Operation 0
X
57
Multicycle Control Step (4):
Memory Access - Read (lw)
MDR = Memory[ALUOut];
IRWrite 0
I Instruction I jmpaddr 28 32
0 R
5 I[25:0] <<2 CONCAT
PCWr* rs rt rd
1 0 1 RegDst
X
2
IorD 0 32 5 5 MUX
5 X ALUSrcA XXX 1U
M
X
PC Operation 0
MemWrite RN1 RN2 WN 3
0M 0M
U ADDR M PCSource
1X Registers U
1X Zero
D 1M RD1 A
Memory
RD R U
0X
WD ALU X
ALU
OUT
WD RD2 B 0
MemRead MemtoReg 4 1M
U
X RegWrite
2X
3
1 0 E
16 X 32
ALUSrcB
immediate
T
N
<<2 X
D
58
Multicycle Execution Steps (4)
Memory Access - Write (sw)
Memory[ALUOut] = B;
IRWrite 0
I Instruction I jmpaddr 28 32
0
PCWr*
R
rs rt
5
rd
I[25:0] <<2 CONCAT
1 32
0 1 RegDst
X
2
1 5 5 MUX M
IorD
PC
5 X ALUSrcA XXX
Operation
1U
0
X
59
Multicycle Control Step (4):
ALU Instruction (R-Type)
Reg[IR[15:11]] = ALUOut; (Reg[Rd] =
ALUOut) 0 IRWrite
I Instruction I jmpaddr 28 32
0
PCWr*
R
rs rt
5
rd
I[25:0] <<2 CONCAT
X 32 5 5
0
MUX
1 RegDst
X
2
M
IorD
0 5
1 ALUSrcA
XXX
1U
X
PC Operation 0
MemWrite RN1 RN2 WN 3
0M 0M
U ADDR M U PCSource
1X
D 0 Registers RD1 A 1X Zero
Memory M
RD R 1
U
X
WD ALU
ALU
X
OUT
WD RD2 B 0
MemRead MemtoReg 4 1M
U
2X
1 RegWrite 3
0 1 E
ALUSrcB
immediate 16 X 32
T
N
<<2 X
D
60
Multicycle Execution Steps (5)
Memory Read Completion (lw)
Reg[IR[20-16]] = MDR;
IRWrite 0
I Instruction I 28 32
0 R 5
jmpaddr
I[25:0] <<2 CONCAT
PCWr* rs rt rd
X 0 1 RegDst
X 2
IorD 0 32 5 5 MUX
5 0 ALUSrcA XXX 1U
M
X
PC Operation 0
MemWrite RN1 RN2 WN 3
0M 0M
U ADDR M U PCSource
1X Registers Zero
D A 1X
Memory
RD R
0
1
M
U
X
WD
RD1
ALU X
ALU
OUT
WD RD2 B 0
MemRead MemtoReg 4 1M
U
2X
0 RegWrite 3
0
immediate 16
1 E
X 32
ALUSrcB
T
N
<<2 X
D
61
Multi-Cycle Control
What we need to cover
• Adding registers after every functional unit
– Need to modify the “instruction execution” slides to reflect this
• Breaking instruction execution down into cycles
– What can be done during the same cycle? What requires a
cycle?
– Need to modify the “instruction execution” slides again
– Timing: Registers/memory updated at the beginning of the next
clock cycle
• Control signal values
– What they are per cycle, per instruction
– Finite state machine which determines signals based on
instruction type + which cycle it is
• Putting it all together
62
Fig 5.28 For reference
•Note: In the previous diagrams, the values for the MemtoReg MUX are
63
Backward. The values shown in those slides match the pictures.
A FSM State Diagram
65
Handling Memory Instructions
66
R-type Instruction
67
Branch and Jump
68
FSM Implementation
69
Example: Load (1)
00
1 1
0 0
1 0
01
00
70
Example: Load (2)
rs
rt
11
00
71
Example: Load (3)
10
00
72
Example: Load (4)
1
1 0
73
Example: Load (5)
74
Example: Jump (1)
00
1 1
0 0
1 0
01
00
75
Example: Jump (2)
11
00
76
Example: Jump (3)
1
10
77
To Summarize…
• From several building blocks, we constructed
a datapath for a subset of the MIPS
instruction set
• First, we analyzed instructions for functional
requirements
• Second, we connected buildings blocks in a
way to accommodate instructions
• Third, we refined the datapath and added
controls
78
To Summarize…
• We looked at how an instruction is executed
on the datapath in a pictorial way
• We looked at control signals connected to
functional blocks in our datapath
• We analyzed how execution steps of an
instruction change the control signals
79
To Summarize…
• We compared a single-cycle implementation
and a multi-cycle implementation of our
datapath
• We analyzed multi-cycle execution of
instructions
• We refined multi-cycle datapath
• We designed multi-cycle control
80
To Summarize…
• We looked at the multi-cycle control
scheme in detail
• Multi-cycle control can be implemented
using FSM
• FSM is composed of some combinational
logic and memory element
81
Summary
• Techniques described in this chapter to design datapaths and
control are at the core of all modern computer architecture
• Multicycle datapaths offer two great advantages over single-cycle
– functional units can be reused within a single instruction if they are
accessed in different cycles – reducing the need to replicate expensive
logic
– instructions with shorter execution paths can complete quicker by
consuming fewer cycles
• Modern computers, in fact, take the multicycle paradigm to a higher
level to achieve greater instruction throughput:
– pipelining (later class) where multiple instructions execute
simultaneously by having cycles of different instructions overlap in the
datapath
– the MIPS architecture was designed to be pipelined
82