05SingleCycleCPU_1410693631
05SingleCycleCPU_1410693631
05SingleCycleCPU_1410693631
Single-Cycle Microarchitecture
Jaewoong Sim
Electrical and Computer Engineering
Seoul National University
• CPU performance factors
• Instruction Count
} Impacted by ISA and compiler
• CPI (cycles per instruction) and cycle time
} Impacted by CPU hardware (microarchitecture)
Reading
• P&H Ch 4.1 – 4.4
2
• Formats: Four core instruction formats + two small variations
• Simple Decoding
• 4 bytes per instruction, regardless of format
• rs1, rs2, rd in the same position across different formats
3
• Combinational Circuits
• The outputs depend only on the combination of the current values
• Memoryless
• Also called combinatorial circuits
• Sequential Circuits
• The outputs depend on the input sequence
(i.e., both current and previous values of the inputs)
• Thus, a sequential circuit has memory
5
• Now that we have an ISA, which defines
• Architectural States + Instructions + Others...
6
• What does processing an instruction mean?
Process an instruction
8
I AS’ AS O
Next-State Logic Programmer
(Combinational Logic) Visible State
(PC, Registers,
MEM, ...)
Instruction
address
PC
Instruction Add Sum
Instruction
memory
10
**Based on figures from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]
“Magic” Memory and Register File
• Combinational Read
• The output of the read data port is a combinational function of the register file
contents and the corresponding read select port
• Synchronous write
• The selected register is updated on the positive-edge clock transition when
write enable is asserted
• i.e., cannot affect read output in between clock edge
11
• 5 Generic Steps
• IF: Instruction fetch
• ID: Instruction decode and register operand fetch
• EX: ALU (execute operation or calculate memory address)
• MEM: Memory operand fetch (only for load & store instructions)
• WB: Write result back to register
WB
IF Data
Register #
PC Address Instruction Registers ALU Address
Register #
Instruction
memory Data
Register # memory
EX
ID Data
MEM 12
**Based on figures from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]
We will build a RISC-V datapath incrementally 13
Single-Cycle Datapath for
ALU Instructions
R-Type (Register-Register Instructions)
I-Type (Register-Immediate Instructions)
14
Increment by
4 for next
instruction
15
• Assembly (e.g., register-register signed addition)
ADD rd, rs1, rs2
• Machine Encoding
• Semantics
if MEM[PC] == ADD rd rs1 rs2
GPR[rd] ¬ GPR[rs1] + GPR[rs2]
PC ¬ PC + 4
• Exceptions: none (ignore carry and overflow)
• Variations
• Arithmetic: {ADD, SUB}
• Logical: {AND, OR, XOR}
• Shift: {Left, Right-Logical, Right-Arithmetic}
• Compare: {signed, unsigned} x {Set if Less Than}
16
• Two more elements needed to implement R-format ALU operations
• Register File (RF)
• Arithmetic and Logic Unit (ALU)
IF ID EX MEM WB
if MEM[PC] == ADD rd rs1 rs2
GPR[rd] ¬ GPR[rs1] + GPR[rs2] Combinational
state update logic
PC ¬ PC + 4 17
4 ALU
ALUoperation
operation
[19:15] Read 3
register 1 Read
Read data 1
[24:20] Zero
Instruction register 2
Registers ALU ALU
[11:7] Write result
register
Read
Write data 2
data
1 RegWrite
• Semantics
if MEM[PC] == ADDI rd rs1 immediate
GPR[rd] ¬ GPR[rs1] + sign-extend (immediate)
PC ¬ PC + 4
• Exceptions: none (ignore carry and overflow)
• Variations
• Arithmetic: {ADDI, SUBI}
• Logical: {ANDI, ORI, XORI}
• **Shifts by unsigned imm[4:0]: {SLLI, SRLI, SRAI}
• Compare: {signed, unsigned} x {Set if Less Than Imm}
19
• Input: 32-bit instruction
20
IF ID EX MEM WB
if MEM[PC] == ADDI rd rs1 immediate Combinational
GPR[rd] ¬ GPR[rs1] + sign-extend(immediate) state update logic
PC ¬ PC + 4 21
Add
Read [19:15]
PC address
[24:20]
Instruction
[11:7]
Instruction
memory
Q: How do we efficiently build a datapath for both R-type and I-type ALU?
22
Add
Read
PC address
Instruction
Instruction
memory
ALUSrc
0: R-Type
1: I-Type
23
Single-Cycle Datapath for
Data Movement Instructions
I-Type (Load Instructions)
S-Type (Store Instructions)
24
• Assembly (e.g., load 4-byte word)
LW rd, offset12 (base)
• Machine Encoding
immediate rs1 funct3 rd opcode I-type
12 bits 5 bits 3 bits 5 bits 7 bits
• Semantics
if MEM[PC]==LW rd offset12(base)
EA32 = sign-extend(offset12) + GPR[base]
GPR[rd] ¬ MEM32[ translate(EA) ]
PC ¬ PC + 4
• Exceptions: none for now
• Variations: {LW, LH, LHU, LB, LBU}
• e.g., LB: GPR[rd] ç sign-extend(MEM8[EA])
LBU: GPR[rd] ç zero-extend(MEM8[EA]) EA: Effective Address
25
0
ALUSrc
Add 0: R-Type
1: I-Type
Read
PC address
1
Instruction
Instruction
memory 1
• Semantics
if MEM[PC]==SW rs2 offset12(base)
EA32 = sign-extend(offset12) + GPR[base]
MEM32[ translate(EA) ] ¬ GPR[rs2]
PC ¬ PC + 4
• Exceptions: none for now
• Variations: {SW, SH, SB}
• e.g., SB: MEM8[EA] ç (GPR[rs2])[7:0]
27
• e.g., Load Datapath
• RegWrite: 1, ALUSrc: 1, MemRead: 1, MemWrite: 0
• e.g., Store Datapath
• RegWrite: 0, ALUSrc: 1, MemRead: 0, MemWrite: 1
extend
LoadExtend 28
{W, H, HU, B, BU}
• e.g., Load Datapath
• RegWrite: 1, ALUSrc: 1, MemRead: 1, MemWrite: 0, MemtoReg: 1
extend
29
Single-Cycle Datapath for
Control Flow Instructions
SB-Type (Conditional Branch)
UJ-Type (Unconditional Branch)
30
• Assembly (e.g., branch if equal)
BEQ rs1, rs2, immediate13 (note: implicit imm[0]=0)
• Machine Encoding
imm imm
[10:5] rs2 rs1 funct3 [4:1] opcode SB-type
imm[12] imm[11]
• Semantics
if MEM[PC]==BEQ rs1 rs2 immediate13
target = PC + sign-extend(immediate13)
if GPR[rs1] == GPR[rs2] then PC ç target
else PC ç PC + 4
How far can you jump?
• Exceptions: misaligned target (4-byte) if taken
• Variations: BEQ, BNE, BLT{U}, BGE{U}
31
• Read register operands (rs1, rs2)
32
Just
re-routes
wires
Sign-bit wire
replicated
33
• Assembly
JAL rd, immediate21 (note: implicit imm[0]=0)
• Machine encoding
imm[10:1] imm[19:12] rd opcode UJ-type
5 bits 7 bits
imm[20] imm[11]
• Semantics
if MEM[PC]==JAL rd, immediate21
target = PC + sign-extend(immediate21)
GPR[rd] = PC+4
PC ¬ target How far can you jump?
• Exceptions: misaligned target (4-byte)
34
35
Control
36
• We need to add a control function to make the datapath work for a variety
of instructions!
37
• Control signals
• A combinational function of Inst=MEM[PC]
ALUOp
RegWrite
Instruction ALUSrc
MEM[PC] Decode PCSrc
Logic MemRead
MemWrite
MemtoReg
• Consider
• All R-type and I-type ALU instructions
• LW and SW
• BEQ, BNE, BLE, BGT
• JAL, JALR 38
• A combinational unit to control the ALU
• Input: 32-bit inst. è funct7, funct3, ALUOp (2-bit)
• Output: 4-bit ALU control signals (ALU operation)
39
• Truth table to derive the “ALU control” combinational function
• Q. Do we need all 10 bits from funct7 and funct3 for the example?
40
41
42
1
0
43
1
0
44
PCSrc
0
0
45
Now you know how to design
a single-cycle RISC-V CPU!!
Jaewoong Sim
Electrical and Computer Engineering
Seoul National University
• Lab1 will be released on Apr.04 (Tue)
• Implementing a single-cycle RISC-V CPU using Verilog
• Due: Two weeks from Tue!
48