How Computers Work Assembly Tools Calling Conventions: Processor State Instruction Memory Fetch/Execute Loop

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

How Computers Work

Lecture 2

Assembly Tools
Calling Conventions

How Computers Work Lecture 2 Page 1

Review: Model of
Computation Fetch/Execute
Processor State Instruction Memory Loop:
PC Fetch <PC>
PC <pc> + 1
r0 Execute fetched
instruction
r1 32 bits
r2 (4 bytes) Repeat!

32 bits
next instr

r31 always 0

How Computers Work Lecture 2 Page 2

1
Review: BETA Instructions
Two 32-bit Instruction Formats:

OPCODE Ra Unused Rb Rc

OPCODE Ra 16 bit Constant Rc

How Computers Work Lecture 2 Page 3

Review: ALU Operations SIMILARLY FOR:


What the machine sees (32-bit instruction word):
SUB, SUBC
(optional)
OPCODE Ra Unused Rb Rc MUL, MULC
DIV, DIVC
What we prefer to see: symbolic ASSEMBLY LANGUAGE
BITWISE LOGIC:
ADD(ra, rb, rc) rc <ra> + <rb> AND, ANDC
Add the contents of ra to the contents of OR, ORC
rb; store the result in rc
XOR, XORC

Alternative instruction format: SHIFTS:


SHL, SHR, SAR
(shift left, right;
OPCODE Ra 16 bit Constant shift arith right)
Rc
COMPARES
CMPEQ, CMPLT,
ADDC(ra, const, rc) rc <ra> + sext(const) CMPLE

Add the contents of ra to const; store the result in rc


How Computers Work Lecture 2 Page 4

2
Review: Loads & Stores
LD(ra, C, rc) rc < Mem[<ra> + sext(C)] >

Fetch into rc the contents of the


data memory location whose address is
the contents of ra plus C

ST(rc, C, ra) Mem[<ra> + sext(C)] <rc>

Store the contents of rc into the


data memory location whose address is
the contents of ra plus C

NO BYTE ADDRESSES: only 32-bit word accesses are supported.


This is similar to how Digital Signal Processors work
It is somewhat unusual for general purpose processors,
which are usual byte (8 bit) addressed
How Computers Work Lecture 2 Page 5

Review: Branches
Conditional: rc = <PC>+1; then
BRNZ(ra, label, rc) if <ra> nonzero then
PC <- <PC> + displacement
BRZ(ra, label, rc) if <ra> zero then
PC <- <PC> + displacement

Unconditional: rc = <PC>+1; then


BRZ(r31, label, rc) PC <- <PC> + displacement
Note:
Indirect: rc = <PC>+1; then displacement
is coded as a
JMP(ra, rc) PC <- <ra> CONSTANT in
a field of the
instruction!

How Computers Work Lecture 2 Page 6

3
Review: Iterative Optimized Factorial
;assume n = 20, val = 1 n: 20
val: 1

(define (fact-iter n val) LD(n, r1) ; n in r1


(if (= n 0) LD(val, r3) ; val in r3
val BRZ(r1, done)
loop:
(fact-iter (- n 1) (* n val)) MUL(r1, r3, r3)
) ) SUBC(r1, 1, r1)
BRNZ(r1, loop)
done:
ST(r1, n) ; new n
ST(r3, val) ; new val

How Computers Work Lecture 2 Page 7

Language Tools
The Assembler

01101101
11000110 STREAM of Words
Assembler 00101111
to be loaded
10110001
.....
into memory

Symbolic Translator Binary


SOURCE program Machine
text file Language

Symbol
Textual Memory +
Macro Expression
Pre-Processor Evaluator

How Computers Work Lecture 2 Page 8

4
Macros
Macros are parameterized abbreviations that when invoked cause
TEXTUAL SUBSTITION

| Macro to generate 4 consecutive numbers:


.macro consec4(n) n n+1 n+2 n+3

| Invocation of above macro:


consec4(37)

Is translated into:

37 37+1 37+2 37+3

How Computers Work Lecture 2 Page 9

Some Handy Macros


| BETA Instructions:
ADD(ra, rb, rc) | rc <ra> + <rb>
ADDC(ra, const, rc) | rc <ra> + const
LD(ra, C, rc) | rc <C + <ra>>
ST(rc, C, ra) | C + <ra> <rc>
LD(C, rc) | rc <C>
ST(rc, C) | C <ra>

How Computers Work Lecture 2 Page 10

5
Constant Expression Evaluation
37 -3 255 decimal (default);
0b100101
binary (0b prefix);
0x25
hexadecimal (0x prefix);

Values can also be expressions; eg:

37+0b10-0x10 24-0x1 4*0b110-1 0xF7&0x1F

generates 4 words of binary output, each with the value 23

How Computers Work Lecture 2 Page 11

Symbolic Memory
We can define SYMBOLS:

x = 1 | 1
y = x + 1 | 2

Which get remembered by the assembler. We can


later use them instead of their values:

ADDC(x, 37, y) | R2 <R1> + 37

How Computers Work Lecture 2 Page 12

6
How Are Symbols Different
Than Macros?
Answer:
A macros value at any point in a file is the last
previous value it was assigned.
Macro evaluation is purely textual substitution.
A symbols value throughout a file is the very last
value it is assigned in the file.
Repercussion: we can make forward references to symbols
not yet defined.
Implementation: the assembler must first look at the entire
input file to define all symbols, then make another pass
substituting in the symbol values into expressions.

How Computers Work Lecture 2 Page 13

Dot, Addresses, and Branches


Special symbol . (period) changes to indicate the address of the next output byte.

We can use . to define branches to compute RELATIVE address field:


.macro BRNZ(ra,loc) betaopc(0x1E,ra,(loc-.)-1,r31)

loop = . | loop is here...


ADDC(r0, 1, r0)
...
BRNZ(r3, loop) | Back to addc instr.

How Computers Work Lecture 2 Page 14

7
Address Tags
x: is an abbreviation for x =. -- leading to programs like

x: 0

buzz: LD(x, r0) do { x = x-1; }


ADDC(r0, -1, r0)
ST(r0, x)
BRNZ(r0, buzz) while (x > 0);
...

How Computers Work Lecture 2 Page 15

Macros Are Also Distinguished by Their


Number of Arguments
We can extend our assembly language with new macros. For example,
we can define an UNCONDITIONAL BRANCH:

BR(label, rc) rc <PC>+4; then


PC <PC> + displacement
___________________________________________
BR(label) PC <PC> + displacement

by the definitions
.macro BR(lab, rc) BRZ (r31,lab, rc)

.macro BR(lab) BR(lab,r31)


How Computers Work Lecture 2 Page 16

8
OK, How about recursive fact?
(define (fact n)
(if (= n 0)
Suppose caller:
1
1. Stores nin agreed-on
(* n (fact (- n 1))) location (say, r1)
) )
2. Calls fact using, say,
int fact(int n) BR(fact, r28)
{ Then fact:
if (n == 0)
1. Computes value, leaves
return (1); (say) in r0.
else
2. Returns via
return (n * fact(n-1));
} JMP(r28, r31)

R28 Convention: We call it the Linkage Pointer


(just like scheme RMLs continue register)
How Computers Work Lecture 2 Page 17

Does this really work?


fact:
int fact(int n)
...
{
if (n == 0)
return (1);
BR(fact, LP)
else
...
return (n * fact(n-1)); (put result in r0)
}

OOPS!

We need a STACK ! ! !

How Computers Work Lecture 2 Page 18

9
fact(3) ...
Recursion...
Caller
n=3
fact(3)

n=2
fact(2)

n=1
fact(1)

n=0
fact(0)

How Computers Work Lecture 2 Page 19

Stack Implementation
one of several possible implementations
Lower addresses
Old Stuff
Conventions:
Builds UP on push (stacked data)
Memory: (stacked data)
SP points to first (stacked data)
UNUSED location. (stacked data)
<sp>

To push <x>: unused


sp <sp> + 1; space
Mem[<sp> - 1] <x>
PUSH
To pop() a value into x:
Higher addresses
x <Mem[<sp> - 1]>
New Stuff
sp <sp> - 1;
How Computers Work Lecture 2 Page 20

10
Support Macros for Stacks
sp = r29
push(rx) - pushes 32-bit value onto stack.
ADDC(sp, 1, sp)
ST(rx, -1, sp)

pop(rx) - pops 32-bit value into rx.


LD(rx, -1, sp)
ADDC(SP, -1, sp)

allocate(k) - reserve k WORDS of stack


ADDC(SP, k, sp)

deallocate(k) - give back k WORDS


SUBC(SP, k, sp)
How Computers Work Lecture 2 Page 21

The Stack
STACK as central storage-management
mechanism...
argi (arg)
SIMPLE, EFFICIENT to implement using rtn PC
rtn
contiguous memory and a "stack pointer" locj (Local)
ORIGINAL USE: subroutine return points - sp
push/pop STACK DISCIPLINE follows
natural order of call/return nesting.
EXPANDED USE: "automatic" or STACK DISCIPLINE:
"dynamic" allocation of local variables. most recently
REVOLUTIONARY IMPACT: allocated location
is next location to
ALL modern machines, to varying extents;
be deallocated.
ALL modern languages
IMPACT:
BLOCK
STRUCTURE.
How Computers Work Lecture 2 Page 22

11
Call / Return Linkage
lp = r28
sp = r29
Using these macros and r28 as a linkage pointer, we can call f by:
BR(f, lp)

And code procedure f like:


f: PUSH(lp) | SAVE lp
<perform computation> | (may trash lp)
POP(lp) | RESTORE lp
JMP(lp) | Return to caller

How Computers Work Lecture 2 Page 23

Recursion with Register-Passed Arguments


| Compute Fact(n)
| n passed in r1, result returned in r0

fact: PUSH(lp) | Save linkage pointer

BRZ(r1,fact1) | terminal case?


PUSH(r1) | Save n,
ADDC(r1,-1,r1) | compute fact(n-1).
BR(fact, lp) | recursive call to fact.
POP(r1) | restore arg,
MUL(r1,r0,r0) | return n*fact(n-1)

factx: POP(lp) | restore linkage pointer


JMP(lp) | and return to caller.

fact1: MOVC(1,r0) | fact(0) => 1


BR(factx)

.macro MOV(rsrc,rdest) ADD (rsrc, r31, rdest)


.macro MOVC(csrc,rdest) ADDC (r31, csrc, rdest)
How Computers Work Lecture 2 Page 24

12
A Generic Stack Frame Structure:
The 6.004 Stack Discipline
RESERVED REGISTERS:
bp = r27. Base ptr, points to 1st local.
lp = r28. Linkage pointer, saved <PC>.
sp = r29. Stack ptr, points to 1st unused word.
xp = r30. Exception pointer, saved <PC>

STACK FRAME
(a.k.a. Function Activation Record):
arguments P
U
old <lp>
old <bp>
S
bp: H
Local variables
...
Temporaries
sp:
(as yet unused)

How Computers Work Lecture 2 Page 25

6.004 Stack Discipline Procedure Linkage


Calling Sequence:
PUSH(argn) | push args, in
... | RIGHT-TO-LEFT
PUSH(arg1) | order!
BR(f, lp) | Call f.
DEALLOCATE(n) | Clean up!
... | (returned value now in r0)

Entry Sequence:
f: PUSH(lp) | Save <LP>, <BP>
PUSH(bp) | for new calls.
MOV(sp,bp) | set bp=frame base
ALLOCATE(locals) | allocate locals
(push other regs) | preserve regs used
...

Return Sequence:
(pop other regs) | restore regs
MOV(val, r0) | set return value
MOV(bp,sp) | strip locals, etc
POP(bp) | restore linkage
POP(lp) | (the return <PC>)
JMP(lp) | return.
How Computers Work Lecture 2 Page 26

13
Stack Frame Detail

old old <lp>


old old <bp> CALLERS
callers local 0 FRAME

callers local n-1
arg n-1
(callers
return
PC) arg 0
old <lp> CALLEES
FRAME
old <bp>
bp: local 0

local n-1
sp: free
space
How Computers Work Lecture 2 Page 27

Access to Arguments & Local Variables


To access jth local variable (j >= 0)
LD(BP, (j, rx)
or
ST(rx, j, bp) arg n-1

arg 0
old <lp>
old <bp>
To access jth argument (j >= 0): bp: local 0

LD(BP, 3-j, rx) local n-1
sp: free
or space
ST(rx, 3-j, bp)

QUESTION: Why push args in REVERSE order???

How Computers Work Lecture 2 Page 28

14
Procedure Linkage: The Contract

The caller will:


Push args onto stack, in reverse order.
Branch to callee, putting return address into lp.
Remove args from stack on return.

The callee will:


Perform promised computation, leaving result
in r0.
Branch to return address.
Leave all regs (except lp, r0) unchanged.

How Computers Work Lecture 2 Page 29

Recursive factorial
with stack-passed arguments
| (define (fact n)
| (if (= n 0) 1 (* n (fact (- n 1))))
| )

fact: PUSH(lp) | save linkages


PUSH(bp)
MOV(sp,bp) | new frame base
PUSH(r1) | preserve regs

LD(bp,-3,r1) | r0 n
BRNZ(r1, big) | if n>0
MOVC(1,r0) | else return 1;
BR(rtn)

big: SUBC(r1,1,r1) | r1 (n-1)


PUSH(r1) | arg1
BR(fact,lp) | fact(n-1)
DEALLOCATE(1) | (pop arg)
LD(bp, -3, r1) | r0 n
MUL(r1,r0,r0) | r0 n*fact(n-1)

rtn: POP(r1) | restore regs


MOV(bp,sp) | Why?
POP(bp) | restore links
POP(lp)
JMP(lp) | return.

How Computers Work Lecture 2 Page 30

15
What did we Learn Today?
How to call functions
How to do recursive factorial
The 6.004 Stack Discipline
How to retrieve arguments and local variables

Next In Section
Practice with Stack Discipline

C Tutorial
http://www.csc.lsu.edu/tutorial/ten-commandments/bwk-tutor.html

How Computers Work Lecture 2 Page 31

16

You might also like