Intel I

Download as pdf or txt
Download as pdf or txt
You are on page 1of 72

Computer Architectures

Intel Assembly - I
Prof. Kasım Sinan Yıldırım, University of Trento
The Intel x86 Evolution
● An architecture evolved over almost 40 years:
○ E.g., several new features added to the original instruction set

● x86 milestones (evolution with backward compatibility)


○ 8080 (1974): 8-bit microprocessor
○ 8086 (1978): 16-bit extension to 8080
○ 8087 (1980): floating-point coprocessor
○ 80286 (1982): 24-bit addresses, MMU
○ 80386 (1985): 32-bit extension (now IA-32)

2
The Intel x86 Evolution
● Further evolution…
○ i486 (1989): pipelined, on-chip caches and FPU
○ Pentium (1993): superscalar, 64-bit datapath
○ Pentium Pro (1995), Pentium II (1997)
○ Pentium III (1999): Added SSE (Streaming SIMD Extensions)
○ Pentium 4 (2001): New microarchitecture, Added SSE2 instructions
○ AMD64 (2003): extended architecture to 64 bits
○ Intel Core (2006)
● X86 instruction set largely drove the PC generation of computers
○ still dominates the Cloud portion of the post-PC era.
○ Manufacturing 350M x86 chips per year

3
4
The Intel x86 ISA
● CISC (Complex Instruction Set Computer) architecture
● Provide more powerful instructions than those in RISC-V:
○ Reduces the number of instructions executed by a program.

● CISC processors:
○ Are larger as they contain more transistors
○ May take multiple cycles per line of code, decreasing efficiency
○ Lower clock speed
○ Complex use of pipelining
○ Compared to RISC, they are more complex, which means they are more expensive

5
Intel Assembly
● CISC: different instructions and addressing modes
○ we will not see everything
○ we will mainly focus on differences from RISC-V

● Compatibility:
○ various modes of operation
■ Modern 64-bit CPUs are still capable of running old code 8-bit
○ we will consider only 64-bit modes

● There are several Assemblers, each with different syntax


○ we will consider GNU Assembler (GAS)

6
General Purpose Registers
● They are all preceded by the prefix % (percent)
● 16 64-bit general purpose registers
● They have names that reflect backward compatibility:
○ %rax, %rbx, %rcx, %rdx, %rsi, %rdi, %rbp, %rsp
○ %r8, ..., %r15
○ %rsp: stack pointer
○ %rbp: base pointer
■ pointer to the stack frame
○ %rsi and %rdi for copying of array
■ source index and destination index

7
General Purpose Registers
● %rax-%rbp
○ extend 32-bit registers %eax-%ebp
■ extend 16-bit registers %ax-%bp
● %ax, %bx, %cx, and %dx extend 8-bit registers
indicated by replacing %x with %h or %l:
○ %ax == %ah + %al
● %r8-%r15
○ extend 32-bit registers %r8d-%r15d that
■ extend 16 bit registers %r8w-%r15w that
● include 8-bit registers %r8b-%r15b (least
significant byte)

8
Special Registers
● Instruction Pointer: %rip
● Flags register: %rflags
○ extends %eflags, which extends %flags
○ List of bits (flags) set by logic/arithmetic instructions
■ CF : Carry Flag → Set to 1 if the result is unsigned overflow or if there is carry-out
■ ZF : Zero Flag → Set to 1 if the result is zero
■ SF : Sign Flag → Set to 1 if the result is negative
■ OF : Overflow Flag → Set to 1 if the result has overflowed
○ Used by conditional jump instructions
■ Other flags control CPU operation (IF) or contain other information about the CPU
● There are also other special registers too (check other documents)

9
Intel instructions
● Too many (several online tutorials)
● For a more or less detailed list of instructions:
http://en.wikipedia.org/wiki/X86_instruction_listings
○ Many instructions have been kept for compatibility!

10
Intel instructions
● Typical syntax: <opcode> <source>, <destination>
○ Final character of opcode to indicate "width" (in bits) of the operands

b: 8 bits, w: 16 bits, l: 32 bits, q: 64 bits

○ Register names begin with "%"


○ Immediate (constant) values start with "$" (e.g. $231)
○ Direct addresses (constants) are simply numbers (e.g. "439")

11
Addressing Modes
● Mainly 2-operand instructions: second operand is the destination!
○ Limitation with compared to RISC-V:
■ Impossible to specify two operands and a different destination
● Source (first operand)
○ Immediate Operand (a constant with $ as prefix, e.g. $20)
○ Operand in Register (value of a register, e.g. %rax)
○ Operand in Memory (value in memory location
■ e.g., value at the address 0x0100A8
● Destination (second operand)
○ Operand in Register (one register as destination, e.g. %rdx)
○ Operand in Memory (a memory location specified by the address
■ e.g., location at the address 0x0AA0E2

12
Addressing Modes
● Access to memory location: <displ> (<base reg>, <index reg>, <scale>
■ <displacement>: constant (immediate value) at 8, 16 or 32 bits
● similar to RISC-V - but RISC-V only has 12-bit displacement/offset
■ <base>: value in register (as for RISC-V)
■ <index>: value in register (simplifies iteration on arrays)
■ <scale>: constant value: 1, 2, 4, or 8 (simplifies access to arrays with elements of size
1, 2, 4 or 8 bytes)
○ Address is formed as: displacement + base + (index*scale)

13
Addressing - examples
Name Format Example Description

Immediate $Num movq $-500, %rax rax = $Num

Direct access Num movq 500, %rax rax = Mem[Num]

Register ri movq %rdx, %rax rax = rdx

Indirect access (ri) movq (%rdx), %rax rax = Mem[rdx]

Base and displacement Num(r i) movq 31(%rdx), %rax rax = Mem[rdx+31]

Scaled index (rb, ri, s) movq (%rdx, %rcx, 4), %rax rax = Mem[rdx+rcx*4]

Scaled index + displacement Num(r b, ri, s) movq 35(%rdx, %rcx, 4), rax = Mem[rdx+rcx*4+35]
%rax

● s is the scale factor and can take the values: 1, 2, 4, or 8

14
Addressing - Special Cases
● Scale = 1 (no scale): <displ> (<base reg>, <index reg>)
● No scale and index: <displ> (<base reg>)
○ Remember access for RISC-V. Only difference: the size in bits offset/displacement

● No scale, index and displacement (displacement = 0):


○ (<base reg>)

● No displacement (displacement = 0):


○ (<base reg>, <index reg>, <scale>)

15
Addressing Modes
● Instructions allow you to write to both registers and memory!
○ RISC-V: memory access only for load and store
● Constraint: no both operands in memory!
○ Valid combinations are:
■ Imm → Reg
■ Imm → Mem
■ Reg → Reg
■ Mem → Reg
■ Reg → Mem

16
Addressing Modes
● Both operands cannot be in memory
● E.g., movl 345, (%eax) is not allowed!
○ Because both destinations are in memory: Mem[eax] = Mem[345]

● Writing from memory to memory requires two instructions:

movl 345, %eax

movl %eax, (%ebx)

17
Most Common Instructions
● mov: copy data from source to destination
○ movsx: copy data with sign extension
○ movzx: copy data with zero extension
● add/adc: add and add with carry
● inc/dec: add/subtract 1
● sub/sbc: sub and sub with carry
● and/or/xor/not: bitwise boolean operations
○ Arithmetic/logical instructions modify flags (carry, zero, sign, ...)
○ Other instructions for flags modification: clc/stc, cld, cmc …
● neg: two’s complement (negation)

18
Most Common Instructions
● mul/imul signed/unsigned multiplication
● div/idiv signed/unsigned division
● rcl, rcr, rol, ror various forms of "rotate"
● sal, sar, shl, shr shift (arithmetic and logical)
● nop

19
Most Common Instructions
● push push data onto the stack: pushq %reg
○ Decreases %rsp by 8 and writes %reg to the memory specified in %rsp
● Example: pushq %rax
○ Equivalent to:
subq $8, %rsp
movq %rax, (%rsp)
● pop remove data from the stack: popq %reg
○ Stores the value at the address indicated by %rsp in %reg and increases %rsp by 8
● Example: popq %rdx
○ Equivalent to:
movq (% rsp),% rdx
addq $8, %rsp

20
Most Common Instructions
● cmp and test
○ cmp: cmp arg1, arg2
■ Compare arg2 with arg1 (e.g. arg2 < arg1, arg2 = arg1, ...)
■ Do arg2 - arg1 and set flags of the flag register
■ arg1 and arg2 are not modified (subtraction result is not stored)
○ test: test arg1, arg2
■ Compare arg2 with arg1 (e.g. arg2 = arg1, ...)
■ Do arg2 & arg1 and set flag register flags
■ arg1 and arg2 are not modified (result of bitwise and is not stored)

● Used before conditional jumps

21
Most Common Instructions
Instruction Synonym Cond. flag Description
● jmp unconditional jump
je label jz label ZF Equal or zero
● je/jnz/jc/jnc conditional jne label jnz label ~ZF Different or
jumps js label SF Non-zero
jns label ~SF Negative
jg label jnle label ~(SF^OF) & Non-negative
j <condition> jge label jnl label ~ZF Signed >
jl label jnge label ~(SF^OF) Signed >=
● The jump condition is jle label jng label (SF^OF) Signed <
ja label jnbe label (SF^OF) | ZF Signed <=
established based on the jae label jnb label ~CF & ~ZF Unsigned >
jb label jnae label ~CF Unsigned >=
values in the flag register jbe label jna label CF Unsigned <
○ jump if equal, jump if not CF | ZF Unsigned <=
zero, jump if carry, ...

22
Most Common Instructions
● call procedure call: call label
○ Pushes the address of the next instruction onto the stack
○ Implicitly executes:
subq $8, %rsp
movq %rip, (%rsp)
○ Modifies the program counter to go to the beginning of the desired procedure
● ret return from procedure: ret
○ Pops the return address from the stack and stores it in %rip
○ Changes the program counter to go to the next caller instruction
○ Implicitly executes:
movq (%rsp), %rip
addq $8, %rsp

23
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

24
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl 0x204, %eax rax = 0000 0000 7654 3210

32 bit instructions will automatically zero the top 32 bits of the


respective 64 bit registers, while 16 or 8 bit instructions don’t.

25
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl 0x204, %eax rax = 0000 0000 7654 3210


● movw 0x202, %ax rax = 0000 0000 7654 fedc

26
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl 0x204, %eax rax = 0000 0000 7654 3210


● movw 0x202, %ax rax = 0000 0000 7654 fedc
● movb 0x207, %al rax = ?

27
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl 0x204, %eax rax = 0000 0000 7654 3210


● movw 0x202, %ax rax = 0000 0000 7654 fedc
● movb 0x207, %al rax = 0000 0000 7654 fe76

28
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl 0x204, %eax rax = 0000 0000 7654 3210


● movw 0x202, %ax rax = 0000 0000 7654 fedc
● movb 0x207, %al rax = 0000 0000 7654 fe76
● movq 0x200, %rax rax = 7654 3210 fedc ba98

29
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x004e4] = 0000 0000
Mem[0x004e0] = 0000 0000
rax = 7654 3210 fedc ba98

● movb %al, 0x4e5 Mem[0x004e4] = ?


Mem[0x004e0] = ?

30
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x004e4] = 0000 0000
Mem[0x004e0] = 0000 0000
rax = 7654 3210 fedc ba98

● movb %al, 0x4e5 Mem[0x004e4] = 0000 9800


Mem[0x004e0] = 0000 0000

31
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x004e4] = 0000 0000
Mem[0x004e0] = 0000 0000
rax = 7654 3210 fedc ba98

● movb %al, 0x4e5 Mem[0x004e4] = 0000 9800


Mem[0x004e0] = 0000 0000
● movl %eax, 0x4e0 Mem[0x004e4] = ?
Mem[0x004e0] = ?

32
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x004e4] = 0000 0000
Mem[0x004e0] = 0000 0000
rax = 7654 3210 fedc ba98

● movb %al, 0x4e5 Mem[0x004e4] = 0000 9800


Mem[0x004e0] = 0000 0000
● movl %eax, 0x4e0 Mem[0x004e4] = 0000 9800
Mem[0x004e0] = fedc ba98

33
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl $0xfe1234, %eax rax = ?

34
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl $0xfe1234, %eax rax = 0000 0000 00fe 1234

32 bit instructions will automatically zero the top 32 bits of the


respective 64 bit registers, while 16 or 8 bit instructions don’t.

35
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl $0xfe1234, %eax rax = 0000 0000 00fe 1234


● movw $0xaa55, %ax rax = ?

36
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl $0xfe1234, %eax rax = 0000 0000 00fe 1234


● movw $0xaa55, %ax rax = 0000 0000 00fe aa55

37
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl $0xfe1234, %eax rax = 0000 0000 00fe 1234


● movw $0xaa55, %ax rax = 0000 0000 00fe aa55
● movb $20, %al rax = ?

38
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl $0xfe1234, %eax rax = 0000 0000 00fe 1234


● movw $0xaa55, %ax rax = 0000 0000 00fe aa55
● movb $20, %al rax = 0000 0000 00fe aa14

39
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl $0xfe1234, %eax rax = 0000 0000 00fe 1234


● movw $0xaa55, %ax rax = 0000 0000 00fe aa55
● movb $20, %al rax = 0000 0000 00fe aa14
● movq $−1, %rax rax = ?

40
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rax = ffff ffff 1234 5678

● movl $0xfe1234, %eax rax = 0000 0000 00fe 1234


● movw $0xaa55, %ax rax = 0000 0000 00fe aa55
● movb $20, %al rax = 0000 0000 00fe aa14
● movq $−1, %rax rax = ffff ffff ffff ffff

41
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x004e4] = 7654 3210
Mem[0x004e0] = fedc ba98
rax = ffff ffff 1234 5678

● movabsq $0x123456789ab, %rax rax = 0000 0123 4567 89ab


Movabs[bwlq]: move 64 bit immediate
value to a reqister

42
move
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions: Mem[0x004e4] = 7654 3210
Mem[0x004e0] = fedc ba98
rax = ffff ffff 1234 5678

● movabsq $0x123456789ab, %rax rax = 0000 0123 4567 89ab


● movq $−1, 0x4e0 Mem[0x004e4] = ffff
ffff
Mem[0x004e0] = ffff ffff

43
move (Zero, Signed)
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions:
Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rdx = 0123 4567 89ab cdef
● movslq 0x200, %rax rax = ffff ffff fedc ba98 Move sign-extended quadword

● movzwl 0x202, %eax rax = ? Move zero-extended word

● movsbw 0x201, %ax rax = ? Move sign-extended byte

● movsbl 0x206, %eax rax = ? Move sign-extended byte

● movzbq %dl, %rax rax = ? Move zero-extended byte

44
move (Zero, Signed)
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions:
Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rdx = 0123 4567 89ab cdef
● movslq 0x200, %rax rax = ffff ffff fedc ba98 Move sign-extended quadword

● movzwl 0x202, %eax rax = ffff ffff 0000 fedc Move zero-extended word

● movsbw 0x201, %ax rax =? Move sign-extended byte

● movsbl 0x206, %eax rax = ? Move sign-extended byte

● movzbq %dl, %rax rax = ? Move zero-extended byte

45
move (Zero, Signed)
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions:
Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rdx = 0123 4567 89ab cdef
● movslq 0x200, %rax rax = ffff ffff fedc ba98 Move sign-extended quadword

● movzwl 0x202, %eax rax = ffff ffff 0000 fedc Move zero-extended word

● movsbw 0x201, %ax rax = ffff ffff 0000 ffba Move sign-extended byte

● movsbl 0x206, %eax rax = ? Move sign-extended byte

● movzbq %dl, %rax rax = ? Move zero-extended byte

46
move (Zero, Signed)
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions:
Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rdx = 0123 4567 89ab cdef
● movslq 0x200, %rax rax = ffff ffff fedc ba98 Move sign-extended quadword

● movzwl 0x202, %eax rax = ffff ffff 0000 fedc Move zero-extended word

● movsbw 0x201, %ax rax = ffff ffff 0000 ffba Move sign-extended byte

● movsbl 0x206, %eax rax = ffff ffff 0000 0054 Move sign-extended byte

● movzbq %dl, %rax rax = ? Move zero-extended byte

47
move (Zero, Signed)
● Syntax: mov[b,w,l,q] src, dst
● Initial conditions:
Mem[0x00204] = 7654 3210
Mem[0x00200] = fedc ba98
rdx = 0123 4567 89ab cdef
● movslq 0x200, %rax rax = ffff ffff fedc ba98 Move sign-extended quadword

● movzwl 0x202, %eax rax = ffff ffff 0000 fedc Move zero-extended word

● movsbw 0x201, %ax rax = ffff ffff 0000 ffba Move sign-extended byte

● movsbl 0x206, %eax rax = ffff ffff 0000 0054 Move sign-extended byte

● movzbq %dl, %rax rax = 0000 0000 0000 00ef Move zero-extended byte

48
Examples and, or, and, sub
● Syntax: and/or/and/sub[b,w,l,q] src, dst
● Initial conditions: Mem[0x00204] = 7654 3210
Mem[0x00200] = 0f0f ff00
rdx = ffff ffff 1234 5678
rax = 0000 0000 cc33 aa55

● addl $0x12300, %eax rax = 0000 0000 cc34 cd55


● addq %rdx, %rax rax = ffff ffff de69 23cd
● andw 0x200, %ax rax = ffff ffff de69 2300
● orb 0x203, %al rax = ffff ffff de69 230f
● subw $14, %ax rax = ffff ffff de69 2301
● addl $0x12345, 0x204 Mem[0x00204] = 7655 5555 Mem[0x00400] = 0f0f ff00

49
Load Effective Address
● lea load effective address: lea src, dest
● Copies source address (calculated by displacement, base, index and scale)
into the destination register
○ Calculates the address and stores it in the destination register without loading anything
from the memory.

● Example: lea 80(%rdx, %rcx, 2), %rax → %rax = %rdx + 2*%rcx + 80


● It is often used as an arithmetic instruction
○ that simultaneously performs two sums (an immediate value and two registers) shifting
one of the addends.

50
Load Effective Address
● Example:
○ add the content of two registers and save the result in a third register.

● On RISC-V if we want to add x1 and x2 saving the result in x3


○ add x3, x1, x2

● On Intel how can we add %rbx and %rcx while saving the result in %rax?
○ lea (%rbx, %rcx), %rax

51
Load Effective Address
● Syntax: lea src, dst
● Initial conditions:
rcx = 0000 0000 0000 0020
rdx = 0000 0089 1234 4000
rbx = ffff ffff ff00 0300

52
Load Effective Address
● Syntax: lea src, dst
● Initial conditions:
rcx = 0000 0000 0000 0020
rdx = 0000 0089 1234 4000
rbx = ffff ffff ff00 0300

● leal (%edx,%ecx),%eax rax = 0000 0000 1234 4020

53
Load Effective Address
● Syntax: lea src, dst
● Initial conditions:
rcx = 0000 0000 0000 0020
rdx = 0000 0089 1234 4000
rbx = ffff ffff ff00 0300

● leal (%edx,%ecx),%eax rax = 0000 0000 1234 4020


● leaq −8(%rbx),%rax rax = ffff ffff ff00 02f8

54
Load Effective Address
● Syntax: lea src, dst
● Initial conditions:
rcx = 0000 0000 0000 0020
rdx = 0000 0089 1234 4000
rbx = ffff ffff ff00 0300

● leal (%edx,%ecx),%eax rax = 0000 0000 1234 4020


● leaq −8(%rbx),%rax rax = ffff ffff ff00 02f8
● leaq 12(%rdx,%rcx,2),%rax rax = 0000 0089 1234 404c

55
Load Effective Address
void fl(int x) { // x = %edi
return 9 * x + 1;
}

Not optimized code: Optimized code:

fl: fl:
movl %edi, %eax # tmp = x
leal 1(%edi, %edi, 8), %eax #eax=1+%edi+ %edi*8
sall 3, %eax # tmp = 8*x
ret
addl %edi, %eax # tmp += x
addl $1, %eax # tmp += 1
ret

56
Other Instructions
● inc <register>/dec <register>
○ add / subtract 1 to/from a register
○ Why not just add $1, <register>?

● In RISC-V each instruction is coded into constant 32 bits


● Intel uses encoding with variable number of bits
○ add $1, <register> would require you to encode:
■ The opcode of the add
■ The immediate value 1 (16, 32 or 64 bit)
■ The register <register>

● inc does not require you to encode the immediate value: saves at least 16 bits

57
cmov (Conditional Move)
● cmovx <source> <destination>
● copy a value (from source to destination) depending on the values of the condition codes

# x in register %edi, y in %esi 1


max:
cmpl %esi, %edi #Compare x:y
cmovge %edi, %esi # if >=, then y=x
movl %esi, %eax #set y as return value
ret

cmove ZF cmovne ˜ZF


cmovs SF (Negative) cmovns ˜SF (Nonnegative)
cmovg ˜(SF ˆ OF) & ˜ZF Greater (signed >) cmovge ˜(SF ˆ OF) Greater or equal (signed >=)
cmovl SF ˆ OF Less (signed <) cmovle (SF ˆ OF) | ZF Less or equal (signed <=)
cmova ˜CF & ˜ZF Above (unsigned >) cmovae ˜CF Above or equal (Unsigned >=)
cmovb CF Below (unsigned <) cmovbe CF | ZF below or equal (unsigned <=)

58
Example
● String C: character array is terminated by 0
○ ASCII: characters encoded in bytes

● Copy a string:
void copy_string(char *d, const char *s){
int i = 0;
while ((d[i] = s[i]) != 0) {
i += 1;
}
}

● How to do with Intel Assembly (64 bit)?

59
Recap:
● Calling Conventions are not really part of the architecture
○ Given a CPU/architecture, many different calling
○ They are used to make different compilers/libraries and other parts of the Operating
System compatible

● They are specified by the ABI, not by the ISA !!!


○ How/where to pass parameters? Stacks or registers?
○ What records to keep? When a program invokes a subroutine, which registers are
expected to always contain the same value on return?

60
x64 Calling Conventions
● First 6 arguments:
○ %rdi, %rsi, %rdx, %rcx, %r8 and %r9
● Other arguments (7 → n): on the stack
● Return values:
○ %rax and %rdx
● Preserved registers:
○ %rbp, %rbx, %r12, %r13, %r14 and %r15
● Not preserved registers:
○ %rax, %r10, %r11, in addition to the registers for parameter passing: %rdi, %rsi, %rdx, %rcx,
%r8 and %r9

61
Example
void copy_string(char *d, const char *s){
int i = 0;
while ((d[i] = s[i]) != 0) {
i += 1;
}
}

● The two parameters d and s are contained in %rdi and %rsi


● Suppose we use %rax for the counter i
○ It is not a preserved register, it is not necessary to save it on the stack
○ No prologue is needed: we can start with the C code

● We begin: i = 0; → %rax = 0
movq $0, %rax

62
Example
void copy_string(char *d, const char *s){
int i = 0;
while ((d[i] = s[i]) != 0) {
i += 1;
}
}
● First of all, let's load s[i] in %bl
● Use the indirect addressing mode:
L1: movb (%rsi, %rax), %bl # Init Loop
○ No need to load the address of the i-th element in a register (as in RISC-V)
● Now we store %bl in d[i]
movb %bl, (%rdi, %rax)

63
Example
void copy_string(char *d, const char *s){
int i = 0;
while ((d[i] = s[i]) != 0) {
i += 1;
}
}

● Now we need to check if s[i] == 0, If yes, end of the loop!


cmpb $0, %bl # compares %bl with 0
je L2 # if they are equal, jump to L2 (exit the loop!)
● If not, increment i (which stands in% rax) and loop …
add $1, %rax
jmp L1 # L1: loop start label
● The L2 label will implement the return to the caller

64
Example
void copy_string(char *d, const char *s){
int i = 0;
while ((d[i] = s[i]) != 0) {
i += 1;
}
}

● We have not saved anything on the stack


○ there is no need for an epilogue!

● You can directly return to the caller


L2: ret

65
Example

.text
void copy_string(char *d, const char *s){ .globl copy_string
int i = 0; copy_string:
while ((d[i] = s[i]) != 0) { movq $0, %rax
i += 1; L1 :
} movb (%rsi, %rax), %bl
} movb %bl, (%rdi, %rax)
cmpb $0, %bl
je L2
add $1, %rax
jmp L1
L2 :
ret

66
Code examples - 1
Code Assembler code Assumptions/Notes

int x, y, z; movl $0x10000004, %ecx &x = 0x10000004


... movl (%ecx), %eax &y = 0x10000008
z = x + y; addl 4(%ecx), %eax &z = 0x1000000c
movl %eax, 8(%ecx)

char a[100]; movl $0x1000000c, %ecx &a = 0x1000000c


... decb 1(%ecx)
a[1]--;

int d[4], x; movl $0x10000010, %ecx &d = 0x10000010


... movl (%ecx), %eax &x = 0x10000020
x = d[0]; movl %eax, 16(%ecx)
x += d[1]; movl 16(%ecx), %eax
addl 4(%ecx), %eax
movl %eax, 16(%ecx)

unsigned int y; movl $0x10000010, %ecx &y = 0x10000010


short z; movl (%ecx), %eax &z = 0x10000014
y = y/4; shrl 2, %eax
z = z << 3; movl %eax, (%ecx)
movw 4(%ecx), %ax
salw 3, %ax
movw %ax, 4(%ecx) 67
Code examples - 2
// data = %edi func_1:
// val = %esi movl (%esi) , %eax
// i = %edx addl (%edi, %edx, 4), %eax
int func_1(int data[], int *val, int i){ ret
int sum = *val;
sum += data[i];
return sum;
}

struct Data{ func_2:


char c; int d; addb $1, (%edi)
} subl %esi, 4(%edi)
// ptr = %edi ret
// x = %esi
void func_2(struct Data *ptr, int x){
ptr−>c++;
ptr−>d −= x;
}
// ptr = %edi The convention for X86 64 requires that the return
// x = %esi value of a function be stored in %eax / %rax 68
Code examples - 3
void abs_value(int x, int *res){ abs_value:
if (x < 0) test %edi, %edi // edi & edi
*res = −x; jns Lab1 // SF = 0
else negl %edi // 2’s comp.
*res = x; movl %edi, (%rsi)
} ret
Lab1:
movl %edi, (%rsi)
ret

// x = %edi, y = %esi, res = %rdx func_3:


void func_3(int x, int y, int *res){ cmpl %esi, %edi
if (x < y) jge Lab2
*res = x; movl %edi, (%rdx)
else ret
*res = y; Lab2:
} movl %esi, (%rdx)
ret
69
Code examples - 4
// x = %edi, y = %esi, res = %rdx func_4:
void func_4(int x, int y, int *res){ cmpl $−1, %edi
if ((x == −1) || (y == −1)) je .L6
cmpl $−1, %esi
*res = y − 1; je .L6
else if ((x > 0) && (y < x)) testl %edi, %edi
*res = x + 1; jle .L5
else cmpl %esi, %edi
*res = 0; jle .L5
} addl $1, %edi
movl %edi, (%rdx)
ret
.L5:
movl $0, (%rdx)
ret
.L6:
subl $1, %esi
movl %esi, (%rdx)
ret

// a = %edi, b = %esi avg:


movl %edi, %eax
int avg(int a, int b){ addl %esi, %eax
return (a+b)/2; sarl 1, %eax 70
} ret
Code examples - 5
// str = %rdi func_5:
movl $0, %eax
int func_5 (char str[]){ jmp .L2
int i = 0; .L3:
while (str[i] != 0){ addl $1, %eax
.L2:
i ++; movslq %eax, %rdx
} cmpb $0, (%rdi, %rdx)
return i; jne .L3
ret
}

// dat = %rdi, len = %esi func_6:


movl (%rdi), %eax
int func_6(int dat[], int len){ movl $1, %edx
int min = dat [0]; jmp .L2
for (int i = 1; i < len; i++){ .L4:
movslq %edx, %rcx
if (dat[i] < min){ movl (%rdi, %rcx, 4), %ecx
min = dat[i]; cmpl %ecx, %eax
} jle .L3
movl %ecx, %eax
} .L3:
return min; addl $1, %edx
} .L2:
cmpl %esi, %edx
jl .L4
ret 71
Code examples - 6
int caller(){ caller:
int sum = f1(1, 2, 3, 4, 5, 6, 7, 8); pushq $8
return sum; pushq $7
movl $6, %r9d
} movl $5, %r8d
int f1(int a1, int a2, int a3, int a4, movl $4, %ecx
int a5, int a6, int a7, int a8){ movl $3, %edx
return a1+a2+a3+a4+a5+a6+a7+a8 ; movl $2, %esi
} movl $1, %edi
call f1
addq $16, %rsp
ret
f1:
addl %edi, %esi
addl %esi, %edx
addl %edx, %ecx
addl %ecx, %r8d
addl %r8d, %r9d
movl %r9d, %eax
addl 8(%rsp), %eax
addl 16(%rsp), %eax
ret
72

You might also like