01 Lecture02
01 Lecture02
01 Lecture02
20200123
Outline
Memory segments
Format strings
1
A primer on x86 assembly
Introduction
2
It’s a trap!
• ≈ 1000 instructions . . .
• No time to know them all :-)
This overview is meant as a first help
Multiple syntaxes
• AT&T
• Intel
3
Basics
In general
Mnemonics accept from 0 to 3 arguments.
2 arguments mnemonics are of the form (Intel syntax)
m dst, src
4
Endianness
x = 0xdeadbeef
Endianness
byte address 0x00 0x01 0x02 0x03
byte content (big-endian) 0xde 0xad 0xbe 0xef
byte content (litte-endian) 0xef 0xbe 0xad 0xde
5
Architectures
Big endian
PowerPC, Sparc, 68000
Little endian
Intel, AMD
Bi-endian
ARM, RISC-V
These usually defaults to little endian.
6
Resources
• Cheat sheet
• Opcode and Instruction Reference
• Intel full instruction set reference
7
Basic registers (16/32/64 bits)
9
The full story
10
Register flags (partial)
of overflow flag
cf carry flag
zf zero flag
sf sign flag
df direction flag
pf parity flag
af adjust flag
11
Signed vs unsigned
• unsigned value
• signed value
• float (will not talk about it)
12
Transfer
Move
13
Arithmetic
14
Arithmetic
Sign preservation
1 mov ax, 0xff00 # unsigned: 65280, signed : -256
2 # ax=1111.1111.0000.0000
3 sal ax, 2 # unsigned: 64512, signed : -1024
4 # ax=1111.1100.0000.0000
5 sar ax, 5 # unsigned: 65504, signed : -32
6 # ax=1111.1111.1110.0000
15
Basic logical operators
Basic semantics
Examples
1 xor ax, ax # ax = 0x0000
2 not ax # ax = 0xffff
3 mov bx, 0x5500 # bx = 0x5500
4 xor ax, bx # ax = 0xbbff
16
Logical shifts
Shift
Example
1 mov ax, 0xff00 # unsigned: 65280, signed : -256
2 # ax=1111.1111.0000.0000
3 shl ax, 2 # unsigned: 64512, signed : -1024
4 # ax=1111.1100.0000.0000
5 shr ax, 5 # unsigned: 2016, signed : 2016
6 # ax=0000.0111.1110.0000
17
Comparison and test instructions
Comparison
cmp dst, src : set condition according to dst − src
Test
test dst, src: set condition according to dst & src
18
Stack manipulation
Push
Pop
19
Nops
20
Misc
Int
21
Unconditional jump instructions
Call
call address
call *op
Jmp
jmp *op
jmp address
22
Extra jumps
Leave
esp := ebp; ebp := pop();
Ret
esp := esp + 4; eip := @[esp - 4];
23
Unsigned jumps
Reading
ja has n and e versions, means that mnemonics
exist as well
24
Signed jumps
25
Addressing modes
Mode Intel
Immediate mov ax, 16h
Direct mov ax, [1000h]
Register Direct mov bx, ax
Register Indirect (indexed) mov ax, [di]
Based Indexed Addressing mov ax, [bx + di]
Based Indexex Disp. mov eax, [ebx + edi + 2]
26
The semantics of instructions
may seem intuitive
but is complex
26
Instructions do have side effects
27
Real behavior of conditions
ja ¬ CF ∧¬ ZF x >u y x 0 6= 0 x &y 6= 0
jnae CF x <u y x 0 6= 0 ⊥
je ZF x =y x0 = 0 x &y =0
28
Shift left
29
Memory segments
General overview
1. code (text)
2. stack
3. data segments
3.1 data
3.2 bss
3.3 heap
30
Execution
31
Graphically speaking
the hole
the break
bss globals
data
text
32
Text segment
stack
data
text
33
Data & bss segments
stack
data
text
34
Heap segment
stack
stack
• The stack segment is a temporary
scratch pad for functions
the hole
• Since eip changes on function calls,
the stack is used to remember the
previous state (return address, calling
heap function base, arguments, . . . ).
• It is writable
bss
• It can grow larger, towards lower
memory addresses – w.r.t to function
data calls.
text
36
In C
37
Stack-based buffer overflows
C low-level responsibility
38
Reminder : stack layout for x86
return address f
saved frame pointer f
stack frame f
Code locals f
f: ...
call g
... arguments g
return address g
Data saved frame pointer g
stack frame g
locals g
buffer
39
Vulnerability reason
40
A rich history
41
Basic exploitation
return address f
saved frame pointer f
stack frame f
Code locals f
f: ...
call g
... arguments g
return address g
Data saved frame pointer g
stack frame g
injected code
locals g
42
Frame pointer overwriting
return address f
saved frame pointer f
stack frame f
Code locals f
f: ...
call g
... arguments g
return address g
Data saved frame pointer g
stack frame g
locals g
injected code
43
Indirect pointer overwriting
return address f
saved frame pointer f
stack frame f
Code locals f
f: ...
call g
... arguments g
return address g
Data saved frame pointer g
stack frame g
injected code
locals g
44
Example 1
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 int check_authentication(char *password) {
6 int auth_flag = 0;
7 char password_buffer[16];
8 strcpy(password_buffer, password);
9 if (strcmp(password_buffer, "brillig") == 0)
10 auth_flag = 1;
11 if (strcmp(password_buffer, "outgrabe") == 0)
12 auth_flag = 1;
13 return auth_flag;
14 }
15
16 int main(int argc, char *argv[]) {
17 if (argc < 2) { printf("Usage: %s <password>\n", argv[0]); exit(0); }
18 if (check_authentication(argv[1])) printf("\nAccess Granted.\n");
19 else printf("\nAccess Denied.\n");
20 }
45
Example 2
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 int check_authentication(char *password) {
6 char password_buffer[16]; /* Putting buffers before variables to impede
7 int auth_flag = 0;
8 strcpy(password_buffer, password);
9 if (strcmp(password_buffer, "brillig") == 0)
10 auth_flag = 1;
11 if (strcmp(password_buffer, "outgrabe") == 0)
12 auth_flag = 1;
13 return auth_flag;
14 }
15
16 int main(int argc, char *argv[]) {
17 if (argc < 2) { printf("Usage: %s <password>\n", argv[0]); exit(0); }
18 if (check_authentication(argv[1])) printf("\nAccess Granted.\n");
19 else printf("\nAccess Denied.\n");
20 }
46
Constraints
Needs
• Hardware willing to execute data as code
• No null bytes
Variants
• Frame pointer corruption
• Causing an exception to execute a specific function
pointer
47
Statistics # (https://nvd.nist.gov/vuln)
48
Statistics % (https://nvd.nist.gov/vuln)
49
Heap-based buffer overflows
Vulnerability
50
Basic exploitation
51
Overwriting heap-based function pointers
52
Constraints
53
Statistics # (https://nvd.nist.gov/vuln)
54
Statistics % (https://nvd.nist.gov/vuln)
55
Format strings
About format strings vulnerabilities
56
Vulnerability
How it works
• The format string is copied to the output unless ’%’ is
encountered.
• Then the format specifier will manipulate the output.
• When an argument is required, it is expected to be on the
stack.
57
Caveat
And so ..
If an attacker is able to specify the format string, it is now
able to control what the function pops from the stack and
can make the program write to arbitrary memory locations.
CVEs
Software CVE
Zend 2015-8617
latex2rtf 2015-8106
VmWare 8x 2012-3569
WuFTPD (providing remote root since 1994) 2000
58
Good & Bad
Good Í Bad ë
1 int f (char *user) { 1 int f (char *user) {
2 printf("%s", user); 2 printf(user);
3 } 3 }
59
Exploitation
60
Example
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 int main(int argc, char *argv[]) {
6 char text[1024];
7 static int test_val = 65;
8 if (argc < 2) {
9 printf("Usage: %s <text to print>\n", argv[0]);
10 exit(0);
11 }
12 strcpy(text, argv[1]);
13 printf("The right way to print user-controlled input:\n");
14 printf("%s", text);
15 printf("\nThe wrong way to print user-controlled input:\n");
16 printf(text);
17 // Debug output
18 printf("\n[*] test_val @ 0x%08x = %d 0x%08x\n",
19 &test_val, test_val, test_val);
20 exit(0);
21 } 61
Stack situation
fmt
...
argn
...
arg1
&fmt
62
Reading from arbitrary addresses
63
Printing local variable
64
Writing to arbitrary memory
65
It may be unintentional
66
Statistics # (https://nvd.nist.gov/vuln)
67
Statistics % (https://nvd.nist.gov/vuln)
68
Looking back
69
Play (exploitation) games
https://microcorruption.com
70
Questions ?
https://rbonichon.github.io/teaching/2020/asi36/
70