OS Lecture 04
OS Lecture 04
OS Lecture 04
main.c sum.c
2
Static Linking
Programs are translated and linked using a compiler driver:
linux> gcc -Og -o prog main.c sum.c
linux> ./prog
main.c sum.c
Source files
Translators Translators
(cpp, cc1, as) (cpp, cc1, as)
4
Why Linkers? (cont)
Reason 2: Efficiency
Time: Separate compilation
Change one source file, compile, and then relink.
No need to recompile other source files.
Space: Libraries
Common functions can be aggregated into a single
file...
Yet executable files and running memory images
contain only code for the functions they actually use.
5
What Do Linkers Do?
Step 1: Symbol resolution
Programs define and reference symbols (global variables
and functions):
void swap() {…} /* define symbol swap */
swap(); /* reference symbol swap */
int *xp = &x; /* define symbol xp, reference x */
Symbol definitions are stored in object file (by assembler)
in symbol table.
Symbol table is an array of structs
Each entry includes name, size, and location of symbol.
During symbol resolution step, the linker associates each
symbol reference with exactly one symbol definition. 6
What Do Linkers Do? (cont)
Step 2: Relocation
Merges separate code and data sections into single
sections
Relocates symbols from their relative locations in the .o
files to their final absolute memory locations in the
executable.
Updates all references to these symbols to reflect their
new positions.
Let’s look at these two steps in more detail….
7
Three Kinds of Object Files (Modules)
Relocatable object file (.o file)
Contains code and data in a form that can be combined
with other relocatable object files to form executable
object file.
Each .o file is produced from exactly one source (.c) file
Executable object file (a.out file)
Contains code and data in a form that can be copied
directly into memory and then executed.
Shared object file (.so file)
Special type of relocatable object file that can be loaded
into memory and linked dynamically, at either load time
or run-time.
Called Dynamic Link Libraries (DLLs) by Windows
8
Executable and Linkable Format (ELF)
Standard binary format for object files
One unified format for
Relocatable object files (.o),
Executable object files (a.out)
Shared object files (.so)
Generic name: ELF binaries
9
ELF Object File Format
Elf header 0
ELF header
Word size, byte ordering, file type (.o, exec, .so), machine
type, etc. Segment header table
(required for executables)
Segment header table
.text section
Page size, virtual addresses memory segments (sections),
segment sizes. .rodata section
int sum(int *a, int n); main.c int sum(int *a, int n) sum.c
{
int array[2] = {1, 2}; int i, s = 0;
Defining
a global Referencing Linker knows
Linker knows a global… nothing of i or s
nothing of val …that’s defined here
13
Local Symbols
Local non-static C variables vs. local static C variables
local non-static C variables: stored on the stack
local static C variables: stored in either .bss, or .data
int f()
{
static int x = 0;
return x;
Compiler allocates space in .data for each
} definition of x
15
Linker’s Symbol Rules
Rule 1: Multiple strong symbols are not allowed
Each item can be defined only once
Otherwise: Linker error
Rule 2: Given a strong symbol and multiple weak
symbols, choose the strong symbol
References to the weak symbol resolve to the strong
symbol
Rule 3: If there are multiple weak symbols, pick an
arbitrary one
Can override this with gcc –fno-common
16
Linker Puzzles
int x;
p1() {} p1() {} Link time error: two strong symbols (p1)
Otherwise
Use static if you can
Initialize if you define a global variable
Use extern if you reference an external global variable
18
Step 2: Relocation
Relocatable Object Files Executable Object File
main()
.text
main.o
swap()
main() .text
System data
sum.o .data
int array[2]={1,2}
sum() .text
.symtab
.debug
19
Relocation Entries
int array[2] = {1, 2}; main.c
int main()
{
int val = sum(array, 2);
return val;
}
00000000004004e8 <sum>:
4004e8: b8 00 00 00 00 mov $0x0,%eax
4004ed: ba 00 00 00 00 mov $0x0,%edx
4004f2: eb 09 jmp 4004fd <sum+0x15>
4004f4: 48 63 ca movslq %edx,%rcx
4004f7: 03 04 8f add (%rdi,%rcx,4),%eax
4004fa: 83 c2 01 add $0x1,%edx
4004fd: 39 f2 cmp %esi,%edx
4004ff: 7c f3 jl 4004f4 <sum+0xc>
400501: f3 c3 repz retq
24
Creating Static Libraries
atoi.c printf.c random.c
unix> ar rs libc.a \
Archiver (ar) atoi.o printf.o … random.o
Linker (ld)
int main()
{
void *handle;
void (*addvec)(int *, int *, int *, int);
char *error;
36
Some Interpositioning Applications
Security
Confinement (sandboxing)
Behind the scenes encryption
Debugging
In 2014, two Facebook engineers debugged a treacherous 1-
year old bug in their iPhone app using interpositioning
Code in the SPDY networking stack was writing to the wrong
location
Solved by intercepting calls to Posix write functions (write,
writev, pwrite)
Source: Facebook engineering blog post at
https://code.facebook.com/posts/313033472212144/debugging-file-corruption-on-ios/
37
Some Interpositioning Applications
Monitoring and Profiling
Count number of calls to functions
Characterize call sites and arguments to functions
Malloc tracing
Detecting memory leaks
Generating address traces
38
Example program
#include <stdio.h> Goal: trace the addresses and sizes
#include <malloc.h> of the allocated and freed blocks,
without breaking the program, and
int main() without modifying the source
{ code.
int *p = malloc(32);
free(p); Three solutions: interpose on the
return(0); lib malloc and free functions
} at compile time, link time, and
load/run time.
int.c
Compile-time Interpositioning
#ifdef COMPILETIME
#include <stdio.h>
mymalloc.c
#include <malloc.h>
41
Link-time Interpositioning
#ifdef LINKTIME mymalloc.c
#include <stdio.h>
void *__real_malloc(size_t size);
void __real_free(void *ptr);
if (!ptr)
return;
45
Load/Run-time Interpositioning
linux> make intr
gcc -Wall -DRUNTIME -shared -fpic -o mymalloc.so mymalloc.c -ldl
gcc -Wall -o intr int.c
linux> make runr
(LD_PRELOAD="./mymalloc.so" ./intr)
malloc(32) = 0xe60010
free(0xe60010)
linux>
46
Interpositioning Recap
Compile Time
Apparent calls to malloc/free get macro-expanded into
calls to mymalloc/myfree
Link Time
Use linker trick to have special name resolutions
malloc __wrap_malloc
__real_malloc malloc
Load/Run Time
Implement custom version of malloc/free that use
dynamic linking to load library malloc/free under different
names
47
Any Questions?
48