05 Compilation Linking Loading
05 Compilation Linking Loading
05 Compilation Linking Loading
Abhijit A M
Review of last few lectures
Boot sequence: BIOS, boot-loader, kernel
●
Boot sequence: Process world
– kernel->init -> many forks+execs() -> ....
●
Hardware interrupts, system calls, exceptions
●
Event driven kernel
●
System calls
– Fork, exec, ... open, read, ...
2
What are compiler, assembler, linker and
loader, and C library
System Programs/Utilities
Most essential to make a kernel really usable
3
Standard C Library
●
A collection of some of the most frequently needed functions for C
programs
– scanf, printf, getchar, system-call wrappers (open, read, fork, exec, etc.), ...
●
An machine/object code file containing the machine code of all
these functions
– Not a source code! Neither a header file. More later.
●
Where is the C library on your computer?
– /usr/lib/x86_64-linux-gnu/libc-2.31.so
4
Compiler
●
application program, which converts one (programming) language to another
– Most typically compilers convert a high level language like C, C++, etc. to Machine
code language
●
E.g. GCC /usr/bin/gcc
main.c gcc main
– Usage: e.g.
Source Machine
– $ gcc main.c -o main code file code file
– Here main.c is the C code, and "main" is the object/machine code file generated
●
Input is a file and output is also a file.
●
Other examples: g++ (for C++), javac (for java)
5
Assembler
●
application program, converts assembly code into machine code
●
What is assembly language?
– Human readable machine code language.
●
E.g. x86 assembly code main
main.s as
– mov 50, r1 assembly Machine
– add 10, r1 code file code file
gcc
●
From the
textbook
Example
try.c f.c g.c
#include <stdio.h> int g(int); int g(int x) {
#define MAX 30 #define ADD(a, b) (a + b) return x + 10;
int f(int, int); int f(int m, int n) { }
int main() { return ADD(m,n) + g(10);
int i, j, k; }
scanf("%d%d", &i, &j);
k = f(i, j) + MAX; Try these commands, observe the output/errors/warnings,
printf("%d\n", k); and try to understand what is happening
return 0; $ gcc try.c
} $ gcc -c try.c
$ gcc -c f.c
$ gcc -c g.c
$ gcc try.o f.o g.o -o try
$ gcc -E try.c
$ gcc -E f.c
More about the steps
●
Pre-processor
– #define ABC XYZ
●
cut ABC and paste XYZ
– # include <stdio.h>
●
copy-paste the file stdio.h
●
There is no CODE in stdio.h, only typedefs, #includes, #define, #ifdef, etc.
●
Linking
– Normally links with the standard C-library by default
– To link with other libraries, use the -l option of gcc
● cc main.c -lm -lncurses -o main # links with libm.so and
libncurses.so
Using gcc itself to understand the
process
●
Run only the preprocessor
– cc -E test.c
– Shows the output on the screen
●
Run only till compilation (no linking)
– cc -c test.c
– Generates the “test.o” file , runs compilation + assembler
– gcc -S main.c
– One step before machine code generation, stops at assembly code
●
Combine multiple .o files (only linking part)
cc test.o main.o try.o -o something
Linking process
●
Linker is an application program
– On linux, it's the "ld" program
– E.g. you can run commands like $ ld a.o b.o -o c.o
– Normally you have to specify some options to ld to get a proper
executable file.
●
When you run gcc
– $ cc main.o f.o g.o -o try
– the CC will internally invoke "ld" . ld does the job of linking
Linking process
●
The resultatnt file "try" here, will contain the codes of all the
functions and linkages also.
●
What is linking?
– "connecting" the call of a function with the code of the
function.
●
What happens with the code of printf()?
– The linker or CC will automatically pick up code from the
libc.so.6 file for the functions.
Executable file format
●
An executable file needs to
execute in an environment
●
ELF : The format on
created by OS and on a Linux.
particular processor ●
Try this
– Contains machine code + other
information for OS – $ file /bin/ls
– Need for a structured-way of storing
machine code in it
– $ file /usr/lib/x86_64-
●
Different OS demand different linux-gnu/libc-2.31.so
formats
– Windows: PE, Linux: ELF, Old
Unixes: a.out, etc.
Exec() and ELF
●
When you run a program ●
ELF is used not only for executable
(complete machine code) programs,
– $ ./try
but also for partially compiled files e.g.
– Essentially there willl be a fork() main.o and library files like libc.so.6
and exec("./try", ...") ●
What is a.out?
– So the kernel has to read the file – "a.out" was the name of a format
"./try" and understand it.
used on earleir Unixes.
– So each kernel will demand it's – It so happened that the early
own object code file format.
compiler writers, also created
– Hence ELF, EXE, etc. Formats executable with default name 'a.out'
Utilities to play with object code files
●
objdump
●
ar
– To create a “statically linked”
– $ objdump -D -x /bin/ls
library file
– Shows all disassembled – $ ar -crs libmine.a one.o two.o
machine instructions and ●
Gcc to create shared library
“headers”
– $ gcc hello.o -shared -o
●
hexdump libhello.so
– $ hexdump /bin/ls ●
To see how gcc invokes as, ld, etc;
do this
– Just shows the file in – $ gcc -v hello.c -o hello
hexadecimal
– /*
●
readelf https://stackoverflow.com/questi
ons/1170809/how-to-get-gcc-lin
– Alternative to objdump
Linker, Loader, Link-Loader
●
Linker or linkage-editor or link-editor
– The “ld” program. Does linking.
●
Loader
– The exec(). It loads an executable in the memory.
●
Link-Loader
– Often the linker is called link-loader in literature. Because where
were days when the linker and loader’s jobs were quite over-lapping.
Static, dynamic / linking, loading
●
Both linking and loading can be
– Static or dynamic
– More about this when we learn memory management
●
An important fundamental:
– memory management features of processor, memory management architecture of
kernel, executable/object-code file format, output of linker and job of loader, are
all interdependent and in-separable.
– They all should fit into each other to make a system work
– That’s why the phrase “system programs”
Cross-compiler
●
Compiler on system-A, but generate object-code file for
system-B (target system)
– E.g. compile on Ubuntu, but create an EXE for windows
●
Normally used when there is no compiler available on
target system
– see gcc -m option
●
See https://wiki.osdev.org/GCC_Cross-Compiler