ch2编程背景
ch2编程背景
ch2编程背景
2 Programming Background
2.3 Program Development
2.3.1 Program Development Steps
(1). Create source files
t1.c and t2.c are the source files of a C program
2.3.2 Variables in C
Variables in C programs can be classified as global, local, static,
automatic and registers, etc. as
shown in Fig. 2.3.
• Global variables are defined outside of any function.
• Local variables are defined inside functions.
• Global variables are unique and have only one copy.
• Static globals are visible only to the file in which they are defined.
• Non-static globals are visible to all the files of the same program.
• Initialized globals are assigned values at compile time.
• Uninitialized globals are cleared to 0 when the program execution starts.
• Local variables are visible only to the function in which they are
defined.
• By default, local variables are automatic; they come into existence when
the function is entered and they logically disappear when the function
exits.
• For register variables, the compiler tries to allocate them in CPU
registers.
• Since automatic local variables do not have any allocated memory
space until the function is entered, they cannot be initialized at
compile time.
• Static local variables are permanent and unique, which can be
initialized.
• C also supports volatile variables, which are used as memory-mapped
I/O locations or global variables that are accessed by interrupt
handlers or multiple execution threads.
• The volatile keyword prevents the C compiler from optimizing the
code that operates on such variables.
2.3.3 Compile-Link in GCC
(2). Use gcc to convert the source files into a binary executable, as in
gcc t1.c t2.c
which generates a binary executable file named a.out(assembler output).
(3). What’s gcc? gcc is a program, which consists of three major steps,
as shown in Fig. 2.4.
Step 1. Convert C source files to assembly code files :
The C COMPILER translates .c files into .s files containing assembly
code of the target machine.
If all the cross references can be resolved successfully, the linker writes
the resulting combined file as a.out, which is the binary executable file.
2.3.4 Static vs. Dynamic Linking
• In static linking, which uses a static library, the linker includes all the
needed library function code and data into a.out. This makes a.out
complete and self-contained but usually very large.
• In dynamic linking, which uses a shared library, the library functions
are not included in a.out but calls to such functions are recorded in
a.out as directives.
• When execute a dynamically linked a.out file, the operating system
loads both a.out and the shared library into memory and makes the
loaded library code accessible to a.out during execution.
The main advantages of dynamic linking are:
• The size of every a.out is reduced.
• Many executing programs may share the same library functions in
memory.
• Modifying library functions does not need to re-compile the source
files again.
Note that the bss section, which contains uninitialized global and static
local variables, is not in the a.out file. Only its size is recorded in the
a.out file header. Also, automatic local variables are not in a.out.
• Figure 2.5 shows the layout of an a.out file.
• _brk is a symbolic mark indicating the end of the bss section. The total
loading size of a.out is usually equal to _brk, i.e. equal to
tsize+dsize+bsize. If desired, _brk can be set to a higher value for a
larger loading size. The extra memory space above the bss section is
the HEAP area for dynamic memory allocation during execution.
2.3.7 Program Execution
• Under a Unix-like operating system, the sh command line
a.out one two three
executes a.out with the token strings as command-line parameters.
(2) Every CPU has the following registers or equivalent, where the
entries in parentheses denote registers of the x86 CPU:
PC (IP): point to next instruction to be executed by the CPU.
SP (SP): point to top of stack.
FP (BP): point to the stack frame of current active function.
Return Value Register (AX): register for function return value.
(3) In every C program, main() is called by the C startup code crt0.o.
When crt0.o calls main(), it pushes the return address (the current PC
register) onto stack and replaces PC with the entry address of main(),
causing the CPU to enter main().
When control enters main(), the stack contains the saved return PC on
top, as shown in Fig. 2.8, in which XXX denotes the stack contents
before crt0.o calls main(), and SP points to the saved return PC from
where crt0.o calls main().
(4) Upon entry, the compiled code of every C function does the
following:
• push FP onto stack # this saves the CPU's FP register on stack.
• let FP point at the saved FP # establish stack frame
• shift SP downward to allocate space for automatic local variables on
stack
• the compiled code may shift SP farther down to allocate some
temporary working space on the stack, denoted by temps.
• After entering main(), the stack contents becomes as shown in Fig.
2.9, in which the spaces of a, b, c are allocated but their contents are
yet undefined.
(5) Then the CPU starts to execute the code a=1; b=2; c=3; which put
the values 1, 2, 3 into the memory locations of a, b, c, respectively.
Assume that sizeof(int) is 4 bytes. The locations of a, b, c are at -4, -8, -
12 bytes from where FP points at. These are expressed as -4(FP), -
8(FP), -12(FP) in assembly code, where FP is the stack frame pointer.
For example, in 32-bit Linux the assembly code for b=2 in C is
movl $2, -8(%ebp) # b=2 in C
where $2 means the value of 2 and %ebp is the ebp register.
(6) main() calls sub() by c = sub(a, b); The compiled code of the
function call consists of
• Push parameters in reverse order, i.e. push values of b=2 and a=1 into
stack.
• Call sub, which pushes the current PC onto stack and replaces PC with
the entry address of sub, causing the CPU to enter sub().
• When control first enters sub(), the stack contains a return address at
the top, preceded by the parameters, a, b, of the caller, as shown in
Fig. 2.10.
(7) Since sub() is written in C, it actions are exactly the same as that of
main(), i.e. it
• Push FP and let FP point at the saved FP;
• Shift SP downward to allocate space for local variables u, v.
• The compiled code may shift SP farther down for some temp space on
stack.
• The stack contents becomes as shown in Fig. 2.11.
2.4.2 Stack Frames
While execution is inside a function, such as sub(), it can only access
global variables, parameters passed in by the caller and local variables.
Global and static local variables are in the combined Data section,
which can be referenced by a fixed base register.
So the problem is: how to reference parameters and automatic locals?
• The stack area visible to a function, i.e. parameters and automatic
locals, is called the Stack Frame of a function, FP is called the Stack
Frame Pointer.
What would happen if we have a sequence of function calls, e.g.
crt0.o --> main() --> A(par_a) --> B(par_b) --> C(par_c)
the function call sequence is maintained in the stack as a link list, as
shown in Fig. 2.13.
-L. specifies the library path (current directory), and -l specifies the library.
Note that the library (mylib) is specified without the prefex lib, as well as the
suffix .a
2.7 Makefile
2.7.1 Makefile Format
• A make file consists of a set of targets, dependencies and rules.
• A target is usually a file to be created or updated, it may also be a
directive to, or a label to be referenced by, the make program.
• A target depends on a set of source files, object files or even other
targets, which are described in a Dependency List.
• Rules are the necessary commands to build the target by using the
Dependency List.
Makefile format
2.7.2 The make Program
• When the make program reads a makefile, it determines which targets
to build by comparing the timestamps of source files in the
Dependency List.
• If any dependency has a newer timestamp since last build, make will
execute the rule associated with the target.
%.o stands for all .o files and $@ is set to the current target name, i.e.
the current .o file name. This avoids defining separate targets for
individual .o files.
Makefile Example 4: Use make variables and
suffix rules
CC = gcc
CFLAGS = -I.
OBJS = t.o mysum.o
AS = as # assume we have .s files in assembly also
DEPS = type.h # list all .h files in DEPS
.s.o: # for each fname.o, assemble fname.s into fname.o
$(AS) –a $< -o $@ # -o $@ REQUIRED for .s files
.c.o: # for each fname.o, compile fname.c into fname.o
$(CC) –c $< -o $@ # -o $@ optional for .c files
%.o: %.c $(DEPS) # for all .o files: if its .c or .h file changed
$(CC) –c –o $@ $< # compile corresponding .c file again
myt: $(OBJS)
$(CC) $(CFLAGS) -o $@ $^
the lines .s.o: and .c.o: are not targets but directives to the make program
by the suffix rule.
for each .o file, there should be a corresponding .s or .c file to build
if their timestamps differ.
$@ means the current target.
$< means the first file in the dependency list.
$^ means all files in the dependency list.
Open EMACS Tools menu and select Compile. EMACS will show a prompt
line at the bottom of the edit window
make –k
and waits for user response.
EMACS normally compile-link the source code by a makefile. If the
reader already has a makefile in the same directory as shown above, press the
Enter key to let EMACS continue.
In instead of a makefile, the reader may also enter the command line manually.
3. Start up GDB: Open EMACS Tools menu and select Debugger.
EMACS will show a prompt line at the bottom of the edit window and
wait for user response.
gdb –i=mi t
Press Enter to start up the GDB debugger.
GDB will run in the upper window and display a menu and a
tool bar at the top of the EMACS edit window, as shown in Fig. 2.18.
The user may now enter GDB commands to debug the program. For
example, to set break points, enter the GDB commands
b main # set break point at main
b sub # set break point at sub
b 10 # set break point at line 10 in program
When the user enters the Run (r) command (or choose Run in the tool
bar), GDB will display the program code in the same GDB window.
Other frames/windows can be activated through the submenu
GDB-Frames or GDB-Windows.
4. GDB in Multi-Windows: From the GDB menu, choose Gud => GDB-
MI => Display Other Windows.
GDB will display GDB buffers in different windows, as shown in Fig.
2.19.
Figure 2.19 shows six (6) GDB windows, each displays a specific GDB
buffer.
• Gud-t: GDB buffer for user commands and GDB messages
• t.c: Program source code to show progress of execution
• Stack frames: show stack frames of function calling sequence
• Local Registers: show local variables in current executing function
• Input/output: for program I/O
• Breakpoints: display current break points settings
• It also shows some of the commonly used GDB commands in a tool
bar, e.g. Run, Continue, Next line, Step line.
Figure 2.20 shows that the program execution is now inside sub() and
the execution already passed the statements before
printf(“return from sub\n”);
(5). Additional GDB Commands:
• At each break point or while executing in single line mode, the user
may enter GDB commands
• either manually,
• by the GDB tool bar or by choosing submenu items in the Gud menu,
which includes all the commands in the GDB tool bar.
• The following lists some additional GDB commands and their
meanings.
2.9 Structures in C
A structure is a composite data type containing a collection of variables
or data objects.