ICS2305 - Systems Programming Lecture Notes ICS2305 - Systems Programming Lecture Notes
ICS2305 - Systems Programming Lecture Notes ICS2305 - Systems Programming Lecture Notes
ICS2305 - Systems Programming Lecture Notes ICS2305 - Systems Programming Lecture Notes
DEPARTMENT OF INFORMATION
TECHNOLOGY
Calvin Otieno
([email protected])
Course description
System Programming overview : Intro and Definitions, Application Vs System Pro-
gramming, System Software, Operating system, Device Drivers, OS Calls. Window
System Programming for Intel386 Architecture: 16 bit Vs 32 bit, Programming, 32
bit Flat memory model , Windows Architecture. Virtual Machine (VM) Basics, Sys-
tem Virtual Machine, Portable Executable Format, Ring O Computer, Linear Exe-
cutable format, Virtual Device, Device Driver Development. UNIX Unifying Ideas
Compilation C Data types, Pointers, etc. Standard I/O, Dynamic memory manage-
ment, System calls & error handling; Building, Debugging and Directories in Unix,
Unix File System Structure; Linux File System Kernel Data Structures; Low-level
I/O (Processes, File Descriptors, Opening/Creating/Closing/Deleting Files, Read-
ing and Writing); Intro to Processes (Programming with Processes: IPC, Pipes,
FIFO/Named Pipes, Message queues, Sockets, Semaphores and Locks, Signals;
Abnormal Termination, Orphaned Processes). Threads, Network File Systems (CVS,
NFS v2 spec). Subverting an interface: The SFS Server.
Prerequisites:
BIT 2203: Advanced Programming, ICS 2202: Computer Operating Systems
Course aims
This course aims at providing students a basic understanding of the issues involved
in writing system programs, manipulating system processes, system IO, system per-
missions, files, directories, signals, threads, sockets, terminal, etc. The primary op-
erating system discussed will be Unix (Linux) but Windows will also be discussed.
A primary goal of the course then is to reinforce, in a systems programming context,
the skills you have already acquired to develop code that is robust.
Learning outcomes
By the end of this course unit, the learner should be able to:
ii
2. Write, compile, debug, and execute C programs that correctly use system
interfaces provided by UNIX (or a UNIX-like operating system).
3. List UNIX system calls, and invoke them correctly from within C programs.
4. Write, compile, debug, and execute C programs that create, manage and ter-
minate processes and threads on UNIX.
5. Write, compile, debug, and execute C programs with processes and threads
that interact by invoking and catching signals.
6. 6. Write, compile, debug, and execute C programs that use files and I/O on
UNIX.
7. Write, compile, debug, and execute C programs that make use of memory
management functions.
Instruction methodology
• Lectures,
• Tutorials,
• Practical,
• Presentations,
• Discussions,
• Industrial visits.
Assessment information
The module will be assessed as follows;
iii
Contents
2 Interupt mechanisms 9
2.1 introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 The interrupt vector table . . . . . . . . . . . . . . . . . . 10
2.1.2 Interrupt Invocation . . . . . . . . . . . . . . . . . . . . . 11
2.1.3 Parameter passing into Software interrupts . . . . . . . . . . 12
2.1.4 Software interrupts invocation . . . . . . . . . . . . . . . . 12
3 Registry service 18
iv
CONTENTS CONTENTS
3.1 introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2 Function pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
•.6 void myfunc() . . . . . . . . . . . . . . 20
3.3 Interrupt pointers and functions . . . . . . . . . . . . . . . . . . . . 21
3.3.1 Setting Interrupt Vector . . . . . . . . . . . . . . . . . . . 21
3.3.2 The Keep function . . . . . . . . . . . . . . . . . . . . . . 23
7 System processes 44
7.1 introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.1.1 Starting a Process: . . . . . . . . . . . . . . . . . . . . . . 44
• Foreground Processes: . . . . . . . . . . . . . . . 44
• Background Processes: . . . . . . . . . . . . . . . 45
7.1.2 Listing Running Processes: . . . . . . . . . . . . . . . . . . 45
CONTENTS CONTENTS
9 Systems performance 57
9.1 introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.1.1 Performance Components: . . . . . . . . . . . . . . . . . . 57
9.1.2 Performance Tools: . . . . . . . . . . . . . . . . . . . . . . 58
vi
CONTENTS CONTENTS
vii
LESSON 1
Intro to systems programing
2. output
While designing software the programmer may determine the required inputs for
that program, the wanted outputs and the processing the software would perform
in order to give those wanted outputs. The implementation of the processing part
is associated with application programming. Application programming facilitates
the implementation of the required processing that software is supposed to perform;
everything that is left now is facilitated by system programming. Systems program-
ming is the study of techniques that facilitate the acquisition of data from input
devices, these techniques also facilitates the output of data which may be the result
of processing performed by an application.
• Programmed I/O
In case of programmed I/O the CPU continuously checks the I/O device if the I/O
operation can be performed or not. If the I/O operations can be performed the CPU
performs the computations required to complete the I/O operation and then again
starts waiting for the I/O device to be able to perform next I/O operation. In this
way the CPU remains tied up and is not doing anything else besides waiting for the
I/O device to be idle and performing computations only for the slower I/O device.
In case of interrupt driven the flaws of programmed driven I/O are rectified. The
processor does not check the I/O device for the capability of performing I/O op-
eration rather the I/O device informs the CPU that it’s idle and it can perform I/O
operation, as a result the execution of CPU is interrupted and an Interrupt Service
Routine (ISR) is invoked which performs the computations required for I/O oper-
ation. After the execution of ISR the CPU continues with whatever it was doing
before the interruption for I/O operation. In this way the CPU does not remain
tied up and can perform computations for other processes while the I/O devices are
busy performing I/O and hence is more optimal. Usually it takes two bus cycles to
transfer data from some I/O port to memory or vice versa if this is done via some
processor register. This transfer time can be reduced bypassing the CPU as ports
and memory device are also interconnected by system bus. This is done with the
support of DMA controller. The DMA (direct memory access) controller can con-
troller the buses and hence the CPU can be bypassed data item can be transferred
from memory to ports or vice versa in a single bus cycle.
• Buffering
We shall discuss various such I/O controllers interfaced with CPU and also the
techniques and rules by which they can be programmed to perform the required I/O
operation.
Some of such controllers are
•.5. We shall discuss all of them in detail and how they can be used to perform I/O
operations.
drive parameter block) and the FCBs(file control block) which collectively forms
the directory structure.
To understand the file structure the basic requirement is the understanding of the
disk architecture, the disk formatting process and how this process divides the disk
into sectors and clusters.
• Real Mode
• Protected Mode
In real mode the processor can access only first one MB of memory to control
the memory within this range the DOS operating system makes use of some data
structures called
We shall discuss how these data structures can be directly accessed, what is the
significance of data in these data structures. This information can be used to traverse
through the memory occupied by the processes and also calculate the total amount
of free memory available. Certain operating systems operate in protected mode. In
protected mode all of the memory interfaced with the processor can be accessed.
Operating systems in this mode make use of various data structures for memory
management which are
We will discuss the significance these data structures and the information stored in
them. Also we will see how the logical addresses can be translated into physical
addresses using the information these tables
• Programmed I/O
In case of programmed I/O the CPU is in a constant loop checking for an I/O op-
portunity and when its available it performs the computations operations required
for the I/O operations. As the I/O devices are generally slower than the CPU, CPU
has to wait for I/O operation to complete so that next data item can be sent to the
device. The CPU sends data on the data lines. The device need to be signaled that
the data has been sent this is done with the help of STROBE signal. An electrical
pulse is sent to the device by turning this signal to 0 and then 1. The device on
getting the strobe signal receives the data and starts its output. While the device
is performing the output it’s busy and cannot accept any further data on the other
and CPU is a lot faster device and can process lot more bytes during the output of
previously sent data so it should be synchronized with the slower I/O device.
This is usually done by another feed back signal of BUSY which is kept active as
long as the device is busy. So the CPU is only waiting for the device to get idle
by checking the BUSY signal as long as the device is busy and when the device
gets idle the CPU will compute the next data item and send it to the device for I/O
operation. Similar is the case of input, the CPU has to check the DR (data Ready)
signal to see if data is available for input and when its not CPU is busy waiting for
it.
The main disadvantage of programmed I/O as can be noticed is that the CPU is
busy waiting for an I/O opportunity and as a result remain tied up for that I/O
operation. This disadvantage can be overcome by means of interrupt driven I/O. In
Programmed I/O CPU itself checks for an I/O opportunity but in case of interrupt
driven I/O the I/O controller interrupts the execution of CPU when ever and I/O
operation is required for the computation of the required I/O operation. This way
the CPU can perform other computation and interrupted to perform and interrupt
service routine only when an I/O operation is required, which is quite an optimal
technique.
In case data is needed to transferred from main memory to I/O port this can be
done using CPU which will consume 2 bus cyclesfor a single word, one bus cycle
from memory to CPU and other from CPU to I/O port in case of output and the
vice versa in case of input. In case no computation on data is required CPU can be
bypassed and another device DMA (direct memory access) controller can be used.
Its possible to transfer a data word directly from memory to CPU and vice versa in
a single bus cycle using the DMA, this technique is definitely faster.
Revision questions
Example . Outline any three methods that you can use to perform I/O
Solution:
• Programmed I/O
• Interrupt driven I/O
• DMA driven I/O
E XERCISE 1. ...
Course Journals
1. Acta Informatica ISSN 0001-5903
LESSON 2
Interupt mechanisms
2.1. introduction
Interrupt follow a follow a certain mechanism for their invocation just like near or
far procedures. To understand this mechanism we need to understand its differences
with procedure calls. Difference between interrupt and procedure calls Procedures
or functions of sub-routines in various different languages are called by different
methods as can be seen in the examples.
• Call MyProc
• A= Addition(4,5);
• Printf(“hello world”);
The general concept for procedure call in most of the programming languages is
that on invocation of the procedure the parameter list and the return address (which
is the value if IP register in case of near or the value of CS and IP registers in case
of far procedure) is pushed Moreover in various programming languages whenever
a procedure is called its address need to be specified by some notation i.e. in C
language the name of the procedure is specified to call a procedure which effectively
can be used as its address.
However in case of interrupts the a number is used to specify the interrupt number
in the call
• Int 21h
• Int 10h
• Int3
Moreover when an interrupt is invoked three registers are pushed as the return ad-
dress i.e. the values of IP, CS and Flags in the described order which are restored
on return. Also no parameters are pushed onto the stack on invocation parameters
can only be passed through registers.
Moreover it is important to understand the meaning of the four bytes within the
interrupt vector. Each entry within the IVT contain a far address the first two bytes
(lower word) of which is the offset and the next two bytes (higher word) is the
segment address.
10
Generally there are three kind of ISR within a system depending upon the entity
which implements it
• DOS ISRs
When the system has booted up and the applications can be run all these kind of
ISRs maybe provided by the system. Those provided by the ROM-BIOS would be
typically resident at any location after the address F000:0000H because this the ad-
dress within memory from where the ROM-BIOS starts, the ISRs provided by DOS
would be resident in the DOS kernel (mainly IO.SYS and MSDOS.SYS loaded in
memory) and the ISR provided by third party device drivers will be resident in the
memory occupied by the device drivers.
• Use lower two bytes of interrupt Vector as offset and move into IP
• Use the higher two bytes of Vector as Segment Address and move it into
CS=0:[offset+2]
• Return to Point of Interruption by Popping the 6 bytes i.e. Flags CS, IP.
11
Moreover the instruction INT 21H can be assembled and executed in the debug pro-
gram, on doing exactly so the instruction is traced through and the result is moni-
tored. It can be seen that on execution of this instruction the value of IP is changed
to 107CH and the value of CS is changed to 00A7H which cause the execution to
branch to the Interrupt # 21H in memory and the previous values of flags, CS and
IP registers are temporarily saved onto the stack as the value of SP is reduced by 6
and the dump at the location SS:SP will show these saved values as well.
12
13
register AH, AL, BH, BL, CH, CL, DH, DL. These structures are combined such
that through this structure the field ax can be accessed to load a value and also its
half components al and ah can be accessed individually. The declaration of this
structure goes as below. If this union is to be used a programmer need not declare
the following declaration rather declaration already available through its header file
“dos.h”
struct full
{
unsigned int ax;
unsigned int bx;
unsigned int cx;
unsigned int dx;
};
struct half
{
unsigned char al;
unsigned char ah;
unsigned char bl;
unsigned char bh;
unsigned char cl;
unsigned char ch;
unsigned char dl; unsigned char dh;
};
typedef union tagREGS
{
struct full x;
struct half h;
}REGS;
This union can be used to signify any of the full or half general purpose register
shows if the field ax in x struct is to be accessed then accessing the fields al and ah
in h will also have the same effect as show in the example below.
Example:
#include<DOS.H>
union REGS regs;
14
15
CX-DX specify the number of bytes to move a double word is needed to specify
this value as the size of file in DOS can be up to 2 GB.
On return of the service DX-AX will contain the number of bytes the file pointer is
actually moved eg. If the file pointer is moved relative to the EOF zero bytes the
DX-AX on return will contain the size of file if the file pointer was at BOF before
calling the service.
16
Revision questions
E XERCISE 2. ....
17
LESSON 3
Registry service
3.1. introduction
The above described service can be used to get the size of a file in the described
manner. The following C language program tries to accomplish just that. This
program has been saved as .C file and not as .CPP file and then compiled.
Example 21H/42H:
#include<stdio.h>
#include<fcntl.h>
#include<io.h>
#include<BIOS.H>
#include<DOS.H>
unsigned int handle;
void main()
{
union REGS regs;
unsigned long int size;
handle = open("c:\\abc.txt",O_RDONLY);
regs.x.bx = handle;
regs.h.ah = 0x42;
regs.h.al = 0x02; //correction
regs.x.cx = 0;
regs.x.dx = 0;
int86(0x21,®s,®s);
*((int*)(&size)) = regs.x.ax;
*(((int*)(&size))+1) =regs.x.dx;
printf ("Size is %d“ ,size);
}
This program opens a file and saves its handle in the handle variable. This handle
is passed to the ISR 21H/42H along with the move technique whose value is 2 sig-
nifing movement relative to the EOF and the number of bytes to move are specified
to be zero indicating that the pointer should move to the EOF. As the file was just
opened the previous location of the file pointer will be BOF.
18
On return of this service DX-AX will contain the size of the file. The low word of
this size in ax is placed in the low word of size variable and the high word in dx is
placed in the high word of size variable.
Another Example: Lets now illustrate how ISR can be invoked by means of another
example of BIOS service. Here we are choosing the ISR 10h/01h. This interrupt
is used to perform I/O on the monitor. Moreover this service is used to change the
size of cursor in text mode. The description of this service is given as under.
Int # 10H Service # 01H
Entry
AH = 01 CH = Beginning Scan Line
CL = Ending Scan Line
On Exit
Unchanged
The size of the cursor depends upon the number of net scan lines used to display
the cursor if the beginning scan line is greater than the ending scan line the cursor
will disappear. The following tries to accomplish just that
void main()
{
char st[80];
union REGS regs;
regs.h.ah = 0x01;
regs.h.ch = 0x01;
regs.h.cl = 0x00;
int86(0x10,®s,®s); //corrected
gets(st);
}
The program is quite self explanatory as it puts the starting scan line to be 1 and
the ending scan line to be 0. Henceforth when the service execute the cursor will
disappear. Use of ISRs for C Library functions
There are various library function that a programmer would typically use in a pro-
gram to perform input output operations. These library functions perform trivial
I/O operations like character input (putch()) and character output (getch(), getc()
etc).
19
All these function call various ISRs to perform this I/O. In BIOS and DOS docu-
mentation number of services can be found that lie in correspondence with some C
library function in terms of its functionality. Writing S/W ISRs Lets now see how
can a programmer write an ISR routine and what needs to be done in order make
the service work properly. To exhibit this we will make use of an interrupt which
is not used by DOS or BIOS so that our experiment does not put any interference
to the normal functions of DOS and BIOS. One such interrupt is interrupt # 65H.
The vector of int 65H is typically filled with zeros another indication that it is not
being used. Getting interrupt vector As we have discussed earlier IVT is a table
containing 4 byte entries each of which is a far address of an interrupt service rou-
tine. All the vectors are arranged serially such that the interrupt number can be used
as an index into the IVT. Getting interrupt vector refers to the operation which used
to reading the far address stored within the vector. The vector is double word, the
lower word of it being the offset address and the higher word being the segment
address. Moreover the address read from a vector can be used as a function pointer.
The C library function used to do the exactly
20
pointed by funcptr by the statement (*funcptr)(); is called and then the original my-
func() is called. The user will observe in both the cases same function myproc()
will be invoked.
21
...
}
setvect(0x08, newint);
C program making use of Int 65H
Here is a listing of a program that makes use of int 65H to exhibit how software
interrupts needs to be programmed.
void interrupt (*oldint65)( );
char st[80] = {“Hello World$”};
void interrupt newint65(void);
void main()
{
oldint65 = getvect(0x65);
setvect(0x65, newint65);
geninterrupt (0x65);
geninterrupt (0x65);
geninterrupt (0x65);
setvect(0x65, oldint65);
}
void interrupt newint65( )
{
_AH = 0x09;
_DX=(unsigned int)st;
geninterrupt (0x21);
}
The above listing saves the address of original int 65H in the pointer oldint65.
It then places the address of its own function newint65 at the vector of interrupt
number 65H. From this point onwards whenever int 65H is invokes the function
newint65 will be invoked. Int 65 is invoked thrice which will force the newint65
function to be invoked thrice accordingly. After this the original value of the vector
stored in oldint65 is restored. The newint65 function only displays the string st. As
the interrupt 65 is invoked thrice this string will be printed thrice.
22
23
geninterrupt (0x21);
}
The main()function gets and sets the vector of int 65H such that the address of
newint65 is placed at its vector. In this case the program is made memory resident
using the keep function and 1000 paragraphs of memory is reserved for the program
(the amount of paragraphs is just a calculated guess work based upon the size of
application). Now if any application as in the following case invokes int 65H the
string st which is also now memory resident will be displayed.
#include<BIOS.H>
#include<DOS.H>
void main()
{
geninterrupt (0x65);
geninterrupt (0x65);
}
This program invokes the interrupt 65H twice which has been made resident.
24
Revision questions
One deficiency in the above listing is that it is not good enough for other application
i.e. after the termination of this program the newint65 function is de-allocated from
the memory and the interrupt vector needs to be restored otherwise it will act as a
dangling component
E XERCISE 3. ....
25
LESSON 4
Port Programming in I/O
4.1. introduction
file *fptr;
unsigned far *base=(unsigned int far *)0x00400008
void main (void)
{
fptr=fopen(“c:\\abc.txt”,”rb”);
while( ! feof (fptr) )
{
if( ! ( inport (*base + 1 ) & 0x80)
{
outport(*base,getc(fptr));
outport ((*base+2,inport((*base+2) | 0x01);
outport((*base+2,inport((*base+2) & 0xFE); } }}
The above program directly accesses the registers of the PPI to print a file. The
while loop terminates when the file ends. The if statement only schecks if the
printer is busy of not. If the printer is idle the program writes the next byte in file on
to the data port and then turns the strobe bit to 1 and then 0 to indicate that a byte
has been sent to the printer.
The loop then again starts checking the busy status of the printer and the process
continue.
26
27
28
Revision questions
*fptr;
unsigned far *base=(unsigned int far *)0x00400008
void main (void)
{
fptr=fopen(“c:\\abc.txt”,”rb”);
while( ! feof (fptr) )
{
if( ! ( inport (*base + 1 ) & 0x80)
{
outport(*base,getc(fptr));
outport ((*base+2,inport((*base+2) | 0x01);
outport((*base+2,inport((*base+2) & 0xFE);
}
}
}
.
E XERCISE 4. .....
29
LESSON 5
System calls in unix
5.1. introduction
A system call is just what its name implies – a request for the operating system
to do something on behalf of the user’s program. The system calls are functions
used in the kernel itself. To the programmer, the system call appears as a normal C
function call.
However since a system call executes code in the kernel, there must be a mechanism
to change the mode of a process from user mode to kernel mode. The C compiler
uses a predefined library of functions (the C library) that have the names of the
system calls. The library functions typically invoke an instruction that changes the
process execution mode to kernel mode and causes the kernel to start executing
code for system calls. The instruction that causes the mode change is often referred
to as an "operating system trap" which is a software generated interrupt.
The library routines execute in user mode, but the system call interface is a special
case of an interrupt handler. The library functions passthe kernel a unique number
per system call in a machine dependent way – either as a parameter to the operating
system trap, in a particular register, or on the stack – and the kernel thus determines
the specific system call the user is invoking. In handling the operating systemtrap,
the kernel looks up the system call number in a table to find the address of the
appropriate kernel routine that is the entry point for the system call and to find the
number of parameters the system callexpects.
The kernel calculates the (user) address of the first parameter to the system call by
adding (or subtracting, depending on the direction of stack growth) an offset to the
user stack pointer, corresponding to the number of the parameters to the system
call.
Finally, it copies the user parameters to the "u area" and call the appropriate system
call routine. After executing the code for the system call, the kernel determines
whether there was an error. If so, it adjusts register locations in the saved user
register context, typically setting the "carry" bit for the PS (processor status) register
and copying the error number into register 0 location. If there were no errors in the
execution of the system call, the kernel clears the "carry" bit in the PS register
and copies the appropriate return values from the system call into the locations for
30
registers 0 and 1 in the saved user register context. When the kernel returns from
the operating system trap to user mode, it returns to the library instruction after the
trap instruction. The library interprets the return values from the kernel and returns
a value to the user program.
UNIX system calls are used to manage the file system, control processes,and to
provide interprocess communication. The UNIX system interface consists of about
80 system calls (as UNIX evolves this number will increase).
31
[NOTE: The system call interface is that aspect of UNIX that has changed the most
since the inception of the UNIX system. Therefore, when you write a software tool,
you should protect that tool by putting system calls in other subroutines within your
program and then calling only those subroutines. Should the next version of the
UNIX system change the syntax and semantics of the system calls you’ve used, you
need only change your interface routines.]
When a system call discovers and error, it returns -1 and stores the reason the called
failed in an external variable named "errno". The "/usr/include/errno.h" file maps
these error numbers to manifest constants, and it these constants that you should
use in your programs.
When a system call returns successfully, it returns something other than -1, but it
does not clear "errno". "errno" only has meaning directly after a system call that
returns an error.
When you use system calls in your programs, you should check the value returned
by those system calls. Furthermore, when a system call discovers an error, you
should use the "perror()" subroutine to print a diagnostic message on the standard
error file that describes why the system call failed. The syntax for "perror()" is:
void perror(string)
char string;
"perror()" displays the argument string, a colon, and then the error message, as
directed by "errno", followed by a newline. The output of "perror()" is displayed
on "standard error". Typically, the argument give to "perror()" is the name of the
program that incurred the error, argv[0]. However, when using subroutines and
system calls on files, the related file name might be passed to "perror()".
There are occasions where you the programmer might wish to maintain more con-
trol over the printing of error messages than "perror()" provides – such as with a
formatted screen where the newline printed by "perror()" would destroy the for-
matting. In this case, you can directly access the same system external (global)
variables that "perror()" uses. They are:
32
33
Revision questions
A request for the operating system to do something on behalf of the user’s program.
The system calls are functions used in the kernel itself. To the programmer, the
system call appears as a normal C function call.
E XERCISE 5. ....
34
LESSON 6
Unix file system
6.1. introduction
A file system is a logical method for organising and storing large amounts of in-
formation in a way which makes it easy manage. The file is the smallest unit in
which information is stored. The UNIX file system has several important features.
to you, the user, it appears as though there is only one type of file in UNIX - the
file which is used to hold your information. In fact, the UNIX filesystem contains
several types of file.
(a) This type of file is used to store your information, such as some text
you have written or an image you have drawn. This is the type of file
that you usually work with. Files which you create belong to you - you
are said to "own" them - and you can set access permissions to control
which other users can have access to them. Any file is always contained
within a directory.
2. Special files
(a) This type of file is used to represent a real physical device such as a
printer, tape drive or terminal. It may seem unusual to think of a physical
device as a file, but it allows you to send the output of a command to a
device in the same way that you send it to a file. For example:
3. Directories
35
(a) A directory is a file that holds other files and other directories. You can
create directories in your home directory to hold files and other sub-
directories. Having your own directory structure gives you a definable
place to work from and allows you to structure your information in a
way that makes best sense to you. Directories which you create belong
to you - you are said to "own" them - and you can set access permissions
to control which other users can have access to the information they
contain.
4. Pipes
(a) UNIX allows you to link commands together using a pipe. The pipe acts
as a temporary file which only exists to hold data from one command
until it is read by another.
• Home directory
• Pathnames
36
• Pathnames
Every file and directory in the file system can be identified by a complete list of the
names of the directories that are on the route from the root directory to that file or
directory. Each directory name on the route is separated by a / (forward slash).
For example: /usr/local/bin/ue
This gives the full pathname starting at the root directory and going down through
the directories usr, local and bin to the file ue - the program for the MicroEMACS
editor. You can picture the full pathname as looking like this:
• Directory Structure:
Unix uses a hierarchical file system structure, much like an upside-down tree, with
root (/) at the base of the file system and all other directories spreading from there.
A UNIX filesystem is a collection of files and directories that has the following
properties:
• It has a root directory (/) that contains other files and directories.
• Each file or directory is uniquely identified by its name, the directory in which
it resides, and a unique identifier, typically called an inode.
• By convention, the root directory has an inode number of 2 and the lost+found
directory has an inode number of 3. Inode numbers 0 and 1 are not used. File
inode numbers can be seen by specifying the -i option to ls command.
The directories have specific purposes and generally hold the same types of infor-
mation for easily locating files. Following are the directories that exist on the major
versions of Unix:
37
additional
• /var Typically contains variable-length files such as log and print files and any
other type of file that may contain a variable amount of data
extras
38
You can use Manpage Help to check complete syntax for each command mentioned
here.
Some of the directories, such as /devices, shows 0 in the kbytes, used, and avail
columns as well as 0% for capacity. These are special (or virtual) file systems, and
although they reside on the disk under /, by themselves they do not take up disk
space.
The df -k output is generally the same on all Unix systems. Here’s what it usually
includes:
1. Filesystem
2. kbytes
3. used
4. avail
39
5. capacity
6. Mounted on
You can use the -h (human readable) option to display the output in a format that
shows the size in easier-to-understand notation.
40
41
Soft Limit: If the user exceeds the limit defined, there is a grace period that allows
the user to free up some space.
Hard Limit: When the hard limit is reached, regardless of the grace period, no
further files or blocks can be allocated.
here are a number of commands to administer quotas:
(Command and their Description)
• quota
• edquota
– This is a quota editor. Users or Groups quota can be edited using this
command.
• quotacheck
– Scan a filesystem for disk usage, create, check and repair quota files
• setquota
• quotaon
– This announces to the system that disk quotas should be enabled on one
or more filesystems.
• quotaoff
– This announces to the system that disk quotas should be disabled off
one or more filesystems.
• repquota
– This prints a summary of the disc usage and quotas for the specified file
systems
42
REVISION QUESTIONS
It has a root directory (/) that contains other files and directories.
Each file or directory is uniquely identified by its name, the directory in which it
resides, and a unique identifier, typically called an inode.
By convention, the root directory has an inode number of 2 and the lost+found
directory has an inode number of 3. Inode numbers 0 and 1 are not used. File inode
numbers can be seen by specifying the -i option to ls command.
It is self contained. There are no dependencies between one filesystem and any
other.
E XERCISE 6. ....
43
LESSON 7
System processes
7.1. introduction
When you execute a program on your UNIX system, the system creates a special
environment for that program. This environment contains everything needed for the
system to run the program as if no other program were running on the system.
Whenever you issue a command in UNIX, it creates, or starts, a new process. When
you tried out the ls command to list directory contents, you started a process.
A process, in simple terms, is an instance of a running program.
The operating system tracks processes through a five digit ID number known as the
pid or process ID . Each process in the system has a unique pid.
Pids eventually repeat because all the possible numbers are used up and the next
pid rolls or starts over. At any one time, no two processes with the same pid exist
in the system because it is the pid that UNIX uses to track each process.
• Foreground Processes
• Background Processes
• Foreground Processes:
By default, every process that you start runs in the foreground. It gets its input from
the keyboard and sends its output to the screen.
You can see this happen with the ls command. If I want to list all the files in my
current directory, I can use the following command: $ls ch*.doc
This would display all the files whose name start with ch and ends with .doc:
ch01-1.doc ch010.doc ch02.doc ch03-2.doc
ch04-1.doc ch040.doc ch05.doc ch06-2.doc
ch01-2.doc ch02-1.doc
The process runs in the foreground, the output is directed to my screen, and if the
ls command wants any input (which it does not), it waits for it from the keyboard.
44
While a program is running in foreground and taking much time, we cannot run any
other commands (start any other processes) because prompt would not be available
until program finishes its processing and comes out.
• Background Processes:
A background process runs without being connected to your keyboard. If the back-
ground process requires any keyboard input, it waits.
The advantage of running a process in the background is that you can run other
commands; you do not have to wait until it completes to start another!
The simplest way to start a background process is to add an ampersand ( &) at the
end of the command.
$ls ch*.doc &
This would also display all the files whose name start with ch and ends with .doc:
ch01-1.doc ch010.doc ch02.doc ch03-2.doc
ch04-1.doc ch040.doc ch05.doc ch06-2.doc
ch01-2.doc ch02-1.doc
Here if the ls command wants any input (which it does not), it goes into a stop state
until I move it into the foreground and give it the data from the keyboard.
That first line contains information about the background process - the job num-
ber and process ID. You need to know the job number to manipulate it between
background and foreground.
If you press the Enter key now, you see the following:
[1] + Done ls ch*.doc &
$
The first line tells you that the ls command background process finishes success-
fully.
The second is a prompt for another command.
45
1. UID
(a) User ID that this process belongs to (the person running it).
2. PID
3. PPID
4. C
5. STIME
6. TTY
7. TIME CPU
46
8. CMD
47
However, sometimes the parent process is killed before its child is killed. In this
case, the "parent of all processes," init process, becomes the new PPID (parent
process ID). Sometime these processes are called orphan process.
When a process is killed, a ps listing may still show the process with a Z state. This
is a zombie, or defunct, process. The process is dead and not being used. These
processes are different from orphan processes.
They are the processes that has completed execution but still has an entry in the
process table.
48
•.1. In addition, a job can consist of multiple processes running in series or at the
same time, in parallel, so using the job ID is easier than tracking the individual
processes.
49
Revision questions
Daemons are system-related background processes that often run with the permis-
sions of root and services requests from other processes.
E XERCISE 7. ...
50
LESSON 8
The network utilities
8.1. introduction
When you work in a distributed environment then you need to communicate with
remote users and you also need to access remote UNIX machines
There are several Unix utilities which are especially useful for users computing in
a networked, distributed environment. This tutorial lists few of them:
Syntax:
Following is the simple syntax to use ping command:
$ping hostname or ip-address
Above command would start printing a response after every second. To come out
of the command you can terminate it by pressing CNTRL + C keys.
Example . Example:
Following is the example to check the availability of a host available on the network:
$ping google.com
PING google.com (74.125.67.100) 56(84) bytes of data.
64 bytes from 74.125.67.100: icmp_seq=1 ttl=54 time=39.4 ms
64 bytes from 74.125.67.100: icmp_seq=2 ttl=54 time=39.9 ms
64 bytes from 74.125.67.100: icmp_seq=3 ttl=54 time=39.3 ms
64 bytes from 74.125.67.100: icmp_seq=4 ttl=54 time=39.1 ms
64 bytes from 74.125.67.100: icmp_seq=5 ttl=54 time=38.8 ms
— google.com ping statistics —
51
• Navigate directories.
Syntax:
Following is the simple syntax to use ping command:
$ftp hostname or ip-address
Above command would prompt you for login ID and password. Once you are
authenticated, you would have access on the home directory of the login account
and you would be able to perform various commands.
Few of the useful commands are listed below:
3. mput file list Upload more than one files from local machine to remote ma-
chine.
52
4. mget file list Download more than one files from remote machine to local
machine.
5. prompt off Turns prompt off, by default you would be prompted to upload
or download movies using mput or mget commands.
7. dir List all the files available in the current directory of remote machine.
It should be noted that all the files would be downloaded or uploaded to or from
current directories. If you want to upload your files in a particular directory then
first you change to that directory and then upload required files.
•.2. Example:
53
54
55
Revision Questions
E XERCISE 8. ...
56
LESSON 9
Systems performance
9.1. introduction
The purpose of this tutorial is to introduce the performance analyst to some of the
free tools available to monitor and manage performance on UNIX systems, and
to provide a guideline on how to diagnose and fix performance problems in Unix
environment.
UNIX has following major resource types that need to be monitored and tuned:
• CPU
• Memory
• Disk space
• Communications lines
• I/O Time
• Network Time
• Applications programs
1. User state CPU The actual amount of time the CPU spends running the users
program in the user state. It includes time spent executing library calls, but
does not include time spent in the kernel on its behalf.
2. System state CPU This is the amount of time the CPU spends in the system
state on behalf of this program. All I/O routines require kernel services. The
programmer can affect this value by the use of blocking for I/O transfers.
3. I/O Time and Network Time These are the amount of time spent moving
data and servicing I/O requests
57
5. Application Program Time spent running other programs - when the system
is not servicing this application because another application currently has the
CPU.
58
Revision questions
Example . Outline any resource types in UNIX that should be managed and
controlled
Solution:
• CPU
• Memory
• Disk space
• Communications lines
• I/O Time
• Network Time
• Applications programs
E XERCISE 9. ....
59
LESSON 10
User administration in unix
10.1. introduction
There are three types of accounts on a Unix system:
1. Root account: This is also called superuser and would have complete and
unfettered control of the system. A superuser can run any commands without
any restriction. This user should be assumed as a system administrator.
2. System accounts: System accounts are those needed for the operation of
system-specific components for example mail accounts and the sshd accounts.
These accounts are usually needed for some specific function on your system,
and any modifications to them could adversely affect the system.
3. User accounts: User accounts provide interactive access to the system for
users and groups of users. General users are typically assigned to these ac-
counts and usually have limited access to critical system files and directories.
1. /etc/passwd: Keeps user account and password information. This file holds
the majority of information about accounts on the Unix system.
3. /etc/group: This file contains the group information for each account.
60
You can use Manpage Help to check complete syntax for each command mentioned
here.
• -f This option causes to just exit with success status if the specified group
already exists. With -g, if specified GID already exists, other (unique) GID is
chosen
If you do not specify any parameter then system would use default values.
Following example would create developers group with default values, which is
very much acceptable for most of the administrators.
$ groupadd developers
61
If you do not specify any parameter then system would use default values. The
useradd command modifies the /etc/passwd, /etc/shadow, and /etc/group files and
creates a home directory.
62
Following is the example which would create an account mcmohd setting its home
directory to /home/mcmohd and group as developers.
This user would have Korn Shell assigned to it.
$ useradd -d /home/mcmohd -g developers -s /bin/ksh mcmohd
Before issuing above command, make sure you already have developers group cre-
ated using groupadd command.
Once an account is created you can set its password using the passwd command as
follows:
$ passwd mcmohd20
Changing password for user mcmohd20.
New UNIX password:
Retype new UNIX password:
passwd: all authentication tokens updated successfully.
When you type passwd accountname, it gives you option to change the password
provided you are super user otherwise you would be able to change just your pass-
word using the same command but without specifying your account name.
63
If you want to keep her home directory for backup purposes, omit the -r option.
You can remove the home directory as needed at a later time.
64
Revision questions
1. /etc/passwd: Keeps user account and password information. This file holds the
majority of information about accounts on the Unix system.
2. /etc/shadow: Holds the encrypted password of the corresponding account. Not
all the system support this file.
3. /etc/group: This file contains the group information for each account.
65