22CSE341-Module 2[1]

Download as pdf or txt
Download as pdf or txt
You are on page 1of 50

MODULE 2

Linux File Systems and Commands

File System and Attributes: Introduction to LINUX file system, inode, File Types, File
Attributes, Application program Interface to Files, LINUX kernel support for files File
Handling Commands: ls, cat, cp, mv, rm, wc, od, printf, pwd, mkdir, rmdir, cd, file and
directory permissions-chmod, file ownership-chown, chgrp, umask, tar, gzip, du, df, find,
file modification and access times and touch command

Laboratory Component: (minimum 3 experiments / programs)


1. Execute the "ls" command to display comprehensive file attributes with all available
options, view the file's contents, perform file copying and moving operations between
locations, and subsequently remove the file.
2. Execute the following directory-related commands: (i) Create a new directory, navigate
between directories, print the current directory path, check disk space usage, compress file
content, and archive files.
3. Identify commands for adjusting user, group, and others' permissions using symbolic
and octal notation, create files using the "touch" command, modify access and
modification timestamps, and alter default permissions for files or directories using
"umask”

Self-study / Case study:


1. File System Overview: Research and understand the structure and organization of the
Linux file system, including its hierarchy and key directories.
2. Inodes Explained: Dive deep into the concept of inodes and how they are used to
manage files and directories in Linux.
3. File Types and Attributes: Explore different file types (e.g., regular files, directories,
symbolic links) and learn how to view and modify file attributes.
4. Application Program Interface: Study the Linux API for file operations,
including how application programs interact with files through system calls.
5. Kernel Support for Files: Research the role of the Linux kernel in managing
files and how it provides support for file operations
LINUX SYSTEM PROGRAMMING MODULE 2

Introduction to LINUX file system


The UNIX and POSIX File Systems

Figure 1: The Unix File System

• Files in UNIX or POSIX systems are stored in tree-like hierarchical file system.
• The root of a file system is the root (“/”) directory.
• The leaf nodes of a file system tree are either empty directory files or other types of files.
• Absolute path name of a file consists of the names of all the directories, starting from theroot.
• Ex: /usr/cse/a.out
• Relative path name may consist of the “.” and “..” characters. These are references to current and parent
directories respectively.
• Ex: ../../.login denotes .login file which may be found 2 levels up from the current directory
• A file name may not exceed NAME_MAX characters (14 bytes) and the total numberof characters of a
path name may not exceed PATH_MAX (1024 bytes).
• POSIX.1 defines _POSIX_NAME_MAX and _POSIX_PATH_MAX in <limits.h> header
• File name can be any of the following character set only
A to Z, a to z, 0 to 9, _
• Path name of a file is called the hardlink.
• A file may be referenced by more than one path name if a user creates one or more hard links to the file
using ln command.
ln /usr/foo/path1 /usr/prog/new/n1
• If the –s option is used, then it is a symbolic (soft) link .

Directory Content
/bin The directories where all the commonly used UNIX
commands (binaries,hence the name bin) are found.

/sbin and Commands that common user can‘t execute but the system
/usr/sbin administratorcan would be in these directories
/etc This directory contains the configuration files of the system.
can change a very important aspect of system functioning by
editing a text file in thisdirectory
LINUX SYSTEM PROGRAMMING MODULE 2

/dev This directory contains all device files. These files don‘t
occupy space ondisk.

/lib and /usr/lib These directories contain all library files in binary form. We
would linkyour C programs with files in these directories

/usr/include This directory contains the standard header files used by C pro-
grams. The statement #include <stdio.h> used in most C
programs refers to the file stdio.h in this directory.

/usr/share/man This is where the man pages are stored.


• Users also work with their own files; they write programs, send and receive mail, and create temporary
files. These files are available in the second group:

Directory Content
/tmp The directories where users are allowed to create temporary
files. These files arewiped away regularly by the system.
/var The variable part of the file system. Contains all of your
print jobs and your outgoing and incoming mail.

/home The variable part of the file system. Contains all of your print
jobs and youroutgoing and incoming mail.

The following files are commonly defined in most UNIX systems

FILE Use
/etc Stores system administrative files and programs
/etc/passwd Stores all user information’s
/etc/shadow Stores user passwords
/etc/group Stores all group information
/bin Stores all the system programs like cat, rm, cp,etc.
/dev Stores all character device and block device files
/usr/include Stores all standard header files.
/usr/lib Stores standard libraries
/tmp Stores temporary files created by program
LINUX SYSTEM PROGRAMMING MODULE 2

UNIX and POSIX File Attributes


The general file attributes of each file in a file system are

1) File Type - specifies what type of file it is.


2) Access permission - the file access permission for owner, group and others.
3) Hard link count -number of hard link of the file
4) Uid - the file owner user id.
5) Gid - the file group id.
6) File size - the file size in bytes.
7) Inode no - the system inode no of the file.
8) File system id - the file system id where the file is stored.
9) Last access time - the time, the file was last accessed.
10) Last modified time - the file, the file was last modified.
11) Last change time
- the time, the file was last changed

Unix System Call Attributes changed


Command

chmod chmod Changes access permission, last change time


chown chown Changes UID, last change time
chgrp chown Changes GID, ast change time
touch utime Changes last access time, modification time
ln link Increases hard link count
rm unlink Decreases hard link count. If the hard link count is zero, the
file will beremoved from the file system
vi, emac Changes the file size, last access time, last modification time
LINUX SYSTEM PROGRAMMING MODULE 2

File system and Attributes


Files are the building blocks of any operating system. When you execute a command in UNIX, the UNIX
kernel fetches the corresponding executable file from a file system, loads its instruction text to memory,
and creates a process to execute the command on your behalf. In the course of execution, a process may
read from or write to files. All these operations involve files.Thus, the design of an operating system always
begins with an efficient file management system.
File Types
A file in a UNIX or POSIX system may be one of the following types:
➢ regular file
➢ directory file
➢ FIFO file
➢ Character device file
➢ Block device file

❖ Regular file
▪ A regular file may be either a text file or a binary file
▪ These files may be read or written to by users with the appropriate access permission
▪ Regular files may be created, browsed through and modified by various means such as text editors or
compilers, and they can be removed by specific system commands
❖ Directory file
▪ It is like a folder that contains other files, including sub-directory files.
▪ It provides a means for users to organise their files into some hierarchical structure based on file
relationship or uses.
▪ Ex: /bin directory contains all system executable programs, such as cat, rm, sort
▪ A directory may be created in UNIX by the mkdir command
o Ex: mkdir /usr/foo/xyz
▪ A directory may be removed via the rmdir command
o Ex: rmdir /usr/foo/xyz
▪ The content of directory may be displayed by the ls command
❖ Device file
Block device file Character device file

It represents a physical It represents a physical device that


device that transmits dataa transmits data in a character-based
block at a time. manner.
Ex: hard disk drives and Ex: line printers, modems, and
floppy disk drives consoles

▪ A physical device may have both block and character device files representing it for
different access methods.
▪ An application program may perform read and write operations on a device file and the OS will
automatically invoke an appropriate device driver function to perform the actual data transfer between the
physical device and the application
▪ An application program in turn may choose to transfer data by either a character- based(via character device
file) or block-based(via block device file)
▪ A device file is created in UNIX via the mknod command
o Ex: mknod /dev/cdsk c 115 5

Here , c - character device file


115 - major device number
5 - minor device number
o For block device file, use argument ‘b’ instead of ‘c’.
▪ an index to a kernel table that contains the addresses of all device
driver functions known to the system. Whenever a process reads data from or writes data to a device file,
LINUX SYSTEM PROGRAMMING MODULE 2

the kernel uses the device file’s major number to select and invoke a device driver function to carry out
actual data transfer with a physical device.
▪ an integer value to be passed as an argument to a device driver
function when it is called. It tells the device driver function what actual physical device is talking to and
the I/O buffering scheme to be used for data transfer.

❖ FIFO file
▪ It is a special pipe device file which provides a temporary buffer for two or more processes to
communicate by writing data to and reading data from the buffer.
▪ The size of the buffer is fixed to PIPE_BUF.
▪ Data in the buffer is accessed in a first-in-first-out manner.
▪ The buffer is allocated when the first process opens the FIFO file for read or write
▪ The buffer is discarded when all processes close their references (stream pointers) to the FIFO file.
▪ Data stored in a FIFO buffer is temporary.
▪ A FIFO file may be created via the mkfifo command.
o The following command creates a FIFO file (if it does not exists)
mkfifo /usr/prog/fifo_pipe
o The following command creates a FIFO file (if it does not exists)
mknod /usr/prog/fifo_pipe p
▪ FIFO files can be removed using rm command.

❖ Symbolic link file


▪ BSD UNIX & SV4 defines a symbolic link file.
▪ A symbolic link file contains a path name which references another file in either local or a remote file
system.
▪ POSIX.1 does not support symbolic link file type
▪ A symbolic link may be created in UNIX via the ln command
▪ Ex: ln -s /usr/divya/original /usr/raj/slink
▪ It is possible to create a symbolic link to reference another symbolic link.
▪ rm, mv and chmod commands will operate only on the symbolic link arguments directly and not on the
files that they reference.

pwd: Present Working Directory


At any time you can determine where you are in the file system hierarchy with the pwd, present working
directory, command,
$pwd
/home/pq/ab
What is seen above is the pathname, a sequence of directory names separated by slashes. These slashes act
as delimiters to file and directory names, except first slash which is a synonym for root.
Absolute Pathnames
Accessing a file from your current directory to another directory can be done using absolute pathname or
relative pathnames.
An absolute pathname is just a path that starts at the root directory, which means it begins with the slash
(/) character. This pathname uses the root as the ultimate reference point. example:

cd /home/user1/exam
No 2 files in UNIX system can have identical pathnames. If 2 files have same names, they must be in
different directories, which means their absolute pathname is also different.

Relative Pathnames
LINUX SYSTEM PROGRAMMING MODULE 2

In this the current directory is used as the point of reference and specifies the path relative to it. UNIX
allows the use of two symbols in pathnames that use the current and parent directory as the reference
point:
. (a single dot) This represents the current directory
.. (two dots) This represents the parent directory.
Pathnames that begin with either of these symbols are known as relative pathnames. example:
cd ./progs
which refers subdirectory progs under your current directory.
Any number of such sets of .. separated by / can be combined. However when a / is used with
.. it acquires a different meaning, instead of moving down a level, it moves one level up.
ls: Listing Files
The command to list your directories and files is ls. With options it can provide information about the
size, type of file, permissions, dates of file creation, change and access.
Syntax

ls [options] [argument] Common Options


ls -a Lists all files, including those beginning with a dot (.).
ls -d Lists only names of directories, not the files in the directory
ls -F Indicates type of entry with a trailing symbol: executables with *, directories with / and symbolic links
with @
ls -R Recursive list
ls -u Sorts filenames by last access time
ls -t Sorts filenames by last modification time
ls -i Displays inode number
ls -l

list the attributes of a file. Long listing: lists the mode, link information, owner, group, size, last
modification (time). If the file is a symbolic link, an arrow (-->) precedes the pathname of the linked-to
file.
ls command is used to obtain a list of all filenames in the current directory. ls -l look up the file‘sinode to
fetch its attributes. It lists seven attributes of all files in the current directory and they are:

ls -l
total 179
-rw-r--r-- 1 prami prami 0 Oct 18 2021 03061200
-rw-r--r-- 2 prami prami 84 Oct 18 2021 1.c
-rw-r--r— 1 prami prami 0 Oct 18 2021 10061200
-rw-r--r-- 1 prami prami 216 Dec 20 2021 10a.sh
-rw-r--r-- 1 prami prami 208 Dec 13 2021 10a.sh.save
-rw-r--r-- 1 prami prami 240 Dec 27 2021 11a.sh

Types and Permission- The first column of the first field shows the file type. Here we see three possible
values—a - (ordinary file), d (directory), or l (symbolic link). The remaining nine characters form a string
of permissions which can take the values r, w, x, and -
Links- The second field indicates the number of links associated with the file. UNIX lets a file have
multiple names, and each name is interpreted as a link.
Ownership and Group Ownership- Every file has an owner. The third field shows prami as the owner of
LINUX SYSTEM PROGRAMMING MODULE 2

most of the files. A user also belongs to a group, and the fourth field shows metal as the group owner of
most of the files.
Size- The fifth field shows the file size in bytes. This actually reflects the character count and not the disk
space consumption of the file.
Last Modification time- The sixth field displays the last modification time in three columns, a time stamp
that is stored to the nearest second.

Filename- The last field displays the filename, which can be up to 255 characters long.

Listing Directory Attributes (-ld) To see the attributes of a directory bar rather than the filenames it
contains.
File Permissions
UNIX follows a three-tiered file protection system that determines a file‘s access rights.

Figure 2: Structue of File’s Permissions string

Each group here represents a category.


There are three categories representing the user (owner), group owner, and others.
Each category contains three slots representing the read, write, and execute permissions of thefile.
r indicates read permission, which means cat command can display the file. w indicates write permission;
edit such a file with an editor.
x indicates execute permission; the file can be executed as a program. The - shows the absence of the
corresponding permission.
The first category (rwx) shows that the file is readable, writable, and executable by the owner of the file.
second category (r-x) indicates the absence of write permission for the group owner of the file The third
category (r--) applies to others (neither owner nor group owner)

chmod: Changing File Permissions


chmod command changes a file‘s permissions. The command uses the following syntax:
chmod [-R] mode file

Relative Permissions: chmod only changes the permissions specified in the command line and leaves the
other permissions unchanged. Its syntax is:

Figure 3: Structue of chmod command


LINUX SYSTEM PROGRAMMING MODULE 2

Table 1: Abbreviations used by chmod

Examples: Initially,

-rw-r--r-- 1 kumar metal 1906 sep 23:38 xstart


chmod u+x xstart
-rwxr--r-- 1 kumar metal 1906 sep 23:38 xstart
The command assigns (+) execute (x) permission to the user (u), other permissions remain unchanged.
Absolute Permissions: When ‗=‘ is used to assign the permission then it is called absolute permission.
Only the provided permission is assigned, it removes other permissions.

- 1 1906 xstart chmod


rwxr kumar sep go=r xstart
-xr-x metal 23:38
Then
, it
1 1906 xstart
beco kumar sep
mes metal 23:38
-rwx
r- - r-
-
Octal Notation: A string of three octal digits is used to express the permission. The permission can be
represented by one octal digit for each category. For each category, we add octal digits.

Read permission -4
Write permission -2
Execute permission -1

Octal Permissions Significance


0 --- no permissions
1 --x execute only
2 -w- write only
3 -wx write and execute
4 r-- read only
5 r-x read and execute
6 rw- read and write
7 rwx read, write and execute
Using relative permission, we have,
chmod a+rw xstart
LINUX SYSTEM PROGRAMMING MODULE 2

Using octal; permission, we have,


chmod 666 xstart
chmod recursive: We can use chmod Recursively.

chmod -R a+x shell_scripts


This makes all the files and subdirectories found in the shell_scripts directory, executable by all users.
Directory Permissions
It is possible that a file cannot be accessed even though it has read permission, and can beremoved
even when it is write protected. The default permissions of a directory are,
rwx r-x r-x (755)
A directory must never be writable by group and others Example:
mkdir c_progs
ls
-ld c_progs
drwxr-xr-x 2 kumar metal 512 may 9 09:57 c_progs
when read, write or execute permission is not there, this is how directory behaves:
• Read Permission
can‘t perform ls command but can display and access the contents inside the directory

• Write permission
cant create files in the directory (can‘t perform cat >,rm,cp,mv command) cant create subdirectory (can‘t
perform mkdir, rmdir command)
• Execute permission
Cant access the directory (can‘t perform cd, mkdir, rmdir, cat command)

umask: Default File Permissions


umask is a command that determines the settings of a mask that controls how file permissions are set for
newly created files. It also may refer to a function that sets the mask.
If the umask command is invoked without any arguments, it will display the current mask. Theoutput will
be in either octal or symbolic notation depending on the OS.
These default permissions are inherited by files and directories created by all users:
• rw-rw-rw- (octal 666) for regular files
• rwxrwxrwx (octal 777) for directories
$ umask 0022

This is an octal number which has to be subtracted from the system default to obtain the actual default.
This becomes 644 (666 – 022) for ordinary files and 755 (777 – 022) for directories. When you create a
file on this system, it will have the permissions rw-r--r--. A directory will have the permissions rwxr-xr-
x.

File ownership
There are 2 commands meant to manipulate the ownership of a file or a directory chown, changing file
LINUX SYSTEM PROGRAMMING MODULE 2

owner and chgrp, changing group owner

• chown ls -l note
-rwxr----x 1 kumar metal 347 may 10 20:30 note chown sharma note;
ls -l note
-rwxr----x 1 sharma metal 347 may 10 20:30 note
Once ownership of the file has been given away to sharma, the user file permissions that previously applied
to Kumar now apply to sharma.

• chgrp
This command changes the file‘s group owner.

Inodes in UNIX System V


▪ In UNIX system V, a file system has an inode table, which keeps tracks of all files. Each entry of the
inode table is an inode record which contains all the attributes of afile, including inode # and the physical
disk address where data of the file is stored
▪ For any operation, if a kernel needs to access information of a file with an inode # 15, it will scan the inode
table to find an entry, which contains an inode # 15 in order to access the necessary data.
▪ An inode # is unique within a file system. A file inode record is identified by a file system ID and an inode
#.
▪ Generally an OS does not keep the name of a file in its record, because the mapping of the filenames to
inode# is done via directory files i.e. a directory file contains a list of names of their respective inode # for
all file stored in that directory.
▪ Ex: a sample directory file content
Inode number File name

115
89 ..
201 xyz
346 a.out
201 xyz_ln1
▪ To access a file, for example /usr/divya, the UNIX kernel always knows the “/” (root) directory inode # of
any process. It will scan the “/” directory file to find the inode number of the usr file. Once it gets the usr
file inode #, it accesses the contents of usr file. It then looks for the inode # of divya file.
▪ Whenever a new file is created in a directory, the UNIX kernel allocates a new entry in the inode table to
store the information of the new file
▪ It will assign a unique inode # to the file and add the new file name and inode # to the directory file that
contains it.
Application Program Interface to Files

The general interfaces to the files on UNIX and POSIX system are
▪ Files are identified by pathnames.
▪ Files should be created before they can be used. The various commands and system calls to create files
are listed below.
LINUX SYSTEM PROGRAMMING MODULE 2

File type commands system call

Regular file vi,pico,emac open,creat


Directory file mkdir mkdir,mknod
FIFO file mkfifo mkfifo,mknod
Device file mknod mknod
Symbolic link ln –s symlink
file

▪ For any application to access files, first it should be opened, generally we use open system call to open a
file, and the returned value is an integer which is termed as file descriptor.
▪ There are certain limits of a process to open files. A maximum number of OPEN-MAX files can be opened
.The value is defined in <limits.h> header
▪ The data transfer function on any opened file is carried out by read and write system call.
▪ File hard links can be increased by link system call, and decreased by unlink system call.
▪ File attributes can be changed by chown, chmod and link system calls.
▪ File attributes can be queried (found out or retrieved) by stat and fstat system call.
▪ UNIX and POSIX.1 defines a structure of data type stat i.e. defined in <sys/stat.h> header file. This
contains the user accessible attribute of a file. The definition of the structure can differ among
implementation, but it could look like

struct stat
{
dev_t st_dev; /* file system ID */
ino_t st_ino; /* file inode number */
mode_t st_ mode; /* contains file type and permission */
nlink_t st_nlink; /* hard link count */
uid_t st_uid; /* file user ID */
gid_t st_gid; /* file group ID */
dev_t st_rdev; /*contains major and minor device#*/
off_t st_size; /* file size in bytes */
time_t st_atime; /* last access time */
time_t st_mtime; /* last modification time */
time_t st_ctime; /* last status change time */
};

UNIX Kernel Support for Files


In UNIX system V, the kernel maintains a file table that has an entry of all opened files and also there is an
inode table that contains a copy of file inodes that are most recently accessed.
A process, which gets created when a command is executed will be having its own data space (data
structure) wherein it will be having file descriptor table. The file descriptor table will be having an
maximum of OPEN_MAX file entries. Whenever the process calls the open function to open a file to read
or write, the kernel will resolve the pathname to the file inode number.
The steps involved are :
1. The kernel will search the process descriptor table and look for the first unused entry. If an entry
is found, that entry will be designated to reference the file .The index of the entry will be returned to the
process as the file descriptor of the opened file.
2. The kernel will scan the file table in its kernel space to find an unused entry that can be assigned
to reference the file.
If an unused entry is found the following events will occur:
▪ The process file descriptor table entry will be set to point to this file table entry.
LINUX SYSTEM PROGRAMMING MODULE 2

▪ The file table entry will be set to point to the inode table entry, where the inode record ofthe file is stored.
▪ The file table entry will contain the current file pointer of the open file. This is an offset from the beginning
of the file where the next read or write will occur.
▪ The file table entry will contain an open mode that specifies that the file opened is for read only, write only
or read and write etc. This should be specified in open function call.
▪ The reference count (rc) in the file table entry is set to 1. Reference count is used to keep track of how
many file descriptors from any process are referring the entry.
▪ The reference count of the in-memory inode of the file is increased by 1. This count specifies how many
file table entries are pointing to that inode.
If either (1) or (2) fails, the open system call returns -1 (failure/error) Data Structure for File Manipulation

Normally the reference count in the file table entry is 1,if we wish to increase the rc in the file table entry,
this can be done using fork,dup,dup2 system call. When a open system call is succeeded, its return value
will be an integer (file descriptor). Whenever the process wants to read or write data from the file, it should
use the file descriptor as one of its argument.
The following events will occur whenever a process calls the close function to close the files that are
opened.
1. The kernel sets the corresponding file descriptor table entry to be unused.
2. It decrements the rc in the corresponding file table entry by 1, if rc not equal to 0 go to step 6.
3. The file table entry is marked as unused.
4. The rc in the corresponding file inode table entry is decremented by 1, if rc value not equal to 0 go to step
6.
5. If the hard link count of the inode is not zero, it returns to the caller with a success status otherwise
it marks the inode table entry as unused and de-allocates all the physical dusk storage of the file.
6. It returns to the process with a 0 (success) status.

Directory Files
▪ It is a record-oriented file
▪ Each record contains the information of a file residing in that directory
▪ The record data type is struct dirent in UNIX System V and POSIX.1 and struct direct in BSD UNIX.
▪ The record content is implementation-dependent
▪ They all contain 2 essential member fields
o File name
o Inode number
▪ Usage is to map file names to corresponding inode number
LINUX SYSTEM PROGRAMMING MODULE 2

Directory function Purpose


opendir Opens a directory file
readdir Reads next record from the file
closedir Closes a directory file
rewinddir Sets file pointer to beginning of
file

Hard and Symbolic Links


▪ A hard link is a UNIX pathname for a file. Generally most of the UNIX files will behaving only one
hard link.
▪ In order to create a hard link, we use the command ln.
Example : Consider a file /usr/ divya/old, to this we can create a hard link by
ln /usr/ divya/old /usr/ divya/new
after this we can refer the file by either /usr/ divya/oldor /usr/ divya/new
▪ Symbolic link can be creates by the same command ln but with option –s
▪ Example:
▪ ln –s /usr/divya/old /usr/divya/new
▪ ln command differs from the cp(copy) command in that cp creates a duplicated copy of a file to another
file with a different pathname, whereas ln command creates a new directory to reference a file.
▪ Let’s visualize the content of a directory file after the execution of command ln.
Case 1: for hardlink file

ln /usr/divya/abc /usr/raj/xyz

The content of the directory files /usr/divya and/usr/raj are


Both /urs/divya/abc and /usr/raj/xyz refer to the same inode number 201, thus type is no new file created.
Case 2: For the same operation, if ln –s command is used then a new inode will be created.
ln –s /usr/divya/abc /usr/raj/xyz
The content of the directory files divya and raj will be

Dept. Of CSE, NHCE 2019-2020

If cp command was used then the data contents will be identical and the 2 files will be separate objects
in the file system, whereas in ln –s the data will contain only the path name.
Limitations of hard link:
1. User cannot create hard links for directories, unless he has super-user privileges.
2. User cannot create hard link on a file system that references files on a different file system,
because inode number is unique to a file system.
LINUX SYSTEM PROGRAMMING MODULE 2

Differences between hard link and symbolic link are listed below:

File Handling Commands:

ls: Listing Files


The command to list your directories and files is ls. With options it can provide information about the
size, type of file, permissions, dates of file creation, change and access.
Syntax

ls [options] [argument] Common Options


ls -a Lists all files, including those beginning with a dot (.).
ls -d Lists only names of directories, not the files in the directory
ls -F Indicates type of entry with a trailing symbol: executables with *, directories with / and symbolic links
with @
ls -R Recursive list
ls -u Sorts filenames by last access time
ls -t Sorts filenames by last modification time
ls -i Displays inode number
ls -l

list the attributes of a file. Long listing: lists the mode, link information, owner, group, size, last
modification (time). If the file is a symbolic link, an arrow (-->) precedes the pathname of the linked-to
file.
ls command is used to obtain a list of all filenames in the current directory. ls -l look up the file‘sinode to
fetch its attributes. It lists seven attributes of all files in the current directory and they are:
LINUX SYSTEM PROGRAMMING MODULE 2

wc: Line, Word and character counting

This command counts lines, words and characters depending on the options used.

Syntax: wc [options] <filename>

Option: '-l' is used to count the number of lines.


'-w' is used to count the number of words.
'-c' is used to count the number of character.

wc filename - Displays a 4 column output


3 20 103 sample.txt
The command counts 3 lines, 20 words and 103 characters. The filename has also been
shown in the 4th column.
• A line is any group of characters not containing a newline character.
• A word is a group of characters not containing a space, tab or newline.
• A character is the smallest unit of information, and includes all spaces, tabs and newlines.

od: Displaying data in Octal

Many files contain non-printing characters, and most UNIX commands don‟t display them properly.
To make these characters visible, use od (octal dump) that displays the ASCII octal value of a file‟s
contents.
The –b option displays this value for each character separately.
od –b filename
Each line displays 16 bytes of data in octal, preceded by the position in the file of the first byte in the line.
The option –c combined with –b gives an output with octal representations in the first line,
and the printable characters and escape sequences in their equivalent 2nd line.
od –bc filename
LINUX SYSTEM PROGRAMMING MODULE 2

pr:Paginating Files

This command prepares a file for printing by adding suitable headers, footers and formatted text.

$cat dept.lst

01|accounts|6213
02|progs|5423
03|marketing|6521
04|personnel|2365
05|production|9876
06|sales|1006
pr command adds suitable headers, footers and formatted text. pr adds five lines of margin at
the top and bottom. The header shows the date and time of last modification of the file along
with the filename and page number.

$pr dept.lst

oct 06 10:38 2022 dept.lst page 1


01:accounts:6213
02:progs:5423
03:marketing:6521
04:personnel:2365
05:production:9876
06:sales:1006
…blank lines…
pr options
The different options for pr command are:
-k prints k (integer) columns
-t to suppress the header and footer
-h to have a header of user‟s choice
-d double spaces input
-n will number each line and helps in debugging
-on offsets the lines by n spaces and increases left margin of page
pr +10 chap01
starts printing from page 10
pr -l 54 chap01
this option sets the page length to 54
LINUX SYSTEM PROGRAMMING MODULE 2

cat:- concatenate and print a file

$ cat filename

cat is short form of concatenate, which means to join together. This utility is used most often to display
contents of single file. You may also use the cat command to display the contents of several files in
succession.
In that case files should be separated by a blank space.

You can also create a file using cat.


$ cat> test <enter>
Hai this is introduction to unix session.
We will explore our self to a large extent.
<ctrl d>
$ cat test
Hai this is introduction to unix session.
We will explore our self to a large extent.

$ mkdir: Creating a directory


$mkdir ABC
Directory ABC is created

$mkdir ABC
mkdir: Cannot create directory ABC : File exists
You may also use the mkdir command to create several directories in succession.
$mkdir XYZ XYZ/a XYZ/b
Creates a directory XYZ and subdirectories a and b under XYZ

cd: Changes the directory to specified name


$cd ABC
~/ABC$
$cd .. Moves one level, to the parent directory
$cd / Moves to root directory

rmdir: Removes directory


Allows to remove the specified directory provided the directory is empty.
The directory to be removed should not be current working directory.
$rmdir XYZ/a XYZ/b XYZ
Removes the tree directory XYZ

ls: - to list the files in directories.


Ex : $ls
abc
text
test.c
test1.C

You know that everything is treated as files in UNIX, hence ls will list all type of files.
It is difficult to make out which is ordinary file, which is device file etc. For which ls supports an
option called as –l which lists the files in long format. We call it as long listing of files.

$ ls –l long listing of all files and directories


LINUX SYSTEM PROGRAMMING MODULE 2

-rw-r--r-- 1 root nhce 4096 Jul 11 13:34 test.c


drwxr-xr-x 2 root nhce 29 Jul 11 13:34 abc
-rw-r—r-- 1 root nhce 345 Jul 11 13:34 test1.C

The significance of each field is


1) In the long listing if the starting character is hyphen (-) then it is an ordinary file. If it is d then it is a
directory file.
What follows after wards i.e remaining 9 characters tells the permission associated with the file.
You have three basic permissions: read(r), write(w) and execute(x). And these permissions belong to
three different type of people: owner, group and others. We will see this concept little bit later.
2) The second column indicates the number of links associated with the file. This is actually
number of filenames maintained by the system of that file.
3) The third column shows the owner of this file.
4) The fourth column stands for group owner.
5) The fifth column shows the size of the file in bytes.
6) The sixth, seventh and eighth column indicates the last modification time of the file, which is stored to the
nearest second.
7) The last column indicates the file name.

$ls –x lists multiple columns


$ls –a lists all hidden files
$ls –lt long listing based on Modification time
$ls –u lists files based on size

pwd: Present working directory


Displays the full path name of the current working directory
$ pwd /root/abc
cp:- copy a file
$cp file1 file2 If target file already exists, it is overwritten.
Contents of file1 are copied onto file2

$ cp –r cse ise
Copies all files of cse to ise

mv:- move or rename a file


$mv file1 file2
Moves the contents of file1 to file2 and file1 no longer exists

rm:- remove files or directories


$ rm filename removes file filename

NOTE: To remove more than one file, separate it by a space.


$rm –r XYZ removes recursively all files and directories and XYZ

CAUTION: Do not give $rm *


chmod :Change permission mode

UNIX allows the user to change the default permissions that are assigned. The pre condition however is
that
you must be the owner of the file. Unless and until you own the file, you cannot change the permissions
assigned to the file.
This command is the key to UNIX permission modes, which provides a simple yet effective method for
controlling
access to files. Whenever a file is created, system assigns its default access permissions to the file.
LINUX SYSTEM PROGRAMMING MODULE 2

The owner can change these permissions with help of chmod command.

Syntax: chmod [who] op-code mode <file>

who: → u → file owner


g → group
o → all others
a → all (default)

Opcodes: + → add permission


- → remove permissions

mode: r → read
w → write
x → execute

$ ls -l filename

-rw-rw-rw- 1 nhce root 62 Jul 9 9:35 filename

$ chmod go-rw filename

$ ls -l filename

-rw------- 1 nhce root 62 Jul 9 9:35 filename

Note: Modification time has not been changed. This is because changing the access permissions does
not modify the contents of file. The modification time is changed only if the file’s contents are modified
by write operation.

This is called symbolic format of accessing the modes.

One more format is there called as absolute format – which is based on octal numbers (digits 0 through
7)

All octal values for read, write and execute modes are as follows:
Read → 4
Write → 2
Execute→ 1

In order to express the ways in which you want a particular file to be accessed, simply add the octal values
that
correspond to individual types of permissions. (i.e. read, write, execute)

No access = 0
Read access only = 4
Read and execute access = 4 + 1 = 5
Read and write access = 4 + 2 = 6
Read and write and execute access = 4 + 2 + 1 = 7

Finally, the added octal rules are expressed in groups of three octal numbers which in turn indicate desired
access
modes for file owner, group owner and other user categories.
LINUX SYSTEM PROGRAMMING MODULE 2

Use Gro Oth Oct


r up ers al
valu
e
r r r
w w w
x x x
4 4 4
2 2 2 777
1 1 1
4 4 0
2 0 0 751
1 1 1
4 0 0
2 0 0 600
0 0 0

$ ls -l filename
-rw-rw-rw- 1 nhce root 62 Jul 9 9:35 filename

$ chmod 600 filename

$ ls -l filename
-rw------- 1 nhce root 62 Jul 9 9:35 filename

umask
New files and directories are created with default set of permissions. For directories,
the base permissions are (rwxrwxrwx) 0777 and for files they are 0666 (rw-rw-rw).
Kernel applies a restriction on the default permission on files & directories by applying a permission
mask called the umask. This is an octal number which has to be subtracted from default permission.
$umask
0002

The default umask 0002 used for normal user. With this mask default directory permissions are 0775 and
default file permissions are 0664.

touch: change and modify the timestamp

The touch command is used to change the timestamps (i.e., dates and times of the most recent access and
modification)

on existing files and directories.

$touch <options> filename

When used without any options, touch creates new files for any file names that are provided
as arguments (i.e., input data)

if files with such names do not already exist. Touch can create any number of files simultaneously.

For example, the -a option changes only the access time, while the -m option changes only the modification
time.
LINUX SYSTEM PROGRAMMING MODULE 2

The use of both of these options together changes both the access and modification times to the current
time
Tar command
The GNU tar (short for Tape ARchiver) command is the most widely used archiving utility in Linux
systems.
Available directly in the terminal, the tar command helps create, extract, and list archive contents.
The utility is simple and has many helpful options for compressing files, managing backups,
or extracting a raw installation

tar <operation mode> <option(s)> <archive> <file(s) or location(s)>

• Operation mode indicates which operation executes on the files (creation, extraction, etc.).

The command allows and requires only one operation.

• Options modify the operation mode and are not necessary. There is no limit on the number of options.
• The archive is the file name and extension.
• The file name(s) is a space-separated list for extraction or compression or wildcard matched name.

There are three possible syntax styles to use the operations and options:

1. Traditional style, clustered together without any dashes.

For example:

tar cfv <archive> <file(s) or location(s)>

2. UNIX short option style, using a single dash and clustered options:

tar -cfv <archive> <file(s) or location(s)>

Alternatively, a dash before each option:

tar -c -f -v <archive> <file(s) or location(s)>

3. GNU long-option style with a double-dash and a descriptive option name:

tar --create --file <archive> --verbose <file(s) or location(s)>


All three styles can be used in a single tar command.
LINUX SYSTEM PROGRAMMING MODULE 2
LINUX SYSTEM PROGRAMMING MODULE 2
LINUX SYSTEM PROGRAMMING MODULE 2

du command

The du command is a standard Linux/Unix command that allows a user to gain disk usage information
quickly.
It is best applied to specific directories and allows many variations for customizing the output to meet your
needs.
As with most commands, the user can take advantage of many options or flags. Also, like many Linux
commands,
most users only use the same two or three flags to meet their specific set of needs. The aim here is to
introduce the
basic flags that people use, but also to look at some that are less common in hopes of improving our use
of du.
Let's first look at the standalone command, and then add in various options.

[tcarrigan@rhel article_submissions]$ du /home/tcarrigan/article_submissions/


12 /home/tcarrigan/article_submissions/my_articles
36 /home/tcarrigan/article_submissions/community_content
48 /home/tcarrigan/article_submissions/

-h , --human-readable

The -h flag prints size outputs, such as the ones above, in a human-readable format. This format provides
a unit of measure (Bytes). If we now run the du -h command on the same directory, we see that the 12, 36,
and 48 values are in KB.
LINUX SYSTEM PROGRAMMING MODULE 2

[tcarrigan@rhel article_submissions]$ du -h /home/tcarrigan/article_submissions/


12K /home/tcarrigan/article_submissions/my_articles
36K /home/tcarrigan/article_submissions/community_content
48K /home/tcarrigan/article_submissions/
-s, --summarize
The -s flag is added to the -h flag on occasion. With their powers combined, they do not become an eco-
friendly demi-god. Instead, they allow us to get a summary of the directory's usage in a human-readable
format.
[tcarrigan@rhel article_submissions]$ du -sh /home/tcarrigan/article_submissions/
48K /home/tcarrigan/article_submissions/
If that output seems familiar, its because its an exact copy of the last line of the -h output.
-a, --all
This helpful option does exactly what you would think. It lists the sizes of all files and directories in the
given file path. The -a option is often combined with the -h flag for ease of use. Notice the individual file
sizes are listed with the directories.
tcarrigan@rhel article_submissions]$ du -ah /home/tcarrigan/article_submissions/
8.0K /home/tcarrigan/article_submissions/my_articles/Creating_physical_volumes
4.0K /home/tcarrigan/article_submissions/my_articles/Creating_volume_groups
12K /home/tcarrigan/article_submissions/my_articles
4.0K /home/tcarrigan/article_submissions/community_content/article
4.0K /home/tcarrigan/article_submissions/community_content/article2
4.0K /home/tcarrigan/article_submissions/community_content/article3
4.0K /home/tcarrigan/article_submissions/community_content/article4
12K /home/tcarrigan/article_submissions/community_content/real_sysadmins
8.0K /home/tcarrigan/article_submissions/community_content/podman_pulling
36K /home/tcarrigan/article_submissions/community_content
48K /home/tcarrigan/article_submissions/
--time
I especially love this flag. It shows the time of the last modification to any file in the directory or
subdirectory that you run it against. This flag was incredibly useful to me as a storage admin. On more
than one occasion, I would have a customer write files to a subdirectory on accident, and then we needed
to find where the write took place. I could use this flag in conjunction with the -ah flags to find the directory
last modified.
[tcarrigan@rhel article_submissions]$ du -ah --time /home/tcarrigan/article_submissions/
8.0K 2020-04-07 11:30 /home/tcarrigan/article_submissions/my_articles/Creating_physical_volumes
4.0K 2020-04-07 11:31 /home/tcarrigan/article_submissions/my_articles/Creating_volume_groups
12K 2020-04-07 11:31 /home/tcarrigan/article_submissions/my_articles
4.0K 2020-02-06 16:47 /home/tcarrigan/article_submissions/community_content/article
4.0K 2020-02-06 16:48 /home/tcarrigan/article_submissions/community_content/article2
4.0K 2020-02-06 16:48 /home/tcarrigan/article_submissions/community_content/article3
4.0K 2020-02-06 16:48 /home/tcarrigan/article_submissions/community_content/article4
12K 2020-04-07 11:32 /home/tcarrigan/article_submissions/community_content/real_sysadmins
8.0K 2020-04-07 11:32 /home/tcarrigan/article_submissions/community_content/podman_pulling
36K 2020-04-07 11:32 /home/tcarrigan/article_submissions/community_content
48K 2020-04-07 11:32 /home/tcarrigan/article_submissions/
Note: This does not sort by last modification so you still need to pay attention to the times. The last
modification is not always at the top
-c, --total
This option is more of a dummy check than it is useful, however, some people really like having a total
measurement output. The -c flag adds a line to the bottom of the output that gives you a grand total of all
of the disk usage for the file path given.
[tcarrigan@rhel article_submissions]$ du -ch /home/tcarrigan/article_submissions/
12K /home/tcarrigan/article_submissions/my_articles
36K /home/tcarrigan/article_submissions/community_content
48K /home/tcarrigan/article_submissions/
LINUX SYSTEM PROGRAMMING MODULE 2

48K total
Notice the bottom line here. The same information is displayed that is shown in the other examples
of du but without the 'total' banner to remind you.
-X, --exclude=Pattern
The -X option is a nifty little trick you can do if you know that your environment has a large number of a
certain type of file that you do not wish to calculate in your findings. In my experience, certain customers
would have large amounts of metadata files with the same file extension and did not wish to include those
in their findings. I cannot demonstrate this here on my virtual machine; however, here is the syntax and an
example.
[tcarrigan@rhel]$ du -ah --exclude="*.dll" /home/tcarrigan/article_submissions
This command would list all files and directory usage info in a human-readable format while excluding
any file with the extension .dll. This is a bit niche, however, it does have a place in the world.
Wrap up and man page
Hopefully, you now have a better understanding how useful the du utility can be. It is easy to get into the
routine of only ever running du -h and forgetting about all of the other incredibly powerful flags you have
at your disposal. There are many flags that I did not cover in this article, but you can find all the information
on the manual page for this command. To access the manpage, simply run man du

df Command Usage & Syntax:

df command that displays the amount of disk space available on the file system containing each file name
argument.
• If no file name is passed as an argument with df command then it shows the space available on all
currently mounted file systems
• . This is something you might wanna know cause df command is not able to show the space available on
unmounted file systems and the reason for this is that for doing this on some systems requires very deep
knowledge of file system structures.
• By default, df shows the disk space in 1 K blocks.
• df displays the values in the units of first available SIZE from –block-size (which is an option) and from
the DF_BLOCK_SIZE, BLOCKSIZE AND BLOCK_SIZE environment variables.
• By default, units are set to 1024 bytes or 512 bytes(if POSIXLY_CORRECT is set) . Here, SIZE is an
integer and optional unit and units are K, M, G, T, P, E, Z, Y (as K in kilo) .
df Syntax :

This command gives disk space usage details


gzip & gunzip
gzip is used to compress the file with .gz as extension
gunzip used to uncompress the .gz file
zip &unzip
zip is used to compress the file with .z extension and unzip is used to decompress the .z files
ls -l dept.lst
-rw-r--r-- 1 kumar metal 139 jun 8 16:43 dept.lst chgrp dba dept.lst; ls -l dept.lst
-rw-r--r-- 1 kumar dba 139 jun 8 16:43 dept.lst
File Modification and Access Time
A UNIX file has 3 timestamps associated with it

• Time of last file modification


• Time of last access
• Time of last inode modification
LINUX SYSTEM PROGRAMMING MODULE 2

When a file‘s contents are changed, its last modification time is updated by the kernel. ls –l shows this time
for a file.
A file‘s access time is the last time someone read, wrote or executed the file. The ls –lu shows the
access time.
touch: Changing the timestamps
touch used to update the access date and/or modification date of a file or directory. In its default usage, it
is the equivalent of creating or opening a file and saving it without any change to the file content. It
simply updates the dates associated with the file or directory. The simplest use case for touch is this:
$ touch myfile.txt
When used without options or an expression, both times are set to the current time. It creates fileif it doesn‘t
exist but not overwritten if it does.
With the –mt option only modification time can be altered. With –at option only access time can be altered.

ln: Creating Hard links


There are two types of links
Symbolic links: Refer to a symbolic path indicating the abstract location of another file hard links.

Hard Links: Refer to the specific location of physical data. Cannot be used to create a link for directory
Cannot beused to link files in different filesystem.
To create a hard link, enter the following command:
ln {target-filename} {hardlink-filename}
To create a symbolic link, enter the following command: ln -s {target-filename} {symbolic-filename}
For example to create softlink for /webroot/home/httpd/test.com/index.php as
/home/vivek/index.php, enter the following command:
ln -s /webroot/home/httpd/test.com/index.php /home/vivek/index.php
ls -l

Output: lrwxrwxrwx 1 vivek vivek 16 2007-09-25 22:53 index.php -> /webroot/h

find: Locating Files


This command is used for searching the files in a directory hierarchy.
The syntax of find command is
find pathnames selection-criteria action find -name "sum.java"
find all the files with name "sum.java" in the current directory and sub-directories. find . -perm 777
display the files which have read, write, and execute permissions. find . -mtime -1
displays files which are modified within 1 day. find -name "*java*" -exec rm -r {} \;

remove files which contain the ame "java".


LINUX SYSTEM PROGRAMMING MODULE 2

System Calls for File Management

General file API’s


Files in a UNIX and POSIX system may be any one of the following types:
• Regular file
• Directory File
• FIFO file
• Block device file
• character device file Symbolic link file.
There are special API’s to create these types of files. There is a set of Generic API’s that can be used to
manipulate and create more than one type of files. These API’s are:

open

• This is used to establish a connection between a process and a file i.e. it is used to open an existing file for
data transfer function or else it may be also be used to create a new file.
• The returned value of the open system call is the file descriptor (row number of the file table), which
contains the inode information.
• The prototype of open function is
#include<sys/types.h> #include<sys/fcntl.h>
int open(const char *pathname, int accessmode, mode_t permission);

• If successful, open returns a nonnegative integer representing the open file descriptor.
• If unsuccessful, open returns –1.
• The first argument is the name of the file to be created or opened. This may be an absolute pathname or
relative pathname.
• If the given pathname is symbolic link, the open function will resolve the symbolic link reference to a non
symbolic link file to which it refers.
• The second argument is access modes, which is an integer value that specifies how actually the file should
be accessed by the calling process.
• Generally the access modes are specified in <fcntl.h>. Various access modes are:
O_RDONLY - open for reading file only O_WRONLY - open for
writing file only
O_RDWR - opens for reading
and writing file

There are other access modes, which are termed as access modifier flags, and one or more of the following
can be specified by bitwise-ORing them with one of the above access mode flags to alter the access
mechanism of the file. O_APPEND - Append data to the end of file.
O_CREAT - Create the file if it doesn’t exist
O_EXCL - Generate an error if O_CREAT is also specified and the file already exists.
O_TRUNC - If file exists discard the file content and set the file size to zero bytes. O_NONBLOCK -
Specify subsequent read or write on the file should be nonblocking.
O_NOCTTY- Specify not to use terminal device file as the calling process control terminal.
• To illustrate the use of the above flags, the following example statement opens a file called /usr/usp for
read and write in append mode: int fd=open(“/usr/usp”,O_RDWR | O_APPEND,0);

• If the file is opened in read only, then no other modifier flags can be used.
LINUX SYSTEM PROGRAMMING Module 2

• If a file is opened in write only or read write, then we are allowed to use any modifier flags along with
them.
• The third argument is used only when a new file is being created. The symbolic names for file permission
are given in the table in the previous page.

creat
• This system call is used to create new regular files. The prototype of creat is

#include <sys/types.h> #include<unistd.h>


int creat(const char *pathname, mode_t mode);

• Returns: file descriptor opened for write-only if OK, -1 on error.


• The first argument pathname specifies name of the file to be created.
• The second argument mode_t, specifies permission of a file to be accessed by owner group and others.
• The creat function can be implemented using open function as:
#define creat(path_name, mode)
open (pathname, O_WRONLY | O_CREAT | O_TRUNC, mode);

read
• The read function fetches a fixed size of block of data from a file referenced by a
given file descriptor.
• The prototype of read function is:

#include<sys/types.h> #include<unistd.h>
• size_t read(int fdesc, void *buf, size_t nbyte);

• If successful, read returns the number of bytes actually read.
• If unsuccessful, read returns –1.
• The first argument is an integer, fdesc that refers to an opened file.
• The second argument, buf is the address of a buffer holding any data read.
• The third argument specifies how many bytes of data are to be read from the file.
• The size_t data type is defined in the <sys/types.h> header and should be the same as unsigned int.
• There are several cases in which the number of bytes actually read is less than the amount requested:
LINUX SYSTEM PROGRAMMING Module 2

o When reading from a regular file, if the end of file is reached before the requested number of bytes has
been read. For example, if 30 bytes remain until the end of file and we try to read 100 bytes, read returns
30. The next time we call read, it will return 0 (end of file).
o When reading from a terminal device. Normally, up to one line is read at a time.
o When reading from a network. Buffering within the network may cause less than the requested amount to
be returned.
o When reading from a pipe or FIFO. If the pipe contains fewer bytes than requested, read will return only
what is available.

write
• The write system call is used to write data into a file.
• The write function puts data to a file in the form of fixed block size referred by a given
file descriptor.
• The prototype of write is

#include<sys/types.h> #include<unistd.h>
ssize_t write(int fdesc, const void *buf, size_t size);

• If successful, write returns the number of bytes actually written.
• If unsuccessful, write returns –1.
• The first argument, fdesc is an integer that refers to an opened file.
• The second argument, buf is the address of a buffer that contains data to be written.
• The third argument, size specifies how many bytes of data are in the buf argument.
• The return value is usually equal to the number of bytes of data successfully written to a file. (size value)
close
• The close system call is used to terminate the connection to a file from a process.
• The prototype of the close is
#include<unistd.h> int close(int fdesc);

• If successful, close returns 0.


• If unsuccessful, close returns –1.
• The argument fdesc refers to an opened file.
• Close function frees the unused file descriptors so that they can be reused to reference other files. This is
important because a process may open up to OPEN_MAX files at any time and the close function allows
a process to reuse file descriptors to access more than OPEN_MAX files in the course of its execution.
• The close function de-allocates system resources like file table entry and memory buffer allocated to hold
the read/write.

fcntl
• The fcntl function helps a user to query or set flags and the close-on-exec flag of any file descriptor.
• The prototype of fcntl is
#include<fcntl.h>
int fcntl(int fdesc, int cmd, …);
• The first argument is the file descriptor.
• The second argument cmd specifies what operation has to be performed.
• The third argument is dependent on the actual cmd value. The possible cmd values are defined in
<fcntl.h> header.

cmd value Use
LINUX SYSTEM PROGRAMMING Module 2

F_GETFL Returns the access control flags of a file descriptor fdesc


F_SETFL Sets or clears access control flags that are specified in the
third argument to fcntl. The allowed access control flags are
O_APPEND & O_NONBLOCK
F_GETFD Returns the close-on-exec flag of a file referenced by fdesc.
If a return value is zero, the flag is off; otherwise on.
F_SETFD Sets or clears the close-on-exec flag of a fdesc. The third
argument to fcntl is an integer value, which is 0 to clear the
flag, or 1 to set the flag
F_DUPFD Duplicates file descriptor fdesc with another file descriptor.
The third argument to fcntl is an integer value which
specifies that the duplicated file descriptor must be greater
than or equal to that value. The return value of fcntl is the
duplicated file descriptor
• The fcntl function is useful in changing the access control flag of a file descriptor.
• For example: after a file is opened for blocking read-write access and the process needs to change the
access to non-blocking and in write-append mode, it can call:

int cur_flags=fcntl(fdesc,F_GETFL);

int rc=fcntl(fdesc,F_SETFL,cur_flag | O_APPEND | O_NONBLOCK);

The following example reports the close-on-exec flag of fdesc, sets it to on afterwards:
fcntl(fdesc,F_GETFD);
(void)fcntl(fdesc,F_SETFD,1); //turn on
close-on-exec flag

The following statements change the standard input of a process to a file called FOO:
int fdesc=open(“FOO”,O_RDONLY); //open FOO for read
close(0); //close standard input

if(fcntl(fdesc,F_DUPFD,0)==-1)

perror(“fcntl”); //stdin from FOO now


char buf[256];
int rc=read(0,buf,256); //read data from FOO

The dup and dup2 functions in UNIX perform the same file duplication function as fcntl. They can be
implemented using fcntl as:

#define dup(fdesc) fcntl(fdesc, F_DUPFD,0)


#define close(fd2),fcntl(fdesc,F_DUPFD,fd2)
dup2(fdesc1,fd2)

lseek
• The lseek function is also used to change the file offset to a different value.
• Thus lseek allows a process to perform random access of data on any opened file.
LINUX SYSTEM PROGRAMMING Module 2

• The prototype of lseek is


#include <sys/types.h> #include <unistd.h>
off_t lseek(int fdesc, off_t pos, int whence);

• On success it returns new file offset, and –1 on error.


• The first argument fdesc, is an integer file descriptor that refer to an opened file.
• The second argument pos, specifies a byte offset to be added to a reference location in deriving the new
file offset value.
• The third argument whence, is the reference location.
Whence Reference
value location
SEEK_CUR Current file
pointer address
SEEK_SET The beginning
of a file
SEEK_END The end of a file
• They are defined in the <unistd.h> header.
• If an lseek call will result in a new file offset that is beyond the current end-offile, two outcomes possible
are:
o If a file is opened for read-only, lseek will fail. o If a file is opened for write access, lseek will succeed.
o The data between the end-of-file and the new file offset address will be initialised with NULL characters.

link
• The link function creates a new link for the existing file. The prototype of the link function is o
• If successful, the link function returns 0.
• If unsuccessful, link returns –1.
• The first argument cur_link, is the pathname of existing file.
• The second argument new_link is a new pathname to be assigned to the same file.
• If this call succeeds, the hard link count will be increased by 1.
• The UNIX ln command is implemented using the link API.

#include <unistd.h>
int link(const char *cur_link, const char *new_link);

unlink
• The unlink function deletes a link of an existing file.
• This function decreases the hard link count attributes of the named file, and removes the file name entry of
the link from directory file.
• A file is removed from the file system when its hard link count is zero and no process has any file descriptor
referencing that file.
• The prototype of unlink is o

#include <unistd.h>
int unlink(const char * cur_link);
• If successful, the unlink function returns 0.
• If unsuccessful, unlink returns –1.
• The argument cur_link is a path name that references an existing file.
• ANSI C defines the rename function which does the similar unlink operation.
• The prototype of the rename function is:
LINUX SYSTEM PROGRAMMING Module 2

#include<stdio.h>
int rename(const char * old_path_name,const char * new_path_name);

stat, fstat
• The stat and fstat function retrieves the file attributes of a given file.
• The only difference between stat and fstat is that the first argument of a stat is a file pathname, where as
the first argument of fstat is file descriptor. The prototypes of these functions are
#include<sys/stat.h>
#include<unistd.h>

int stat(const char *pathname, struct stat *statv); int fstat(const int fdesc, struct stat *statv);

• The second argument to stat and fstat is the address of a struct stat-typed variable which is defined in the
<sys/stat.h> header.
• Its declaration is as follows: struct stat

dev_t st_dev; /* file system ID */


ino_t st_ino; /* file inode number */
mode_t st_mode; /* contains file type and permission */
nlink_t st_nlink; /* hard link count */
uid_t st_uid; /* file user ID */
gid_t st_gid; /* file group ID */
dev_t st_rdev; /*contains major and minor device#*/
off_t st_size; /* file size in bytes */
time_t st_atime; /* last access time */
time_t st_mtime; /* last modification time */
time_t st_ctime; /* last status change time */
};
• The return value of both functions is o 0 if they succeed o -1 if they fail
o errno contains an error status code
• The lstat function prototype is the same as that of stat:
int lstat(const char * path_name, struct stat* statv);

• We can determine the file type with the macros as shown. o macro Type of file o S_ISREG() regular file
o S_ISDIR() directory file o S_ISCHR() character special file o S_ISBLK() block special file o S_ISFIFO()
pipe or FIFO o S_ISLNK() symbolic link
o S_ISSOCK() socket

access
• The access system call checks the existence and access permission of user to a named file.
• The prototype of access function is:

• #include<unistd.h>
• int access(const char *path_name, int flag);

• On success access returns 0, on failure it returns –1. The first argument is the pathname of a file.
• The second argument flag, contains one or more of the following bit flag .
LINUX SYSTEM PROGRAMMING Module 2

Bit flag Uses


F_OK Checks whether a
named file exist
R_OK Test for read
permission
W_OK Test for write
permission
X_OK Test for execute
permission

The flag argument value to an access call is composed by bitwise-ORing one or


more of the above bit flags as shown:
int rc=access(“/usr/usp.txt”,R_OK | W_OK);

example to check whether a file exists:


if(access(“/usr/usp.txt”, F_OK)==-1) printf(“file does not exists”);
else
printf(“file exists”);
chmod, fchmod
• The chmod and fchmod functions change file access permissions for owner, group & others as well as the
set_UID, set_GID and sticky flags.
• A process must have the effective UID of either the super-user/owner of the file.
• The prototypes of these functions are

#include<sys/types.h>
#include<sys/stat.h>
#include<unistd.h>

int chmod(const char *pathname, mode_t flag); int fchmod(int fdesc, mode_t flag);

The pathname argument of chmod is the path name of a file whereas


the fdesc argument of fchmod is the o file descriptor of a file.
The chmod function operates on the specified file, whereas the
fchmod function operates on a file that has already been opened.
To change the permission bits of a file, the effective user ID of the
process must be equal to the owner ID of the file, or the process must
have super-user permissions.
The mode is specified as the bitwise OR of the constants shown
below.
Mode
Description
S_ISUID
S_ISGID Set-user-ID on execution
S_ISVTX
set-group-ID on execution
S_IRWXU saved-text (sticky bit)
LINUX SYSTEM PROGRAMMING Module 2

S_IRUSR read, write, and execute by user (owner)


S_IWUSR read by user (owner)
S_IXUSR write by user (owner)
S_IRWXG execute by user (owner)

S_IRGRP read, write, and execute by group


S_IWGRP read by group
S_IXGRP write by group
S_IRWXO execute by group
read, write, and execute by other (world)

chown, fchown, lchown


• The chown functions changes the user ID and group ID of files.
• The prototypes of these functions are
#include<unistd.h>
#include<sys/types.h>

int chown(const char *path_name, uid_t uid, gid_t gid); int fchown(int fdesc, uid_t uid, gid_t gid);

o The path_name argument is the path name of a file. o The uid argument specifies the new user ID to be
assigned to the file.
o The gid argument specifies the new group ID to be assigned to the file.

• The above program takes at least two command line arguments:


o The first one is the user name to be assigned to files o The second and any subsequent arguments are file
path names.
• The program first converts a given user name to a user ID via getpwuid function. If that succeeds, the
program processes each named file as follows: it calls stat to get the file group ID, then it calls chown to
change the file user ID. If either the stat or chown fails, error is displayed.

utime Function
• The utime function modifies the access time and the modification time stamps of a file.
• The prototype of utime function is
#include<sys/types.h>
#include<unistd.h>
#include<utime.h>

int utime(const char *path_name, struct utimbuf *times);


• On success it returns 0, on failure it returns –1.
• The path_name argument specifies the path name of a file.
• The times argument specifies the new access time and modification time for the file.
• The struct utimbuf is defined in the <utime.h> header as:
LINUX SYSTEM PROGRAMMING Module 2

struct utimbuf
{
time_t actime; /* access time */
time_t modtime; /*modification time */
}

• The time_t datatype is an unsigned long and its data is the number of the seconds elapsed since the birthday
of UNIX : 12 AM , Jan 1 of 1970.
• If the times (variable) is specified as NULL, the function will set the named file access and modification
time to the current time.
• If the times (variable) is an address of the variable of the type struct utimbuf, the function will set the file
access time and modification time to the value specified by the variable.

File and Record Locking


• Multiple processes performs read and write operation on the same file concurrently.
• This provides a means for data sharing among processes, but it also renders difficulty for any process in
determining when the other process can override data in a file.
• So, in order to overcome this drawback UNIX and POSIX standard support file locking mechanism.
• File locking is applicable for regular files.
• Only a process can impose a write lock or read lock on either a portion of a file or on the entire file.
• The differences between the read lock and the write lock is that when write lock is set, it prevents the other
process from setting any over-lapping read or write lock on the locked file.
• Similarly when a read lock is set, it prevents other processes from setting any overlapping write locks on
the locked region.
• The intension of the write lock is to prevent other processes from both reading and writing the locked
region while the process that sets the lock is modifying the region, so write lock is termed as “Exclusive
lock”.
• The use of read lock is to prevent other processes from writing to the locked region while the process that
sets the lock is reading data from the region.
• Other processes are allowed to lock and read data from the locked regions. Hence a read lock is also called
as “shared lock “.
• File lock may be mandatory if they are enforced by an operating system kernel.
• If a mandatory exclusive lock is set on a file, no process can use the read or write system calls to access
the data on the locked region.
• These mechanisms can be used to synchronize reading and writing of shared files by multiple processes.
• If a process locks up a file, other processes that attempt to write to the locked regions are blocked until the
former process releases its lock.
• Problem with mandatory lock is – if a runaway process sets a mandatory exclusive lock on a file and never
unlocks it, then, no other process can access the locked region of the file until the runway process is killed
or the system has to be rebooted.
• If locks are not mandatory, then it has to be advisory lock.
• A kernel at the system call level does not enforce advisory locks.
• This means that even though a lock may be set on a file, no other processes can still use the read and write
functions to access the file.
• To make use of advisory locks, process that manipulate the same file must cooperate such that they follow
the given below procedure for every read or write operation to the file.
• Try to set a lock at the region to be accesses. If this fails, a process can either wait for the lock request to
become successful.
• After a lock is acquired successfully, read or write the locked region.
• Release the lock.
LINUX SYSTEM PROGRAMMING Module 2

• If a process sets a read lock on a file, for example from address 0 to 256, then sets a write lock on the file
from address 0 to 512, the process will own only one write lock on the file from 0 to 512, the previous read
lock from 0 to 256 is now covered by the write lock and the process does not own two locks on the region
from 0 to 256. This process is called “Lock Promotion”.
• Furthermore, if a process now unblocks the file from 128 to 480, it will own two write locks on the file:
one from 0 to 127 and the other from 481 to 512. This process is called “Lock Splitting”.
• UNIX systems provide fcntl function to support file locking. By using fcntl it is possible to impose read or
write locks on either a region or an entire file.
• The prototype of fcntl is
#include<fcntl.h>
int fcntl(int fdesc, int cmd_flag, .... );

• The first argument specifies the file descriptor.


• The second argument cmd_flag specifies what operation has to be performed. If fcntl is used for file
locking then it can values as
F_SETLK sets a file lock, do not block if this cannot
mmediately. succeed
F_SETLKW sets a file lock and blocks the process until
acquired. the lock is
F_GETLK queries as to which process locked a
file. specified region of

• For file locking purpose, the third argument to fctnl is an address of a struct flock type variable.
• This variable specifies a region of a file where lock is to be set, unset or queried.

struct flock
{
short l_type; /* what lock to be set or to unlock file */
short l_whence; /* Reference address for the next field */
off_t l_start ; /*offset from the l_whence reference addr*/
off_t l_len ; /*how many bytes in the locked region */
pid_t l_pid ; /*pid of a process which has locked the file */
};

The l_type field specifies the lock type to be set or unset.


The possible values, which are defined in the <fcntl.h> header, and

use
l_type
value Set a read lock on a
F_RDLCK specified region Set a
write lock on a specified
F_WRLCK region
F_UNLCK Unlock a specified region

• The l_whence, l_start & l_len define a region of a file to be locked or unlocked.
• The possible values of l_whence and their uses are
LINUX SYSTEM PROGRAMMING Module 2

l_whence value Use


SEEK_CUR The l_start value is added to current file pointer address
SEEK_SET The l_start value is added to byte 0 of the file
SEEK_END The l_start value is added to the end of the file

• A lock set by the fcntl API is an advisory lock but we can also use fcntl for mandatory locking purpose
with the following attributes set before using fcntl
• Turn on the set-GID flag of the file.
• Turn off the group execute right permission of the file.
• In the given example program we have performed a read lock on a file “usp” from the 10th byte to 25th
byte.

Example Program

#include <unistd.h> #include<fcntl.h> int


main ( )
{
int fd; struct flock lock; fd=open(“usp”,O_RDONLY);
lock.l_type=F_RDLCK;
lock.l_whence=0;
lock.l_start=10;
lock.l_len=15;
fcntl(fd,F_SETLK,&lock);
}

Directory File API’s


• A Directory file is a record-oriented file, where each record stores a file name and the inode number of a
file that resides in that directory.
• Directories are created with the mkdir API and deleted with the rmdir API.
• The prototype of mkdir is
#include<sys/stat.h>
#include<unistd.h>

int mkdir(const char *path_name, mode_t mode);


• The first argument is the path name of a directory file to be created.
• The second argument mode, specifies the access permission for the owner, groups and others to be assigned
to the file. This function creates a new empty directory.
• The entries for “.” and “..” are automatically created. The specified file access permission, mode, are
modified by the file mode creation mask of the process.

• To allow a process to scan directories in a file system independent manner, a directory record is defined as
o struct dirent in the <dirent.h> header for UNIX.
• Some of the functions that are defined for directory file operations in the above header are
The uses of these functions are

Function Use
opendir Opens a directory file for read-only. Returns a file
handle dir * for future reference of the file.
readdir Reads a record from a directory file referenced by dir-
fdesc and returns that record information.
LINUX SYSTEM PROGRAMMING Module 2

rewinddir Resets the file pointer to the beginning of the directory


file referenced by dir- fdesc. The next call to readdir will
read the first record from the file.
closedir closes a directory file referenced by dir-fdesc.

• An empty directory is deleted with the rmdir API.


• The prototype of rmdir is
#include<unistd.h>
int rmdir (const char * path_name);
• If the link count of the directory becomes 0, with the call and no other process has the directory open then
o the space occupied by the directory is freed.

#include<sys/types.h>

#if defined (BSD)&&!_POSIX_SOURCE #include<sys/dir.h> typedef


struct direct Dirent;

#else

#include<dirent.h> typedef struct direct Dirent;

• UNIX systems have defined additional functions for random access of directory file records.
Function Use
telldir Returns the file pointer of a given dir_fdesc
seekdir Changes the file pointer of a given dir_fdesc to
a specified address

Device file APIs

• Device files are used to interface physical device with application programs.
• A process with superuser privileges to create a device file must call the mknod API.
• The user ID and group ID attributes of a device file are assigned in the same manner as for regular files.
• When a process reads or writes to a device file, the kernel uses the major and minor device numbers of a
file to select a device driver function to carry out the actual data transfer.
• Device file support is implementation dependent. UNIX System defines the mknod API to create device
files.
• The prototype of mknod is
#include<sys/stat.h>
#include<unistd.h>

int mknod(const char* path_name, mode_t mode, int device_id);


• The first argument pathname is the pathname of a device file to be created.
• The second argument mode specifies the access permission, for the owner, group and others, also S_IFCHR
or S_IBLK flag to be assigned to the file.
• The third argument device_id contains the major and minor device number.
• Example mknod(“SCSI5”,S_IFBLK | S_IRWXU | S_IRWXG |
S_IRWXO,(15<<8) | 3); The above function creates a block device file “SCS15”, to which all the
three i.e. read, write and execute permission is granted for user, group and others with major number as 8
LINUX SYSTEM PROGRAMMING Module 2

and minor number 3.


• On success mknod API returns 0 , else it returns -1

FIFO file API’s


• FIFO files are sometimes called named pipes.
• Pipes can be used only between related processes when a common ancestor has created the pipe.
• Creating a FIFO is similar to creating a file.
• Indeed the pathname for a FIFO exists in the file system.
• The prototype of mkfifo is

#include<sys/types.h>
#include<sys/stat.h>
#include<unistd.h>

int mkfifo(const char *path_name, mode_t mode);


• The first argument pathname is the pathname(filename) of a FIFO file to be created.
• The second argument mode specifies the access permission for user, group and others and as well as the
S_IFIFO flag to indicate that it is a FIFO file.
• On success it returns 0 and on failure it returns –1.
• Example o mkfifo(“FIFO5”,S_IFIFO | S_IRWXU | S_IRGRP | S_ROTH);
• The above statement creates a FIFO file “divya” with read-write-execute permission for user and only read
permission for group and others.
• Once we have created a FIFO using mkfifo, we open it using open.
• Indeed, the normal file I/O functions (read, write, unlink etc) all work with FIFOs.
• When a process opens a FIFO file for reading, the kernel will block the process until there is another
process that opens the same file for writing.
• Similarly whenever a process opens a FIFO file write, the kernel will block the process until another
process opens the same FIFO for reading.
• This provides a means for synchronization in order to undergo inter-process communication.
• If a particular process tries to write something to a FIFO file that is full, then that process will be blocked
until another process has read data from the FIFO to make space for the process to write.
• Similarly, if a process attempts to read data from an empty FIFO, the process will be blocked until another
process writes data to the FIFO.
• From any of the above condition if the process doesn’t want to get blocked then we should specify
O_NONBLOCK in the open call to the FIFO file.
• If the data is not ready for read/write then open returns –1 instead of process getting blocked.
• If a process writes to a FIFO file that has no other process attached to it for read, the kernel will send
SIGPIPE signal to the process to notify that it is an illegal operation.
• Another method to create FIFO files (not exactly) for inter-process communication is to use the pipe system
call.
• The prototype of pipe is

#include <unistd.h> int pipe(int fds[2]);

• Returns 0 on success and –1 on failure.


• If the pipe call executes successfully, the process can read from fd[0] and write to fd[1]. A single process
with a pipe is not very useful. Usually a parent process uses pipes to communicate with its children.
LINUX SYSTEM PROGRAMMING Module 2

Symbolic Link File API’s


• A symbolic link is an indirect pointer to a file, unlike the hard links which pointed directly to the inode of
the file.
• Symbolic links are developed to get around the limitations of hard links:
o Symbolic links can link files across file systems. o Symbolic links can link directory files
o Symbolic links always reference the latest version of the files to which they link
o There are no file system limitations on a symbolic link and what it points to and anyone can create a
symbolic link to a directory.
o Symbolic links are typically used to move a file or an entire directory hierarchy to some other location on
a system. o A symbolic link is created with the symlink. o The prototype is
#include<unistd.h>
#include<sys/types.h>
#include<sys/stat.h>

int symlink(const char *org_link, const char *sym_link); int readlink(const char* sym_link,char*
buf,int size);
int lstat(const char * sym_link, struct stat* statv);
The org_link and sym_link arguments to a sym_link call specify the original file path name and the
symbolic link path name to be created.
Module 2
Laboratory Component: (minimum 3 experiments / programs)
1. Execute the "ls" command to display comprehensive file attributes with all
available options, view the file's contents, perform file copying and moving
operations between locations, and subsequently remove the file.
i) $ cat filename
cat is short form of concatenate, which means to join together. This utility is used most
often to display contents of single file. You may also use the cat command to display the
contents of several files in succession. In that case files should be separated by a blank
space.
You can also create a file using cat.

ii) cp: - copy a file


$ cp file1 file2
If target file already exists, it is overwritten.
Contents of file1 are copied onto file2
$cp –r rose tt
Copies all files of rose to tt

iii) mv: - move or rename a file


$mv file1 file2
Moves the contents of file1 to file2 and file1 no longer exists

iv) rm:- remove files or directories


$ rm filename removes file filename

NOTE: To remove more than one file, separate it by a space.


$rm –r XYZ removes recursively all files and directories and XYZ
CAUTION: Do not give $rm *
v) ls: - to list the files in directories.

Short Form Description


-a List all files, including hidden files.
-x Lists multiple columns
-l Use the long format, providing detailed file information.
-lt long listing based on Modification time
-u lists files based on size
-h Display file sizes in a human-readable format.
-R List directories and their contents recursively.
-S Sort by file size, largest first.
-r Reverse the order of listing.
-F Append file type indicators to file names.
-i Display the inode number for each file.
Commands with Output:
2. Execute the following directory-related commands:
(i) Create a new directory, navigate between directories, print the current directory
path, check disk space usage, compress file content, and archive files.
i) mkdir: Creating a directory
$mkdir ABC
Directory ABC is created
$mkdir ABC
mkdir: Cannot create directory ABC : File exists
You may also use the mkdir command to create several directories in succession.

$mkdir XYZ XYZ/a XYZ/b


Creates a directory XYZ and subdirectories a and b under XYZ

ii) $cd ABC ~/ABC


$ cd:
Changes the directory to specified name
iii) $cd ..
Moves one level, to the parent directory
$cd / Moves to root directory
iv) rmdir:
Removes directory Allows to remove the specified directory provided the directory is empty.
The directory to be removed should not be current working directory.
$rmdir XYZ/a XYZ/b XYZ
Removes the tree directory XYZ
v) pwd:
Present working directory
vi) df
Check disk space usage details.

vii) gzip & gunzip


gzip is used to compress the file with .gz as extension
gunzip used to uncompress the .gz file

viii) tar
$ tar -cvf archive.tar file1.txt file2.txt my_directory/
Archives multiple files and directories into a single archive file. The command typically
doesn't display output, but it creates an archive file.
3. Identify commands for adjusting user, group, and others' permissions using
symbolic and octal notation, create files using the "touch" command, modify access
and modification timestamps, and alter default permissions for files or directories
using "umask

i) chmod: Change permission mode UNIX allows the user to change the default
permissions that are assigned. The pre condition however is that you must be the
owner of the file. Unless and until you own the file, you cannot change the
permissions assigned to the file.

This command is the key to UNIX permission modes, which provides a simple yet effective
method for controlling access to files. Whenever a file is created, system assigns its default
access permissions to the file. The owner can change these permissions with help of chmod
command.

Note: Modification time has not been changed. This is because changing the access
permissions does not modify the contents of file. The modification time is changed only if the
file’s contents are modified by write operation. This is called symbolic format of accessing
the modes.
One more format is there called as absolute format – which is based on octal numbers
(digits 0 through 7).
ii) touch: change and modify the timestamp
The touch command is used to change the timestamps (i.e., dates and times of the
most recent access and modification) on existing files and directories.
$touch filename
When used without any options, touch creates new files for any file names that are
provided as arguments (i.e., input data) if files with such names do not already exist.
Touch can create any number of files simultaneously.

For example, the -a option changes only the access time, while the -m option changes only
the modification time. The use of both of these options together changes both the access
and modification times to the current time
iii) umask
New files and directories are created with default set of permissions. For directories, the
base permissions are (rwxrwxrwx) 0777 and for files they are 0666 (rw-rw-rw).
Kernel applies a restriction on the default permission on files & directories by applying a
permission mask called the umask. This is an octal number which has to be subtracted
from default permission.
$umask 0002
Sample Questions
1) Explain Linux file system.
2) What are the file attributes available in Linux file system?
3) Explain file types.
4) Explain how to enable file permissions.
5) Explain chmod command
6) Explain differences between hard link and symbolic link
7) Explain commonly used tar operations and options
8) List steps to create an Archive
9) What are the uses of df command?
10) Explain write system call

You might also like