22CSE341-Module 2[1]
22CSE341-Module 2[1]
22CSE341-Module 2[1]
File System and Attributes: Introduction to LINUX file system, inode, File Types, File
Attributes, Application program Interface to Files, LINUX kernel support for files File
Handling Commands: ls, cat, cp, mv, rm, wc, od, printf, pwd, mkdir, rmdir, cd, file and
directory permissions-chmod, file ownership-chown, chgrp, umask, tar, gzip, du, df, find,
file modification and access times and touch command
• Files in UNIX or POSIX systems are stored in tree-like hierarchical file system.
• The root of a file system is the root (“/”) directory.
• The leaf nodes of a file system tree are either empty directory files or other types of files.
• Absolute path name of a file consists of the names of all the directories, starting from theroot.
• Ex: /usr/cse/a.out
• Relative path name may consist of the “.” and “..” characters. These are references to current and parent
directories respectively.
• Ex: ../../.login denotes .login file which may be found 2 levels up from the current directory
• A file name may not exceed NAME_MAX characters (14 bytes) and the total numberof characters of a
path name may not exceed PATH_MAX (1024 bytes).
• POSIX.1 defines _POSIX_NAME_MAX and _POSIX_PATH_MAX in <limits.h> header
• File name can be any of the following character set only
A to Z, a to z, 0 to 9, _
• Path name of a file is called the hardlink.
• A file may be referenced by more than one path name if a user creates one or more hard links to the file
using ln command.
ln /usr/foo/path1 /usr/prog/new/n1
• If the –s option is used, then it is a symbolic (soft) link .
Directory Content
/bin The directories where all the commonly used UNIX
commands (binaries,hence the name bin) are found.
/sbin and Commands that common user can‘t execute but the system
/usr/sbin administratorcan would be in these directories
/etc This directory contains the configuration files of the system.
can change a very important aspect of system functioning by
editing a text file in thisdirectory
LINUX SYSTEM PROGRAMMING MODULE 2
/dev This directory contains all device files. These files don‘t
occupy space ondisk.
/lib and /usr/lib These directories contain all library files in binary form. We
would linkyour C programs with files in these directories
/usr/include This directory contains the standard header files used by C pro-
grams. The statement #include <stdio.h> used in most C
programs refers to the file stdio.h in this directory.
Directory Content
/tmp The directories where users are allowed to create temporary
files. These files arewiped away regularly by the system.
/var The variable part of the file system. Contains all of your
print jobs and your outgoing and incoming mail.
/home The variable part of the file system. Contains all of your print
jobs and youroutgoing and incoming mail.
FILE Use
/etc Stores system administrative files and programs
/etc/passwd Stores all user information’s
/etc/shadow Stores user passwords
/etc/group Stores all group information
/bin Stores all the system programs like cat, rm, cp,etc.
/dev Stores all character device and block device files
/usr/include Stores all standard header files.
/usr/lib Stores standard libraries
/tmp Stores temporary files created by program
LINUX SYSTEM PROGRAMMING MODULE 2
❖ Regular file
▪ A regular file may be either a text file or a binary file
▪ These files may be read or written to by users with the appropriate access permission
▪ Regular files may be created, browsed through and modified by various means such as text editors or
compilers, and they can be removed by specific system commands
❖ Directory file
▪ It is like a folder that contains other files, including sub-directory files.
▪ It provides a means for users to organise their files into some hierarchical structure based on file
relationship or uses.
▪ Ex: /bin directory contains all system executable programs, such as cat, rm, sort
▪ A directory may be created in UNIX by the mkdir command
o Ex: mkdir /usr/foo/xyz
▪ A directory may be removed via the rmdir command
o Ex: rmdir /usr/foo/xyz
▪ The content of directory may be displayed by the ls command
❖ Device file
Block device file Character device file
▪ A physical device may have both block and character device files representing it for
different access methods.
▪ An application program may perform read and write operations on a device file and the OS will
automatically invoke an appropriate device driver function to perform the actual data transfer between the
physical device and the application
▪ An application program in turn may choose to transfer data by either a character- based(via character device
file) or block-based(via block device file)
▪ A device file is created in UNIX via the mknod command
o Ex: mknod /dev/cdsk c 115 5
the kernel uses the device file’s major number to select and invoke a device driver function to carry out
actual data transfer with a physical device.
▪ an integer value to be passed as an argument to a device driver
function when it is called. It tells the device driver function what actual physical device is talking to and
the I/O buffering scheme to be used for data transfer.
❖ FIFO file
▪ It is a special pipe device file which provides a temporary buffer for two or more processes to
communicate by writing data to and reading data from the buffer.
▪ The size of the buffer is fixed to PIPE_BUF.
▪ Data in the buffer is accessed in a first-in-first-out manner.
▪ The buffer is allocated when the first process opens the FIFO file for read or write
▪ The buffer is discarded when all processes close their references (stream pointers) to the FIFO file.
▪ Data stored in a FIFO buffer is temporary.
▪ A FIFO file may be created via the mkfifo command.
o The following command creates a FIFO file (if it does not exists)
mkfifo /usr/prog/fifo_pipe
o The following command creates a FIFO file (if it does not exists)
mknod /usr/prog/fifo_pipe p
▪ FIFO files can be removed using rm command.
cd /home/user1/exam
No 2 files in UNIX system can have identical pathnames. If 2 files have same names, they must be in
different directories, which means their absolute pathname is also different.
Relative Pathnames
LINUX SYSTEM PROGRAMMING MODULE 2
In this the current directory is used as the point of reference and specifies the path relative to it. UNIX
allows the use of two symbols in pathnames that use the current and parent directory as the reference
point:
. (a single dot) This represents the current directory
.. (two dots) This represents the parent directory.
Pathnames that begin with either of these symbols are known as relative pathnames. example:
cd ./progs
which refers subdirectory progs under your current directory.
Any number of such sets of .. separated by / can be combined. However when a / is used with
.. it acquires a different meaning, instead of moving down a level, it moves one level up.
ls: Listing Files
The command to list your directories and files is ls. With options it can provide information about the
size, type of file, permissions, dates of file creation, change and access.
Syntax
list the attributes of a file. Long listing: lists the mode, link information, owner, group, size, last
modification (time). If the file is a symbolic link, an arrow (-->) precedes the pathname of the linked-to
file.
ls command is used to obtain a list of all filenames in the current directory. ls -l look up the file‘sinode to
fetch its attributes. It lists seven attributes of all files in the current directory and they are:
ls -l
total 179
-rw-r--r-- 1 prami prami 0 Oct 18 2021 03061200
-rw-r--r-- 2 prami prami 84 Oct 18 2021 1.c
-rw-r--r— 1 prami prami 0 Oct 18 2021 10061200
-rw-r--r-- 1 prami prami 216 Dec 20 2021 10a.sh
-rw-r--r-- 1 prami prami 208 Dec 13 2021 10a.sh.save
-rw-r--r-- 1 prami prami 240 Dec 27 2021 11a.sh
Types and Permission- The first column of the first field shows the file type. Here we see three possible
values—a - (ordinary file), d (directory), or l (symbolic link). The remaining nine characters form a string
of permissions which can take the values r, w, x, and -
Links- The second field indicates the number of links associated with the file. UNIX lets a file have
multiple names, and each name is interpreted as a link.
Ownership and Group Ownership- Every file has an owner. The third field shows prami as the owner of
LINUX SYSTEM PROGRAMMING MODULE 2
most of the files. A user also belongs to a group, and the fourth field shows metal as the group owner of
most of the files.
Size- The fifth field shows the file size in bytes. This actually reflects the character count and not the disk
space consumption of the file.
Last Modification time- The sixth field displays the last modification time in three columns, a time stamp
that is stored to the nearest second.
Filename- The last field displays the filename, which can be up to 255 characters long.
Listing Directory Attributes (-ld) To see the attributes of a directory bar rather than the filenames it
contains.
File Permissions
UNIX follows a three-tiered file protection system that determines a file‘s access rights.
Relative Permissions: chmod only changes the permissions specified in the command line and leaves the
other permissions unchanged. Its syntax is:
Examples: Initially,
Read permission -4
Write permission -2
Execute permission -1
• Write permission
cant create files in the directory (can‘t perform cat >,rm,cp,mv command) cant create subdirectory (can‘t
perform mkdir, rmdir command)
• Execute permission
Cant access the directory (can‘t perform cd, mkdir, rmdir, cat command)
This is an octal number which has to be subtracted from the system default to obtain the actual default.
This becomes 644 (666 – 022) for ordinary files and 755 (777 – 022) for directories. When you create a
file on this system, it will have the permissions rw-r--r--. A directory will have the permissions rwxr-xr-
x.
File ownership
There are 2 commands meant to manipulate the ownership of a file or a directory chown, changing file
LINUX SYSTEM PROGRAMMING MODULE 2
• chown ls -l note
-rwxr----x 1 kumar metal 347 may 10 20:30 note chown sharma note;
ls -l note
-rwxr----x 1 sharma metal 347 may 10 20:30 note
Once ownership of the file has been given away to sharma, the user file permissions that previously applied
to Kumar now apply to sharma.
• chgrp
This command changes the file‘s group owner.
115
89 ..
201 xyz
346 a.out
201 xyz_ln1
▪ To access a file, for example /usr/divya, the UNIX kernel always knows the “/” (root) directory inode # of
any process. It will scan the “/” directory file to find the inode number of the usr file. Once it gets the usr
file inode #, it accesses the contents of usr file. It then looks for the inode # of divya file.
▪ Whenever a new file is created in a directory, the UNIX kernel allocates a new entry in the inode table to
store the information of the new file
▪ It will assign a unique inode # to the file and add the new file name and inode # to the directory file that
contains it.
Application Program Interface to Files
The general interfaces to the files on UNIX and POSIX system are
▪ Files are identified by pathnames.
▪ Files should be created before they can be used. The various commands and system calls to create files
are listed below.
LINUX SYSTEM PROGRAMMING MODULE 2
▪ For any application to access files, first it should be opened, generally we use open system call to open a
file, and the returned value is an integer which is termed as file descriptor.
▪ There are certain limits of a process to open files. A maximum number of OPEN-MAX files can be opened
.The value is defined in <limits.h> header
▪ The data transfer function on any opened file is carried out by read and write system call.
▪ File hard links can be increased by link system call, and decreased by unlink system call.
▪ File attributes can be changed by chown, chmod and link system calls.
▪ File attributes can be queried (found out or retrieved) by stat and fstat system call.
▪ UNIX and POSIX.1 defines a structure of data type stat i.e. defined in <sys/stat.h> header file. This
contains the user accessible attribute of a file. The definition of the structure can differ among
implementation, but it could look like
struct stat
{
dev_t st_dev; /* file system ID */
ino_t st_ino; /* file inode number */
mode_t st_ mode; /* contains file type and permission */
nlink_t st_nlink; /* hard link count */
uid_t st_uid; /* file user ID */
gid_t st_gid; /* file group ID */
dev_t st_rdev; /*contains major and minor device#*/
off_t st_size; /* file size in bytes */
time_t st_atime; /* last access time */
time_t st_mtime; /* last modification time */
time_t st_ctime; /* last status change time */
};
▪ The file table entry will be set to point to the inode table entry, where the inode record ofthe file is stored.
▪ The file table entry will contain the current file pointer of the open file. This is an offset from the beginning
of the file where the next read or write will occur.
▪ The file table entry will contain an open mode that specifies that the file opened is for read only, write only
or read and write etc. This should be specified in open function call.
▪ The reference count (rc) in the file table entry is set to 1. Reference count is used to keep track of how
many file descriptors from any process are referring the entry.
▪ The reference count of the in-memory inode of the file is increased by 1. This count specifies how many
file table entries are pointing to that inode.
If either (1) or (2) fails, the open system call returns -1 (failure/error) Data Structure for File Manipulation
Normally the reference count in the file table entry is 1,if we wish to increase the rc in the file table entry,
this can be done using fork,dup,dup2 system call. When a open system call is succeeded, its return value
will be an integer (file descriptor). Whenever the process wants to read or write data from the file, it should
use the file descriptor as one of its argument.
The following events will occur whenever a process calls the close function to close the files that are
opened.
1. The kernel sets the corresponding file descriptor table entry to be unused.
2. It decrements the rc in the corresponding file table entry by 1, if rc not equal to 0 go to step 6.
3. The file table entry is marked as unused.
4. The rc in the corresponding file inode table entry is decremented by 1, if rc value not equal to 0 go to step
6.
5. If the hard link count of the inode is not zero, it returns to the caller with a success status otherwise
it marks the inode table entry as unused and de-allocates all the physical dusk storage of the file.
6. It returns to the process with a 0 (success) status.
Directory Files
▪ It is a record-oriented file
▪ Each record contains the information of a file residing in that directory
▪ The record data type is struct dirent in UNIX System V and POSIX.1 and struct direct in BSD UNIX.
▪ The record content is implementation-dependent
▪ They all contain 2 essential member fields
o File name
o Inode number
▪ Usage is to map file names to corresponding inode number
LINUX SYSTEM PROGRAMMING MODULE 2
ln /usr/divya/abc /usr/raj/xyz
If cp command was used then the data contents will be identical and the 2 files will be separate objects
in the file system, whereas in ln –s the data will contain only the path name.
Limitations of hard link:
1. User cannot create hard links for directories, unless he has super-user privileges.
2. User cannot create hard link on a file system that references files on a different file system,
because inode number is unique to a file system.
LINUX SYSTEM PROGRAMMING MODULE 2
Differences between hard link and symbolic link are listed below:
list the attributes of a file. Long listing: lists the mode, link information, owner, group, size, last
modification (time). If the file is a symbolic link, an arrow (-->) precedes the pathname of the linked-to
file.
ls command is used to obtain a list of all filenames in the current directory. ls -l look up the file‘sinode to
fetch its attributes. It lists seven attributes of all files in the current directory and they are:
LINUX SYSTEM PROGRAMMING MODULE 2
This command counts lines, words and characters depending on the options used.
Many files contain non-printing characters, and most UNIX commands don‟t display them properly.
To make these characters visible, use od (octal dump) that displays the ASCII octal value of a file‟s
contents.
The –b option displays this value for each character separately.
od –b filename
Each line displays 16 bytes of data in octal, preceded by the position in the file of the first byte in the line.
The option –c combined with –b gives an output with octal representations in the first line,
and the printable characters and escape sequences in their equivalent 2nd line.
od –bc filename
LINUX SYSTEM PROGRAMMING MODULE 2
pr:Paginating Files
This command prepares a file for printing by adding suitable headers, footers and formatted text.
$cat dept.lst
01|accounts|6213
02|progs|5423
03|marketing|6521
04|personnel|2365
05|production|9876
06|sales|1006
pr command adds suitable headers, footers and formatted text. pr adds five lines of margin at
the top and bottom. The header shows the date and time of last modification of the file along
with the filename and page number.
$pr dept.lst
$ cat filename
cat is short form of concatenate, which means to join together. This utility is used most often to display
contents of single file. You may also use the cat command to display the contents of several files in
succession.
In that case files should be separated by a blank space.
$mkdir ABC
mkdir: Cannot create directory ABC : File exists
You may also use the mkdir command to create several directories in succession.
$mkdir XYZ XYZ/a XYZ/b
Creates a directory XYZ and subdirectories a and b under XYZ
You know that everything is treated as files in UNIX, hence ls will list all type of files.
It is difficult to make out which is ordinary file, which is device file etc. For which ls supports an
option called as –l which lists the files in long format. We call it as long listing of files.
$ cp –r cse ise
Copies all files of cse to ise
UNIX allows the user to change the default permissions that are assigned. The pre condition however is
that
you must be the owner of the file. Unless and until you own the file, you cannot change the permissions
assigned to the file.
This command is the key to UNIX permission modes, which provides a simple yet effective method for
controlling
access to files. Whenever a file is created, system assigns its default access permissions to the file.
LINUX SYSTEM PROGRAMMING MODULE 2
The owner can change these permissions with help of chmod command.
mode: r → read
w → write
x → execute
$ ls -l filename
$ ls -l filename
Note: Modification time has not been changed. This is because changing the access permissions does
not modify the contents of file. The modification time is changed only if the file’s contents are modified
by write operation.
One more format is there called as absolute format – which is based on octal numbers (digits 0 through
7)
All octal values for read, write and execute modes are as follows:
Read → 4
Write → 2
Execute→ 1
In order to express the ways in which you want a particular file to be accessed, simply add the octal values
that
correspond to individual types of permissions. (i.e. read, write, execute)
No access = 0
Read access only = 4
Read and execute access = 4 + 1 = 5
Read and write access = 4 + 2 = 6
Read and write and execute access = 4 + 2 + 1 = 7
Finally, the added octal rules are expressed in groups of three octal numbers which in turn indicate desired
access
modes for file owner, group owner and other user categories.
LINUX SYSTEM PROGRAMMING MODULE 2
$ ls -l filename
-rw-rw-rw- 1 nhce root 62 Jul 9 9:35 filename
$ ls -l filename
-rw------- 1 nhce root 62 Jul 9 9:35 filename
umask
New files and directories are created with default set of permissions. For directories,
the base permissions are (rwxrwxrwx) 0777 and for files they are 0666 (rw-rw-rw).
Kernel applies a restriction on the default permission on files & directories by applying a permission
mask called the umask. This is an octal number which has to be subtracted from default permission.
$umask
0002
The default umask 0002 used for normal user. With this mask default directory permissions are 0775 and
default file permissions are 0664.
The touch command is used to change the timestamps (i.e., dates and times of the most recent access and
modification)
When used without any options, touch creates new files for any file names that are provided
as arguments (i.e., input data)
if files with such names do not already exist. Touch can create any number of files simultaneously.
For example, the -a option changes only the access time, while the -m option changes only the modification
time.
LINUX SYSTEM PROGRAMMING MODULE 2
The use of both of these options together changes both the access and modification times to the current
time
Tar command
The GNU tar (short for Tape ARchiver) command is the most widely used archiving utility in Linux
systems.
Available directly in the terminal, the tar command helps create, extract, and list archive contents.
The utility is simple and has many helpful options for compressing files, managing backups,
or extracting a raw installation
• Operation mode indicates which operation executes on the files (creation, extraction, etc.).
• Options modify the operation mode and are not necessary. There is no limit on the number of options.
• The archive is the file name and extension.
• The file name(s) is a space-separated list for extraction or compression or wildcard matched name.
There are three possible syntax styles to use the operations and options:
For example:
2. UNIX short option style, using a single dash and clustered options:
du command
The du command is a standard Linux/Unix command that allows a user to gain disk usage information
quickly.
It is best applied to specific directories and allows many variations for customizing the output to meet your
needs.
As with most commands, the user can take advantage of many options or flags. Also, like many Linux
commands,
most users only use the same two or three flags to meet their specific set of needs. The aim here is to
introduce the
basic flags that people use, but also to look at some that are less common in hopes of improving our use
of du.
Let's first look at the standalone command, and then add in various options.
-h , --human-readable
The -h flag prints size outputs, such as the ones above, in a human-readable format. This format provides
a unit of measure (Bytes). If we now run the du -h command on the same directory, we see that the 12, 36,
and 48 values are in KB.
LINUX SYSTEM PROGRAMMING MODULE 2
48K total
Notice the bottom line here. The same information is displayed that is shown in the other examples
of du but without the 'total' banner to remind you.
-X, --exclude=Pattern
The -X option is a nifty little trick you can do if you know that your environment has a large number of a
certain type of file that you do not wish to calculate in your findings. In my experience, certain customers
would have large amounts of metadata files with the same file extension and did not wish to include those
in their findings. I cannot demonstrate this here on my virtual machine; however, here is the syntax and an
example.
[tcarrigan@rhel]$ du -ah --exclude="*.dll" /home/tcarrigan/article_submissions
This command would list all files and directory usage info in a human-readable format while excluding
any file with the extension .dll. This is a bit niche, however, it does have a place in the world.
Wrap up and man page
Hopefully, you now have a better understanding how useful the du utility can be. It is easy to get into the
routine of only ever running du -h and forgetting about all of the other incredibly powerful flags you have
at your disposal. There are many flags that I did not cover in this article, but you can find all the information
on the manual page for this command. To access the manpage, simply run man du
df command that displays the amount of disk space available on the file system containing each file name
argument.
• If no file name is passed as an argument with df command then it shows the space available on all
currently mounted file systems
• . This is something you might wanna know cause df command is not able to show the space available on
unmounted file systems and the reason for this is that for doing this on some systems requires very deep
knowledge of file system structures.
• By default, df shows the disk space in 1 K blocks.
• df displays the values in the units of first available SIZE from –block-size (which is an option) and from
the DF_BLOCK_SIZE, BLOCKSIZE AND BLOCK_SIZE environment variables.
• By default, units are set to 1024 bytes or 512 bytes(if POSIXLY_CORRECT is set) . Here, SIZE is an
integer and optional unit and units are K, M, G, T, P, E, Z, Y (as K in kilo) .
df Syntax :
When a file‘s contents are changed, its last modification time is updated by the kernel. ls –l shows this time
for a file.
A file‘s access time is the last time someone read, wrote or executed the file. The ls –lu shows the
access time.
touch: Changing the timestamps
touch used to update the access date and/or modification date of a file or directory. In its default usage, it
is the equivalent of creating or opening a file and saving it without any change to the file content. It
simply updates the dates associated with the file or directory. The simplest use case for touch is this:
$ touch myfile.txt
When used without options or an expression, both times are set to the current time. It creates fileif it doesn‘t
exist but not overwritten if it does.
With the –mt option only modification time can be altered. With –at option only access time can be altered.
Hard Links: Refer to the specific location of physical data. Cannot be used to create a link for directory
Cannot beused to link files in different filesystem.
To create a hard link, enter the following command:
ln {target-filename} {hardlink-filename}
To create a symbolic link, enter the following command: ln -s {target-filename} {symbolic-filename}
For example to create softlink for /webroot/home/httpd/test.com/index.php as
/home/vivek/index.php, enter the following command:
ln -s /webroot/home/httpd/test.com/index.php /home/vivek/index.php
ls -l
open
• This is used to establish a connection between a process and a file i.e. it is used to open an existing file for
data transfer function or else it may be also be used to create a new file.
• The returned value of the open system call is the file descriptor (row number of the file table), which
contains the inode information.
• The prototype of open function is
#include<sys/types.h> #include<sys/fcntl.h>
int open(const char *pathname, int accessmode, mode_t permission);
•
• If successful, open returns a nonnegative integer representing the open file descriptor.
• If unsuccessful, open returns –1.
• The first argument is the name of the file to be created or opened. This may be an absolute pathname or
relative pathname.
• If the given pathname is symbolic link, the open function will resolve the symbolic link reference to a non
symbolic link file to which it refers.
• The second argument is access modes, which is an integer value that specifies how actually the file should
be accessed by the calling process.
• Generally the access modes are specified in <fcntl.h>. Various access modes are:
O_RDONLY - open for reading file only O_WRONLY - open for
writing file only
O_RDWR - opens for reading
and writing file
There are other access modes, which are termed as access modifier flags, and one or more of the following
can be specified by bitwise-ORing them with one of the above access mode flags to alter the access
mechanism of the file. O_APPEND - Append data to the end of file.
O_CREAT - Create the file if it doesn’t exist
O_EXCL - Generate an error if O_CREAT is also specified and the file already exists.
O_TRUNC - If file exists discard the file content and set the file size to zero bytes. O_NONBLOCK -
Specify subsequent read or write on the file should be nonblocking.
O_NOCTTY- Specify not to use terminal device file as the calling process control terminal.
• To illustrate the use of the above flags, the following example statement opens a file called /usr/usp for
read and write in append mode: int fd=open(“/usr/usp”,O_RDWR | O_APPEND,0);
• If the file is opened in read only, then no other modifier flags can be used.
LINUX SYSTEM PROGRAMMING Module 2
• If a file is opened in write only or read write, then we are allowed to use any modifier flags along with
them.
• The third argument is used only when a new file is being created. The symbolic names for file permission
are given in the table in the previous page.
creat
• This system call is used to create new regular files. The prototype of creat is
read
• The read function fetches a fixed size of block of data from a file referenced by a
given file descriptor.
• The prototype of read function is:
#include<sys/types.h> #include<unistd.h>
• size_t read(int fdesc, void *buf, size_t nbyte);
•
• If successful, read returns the number of bytes actually read.
• If unsuccessful, read returns –1.
• The first argument is an integer, fdesc that refers to an opened file.
• The second argument, buf is the address of a buffer holding any data read.
• The third argument specifies how many bytes of data are to be read from the file.
• The size_t data type is defined in the <sys/types.h> header and should be the same as unsigned int.
• There are several cases in which the number of bytes actually read is less than the amount requested:
LINUX SYSTEM PROGRAMMING Module 2
o When reading from a regular file, if the end of file is reached before the requested number of bytes has
been read. For example, if 30 bytes remain until the end of file and we try to read 100 bytes, read returns
30. The next time we call read, it will return 0 (end of file).
o When reading from a terminal device. Normally, up to one line is read at a time.
o When reading from a network. Buffering within the network may cause less than the requested amount to
be returned.
o When reading from a pipe or FIFO. If the pipe contains fewer bytes than requested, read will return only
what is available.
write
• The write system call is used to write data into a file.
• The write function puts data to a file in the form of fixed block size referred by a given
file descriptor.
• The prototype of write is
#include<sys/types.h> #include<unistd.h>
ssize_t write(int fdesc, const void *buf, size_t size);
•
• If successful, write returns the number of bytes actually written.
• If unsuccessful, write returns –1.
• The first argument, fdesc is an integer that refers to an opened file.
• The second argument, buf is the address of a buffer that contains data to be written.
• The third argument, size specifies how many bytes of data are in the buf argument.
• The return value is usually equal to the number of bytes of data successfully written to a file. (size value)
close
• The close system call is used to terminate the connection to a file from a process.
• The prototype of the close is
#include<unistd.h> int close(int fdesc);
fcntl
• The fcntl function helps a user to query or set flags and the close-on-exec flag of any file descriptor.
• The prototype of fcntl is
#include<fcntl.h>
int fcntl(int fdesc, int cmd, …);
• The first argument is the file descriptor.
• The second argument cmd specifies what operation has to be performed.
• The third argument is dependent on the actual cmd value. The possible cmd values are defined in
<fcntl.h> header.
•
cmd value Use
LINUX SYSTEM PROGRAMMING Module 2
int cur_flags=fcntl(fdesc,F_GETFL);
The following example reports the close-on-exec flag of fdesc, sets it to on afterwards:
fcntl(fdesc,F_GETFD);
(void)fcntl(fdesc,F_SETFD,1); //turn on
close-on-exec flag
The following statements change the standard input of a process to a file called FOO:
int fdesc=open(“FOO”,O_RDONLY); //open FOO for read
close(0); //close standard input
if(fcntl(fdesc,F_DUPFD,0)==-1)
The dup and dup2 functions in UNIX perform the same file duplication function as fcntl. They can be
implemented using fcntl as:
lseek
• The lseek function is also used to change the file offset to a different value.
• Thus lseek allows a process to perform random access of data on any opened file.
LINUX SYSTEM PROGRAMMING Module 2
link
• The link function creates a new link for the existing file. The prototype of the link function is o
• If successful, the link function returns 0.
• If unsuccessful, link returns –1.
• The first argument cur_link, is the pathname of existing file.
• The second argument new_link is a new pathname to be assigned to the same file.
• If this call succeeds, the hard link count will be increased by 1.
• The UNIX ln command is implemented using the link API.
#include <unistd.h>
int link(const char *cur_link, const char *new_link);
unlink
• The unlink function deletes a link of an existing file.
• This function decreases the hard link count attributes of the named file, and removes the file name entry of
the link from directory file.
• A file is removed from the file system when its hard link count is zero and no process has any file descriptor
referencing that file.
• The prototype of unlink is o
#include <unistd.h>
int unlink(const char * cur_link);
• If successful, the unlink function returns 0.
• If unsuccessful, unlink returns –1.
• The argument cur_link is a path name that references an existing file.
• ANSI C defines the rename function which does the similar unlink operation.
• The prototype of the rename function is:
LINUX SYSTEM PROGRAMMING Module 2
#include<stdio.h>
int rename(const char * old_path_name,const char * new_path_name);
stat, fstat
• The stat and fstat function retrieves the file attributes of a given file.
• The only difference between stat and fstat is that the first argument of a stat is a file pathname, where as
the first argument of fstat is file descriptor. The prototypes of these functions are
#include<sys/stat.h>
#include<unistd.h>
int stat(const char *pathname, struct stat *statv); int fstat(const int fdesc, struct stat *statv);
• The second argument to stat and fstat is the address of a struct stat-typed variable which is defined in the
<sys/stat.h> header.
• Its declaration is as follows: struct stat
• We can determine the file type with the macros as shown. o macro Type of file o S_ISREG() regular file
o S_ISDIR() directory file o S_ISCHR() character special file o S_ISBLK() block special file o S_ISFIFO()
pipe or FIFO o S_ISLNK() symbolic link
o S_ISSOCK() socket
access
• The access system call checks the existence and access permission of user to a named file.
• The prototype of access function is:
• #include<unistd.h>
• int access(const char *path_name, int flag);
•
• On success access returns 0, on failure it returns –1. The first argument is the pathname of a file.
• The second argument flag, contains one or more of the following bit flag .
LINUX SYSTEM PROGRAMMING Module 2
#include<sys/types.h>
#include<sys/stat.h>
#include<unistd.h>
int chmod(const char *pathname, mode_t flag); int fchmod(int fdesc, mode_t flag);
int chown(const char *path_name, uid_t uid, gid_t gid); int fchown(int fdesc, uid_t uid, gid_t gid);
o The path_name argument is the path name of a file. o The uid argument specifies the new user ID to be
assigned to the file.
o The gid argument specifies the new group ID to be assigned to the file.
utime Function
• The utime function modifies the access time and the modification time stamps of a file.
• The prototype of utime function is
#include<sys/types.h>
#include<unistd.h>
#include<utime.h>
struct utimbuf
{
time_t actime; /* access time */
time_t modtime; /*modification time */
}
• The time_t datatype is an unsigned long and its data is the number of the seconds elapsed since the birthday
of UNIX : 12 AM , Jan 1 of 1970.
• If the times (variable) is specified as NULL, the function will set the named file access and modification
time to the current time.
• If the times (variable) is an address of the variable of the type struct utimbuf, the function will set the file
access time and modification time to the value specified by the variable.
• If a process sets a read lock on a file, for example from address 0 to 256, then sets a write lock on the file
from address 0 to 512, the process will own only one write lock on the file from 0 to 512, the previous read
lock from 0 to 256 is now covered by the write lock and the process does not own two locks on the region
from 0 to 256. This process is called “Lock Promotion”.
• Furthermore, if a process now unblocks the file from 128 to 480, it will own two write locks on the file:
one from 0 to 127 and the other from 481 to 512. This process is called “Lock Splitting”.
• UNIX systems provide fcntl function to support file locking. By using fcntl it is possible to impose read or
write locks on either a region or an entire file.
• The prototype of fcntl is
#include<fcntl.h>
int fcntl(int fdesc, int cmd_flag, .... );
• For file locking purpose, the third argument to fctnl is an address of a struct flock type variable.
• This variable specifies a region of a file where lock is to be set, unset or queried.
struct flock
{
short l_type; /* what lock to be set or to unlock file */
short l_whence; /* Reference address for the next field */
off_t l_start ; /*offset from the l_whence reference addr*/
off_t l_len ; /*how many bytes in the locked region */
pid_t l_pid ; /*pid of a process which has locked the file */
};
use
l_type
value Set a read lock on a
F_RDLCK specified region Set a
write lock on a specified
F_WRLCK region
F_UNLCK Unlock a specified region
• The l_whence, l_start & l_len define a region of a file to be locked or unlocked.
• The possible values of l_whence and their uses are
LINUX SYSTEM PROGRAMMING Module 2
• A lock set by the fcntl API is an advisory lock but we can also use fcntl for mandatory locking purpose
with the following attributes set before using fcntl
• Turn on the set-GID flag of the file.
• Turn off the group execute right permission of the file.
• In the given example program we have performed a read lock on a file “usp” from the 10th byte to 25th
byte.
Example Program
• To allow a process to scan directories in a file system independent manner, a directory record is defined as
o struct dirent in the <dirent.h> header for UNIX.
• Some of the functions that are defined for directory file operations in the above header are
The uses of these functions are
Function Use
opendir Opens a directory file for read-only. Returns a file
handle dir * for future reference of the file.
readdir Reads a record from a directory file referenced by dir-
fdesc and returns that record information.
LINUX SYSTEM PROGRAMMING Module 2
#include<sys/types.h>
#else
• UNIX systems have defined additional functions for random access of directory file records.
Function Use
telldir Returns the file pointer of a given dir_fdesc
seekdir Changes the file pointer of a given dir_fdesc to
a specified address
• Device files are used to interface physical device with application programs.
• A process with superuser privileges to create a device file must call the mknod API.
• The user ID and group ID attributes of a device file are assigned in the same manner as for regular files.
• When a process reads or writes to a device file, the kernel uses the major and minor device numbers of a
file to select a device driver function to carry out the actual data transfer.
• Device file support is implementation dependent. UNIX System defines the mknod API to create device
files.
• The prototype of mknod is
#include<sys/stat.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/stat.h>
#include<unistd.h>
int symlink(const char *org_link, const char *sym_link); int readlink(const char* sym_link,char*
buf,int size);
int lstat(const char * sym_link, struct stat* statv);
The org_link and sym_link arguments to a sym_link call specify the original file path name and the
symbolic link path name to be created.
Module 2
Laboratory Component: (minimum 3 experiments / programs)
1. Execute the "ls" command to display comprehensive file attributes with all
available options, view the file's contents, perform file copying and moving
operations between locations, and subsequently remove the file.
i) $ cat filename
cat is short form of concatenate, which means to join together. This utility is used most
often to display contents of single file. You may also use the cat command to display the
contents of several files in succession. In that case files should be separated by a blank
space.
You can also create a file using cat.
viii) tar
$ tar -cvf archive.tar file1.txt file2.txt my_directory/
Archives multiple files and directories into a single archive file. The command typically
doesn't display output, but it creates an archive file.
3. Identify commands for adjusting user, group, and others' permissions using
symbolic and octal notation, create files using the "touch" command, modify access
and modification timestamps, and alter default permissions for files or directories
using "umask
i) chmod: Change permission mode UNIX allows the user to change the default
permissions that are assigned. The pre condition however is that you must be the
owner of the file. Unless and until you own the file, you cannot change the
permissions assigned to the file.
This command is the key to UNIX permission modes, which provides a simple yet effective
method for controlling access to files. Whenever a file is created, system assigns its default
access permissions to the file. The owner can change these permissions with help of chmod
command.
Note: Modification time has not been changed. This is because changing the access
permissions does not modify the contents of file. The modification time is changed only if the
file’s contents are modified by write operation. This is called symbolic format of accessing
the modes.
One more format is there called as absolute format – which is based on octal numbers
(digits 0 through 7).
ii) touch: change and modify the timestamp
The touch command is used to change the timestamps (i.e., dates and times of the
most recent access and modification) on existing files and directories.
$touch filename
When used without any options, touch creates new files for any file names that are
provided as arguments (i.e., input data) if files with such names do not already exist.
Touch can create any number of files simultaneously.
For example, the -a option changes only the access time, while the -m option changes only
the modification time. The use of both of these options together changes both the access
and modification times to the current time
iii) umask
New files and directories are created with default set of permissions. For directories, the
base permissions are (rwxrwxrwx) 0777 and for files they are 0666 (rw-rw-rw).
Kernel applies a restriction on the default permission on files & directories by applying a
permission mask called the umask. This is an octal number which has to be subtracted
from default permission.
$umask 0002
Sample Questions
1) Explain Linux file system.
2) What are the file attributes available in Linux file system?
3) Explain file types.
4) Explain how to enable file permissions.
5) Explain chmod command
6) Explain differences between hard link and symbolic link
7) Explain commonly used tar operations and options
8) List steps to create an Archive
9) What are the uses of df command?
10) Explain write system call