CH 13

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 47

Chapter 13:

File-System Interface

Operating System Concepts – 10 th Edition Silberschatz, Galvin and Gagne


Chapter 13: File-System Interface

 File Concept
 Access Methods
 Disk and Directory Structure
 File-System Mounting
 File Sharing
 Protection

Operating System Concepts – 10 th Edition 13.2 Silberschatz, Galvin and Gagne


Objectives

 To explain the function of file systems


 To describe the interfaces to file systems
 To discuss file-system design tradeoffs,
including access methods, file sharing, file
locking, and directory structures
 To explore file-system protection

Operating System Concepts – 10 th Edition 13.3 Silberschatz, Galvin and Gagne


File System Components
Physical Reality File System Abstraction
Block oriented Byte oriented
Physical sector #’s Named files
No protection Users protected from each other
Data might be corrupted Robust to machine failures
if machine crashes

 Disk management: how to arrange collection of disk


blocks into files
 Naming: user gives file name, not track 50, platter 5,
etc.
 Protection: keep information secure
 Reliability/durability: when system crashes, lose
stuff in memory, but want files to be durable.

Operating System Concepts – 10 th Edition 13.4 Silberschatz, Galvin and Gagne


User vs. System View of a File
 User’s view:
 Durable data structures – executable com file (static data
region, relocation table, code, etc.)
 Memory-mapped files: operations – read/write to mem
 Serialization (also pointer swizzling, marshalling)
 Systems’ view (system call interface):
 Collection of bytes (UNIX)
 System’s view (inside OS):
 Collection of blocks
 a block is a logical transfer unit, while a sector is the
physical transfer unit.
 Block size >= sector size;
 in UNIX, block size is 4KB.

Operating System Concepts – 10 th Edition 13.5 Silberschatz, Galvin and Gagne


Translating from user to system view
 What happens if user says: give me bytes 2 – 12?
 a. Fetch block corresponding to those bytes
 b. Return just the correct portion of the block
 What about: write bytes 2 – 12?
 a. Fetch block
 b. Modify portion
 c. Write out block
 Everything inside file system is in whole size blocks.
 For example, getc, putc => buffers 4096 bytes, even if interface
is one byte at a time.
 From now on, file is collection of blocks.

Operating System Concepts – 10 th Edition 13.6 Silberschatz, Galvin and Gagne


File Concept
 A file is a named collection of related information that is recorded on
secondary storage
 Files represent:
 Data – numeric, alphabetic, alphanumeric, binary
 Programs (source and object)
 Files may be free form, such as text files, or may be formatted rigidly.
 In general, a file is a sequence of bits, bytes, lines, or records
 Contents defined by file’s creator, different types of information
stored in a file:
 Numeric, text, source code, executable code, photos, music,
videos, etc.
 A file has a certain defined structure, which depends on its type.
 A text file is a sequence of characters organized into lines (and
pages).
 A source file is a sequence of functions, each of which is further
organized as declarations followed by executable statements.
 An executable file is a series of code sections that the loader can
bring into memory and execute.
Operating System Concepts – 10 th Edition 13.7 Silberschatz, Galvin and Gagne
File Attributes
 Name – symbolic file name is the only information kept in human-
readable form
 Identifier – unique tag (number) identifies file within file system; it is the
non-human-readable name for the file.
 Type – needed for systems that support different types
 Location – pointer to file location on device
 Size – current file size (in bytes, words, or blocks), possibly maximum
size
 Protection – access-control information determines who can do
reading, writing, executing
 Timestamps and user identification – information kept for creation, last
modification, and last use, useful for protection, security, and usage
monitoring
 Many variations, including extended file attributes including character
encoding of the file and security features such as file checksum
 Information about files are kept in the directory structure, which is
maintained on the disk
 Directory entry consists of the file's name and its unique identifier, the
identifier in turn locates the other file attributes

Operating System Concepts – 10 th Edition 13.8 Silberschatz, Galvin and Gagne


File info Window on Mac OS X

Operating System Concepts – 10 th Edition 13.9 Silberschatz, Galvin and Gagne


File Operations
 File is an abstract data type
 Create - Two steps, first, space in the file system for the file.
Second, an entry for the new file in a directory.
 Open - all operations except create and delete require a file
open(), returns a file handle that is used as an argument in the
other calls
 Write – a system call specifying both the open file handle and the
information to be written to the file at write pointer location
 Read – a system call that specifies the file handle and where (in
memory) the next block of the file at read pointer location
 Current-file-position pointer per-process for the current operation
location for both read and write to reduce space and system
complexity
 Reposition within file - The current-file-position pointer of the open
file is repositioned to a given value, also called a seek
 Delete – search the directory and release all file space. For hard
links—multiple names (directory entries) for the same file, actual
file contents not deleted until the last link is deleted
 Truncate - erase the contents of a file but keep its attributes,
Operatingreset length– 10toEdition
System Concepts
th
0, release file space
13.10 Silberschatz, Galvin and Gagne
Open Files
 Several pieces of data are needed to manage open files:
 Open-file table: tracks open files
 When a file operation is requested, the file is specified via an
index into this table, so no searching of the directory is
required
 Not in active use, closed by process, OS deletes entry,
releasing lock
 The open() operation takes a file name and searches the
directory, copying the directory entry into the open-file table.
 The open() call can also accept access-mode information—
create, read-only, read–write, append-only, and so on.
 This mode is checked against the file's permissions, if allowed,
the file is opened for the process.
 The open() system call returns a pointer to the entry in the open-
file table, this pointer, not the actual file name, is used in all I/O
operations, avoiding any further searching and simplifying the
system-call interface.

Operating System Concepts – 10 th Edition 13.11 Silberschatz, Galvin and Gagne


Open Files
 The OS uses two levels of internal tables: a per-process table
and a system-wide table.
 The per-process table tracks all files that a process has open,
and contains the current file pointer for each file, access rights,
and accounting information
 Each entry in the per-process table points to a system-wide
open-file table, which contains process-independent
information, such as the location of the file on disk, access
dates, and file size.
 File-open count: number of processes having the file open
 Each close() decreases this open count, and when the open
count reaches zero, the file is no longer in use, and the file's
entry is removed from the open-file table.

Operating System Concepts – 10 th Edition 13.12 Silberschatz, Galvin and Gagne


Open Files
 Information associated with an open file:
 File pointer - the system must track the last read–write location
as a current-file-position pointer, unique to each process, kept
separate from the on-disk file attributes.
 File-open count - tracks the number of opens and closes and
reaches zero on the last close. The system can then remove the
entry to reuse space.
 Location of the file - information needed to locate the file (mass
storage, file server, RAM drive) is kept in memory so that the
system does not have to read it from the directory structure for
each operation.
 Access rights - each process opens a file in an access mode, this
information is stored on the per-process table so the operating
system can allow or deny subsequent I/O requests.

Operating System Concepts – 10 th Edition 13.13 Silberschatz, Galvin and Gagne


Open File Locking
 Provided by some operating systems and file systems
 File locks allow one process to lock a file and prevent other
processes from gaining access to it, useful for shared files
 Similar to reader-writer locks
 Shared lock similar to reader lock – several processes can
acquire concurrently
 Exclusive lock similar to writer lock; only one process at a
time can acquire such a lock
 Mandatory or advisory file-locking mechanisms:
 Mandatory – once a process acquires an exclusive lock, the
operating system will prevent any other process from
accessing the locked file
 Advisory – processes can find status of locks and decide what
to do
 up to software developers to ensure that locks are
appropriately acquired and released.
 Windows operating systems adopt mandatory locking, and
UNIX
Operating System systems
Concepts
th employ advisory
– 10 Edition 13.14 locks Silberschatz, Galvin and Gagne
File Locking Example – Java API
import java.io.*;
import java.nio.channels.*;
public class LockingExample {
public static final boolean EXCLUSIVE = false;
public static final boolean SHARED = true;
public static void main(String arsg[]) throws IOException {
FileLock sharedLock = null;
FileLock exclusiveLock = null;
try {
RandomAccessFile raf = new
RandomAccessFile("file.txt", "rw");
// get the channel for the file
FileChannel ch = raf.getChannel();
// this locks the first half of the file - exclusive
exclusiveLock = ch.lock(0, raf.length()/2,
EXCLUSIVE);
/** Now modify the data . . . */
// release the lock
exclusiveLock.release();

Operating System Concepts – 10 th Edition 13.15 Silberschatz, Galvin and Gagne


File Locking Example – Java API (Cont.)

// this locks the second half of the file -


shared
sharedLock = ch.lock(raf.length()/2+1,
raf.length(), SHARED);
/** Now read the data . . . */
// release the lock
sharedLock.release();
} catch (java.io.IOException ioe) {
System.err.println(ioe);
}finally {
if (exclusiveLock != null)
exclusiveLock.release();
if (sharedLock != null)
sharedLock.release();
}
}
}

Operating System Concepts – 10 th Edition 13.16 Silberschatz, Galvin and Gagne


File Types – Name, Extension
 A common technique for implementing file types is to include
the type as part of the file name. The name is split into two
parts—a name and an extension, usually separated by a period

Operating System Concepts – 10 th Edition 13.17 Silberschatz, Galvin and Gagne


File Structure
 None - sequence of words, bytes
 Simple record structure
 Lines
 Fixed length
 Variable length
 Complex Structures
 Formatted document
 Relocatable load file
 Can simulate last two with first method by inserting
appropriate control characters
 Who decides:
 Operating system
 Program
 Some operating systems impose (and support) a minimal
number of file structures, adopted in UNIX, Windows, and
others.
 UNIX considers each file to be a sequence of 8-bit bytes; no
Operatinginterpretation of these bits is13.18
System Concepts – 10 Edition
th made by the operating system.
Silberschatz, Galvin and Gagne
Sequential-access File
 Information in the file is processed in order, one record after
the other.
 This mode of access is by far the most common; for example,
editors and compilers usually access files in this fashion.
 read_next() — reads the next portion of the file and
automatically advances a file pointer, which tracks the I/O
location.
 write_next() — appends to the end of the file and advances to
the end of the newly written material (the new end of file).
 A file can be reset to the beginning, and on some systems, a
program may be able to skip forward or backward n records
for some integer n, perhaps only for n = 1.

Operating System Concepts – 10 th Edition 13.19 Silberschatz, Galvin and Gagne


Direct Access (Logical access)
 A file is made up of fixed-length logical records that allow
programs to read and write records rapidly in no particular order
(Disks), e.g. databases
 read(n), where n is the block number, and write(n)
 An alternative approach is to retain read_next() and write_next()
and to add an operation position_file(n) where n is the block
number.
 Then, to effect a read(n), we would position_file(n) and then
read_next() or write_next().
 rewrite(n)

 n = relative block number


 A relative block number is an index relative to the beginning of the
file.
 Thus, the first relative block of the file is 0, the next is 1, and so
on, even though the absolute disk address may be 14703 for the
first block and 3192 for the second.
 Relative block numbers allow OS to decide where file should be
placed

Operating System Concepts – 10 th Edition 13.20 Silberschatz, Galvin and Gagne


Simulation of Sequential Access on Direct-access File

Operating System Concepts – 10 th Edition 13.21 Silberschatz, Galvin and Gagne


Other Access Methods

 Can be built on top of a Direct Access method


 Generally involve creation of an index for the file
 Keep index in memory for fast determination of location of data
to be operated on
 consider UPC code plus record of prices about that item
 To find the price of a particular item, we can make a binary
search of the index.
 From this search, we learn exactly which block contains
the desired record and access that block.
 If too large, index (in memory) of the index (on disk)
 The primary index file contains pointers to secondary index
files, which point to the actual data items.

Operating System Concepts – 10 th Edition 13.22 Silberschatz, Galvin and Gagne


Other Access Methods
 IBM indexed sequential-access method (ISAM)
 Small master index, points to disk blocks of secondary index
 File kept sorted on a defined key
 To find a particular item, we first make a binary search of
the master index, which provides the block number of the
secondary index.
 This block is read in, and again a binary search is used to
find the block containing the desired record.
 Finally, this block is searched sequentially.
 VMS operating system provides index and relative files as
another example (see next slide)

Operating System Concepts – 10 th Edition 13.23 Silberschatz, Galvin and Gagne


Example of Index and Relative Files

Operating System Concepts – 10 th Edition 13.24 Silberschatz, Galvin and Gagne


File Usage Patterns
1. Most files are small (for example, .login, .c files)
2. Large files use up most of the disk space
3. Large files account for most of the bytes transferred to/from
disk

 Bad news: need everything to be efficient.


 Need small files to be efficient, since lots of them.
 Need large files to be efficient, since most of the disk
space, most of the I/O due to them

Operating System Concepts – 10 th Edition 13.25 Silberschatz, Galvin and Gagne


Directory Structure

 A collection of nodes containing information about all files

Directory

Files
F1 F2 F4
F3
Fn

Both the directory structure and the files reside on disk

Operating System Concepts – 10 th Edition 13.26 Silberschatz, Galvin and Gagne


Disk Structure
 Disk can be subdivided into partitions
 Disks or partitions can be RAID protected against failure
 Disk or partition can be used raw – without a file system, or
formatted with a file system
 Partitions also known as minidisks, slices
 Entity containing file system known as a volume
 Each volume containing file system also tracks that file
system’s info in device directory or volume table of contents
 As well as general-purpose file systems there are many special-
purpose file systems, frequently all within the same operating
system or computer

Operating System Concepts – 10 th Edition 13.27 Silberschatz, Galvin and Gagne


Directory Structure
 The directory can be viewed as a symbol table that translates file
names into their file control blocks.
 The directory organization must allow to insert entries, delete
entries, search for a named entry, and list all the entries in the
directory.
 Operations on a directory:
 Search for a file
 Create a file
 Delete a file: a delete leaves a hole in the directory structure
and the file system may have a method to defragment the
directory structure.
 List a directory
 Rename a file
 Traverse the file system

Operating System Concepts – 10 th Edition 13.28 Silberschatz, Galvin and Gagne


Directory Organization
 The directory is organized logically to obtain :
 Efficiency – locating a file quickly
 Naming – convenient to users
 Two users can have same name for different files
 The same file can have several different names
 Grouping – logical grouping of files by properties, (e.g., all
Java programs, all games, …)

Operating System Concepts – 10 th Edition 13.29 Silberschatz, Galvin and Gagne


Single-Level Directory
 A single directory for all users

 Limitations
 Large number of files and multiple users
 Need unique names
 Grouping problem

Operating System Concepts – 10 th Edition 13.30 Silberschatz, Galvin and Gagne


Two-Level Directory
 Separate directory for each user
 Each user has his own user file directory (UFD).
 At login, the system's master file directory (MFD) is searched.
 The MFD is indexed by user name or account number, and
each entry points to the UFD for that user

 Can have the same file name for different users, but unique within user’s
UFD
 Path name - Specifying a user name and a file name defines a path in the
tree from the root (the MFD) to a leaf (the specified file).
 search path: sequence of directories searched when a file is named
 No grouping
Operating System Conceptscapability
– 10 th Edition 13.31 Silberschatz, Galvin and Gagne
Tree-Structured Directories

Operating System Concepts – 10 th Edition 13.32 Silberschatz, Galvin and Gagne


Tree-Structured Directories (Cont.)
 In many implementations, a directory is simply another file, but it is
treated in a special way.
 All directories have the same internal format.
 One bit in each directory entry defines the entry as a file (0) or as a
subdirectory (1).
 Special system calls are used to create and delete directories.
 In this case the operating system (or the file system code)
implements another file format, that of a directory.
 Efficient searching and grouping Capability
 Current directory (working directory)
 cd /spell/mail/prog
 type list
 When reference is made to a file, the current directory is searched.
 To change directories, a system call takes a directory name as a
parameter and uses it to redefine the current directory.
 Other systems leave it to the application (say, a shell) to track and
operate on a current directory, as each process could have
different current directories.
Operating System Concepts – 10 th Edition 13.33 Silberschatz, Galvin and Gagne
Tree-Structured Directories (Cont)
 Absolute or relative path name
 Absolute path name begins at the root (initial “/”) and follows a
path down to the specified file, giving the directory names on
the path.
 Relative path name defines a path from the current directory.
 If the current directory is /spell/mail, then the relative path
name prt/first refers to the absolute path name
/spell/mail/prt/first.

 Creating a new file is done in current directory

Operating System Concepts – 10 th Edition 13.34 Silberschatz, Galvin and Gagne


Tree-Structured Directories (Cont)
 Policy decision on how to handle the deletion of a directory.
 If a directory is empty, its entry in the directory that contains it can
simply be deleted.
 If the directory to be deleted is not empty, contains several files or
subdirectories
 Some systems will not delete a directory unless it is empty.
 This approach can result in a substantial amount of work.
 An alternative approach, such as that taken by the UNIX rm
command, is to provide an option: when a request is made to delete
a directory, all that directory's files and subdirectories are also to be
deleted.
 Delete a file
rm <file-name>
 Creating a new subdirectory is done in current directory
mkdir <dir-name>
Example: if in current directory /mail
mkdir count

Deleting “mail”  deleting the entire subtree rooted by “mail”


Operating System Concepts – 10 th Edition 13.35 Silberschatz, Galvin and Gagne
Acyclic-Graph Directories
 Have shared subdirectories and files

Operating System Concepts – 10 th Edition 13.36 Silberschatz, Galvin and Gagne


Acyclic-Graph Directories (Cont.)
 A shared file (or directory) is not the same as two copies of the
file.
 With two copies, each programmer can view the copy rather
than the original, but if one programmer changes the file, the
changes will not appear in the other's copy.
 With a shared file, only one actual file exists, so any changes
made by one person are immediately visible to the other.
 Sharing is particularly important for subdirectories; a new file
created by one person will automatically appear in all the
shared subdirectories.
 Implementing shared files and subdirectories
 #1: create a new directory entry called a link. A link is
effectively a pointer to another file or subdirectory
 A link may be implemented as an absolute or a relative path
name.
 On a reference to a file, search the directory. In case of a link,
the name of the real file is included in the link information.
 Resolve the link by using that path name to locate the real file.
Operating  Links are effectively indirect
System Concepts – 10 th Edition 13.37pointers. Silberschatz, Galvin and Gagne
Acyclic-Graph Directories (Cont.)
 Implementing shared files and subdirectories (cont.)
 #2: duplicate all information about the shared files in both sharing
directories.
 Thus, both entries are identical and equal.
 A major problem with duplicate directory entries is maintaining
consistency when a file is modified.
 Problems in Acyclic-graph directories
 #1: A file may have multiple absolute path names; consequently,
distinct file names may refer to the same file (aliasing).
 Traversing entire file system to gather statistics of all files might
lead to traversing shared structured more than once
 #2: file deletion - when can the space allocated to a shared file be
deallocated and reused?
 Remove the file whenever anyone deletes it; may leave dangling
pointers to the now-nonexistent file, e.g. If dict deletes list 
dangling pointer
 If the remaining file pointers contain actual disk addresses, and
the space is subsequently reused for other files, these dangling
pointers may point into the middle of other files.
Operating System Concepts – 10 th Edition 13.38 Silberschatz, Galvin and Gagne
Acyclic-Graph Directories (Cont.)
 Problems in Acyclic-graph directories (cont.)
 Symbolic links for implementing sharing
 Only the link is deleted
 If the file entry itself is deleted, the space for the file is
deallocated, leaving the links dangling.
 search for these links and remove them as well, but this
search can be expensive.
 Alternatively, leave the links until an attempt is made to
use them, when it can be determined the file does not exist
and fails to resolve the link name
 In UNIX and Windows, symbolic links are left when a file is
deleted, and it is up to the user to realize that the original
file is gone or has been replaced.
 Another approach to deletion is to preserve the file until all
references to it are deleted.
 keep a count of the number of references; when the count
is 0, the file can be deleted
UNIX uses this approach
 Concepts – 10th Edition
Operating System 13.39for non-symbolic links (hard links)
Silberschatz, Galvin and Gagne
General Graph Directory

Operating System Concepts – 10 th Edition 13.40 Silberschatz, Galvin and Gagne


General Graph Directory (Cont.)
 Ensure acyclic-graph structure does not have cycles
 Adding new files and subdirectories to an existing tree-
structured directory preserves the tree-structured nature.
 However, adding links destroys the tree structure, resulting in
a simple graph structure
 Cycles could result in infinite loop continually searching
through the cycle
 limit arbitrarily the number of directories that will be
accessed during a search
 File deletion - when cycles exist, the reference count may
not be 0 even when it is no longer possible to refer to a
directory or file because of self referencing

Operating System Concepts – 10 th Edition 13.41 Silberschatz, Galvin and Gagne


General Graph Directory (Cont.)
 How do we guarantee no cycles?
 Garbage collection - determine when the last reference has
been deleted and the disk space can be reallocated
 Involves traversing the entire file system, marking
everything that can be accessed.
 Then, a second pass collects everything that is not marked
onto a list of free space.
 Extremely expensive for disk-based file systems; seldom
attempted
 Every time a new link is added use a cycle detection
algorithm to determine whether it is OK
 Computationally expensive on disk-based file system
 A simpler algorithm in the special case of directories and
links is to bypass links during directory traversal.
 Cycles are avoided, and no extra overhead is incurred.

Operating System Concepts – 10 th Edition 13.42 Silberschatz, Galvin and Gagne


Protection
 Laptop - user name and password authentication to access it,
encrypting the secondary storage and firewalling network
access
 Multiuser system - advanced mechanisms to allow only valid
access of the data.
 File owner/creator should be able to control:
 what can be done
 by whom
 Types of access
 Read
 Write
 Execute
 Append
 Delete
 List
 Attribute change
Operating System Concepts – 10 th Edition 13.43 Silberschatz, Galvin and Gagne
Access Lists and Groups
 Mode of access: read, write, execute
 Three classes of users on Unix / Linux
RWX
a) owner access 7  111
RWX
b) group access 6  110
RWX
c) public access 1  001
 Ask manager to create a group (unique name), say G, and add
some users to the group.
 For a particular file (say game) or subdirectory, define an
appropriate access.

Attach a group to a file


chgrp G game

Operating System Concepts – 10 th Edition 13.44 Silberschatz, Galvin and Gagne


Windows 7 Access-Control List Management

Operating System Concepts – 10 th Edition 13.45 Silberschatz, Galvin and Gagne


A Sample UNIX Directory Listing

Operating System Concepts – 10 th Edition 13.46 Silberschatz, Galvin and Gagne


End of Chapter 13

Operating System Concepts – 10 th Edition Silberschatz, Galvin and Gagne

You might also like