Assignment 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

NAME – Prince Kumar

ROLL NO - 1806048
GROUP - G2
BRANCH - INFORMATION TECHNOLOGY
SUBJECT - Database Management System
DATE - 25 APRIL 2020

ASSIGNMENT 3

Q1. Type of Indexing in DBMS. Explain each of them using suitable example.

An index is a data structure that organizes data records on disk to optimize the
retrieval operations. An index is used to efficiently retrieve all records that satisfy
search conditions on the search key fields of the index The index structure typically
provides secondary access path, which provides alternative way of retrieving the
records without affecting the physical storage of records on the disk On a same file,
multiple indexes on different fields can be constructed.

There are two type of Indexing -


1. Single-level Indexing
A. Primary indexes
B. Secondary indexes
C. Clustering indexes
D. Dense/Sparse indexes
2. Multilevel Indexing
A. Search Tree
B. B-Tree
C. B⁺ Tree

SINGLE-LEVEL INDEXES
Index is usually defined on a single attribute or field of a file, called indexing field or
indexing attribute Generally, the index stores each value of the index field along with
a list of pointers to all disk blocks that contain records with that field value.
The values in the index are ordered so that binary search can be performed on the
index. The index file is much smaller than the data file, so searching the index using a
binary search is
highly efficient.
STUDENT
Roll Name Age Gender Contact
101 Ashok 19 Male 123
102 Mahesh 19 Male 345
103 Jency 21 Female 678
104 Vikash 20 Male 245
105 Manish 20 Male 789
106 Rahul 22 Male 567
A. PRIMARY INDEXES

A primary index is an ordered file whose records are of fixed length with two fields.
The first field is of the same data type as the ordering key field of the data file, called
the primary key and the second field is a pointer to a disk block.There is one index
entry in the index file for each block in the data file. Each index entry has the value of
the primary key field for the first record in a block and a pointer to that block as its
two field values.The index file for a primary index needs substantially fewer blocks
than does the data file, for two reasons. First, there are fewer index entries than there
are records in the data file. Second,each index entry is typically smaller in size than a
data record because it has only two fields. A binary search on the index file hence
requires fewer block accesses than a binary search on the data file When a user wants
to insert a record to its correct position in the data file, the existing records should be
moved to make space for the new record as well as index entries will be changed.
Similarly, deletion process is also difficult due to the index entries updation. But,
record deletion is commonly handled by using deletion markers.

B. CLUSTERING INDEXES

If records of a file are physically ordered on a non-key field, that field is called the
clustering field. Thus, clustering index is the index created on the clustering field to
speed up retrieval of records that have the same value for the clustering field This
differs from a primary index, which requires that the ordering field of the data file
have a distinct value for each record.
There is one entry in the clustering index for each distinct value of the clustering field,
containing the value and a pointer to the first block in the data file that has a record
with that value for its clustering field To solve the problem of insertion, it is common
to reserve a whole block (or a cluster of contiguous blocks) for each value of the
clustering field; all records with that value are placed in the block (or block cluster).
This makes insertion and deletion relatively straightforward.

C. SECONDARY INDEXES

A secondary index is also an ordered file with two fields. The first field is of the same
data type as some non ordering field of the data file that is an indexing field. The
second field is either a block pointer or a record pointer. There can be many
secondary indexes for the same file In this case there is one index entry for each
record in the data file, which contains the value of the secondary key for the record
and a pointer either to the block in which the record is stored or to the record itself.
A secondary index usually needs more storage space and longer search time than does
a primary index, because of its larger number of entries Secondary index provides a
logical ordering on the records by the indexing field A data file can have either a
primary index or a cluster index depending on its ordering field. It can have several
secondary indexes
D. DENSE/SPARSE INDEXES

Indexes can also be characterized as dense or sparse:


• A dense index has an index entry for every search key value (and hence every record)
in the data file.
• A sparse or non dense index has index entries for only some of the search values
Primary index → Non dense
Clustering index → Non dense
Secondary index → Dense

MULTILEVEL INDEXES

A multilevel indexing can contain any number of levels, each of which acts as a
non-dense index to the level below. The top level contains a single entry A multilevel
index can be created for any type of first-level index (whether it is primary, clustering
or secondary) as long as the first-level index consists of more than one disk block The
advantage of multilevel index is that it reduces the number of blocks accessed when
searching a record, given its indexing field value .The problems associated with index
insertions and deletions still exist because all index levels are physically ordered files.

Because of the insertion and deletion problem, most multilevel indexes use B-tree or
B⁺ tree data structures, which leave space in each tree node (disk block) to allow for
new index entries In B-Tree and B⁺ Tree data structures, each node corresponds
to a disk block. Here, each node is kept between half-full and completely full .An
insertion into a node that is not full is quite efficient. On the other hand, if a node is
full the insertion causes a split into two nodes Similarly, a deletion is quite efficient if
a node does not become less than half full. If a deletion causes a node to become less
than half-full, it must be merged with the neighboring nodes.
Q2. Two phase locking protocol ensure serializiability but not free from starvation
and dead lock. Justify the statement with suitable example.

Two-phase locking protocol is a protocol which ensures serializability .This protocol


requires that each transaction issues lock and unlock requests in two phases. The two
phases are:
• Growing phase: Here, a transaction acquires all required locks without unlocking
any data i.e. the transaction may not release any lock.
• Shrinking phase: Here, a transaction releases all locks and cannot obtain any new
lock.
The point in the schedule where the transaction has obtained its final lock is called the
lock point of the transaction. Transactions can be ordered according to their lock
points

Two-phase locking does not ensure freedom from deadlock.

EXAMPLE -
Suppose a transaction T1 has a shared-mode lock on a data item, and another
transaction T2 requests an exclusive-mode lock on that same data item. In this
situation, T2 has to wait for T1 to release the shared-mode lock.

Suppose, another transaction T3 requests a shared-mode lock on the same data item
while T1 is holding a shared lock on it. As the lock request is compatible with lock
granted to T1, so T3 may be granted the shared-mode lock. But, T2 has to wait for the
release of the lock from that data item.

At this point, T1 may release the lock, but still T2 has to wait for T3 to finish. There
may be a new transaction T4 that requests a shared-mode lock on the same data item,
and is granted the lock before T3 releases it.

In such a situation, T2 never gets the exclusive-mode lock on the data item. Thus, T2
cannot progress at all and is said to be starved. This problem is called as the
starvation problem .

We can avoid starvation of transactions by granting locks in the following manner;


when a transaction Ti requests a lock on a data item Q in a particular mode M, the
concurrency-control manager grants the lock provided that:

• There is no other transaction holding a lock on Q in a mode that conflicts with M


• There is no other transaction that is waiting for a lock on Q and that made its lock
request before Ti.

You might also like