Unit 4 Notes

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 44

lOMoARcPSD|27917820

Unit-4 notes

Database Management System (Guru Gobind Singh Indraprastha University)

Studocu is not sponsored or endorsed by any college or university


Downloaded by mogili siva ([email protected])
lOMoARcPSD|27917820

Database Management System

Unit-4

Notes

Transaction Processing

• A transaction is a logical unit of work of database processing that includes


one or more database access operations.

• A transaction can be defined as an action or series of actions that is carried


out by a single user or application program to perform operations for
accessing the contents of the database. The operations can include
retrieval, insertion, deletion and modification

• A transaction must be either completed or aborted.

• A transaction is a program unit whose execution may change the contents


of a database. It can either be embedded within an application program or
can be specified interactively via a high level query language such as SQL.

• If the database is in a consistent state before a transaction executes, then


the database should still be in consistent state after its execution.

• Therefore, to ensure these conditions and preserve the integrity of the


database transaction must be atomic (also called serialisability).

• Atomic transaction is a transaction in which either all actions associated


with the transaction are executed to completion or none are performed.

Basic operations on database are read and write


1. read_item(X): Reads a database item named X into a program variable. To
simplify our notation, we assume that the program variable is also named X.
2. write_item(X): Writes the value of program variable X into the database
item named X.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Example:

You are working on a system for a bank. A customer goes to the ATM and
instructs it to transfer Rs. 1000 from savings to a checking account. This simple
transaction requires two steps:

• Subtracting the money from the savings account


balance. Savings -1000

• Adding the money to the checking account


balance. Checking + 1000

The code to create this transaction will require two updates to the database. For
example, there will be two SQL statements: one UPDATE command to decrease
the balance in savings and a second UPDATE command to increase the balance in
the checking account.

You have to consider what would happen if a machine crashed between these two
operations. The money has already been subtracted from the savings account will
not be added to the checking account. It is lost. You might consider performing the
addition to checking first, but then the customer ends up with extra money, and the
bank loses. The point is that both changes must be made successfully.

Thus, a transaction is defined as a set of changes that must be made together.

What causes a Transaction to fail

There are several reasons for a transaction to fail in the middle of execution.

1. Computer failure: a hardware, software or network error occurs in the


computer system during transaction execution.

2. Transaction or system: some operations in the transaction may cause it to


fail such as integer overflow or division by 0. The user may also interrupt
the transaction during its execution.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

3. Local errors or exception conditions detected by the transaction :during


transaction execution , certain condition may occur that necessitate the
cancellation of transaction.

Example: insufficient account balance in a banking database may cause a


transaction to be cancelled.

4. Concurrency control enforcement: this method may decide to abort the


transaction because several transactions are in a state of deadlock.

5. Disk failure: some disk blocks may lose their data because of a disk
read/write head crash. This may happen during a read/ write operation of a
transaction.

6. Physical problem & catastrophes: this refers to an endless list of problems


that include fire, theft etc.

A transaction can be in one of the following states:-


1. Active state:- After the transaction starts its operation.

2. Partially committed:- When the last state is reached.

3. Aborted:- After the transaction has been rolled back and the database has
been restored to its state prior to the start of the transaction.

4. Committed:- After successful completion of transaction.

5. Failed:- When the normal execution can no longer proceed.

State diagram of transition/transaction

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Transaction Properties

A transaction must have the following four properties, called ACID properties, to
ensure that a database remains in stable state after the transaction is executed:

• Atomicity

• Consistency

• Isolation

• Durability

Atomicity(All or nothing):- The atomicity property of a transaction requires


that all operations of a transaction be completed, if not, the transaction is
aborted.

In other words, a transaction is treated as single, individual logical unit of work.


Therefore, a transaction must execute and complete each operation in its logic
before it commits its changes.

Consistency(No violation of integrity constraints):-Database consistency


is the property that every transaction sees a consistent database instance.

In other words, execution of a transaction must leave a database in either its prior
stable state or a new stable state that reflects the new modifications (updates)
made by the transaction.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

If the transaction fails, the database must be returned to the state it was in prior
to the execution of the failed transaction. If the transaction commits, the
database must reflect the new changes.

Isolation(concurrent changes invisibles):- Isolation property of a transaction


means that the data used during the execution of a transaction cannot be used by
a second transaction until the first one is completed.

This property isolates transactions from one another. In other words, if a


transaction T1 is being executed and is using the data item X, that data item
cannot be accessed by any other transaction (T2…Tn) until T1 ends.

Durability(committed update persist):- It states that the changes made by a


transaction are permanent. They cannot be lost by either a system failure or by
the erroneous operation of a faulty transaction.

When a transaction is completed, the database reaches a consistent state and


that state cannot be lost , even in the event of system’s failure.

Durability property is the responsibility of the recovery subsystem of the DBMS.

Let Ti be a transaction that transfers Rs 50 from account A to account B. This


transaction can be defined as:-

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Overview of serializability
When multiple transactions are being executed by the operating system in a
multiprogramming environment, there are possibilities that instructions of one
transaction are interleaved with some other transaction.

Schedule

A schedule is a sequence of actions or operations (for example, reading writing,


aborting or committing) that is constructed by merging the actions of a set of
transactions, respecting the sequence of actions within each transaction.

e.g. if in a transaction T1 the instruction write(A) appears before read(A), then in


any valid schedule this sequence must also be preserved.

A schedule S of n transactions is a sequential ordering of the operations of the


n transactions.

A schedule maintains the order of operations within the individual transaction.

For each transaction T if operation a is performed in T before operation b,


then operation a will be performed before operation b in S.
The operations are in the same order as they were before the transactions
were interleaved

A transaction schedule is a tabular representation of how a set of transactions


were executed over time

Types of schedule:

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

1. Complete schedule
2. Serial schedule
3. Non-serial schedule
4. Equivalent schedule
5. Serializable schedule

Complete Schedule: A Schedule that contains either an abort or a commit for


each transaction whose actions are listed in it is called a complete schedule. A
complete schedule must contain all the actions of every transaction that appears
in it.

Serial Schedule: Each serial schedule consists of a sequence of instructions from


various transactions, where the instructions belonging to one single transaction
appear together in that schedule. If the actions of different transactions are not
interleaved-that is, transactions are executed from start to finish, one by one-we
call that schedule a serial schedule.

T1 T2
A=A+100
B=B-100
A=A*7.06
B=B*7.06
Schedule 1
T1 T2

A=A+100
B=B-100

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

A=A*7.06
B=B*7.06

Schedule 2

Non Serial Schedule: A schedule in which the operations from a set of concurrent
transactions are interleaved is called a non- serial schedule.

T1 T2
A=A+100
A=A*7.06
B=B-100
B=B*7.06
Schedule 3
SERIALSCHEDULE NON-SERIALSCHEDULE

A serial schedule is a sequence of operation by a set A non-serial schedule is a schedule where the operations of a of
concurrent transaction that preserves the order of group of concurrent transactions are interleaved. operations in
each of the individual transactions.

Transactions are performed in serial order. Transactions are performed in non-serial order, but result
should be same as serial.

No interference between transactions Concurrency problem can arise here.


It does not matter which transaction is executed first, The problem we have seen earlier lost update, uncommitted as
long as every transaction is executed in its entirely data, inconsistent analysis is arise if scheduling is not proper. from
the beginning to end.

EXAMPLE: EXAMPLE:
If some transaction T is long, the other transaction In this schedule the execution of other transaction goes on
must wait for T to complete all its operations. without waiting the completion of T.

In case of banking transaction, there are 2


transactions – one transaction calculates the interest
on the account and another transaction deposits
some money into the account. Here the order of the
execution is important, as the results will be different

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

depending on whether the interest is calculated


before or after the money is deposited into the
account.

Equivalent Schedule: For any database state, the effect (on the set of objects
in the database) of executing the first schedule is identical to the effect of
executing the second schedule.

e.g. Schedule 3 is equivalent to Schedule 1

Serializable Schedule: A non serial schedule that is equivalent to some serial


execution of transactions is called a serializable schedule.

For e.g. Schedule 3 is serializable schedule, which is equivalent to schedule-1 and


schedule-2.

The objective of Serializability is to find non-serial schedules that allow


transactions to execute concurrently without interfering with one another, and
thereby produce a database state that could be processed by a serial execution. It
is important to guarantee serializability of concurrent transactions in order to

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

prevent inconsistency from transactions interfering with one another. In


serializability the ordering read and write operations are important.

Operations Conflict

• In a given schedule, the order of read/write operations can be changed but


not always.

• Such read/write operations whose order cannot be changed without


affecting the consistency of data are called as conflict operations.

Rules to define conflict operations

• Rule 1: If two transactions just read a data object then they do not conflict
and the order is not important.

• Rule 2: If two transactions either read or write completely separate data


objects then they do not conflict and the order is not important.

• Rule 3: If one transaction writes a data object and another either reads or
writes the same data object then the order of execution is important.

• Rule 4: Two actions on the same data object conflict if at least one
of them is a write.

Consider the following Schedule:-

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

• Are there any conflicting operations in T1 or T2?

• Find out a serial schedule for it? Make it conflict serializable?

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

• If a schedule ,S, can be transformed into a schedule, S’, by a series of swaps


of non-conflicting instructions, we say that S and S’ are conflict equivalent.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

• Here Schedule S and S’ are conflict equivalent as Read and Write of T1


can be swapped with Read and Write of T2.

• This leads to the concept of conflict serializability also.

• We say that a Schedule S, is conflict serializable if it is conflict equivalent


to a serial schedule.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

• Here, Schedule S is conflict serializable as it is conflict equivalent to S’,


serial schedule

Testing for Serializability


The test is as follows:-

Let us assume that there is a schedule, S. We construct it’s directed graph


or precedence graph.

This graph G, is a pair G=(V, E)

Where

‘V’ is a set of vertices

‘E’ is a set of edges.

Set of vertices consists of all the transactions participating in the schedule.

Set of edges consists of all edges Tià Tj for which one of the three conditions
hold:-

Ti executes Write(A) before Tj executes Read(A)

Ti executes Read(A) before Tj executes Write(A)

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Ti executes Write(A) before Tj executes Write(A)

We also say that, if an edge Tià Tj exists in the precedence graph then in any
serial schedule S’ equivalent to S, Ti must appear before T.

e.g. Say Schedule S is:-

• Its precedence graph is:-

• There is one edge and two vertices T1 and T2. Arrow indicates that all
instructions of T1 are executed before the execution of first instruction of
T2.

• But if the schedule is

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

• Then the precedence graph is:-

• This is because firstly T1 is executed then T2 then again T1 and then


T2. Here the graph contains a cycle.

• Note: If the precedence graph for S has a cycle, then the schedule, S is not
conflict serializable. But, if the graph contains no cycles, then the
schedule, S, is conflict serializable.

Concurrency Control

• Concurrency control is the process of managing simultaneous execution of


transactions (such as queries, updates, inserts, deletes and so on) in a
multiprocessing database system without having them interfere with one
another.

• This property of DBMS allows many transactions to access the same


database at the same time without interfering with each other.

• The primary goal of concurrency is to ensure the atomicity of the


execution of transactions in a multi-user database environment.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Problems of Concurrency Control

When concurrent transactions are executed in an uncontrolled manner, several


problems can occur. The concurrency control has the following three main
problems:-

• Lost updates

• Dirty read

• Unrepeatable read

Lost updates:- A lost update problem occurs when two transactions that access
the same database items have their operations in a way that makes the value of
some database item incorrect.

In other words, if transaction T1 and T2 both read a record and then update it, the
effects of the first update will be overwritten by the second update.

eg: T1 T2

read (A)

A:= A-100

read (A)

X:= A*0.02

A:= A+X

write (A)

read (B)

write (A)

B:=B+100

write (B)

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Suppose that the initial values of account A and B are $2000 and $1500,
respectively. Then, after the serial execution of transactions T1 and T2 (T1
followed by T2), the value of account A should be $1938 and that of account B
should be $1600.

Suppose that the operations of T1 and T2 are interleaved in such a way that T2
reads the value of account A before T1 updates its value in the database. Now,
when T2 updates the value of account A in the database, the value of account A
updated by transaction T1 is overwritten and hence is lost. This is known as lost
update problem.

Dirty Read:- A dirty read problem occurs when one transaction updates a
database item and then the transaction fails for some reason. The updated
database item is accessed by another transaction before it is changed back to the
original value.

In other words, a transaction T1 updates a record, which is read by the


transaction T2. Then T1 aborts and T2 now has values which have never formed
part of the stable database.

eg: Assume that T1 fails after debiting $100 from account A, but before crediting
this amount to account B. This will leave the database in an inconsistent state.
The value of account A is now $1900, which must be changed back to original
one, that is, $2000. However, before the transaction T1 is rolled back, let another
transaction T2 reads the incorrect value of account A. This incorrect value of
account A that is read by transaction T2 is called dirty data, and the problem is
called dirty read problem.

Unrepeatable Read:- The third problem occurs when a transaction tries to read
the value of the data item twice, and another transaction updates the same data
item in between the two read operations of the first transaction. As a result, the
first transaction reads varied values of same data item during its execution.

eg: T3 T4

read (A)

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

read (A)

A:= A+100

write (A)

read (A)

sum:= sum + A

write (sum)

Consider a transaction T3 that reads the value of account A. At this point let
another transaction T4 updates the value of account A. Now if T3 again tries to
read the value of account A, it will get a different value. As a result, the
transaction T3 receives different values for two reads of account A. This
interleaved schedule of transaction T3 and T4 that leads to a problem of
unrepeatable read.

Lock-Based Protocols

A lock is defined as a variable associated with a data item that describes the
status of the item with respect to possible operations that can be applied to it.

It prevents access to a database record by a second transaction until the


first transaction has completed all of its actions.

Lock Types

Locks are put under two categories:-

• Binary Locks

• Shared/Exclusive (for R/W) Locks.

Binary Locking:- In binary locking there are two states of locking namely (a)
locked (‘1’) (b) unlocked (‘0’).

If an object of a database table, page, tuple or attribute is locked by a


transaction, no other transaction can use that object. A distinct lock is associated

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

with each database item. If the value of lock on data item X is 1, item X cannot be
accessed by a database operation that requires the item. If a data item X is
unlocked, any transaction can lock the object for its use.

Two operations are used with binary locking:-

1. lock_item (data item)

2. Unlock_item (data item)

Rules followed during Binary Locking

• A transaction, T must issue the operation, lock_item(A) before any


read_item(A) or write_item(A) operations are performed in T.

• T must issue the operation, unlock_item(A) after all read_item(A) and


write_item(A) operations are completed in T.

• T will not issue a lock_item(A) operation if it already holds the lock on item-
A.

• T will not issue an unlock_item(A) operation unless it already holds the lock
on item-A.

• At the most, only one transaction can hold the lock on a particular item. No
two transactions can access the same item concurrently.

Advantages and Disadvantages of Binary Locks

• Binary locking is easy to implement but it is restrictive to yield optimal


concurrency conditions.

• DBMS will not allow the transactions to read the same database object
even if neither of the transaction updates the database.

• At most one transaction can hold the lock on a particular item. No two
transactions can access the same data concurrently.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Shared/Exclusive locking:- A shared/exclusive (or Read/Write) lock uses


multiple-mode lock.

In this type of locking, there are three locking operations namely

(a) Read_lock (A)

(b) Write_lock (B)

(c) Unlock (A)

A read_locked item is also called share-locked, because other transactions


are allowed to read the item.

A write_locked item is called exclusive lock, because a single


transaction exclusively holds the lock on the item.

A shared lock is denoted by S and the exclusive lock is denoted by X.

Here, there are three locking Operations:

• Lock_S(A)
• LOCK_X(A)
• UNLOCK(A)

There are two approaches to write concurrency control Algorithms:

• Pessimistic Approach

• Optimistic Approach

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

2lptgnoc o er

sy
m n c u r r e n
-oseaopc s
t t c o n t r o l
h s

ml
pciapni g
t
o r i m
hkmph2ma p

p r o a c h e
t s
ss

aibpis a
ds
snieaht
ci
o ps
egaleac
c
lPpkdea ri
n
o

orglp a

h
cocp r
ktko i
c
iona
ncgh

go
l s
Pessimistic Approach: An approach in which the transactions are delayed if
they conflict with each other at some point of time in future is called as a
pessimistic approach.

During pessimistic execution, the validate operation is performed first. If there is


a validation according to compatibility of lock then only read, compute and
write operations are performed.

i.e, ValidateàReadàComputeàWrite

Based on pessimistic (not sure) approach, there are two commonly used
locking protocols, namely:-

Two phase locking protocol

Non Two phase locking protocol

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Optimistic Approach: It is based on the assumption that conflicts of operations on


database are rare. It is better to run the transactions to completion and to check
for conflicts only before they commit.

No checking is done while the transactions are executing. Also note that this
method does not require locking or time-stamping technique.

Here, a transaction is executed without restrictions until it is committed. It allows


conflict checking at the end only.

This approach works on a premise that conflict are rare.

i.e, Read à Compute à Validateà Write

Two Phase Locking

• If all locking operations precede the first unlock operation in the


transaction then a transaction follows the two phase locking protocol.

• 2 PL has two phases in a given schedule

• Phase-I Growing Phase: A phase during which all locks are requested.

• Phase-II Shrinking Phase: A phase during which all locks are released.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Initially a transaction is in the growing phase. The transaction acquire locks as


needed Once the transaction releases a lock it enters the shrinking phase and
it can issue no more lock requests.

Rules for 2PL

• Two transactions cannot have conflicting locks.

• No unlock operation can precede a lock operation in the same transaction.

• No data are affected until all locks are obtained.

Note:- As per 2PL protocol, the transaction cannot acquire a new lock after it
has unlocked any of its existing locked items.

• In T1 we put an exclusive lock on B(data item), so we can read or write B.


Then, we put an exclusive lock on A and thus we can read or write to A.

• Finally, we unlock both B and A in that order. So, T1 is in two phase.

• On the other hand T2 is not in two phase. This is because T2 puts a


shared lock on data-item(A), so read operation can be performed.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

• Then, it unlocks A. Then it locks B in shared mode and reads it. Then,
it unlocks B.

• Finally T2 tries to display the result. This schedule of T2 is not a 2-phase as


it releases lock on A and locks B, immediately.

• And 2PL protocol means it should acquire all locks first and then
release them. So, it is not in 2-phase.

Limitations of 2 PL

• Deadlock: 2PL protocols do not prevent deadlocks. The transaction T1 is


in two phase but still has a problem of deadlock. This is because if we do
not unlock a data item, before requesting a lock on another data item,
then deadlocks may occur.

• Cascading Rollback: Cascading Rollback is a phenomenon in which a


single transaction failure leads to a series of transaction rollback.

Non Two Phase Locking Protocol

• Graph Based Protocol

• Time Stamp ordering Protocol

Graph Based Protocols

• In this protocol, we must have prior knowledge about the order in which
the data items will be accessed. Here, the data items are arranged in a tree
form.

• The nodes of the tree represent data items to be accessed. If (A,B) is an arc
of the tree, then A is called as the parent of B.

• If there is a directed path from A to B, then A is called as an ancestor of B.

Rules for Graph/Tree Protocol

• No data item can be accessed unless the transaction locks it.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

• A data-item, X can be locked by a transaction, T, only if T currently holds


the lock on the parent of X, unless X is the root of sub tree accessed by T.

• A transaction, T, after unlocking a data item cannot subsequently relock it.

• Transactions can unlock data items any time.

• Where d0-d10 is data.

• Say, now 3-transactions follow a tree protocol on this graph-

• T1 needs exclusive lock on d0,d1,d4,d9

• T2 needs exclusive lock on d1,d4

• T3 needs exclusive lock on d3,d7,d8

A schedule of transactions obeying the above locks and using tree protocol are
as follows:

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

• These transactions are not in two phase. This is because the transaction,
T1, could release a lock on the data item, d1, after its processing with it is
over, thereby allowing T2 to proceed concurrently.

• But if we use tree protocol then lock is released by T1 and T2 can proceed
concurrently as shown above. So, we have a better concurrency now. Please
note that the above schedule is equivalent to a serial schedule as follows:-

• T1àT3àT2

• If we draw its precedence graph then we find that it does not have a
cycle. Hence, the tree/graph protocol ensures serializability

Advantages of Tree/Graph Protocol

• Unlocking may occur earlier in this protocol, so there is shorter waiting


time of transactions and hence increased concurrency.

• This protocol is free from deadlocks and thus, no rollbacks are required.

Disadvantages of Tree/Graph Protocol

• If there is no prior knowledge of what data items will need to be locked,


then in such a case, the transaction will have to lock the root of the
tree. This also can reduce concurrency.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Timestamp Ordering Protocol

• Timestamp is a unique identifier created by the DBMS to identify the


relative starting time of a transaction.

• Typically timestamp values are assigned in the order in which the


transactions are submitted to the system. So, a timestamp can be thought
of as the transaction start time.

• Therefore, time stamping is a method of concurrency control in which each


transaction is assigned a transaction timestamp.

• To order the transactions, we associate a unique fixed timestamp,


denoted by TS(Ti)

• For each transaction, Ti. The time stamp is an integer (unique) value. It
is assigned by DBMS before the transaction, Ti, starts execution.

• If a transaction (Ti) has been assigned timestamp, TS(Ti), and a


new transaction, Tj, enters the system, then

TS(Ti)<TS(Tj)

Timestamps must have two properties:-

1. Uniqueness

2. Monotonicity

The uniqueness property assures that no equal timestamp values can exist
and monotonicity assures that timestamp values always increase.

The READ & WRITE operations of database within the same transaction must
have the same timestamp.

Timestamping is a concurrency control protocol in which the fundamental goal is


to order transactions globally in such a way that older transactions get priority in
the event of a conflict.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

The timestamp method does not require any locks. Therefore, there are no
deadlocks. The timestamp methods do not make the transactions wait to prevent
conflicts as is the case with locking.

Deadlocks

• A deadlock is a condition in which two (or more) transactions in a set are


waiting simultaneously for locks held by some other transaction in the set.
Neither transaction can continue because each transaction in the set is on a
waiting queue, waiting for one of the other transactions in the set to
release the lock on an item

Deadlock Detection and Prevention

Deadlock detection is a periodic check by the DBMS to determine if the waiting


line for some resource exceeds a predetermined limit.

There are following three basic schemes to detect and prevent deadlock:

1. Deadlock Prevention

2. Deadlock Detection

3. Deadlock Avoidance

Deadlock Prevention (Never allow deadlock):- Deadlock prevention technique


avoids the conditions that lead to deadlocking. It requires that every transaction
lock all data items it needs in advance. If any of the items cannot be obtained,
none of the items are locked. In other words, a transaction requesting a new lock
is aborted if there is the possibility that a deadlock can occur.

Conflict Resolution in TimeStamp

Wait-Die:- In this scheme, if a transaction requests to lock a resource (data item),


which is already held with a conflicting lock by another transaction, then one of
the two possibilities may occur −

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

If TS(Ti) < TS(Tj) − that is Ti, which is requesting a conflicting lock, is older
than Tj − then Ti is allowed to wait until the data-item is available.

If TS(Ti) > TS(tj) − that is Ti is younger than Tj − then Ti dies. Ti is restarted


later with a random delay but with the same timestamp.

This scheme allows the older transaction to wait but kills the younger one.

Wound-Wait Scheme:-

In this scheme, if a transaction requests to lock a resource (data item), which is


already held with conflicting lock by some another transaction, one of the two
possibilities may occur −

If TS(Ti) < TS(Tj), then Ti forces Tj to be rolled back − that is Ti wounds Tj. Tj
is restarted later with a random delay but with the same timestamp.

If TS(Ti) > TS(Tj), then Ti is forced to wait until the resource is available.

This scheme, allows the younger transaction to wait; but when an older
transaction requests an item held by a younger one, the older transaction forces
the younger one to abort and release the item.

In both the cases, the transaction that enters the system at a later stage is
aborted.

Deadlock Detection (Detect deadlock whenever a transaction is blocked):- In a


deadlock detection technique, the DBMS periodically tests the database for
deadlocks. If a deadlock is found, one of the transactions is aborted and the other
transaction continues. The aborted transaction is now rolled back and restarted.

Deadlock Avoidance (Detect deadlocks periodically):- In a deadlock avoidance


technique, the transaction must obtain all the locks it needs before it can be
executed. Thus, it avoids rollback of conflicting transactions by requiring that
locks be obtained in succession.

A simplest way to detect a state of deadlock is for the system to construct and
maintain a wait-for graph.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

In a wait-for graph, an arrow is drawn from the transaction to the record being
sought and then drawing an arrow from that record to the transaction that is
currently using it. If the graph has cycles, deadlock is detected.

Thus, in a wait-for graph, one node is created for each transaction that is
currently executing.

Whenever a transaction T1 is waiting to lock a data item X that is currently locked


by transaction T2 a directed edge (T1àT2) is created in the wait-for graph.

When transaction T2 releases the lock (s) on the data items that the transaction
T1 was waiting for, the directed edge is dropped from the wait-for graph.

We have a state of deadlock if and only if the wait-for graph has a cycle.

MULTI VERSION CONCURRENCY CONTROL


This concurrency control technique keeps the old values of a data item when the
item is updated. These are known as multi-version concurrency control, because
several versions (values) of an item are maintained. When a transaction requires
access to an item, an appropriate version is chosen to maintain the serializability
of the currently executing schedule, if possible. The idea is that some read
operations that would be rejected in other techniques can still be accepted by
reading an older version of the item to maintain serializability. When a
transaction writes an item, it writes a new version and the old version of the item
is retained. Some multi-version concurrency control algorithms use the concept of
view serializability rather than conflict serializability.

An obvious drawback of multi-version techniques is that more storage is needed to


maintain multiple versions of the database items. However, older versions may have
to be maintained anyway—for example, for recovery purposes. In addition, some
database applications require older versions to be kept to maintain a history of the
evolution of data item values. The extreme case is a temporal database, which keeps
track of all changes and the times at which they occurred. In such

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

cases, there is no additional storage penalty for multi-version techniques, since


older versions are already maintained.

multi-version Technique Based on Timestamp Ordering

In this method, several versions... of each data item X are maintained. For each
version, the value of version and the following two timestamps are kept:

1. Read_TS: The read timestamp of is the largest of all the timestamps of


transactions that have successfully read version.

2. Write_TS: The write timestamp of is the timestamp of the transaction that


wrote the value of version.

Whenever a transaction T is allowed to execute a write_item(X) operation, a new


version of item X is created, with both the write_TS and the read_TS set to TS (T).
Correspondingly, when a transaction T is allowed to read the value of version Xi,
the value of read_TS () is set to the larger of the current read_TS() and TS(T).

To ensure serializability, the following two rules are used:

1. If transaction T issues a write_item(X) operation, and version i of X has the


highest write_TS() of all versions of X that is also less than or equal to TS(T), and
read_TS() > TS(T), then abort and roll back transaction T; otherwise, create a new
version of X with read_TS() = write_TS() = TS(T).

2. If transaction T issues a read_item(X) operation, find the version i of X that has


the highest write_TS() of all versions of X that is also less than or equal to TS(T);
then return the value of to transaction T, and set the value of read_TS() to the
larger of TS(T) and the current read_TS().

As we can see in case 2, a read_item(X) is always successful, since it finds the


appropriate version to read based on the write_TS of the various existing versions
of X. In case 1, however, transaction T may be aborted and rolled back. This
happens if T is attempting to write a version of X that should have been read by
another transaction T whose timestamp is read_TS(); however, T has already read
version Xi, which was written by the transaction with timestamp equal to

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

write_TS(). If this conflict occurs, T is rolled back; otherwise, a new version of X,


written by transaction T, is created. Notice that, if T is rolled back, cascading
rollback may occur. Hence, to ensure recoverability, a transaction T should not be
allowed to commit until after all the transactions that have written some version
that T has read have committed.

VALIDATION BASED CONCURRENCY CONTROL


In optimistic concurrency control techniques, also known as validation or
certification techniques, no checking is done while the transaction is executing. In
this scheme, updates in the transaction are not applied directly to the database
items until the transaction reaches its end. During transaction execution, all updates
are applied to local copies of the data items that are kept for the transaction. At the
end of transaction execution, a validation phase checks whether any of the
transaction’s updates violate serializability. Certain information needed by the
validation phase must be kept by the system. If serializability is not violated, the
transaction is committed and the database is updated from the local copies;
otherwise, the transaction is aborted and then restarted later.

There are three phases for this concurrency control protocol:

1. Read phase: A transaction can read values of committed data items from the
database. However, updates are applied only to local copies (versions) of the
data items kept in the transaction workspace.

2. Validation phase: Checking is performed to ensure that serializability will not


be violated if the transaction updates are applied to the database.

3. Write phase: If the validation phase is successful, the transaction updates are
applied to the database; otherwise, the updates are discarded and the
transaction is restarted.

The idea behind optimistic concurrency control is to do all the checks at once;
hence, transaction execution proceeds with a minimum of overhead until the

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

validation phase is reached. If there is little interference among transactions, most


will be validated successfully. However, if there is much interference, many
transactions that execute to completion will have their results discarded and must
be restarted later. Under these circumstances, optimistic techniques do not work
well. The techniques are called "optimistic" because they assume that little
interference will occur and hence that there is no need to do checking during
transaction execution.

In the validation phase for transaction Ti, the protocol checks that Ti does not
interfere with any committed transactions or with any other transactions
currently in their validation phase. The validation phase for Ti checks that, for
each such transaction Tj that is either committed or is in its validation phase, one
of the following conditions holds:

1. Transaction Tj completes its write phase before Ti starts its read phase.

2. Ti starts its write phase after Tj completes its write phase, and the read_set of
Ti has no items in common with the write_set of Tj.

3. Both the read_set and write_set of Ti have no items in common with the
write_set of Tj, and Tj completes its read phase before Ti completes its
read phase.

When validating transaction Ti, the first condition is checked first for each
transaction Tj, since (1) is the simplest condition to check. Only if condition (1) is
false is condition (2) checked, and only if (2) is false is condition (3)—the most
complex to evaluate—checked. If any one of these three conditions holds, there is
no interference and Ti is validated successfully. If none of these three conditions
holds, the validation of transaction Ti fails and it is aborted and restarted later
because interference may have occurred.

Database Recovery

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

• Database recovery is the process of restoring the database to a correct


state in the event of a failure.

• In other words, it is the process of restoring the database to the most


recent consistent state that existed shortly before the time of system
failure.

• The failure may be the result of a system crash due to hardware or


software errors, a media failure such as head crash, or a software error in
the application such as a logical error in the program that is accessing the
database.

Types of database failures

• Hardware Failures:- Hardware failures may include memory errors, disk


crashes, bad disk sectors, disk full errors and so on.

• Software Failures:- Software failures may include failures related to


software's such as operating system, DBMS software, application programs
and so on.

• System crashes:- System crashes are due to hardware or software error,


resulting in the loss of main memory

• Network Failures:- Network failures can occur while using a client-server


configuration or a distributed database system where multiple database
servers are connected by common networks.

• Natural physical disasters:- These are failures such as fires, floods,


earthquake or power failures.

• Carelessness:- These are failures due to unintentional destruction of data


or facilities by operators or users.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Recovery from transaction failure means that the database is restored to the
most recent consistent state just before the time of failure. To do this, the system
must keep information about the changes that were applied to data items by he
various transaction. This information is kept in system log.

Techniques used:

1. Deferred update( no undo/ redo algorithm)


2. Immediate update(undo/ no redo algorithm)
3. Shadow paging

Deferred update( no undo/ redo algorithm)

These techniques defer or postpone any actual updates to the database until the
transaction reaches it commit point. During transaction execution, the updates
are written to the log file. After the transaction reaches it commit point, the log
file is force-written to disk, then the updates are recorded in the database.

Transaction
Commit point

Updates

Database

Log file

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

If the transaction fails before reaching its commit point, there is no need to undo
any operations because the transaction has not affected the database on disk in
any way.

Fails
Transaction Commit point

No
need
to
UNDO

Updates Database

Log

A typical deferred update protocol uses the following procedure:

A transaction cannot change the database on disk until it reaches its


commit point.
A transaction does not reach its commit point until all its update operations
are recorded in the log file and the log file is force-written to disk.
Recovery techniques based on deferred update are therefore known as NO
UNDO/REDO techniques.

REDO is needed in case the system fails after a transaction commits but
before all its changes are recorded on disk. In this case, the transaction
operations are redone from the log file.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Transaction Commit Point

Database
Done
Updates

Fail

REDO

Immediate update (undo/ no redo algorithm)


In this technique , when a transaction issues an update command the database
can be updated immediately without any need to wait for transaction to reach its
commit point. If the recovery technique ensures that all updates of a transaction
are recorded in the database on disk before the transaction commits, there is no
need to redo any operations of committed transaction.

Commit point
Transaction transaction not
Committed

Updates Database

Log
*no need to REDO.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

Provisions must be made for undoing the effect of update operation that have
been applied to the database by a failed transaction. This is done by rolling back
the transaction and undoing the effect of the transaction write operation.

Commit point
Transaction transaction not
Committed

Done
Database
Updates

Fail

Log

UNDO

Shadow Paging

This technique does not require LOG in single user environment


In multi-user may need LOG for concurrency control method
Shadow paging considers

o The database is partitioned into fixed-length blocks referred to as PAGES.

o Page table has n entries – one for each database page.

o Each contain pointer to a page on disk (1 to 1st page on database and so on…).

The idea is to maintain 2 pages tables during the life of transaction.

o The current page table in main memory / volatile storage medium


o The shadow page table in non volatile storage medium

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

When transaction starts, both page tables are identical

o The shadow page table is never changed over the duration of the transaction.
o The current page table may be changed when a transaction performs a write operation.
o All input and output operations use the current page table to locate database pages
on disk.

During transaction execution, all updates are performed using the


current directory and the shadow directory is never modified.

When a write_item operation is performed

– A new copy of the modified DB page is created and the old copy is not
overwritten.

Two versions, of the pages updated by the transaction, are kept.

– The new page is written elsewhere on some unused disk block.

– The current directory entry is modified to point to the new disk block.

• The shadow directory is not modified.

Advantages

No Overhead for writing log records.

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

No Undo / No Redo algorithm.


Recovery is faster.

Disadvantages

Data gets fragmented or scattered.


Hard to extend algorithm to allow transaction to run concurrently.

Authentication and authorization

Authentication is the mechanism whereby systems may securely identify their


users. Authentication systems provide an answer to the questions:

Who is the user?


Is the user really who he/she represents himself to be?

An authentication system may be as simple (and insecure) as a plain-text


password challenging system. In order to verify the identity of a user, the
authenticating system typically challenges the user to provide his unique
information (his password, fingerprint, etc.) -- if the authenticating system can
verify that the shared secret was presented correctly, the user is considered
authenticated.

Authorization, by contrast, is the mechanism by which a system determines what


level of access a particular authenticated user should have to secure resources
controlled by the system. For example, a database management system might be
designed so as to provide certain specified individuals with the ability to retrieve
information from a database but not the ability to change data stored in the
database, while giving other individuals the ability to change data. Authorization
systems provide answers to the questions:

Is user X authorized to access resource R?


Is user X authorized to perform operation P?
Is user X authorized to perform operation P on resource R?

Authentication and authorization are somewhat tightly-coupled mechanisms --


authorization systems depend on secure authentication systems to ensure that

Downloaded by mogili siva ([email protected])


lOMoARcPSD|27917820

users are who they claim to be and thus prevent unauthorized users from gaining
access to secured resources.

Figure I, below, graphically depicts the interactions between arbitrary


authentication and authorization systems and a typical client/server application.

In the diagram above, a user working at a client system interacts with the
authentication system to prove his identity and then carries on a conversation
with a server system. The server system, in turn, interacts with an authorization
system to determine what rights and privileges the client's user should be
granted.

Authentication: Prove it.


Authorization: Here is what you can do.

Downloaded by mogili siva ([email protected])

You might also like