Unit 5 Transaction and Concurrency Control

Download as pdf or txt
Download as pdf or txt
You are on page 1of 92

Accredited ‘A’ Grade By NAAC

Subject: Database Management Systems

UNIT-V
Transaction and Concurrency Control
Transactions
Transaction Concept
• A transaction is a unit of program execution that accesses and
possibly updates various data items.
• E.g., transaction to transfer Rs.50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Two main issues:
• Failures of various kinds, such as hardware failures and
system crashes
• Concurrent execution of multiple transactions
Required Properties of a Transaction
• Consider a transaction to transfer Rs. 50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Atomicity requirement
• If the transaction fails after step 3 and before step 6, money will be
“lost” leading to an inconsistent database state
• Failure could be due to software or hardware
• The system should ensure that updates of a partially executed
transaction are not reflected in the database
• Durability requirement — once the user has been notified that the
transaction has completed (i.e., the transfer of the Rs. 50 has taken
place), the updates to the database by the transaction must persist
even if there are software or hardware failures.
Required Properties of a Transaction
• Consistency requirement in above example:
• The sum of A and B is unchanged by the execution of the
transaction
• In general, consistency requirements include
• Explicitly specified integrity constraints such as primary keys
and foreign keys
• Implicit integrity constraints
• e.g., sum of balances of all accounts, minus sum of loan
amounts must equal value of cash-in-hand
• A transaction, when starting to execute, must see a consistent
database.
• During transaction execution the database may be temporarily
inconsistent.
• When the transaction completes successfully the database must be
consistent
• Erroneous transaction logic can lead to inconsistency
Required Properties of a Transaction
• Isolation requirement — if between steps 3 and 6 (of the fund transfer
transaction) , another transaction T2 is allowed to access the partially
updated database, it will see an inconsistent database (the sum A + B
will be less than it should be).

T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B
• Isolation can be ensured trivially by running transactions serially
• That is, one after the other.
• However, executing multiple transactions concurrently has significant
benefits.
ACID Properties
A transaction is a unit of program execution that accesses and
possibly updates various data items. To preserve the integrity of data
the database system must ensure:
• Atomicity. Either all operations of the transaction are properly
reflected in the database or none are.
• Consistency. Execution of a transaction in isolation preserves the
consistency of the database.
• Isolation. Although multiple transactions may execute concurrently,
each transaction must be unaware of other concurrently executing
transactions. Intermediate transaction results must be hidden from
other concurrently executed transactions.
• That is, for every pair of transactions Ti and Tj, it appears to Ti that
either Tj, finished execution before Ti started, or Tj started
execution after Ti finished.
• Durability. After a transaction completes successfully, the changes it
has made to the database persist, even if there are system failures.
Transaction State
• Active – the initial state; the transaction stays in this state
while it is executing
• Partially committed – after the final statement has been
executed.
• Failed -- after the discovery that normal execution can no
longer proceed.
• Aborted – after the transaction has been rolled back and the
database restored to its state prior to the start of the
transaction. Two options after it has been aborted:
• Restart the transaction
• can be done only if no internal logical error
• Kill the transaction
• Committed – after successful completion.
Transaction State
Concurrent Executions
• Multiple transactions are allowed to run concurrently in the
system. Advantages are:
• Increased processor and disk utilization, leading to better
transaction throughput
• E.g. one transaction can be using the CPU while another is
reading from or writing to the disk
• Reduced average response time for transactions: short
transactions need not wait behind long ones.
• Concurrency control schemes – mechanisms to achieve isolation
• That is, to control the interaction among the concurrent
transactions in order to prevent them from destroying the
consistency of the database
Schedules
• Schedule – a sequences of instructions that specify the
chronological order in which instructions of concurrent
transactions are executed
• A schedule for a set of transactions must consist of all
instructions of those transactions
• Must preserve the order in which the instructions appear in
each individual transaction.
• A transaction that successfully completes its execution will have a
commit instructions as the last statement
• By default transaction assumed to execute commit instruction
as its last step
• A transaction that fails to successfully complete its execution will
have an abort instruction as the last statement
Schedule 1
• Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to
B.
• An example of a serial schedule in which T1 is followed by T2 :
Schedule 2
• A serial schedule in which T2 is followed by T1 :
Schedule 3
• Let T1 and T2 be the transactions defined previously. The following
schedule is not a serial schedule, but it is equivalent to Schedule 1.

Note -- In schedules 1, 2 and 3, the sum “A + B” is preserved.


Schedule 4
• The following concurrent schedule does not preserve the sum of “A + B”
Serializability
Serializability
• Basic Assumption – Each transaction preserves database
consistency.
• Thus, serial execution of a set of transactions preserves
database consistency.
• A (possibly concurrent) schedule is serializable if it is
equivalent to a serial schedule. Different forms of schedule
equivalence give rise to the notions of:
1. conflict serializability
2. view serializability
Simplified view of transactions

• We ignore operations other than read and write instructions


• We assume that transactions may perform arbitrary
computations on data in local buffers in between reads and
writes.
• Our simplified schedules consist of only read and write
instructions.
Conflicting Instructions
• Let li and lj be two Instructions of transactions Ti and Tj
respectively. Instructions li and lj conflict if and only if there
exists some item Q accessed by both li and lj, and at least one
of these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
• Intuitively, a conflict between li and lj forces a (logical) temporal
order between them.
• If li and lj are consecutive in a schedule and they do not
conflict, their results would remain the same even if they
had been interchanged in the schedule.
Conflict Serializability
• If a schedule S can be transformed into a schedule S´ by a series
of swaps of non-conflicting instructions, we say that S and S´ are
conflict equivalent.
• We say that a schedule S is conflict serializable if it is conflict
equivalent to a serial schedule
Conflict Serializability
• Schedule 3 can be transformed into Schedule 6 -- a serial schedule
where T2 follows T1, by a series of swaps of non-conflicting
instructions. Therefore, Schedule 3 is conflict serializable.

Schedule 3 Schedule 6
Conflict Serializability

• Example of a schedule that is not conflict serializable:

• We are unable to swap instructions in the above schedule to


obtain either the serial schedule < T3, T4 >, or the serial
schedule < T4, T3 >.
Precedence Graph
• Consider some schedule of a set of transactions T1, T2, ..., Tn
• Precedence graph — a direct graph where the vertices are
the transactions (names).
• We draw an arc from Ti to Tj if the two transaction conflict,
and Ti accessed the data item on which the conflict arose
earlier.
• We may label the arc by the item that was accessed.
• Example
Testing for Conflict Serializability
• A schedule is conflict serializable if and
only if its precedence graph is acyclic.
• Cycle-detection algorithms exist which
take order n2 time, where n is the
number of vertices in the graph.
• (Better algorithms take order n + e
where e is the number of edges.)
• If precedence graph is acyclic, the
serializability order can be obtained by a
topological sorting of the graph.
• That is, a linear order consistent with
the partial order of the graph.
• For example, a serializability order for
the schedule (a) would be one of
either (b) or (c)
Recoverable Schedules
• Recoverable schedule — if a transaction Tj reads a data item previously
written by a transaction Ti , then the commit operation of Ti must appear
before the commit operation of Tj.
• The following schedule is not recoverable if T9 commits immediately after
the read(A) operation.

• If T8 should abort, T9 would have read (and possibly shown to the user)
an inconsistent database state. Hence, database must ensure that
schedules are recoverable.
Cascading Rollbacks
• Cascading rollback – a single transaction failure leads to a series
of transaction rollbacks. Consider the following schedule where
none of the transactions has yet committed (so the schedule is
recoverable)

If T10 fails, T11 and T12 must also be rolled back.


• Can lead to the undoing of a significant amount of work
Cascadeless Schedules
• Cascadeless schedules — for each pair of transactions Ti and Tj
such that Tj reads a data item previously written by Ti, the
commit operation of Ti appears before the read operation of Tj.
• Every cascadeless schedule is also recoverable
• It is desirable to restrict the schedules to those that are
cascadeless
• Example of a schedule that is NOT cascadeless
Concurrency Control
• A database must provide a mechanism that will ensure that all
possible schedules are both:
• Conflict serializable.
• Recoverable and preferably cascadeless
• A policy in which only one transaction can execute at a time
generates serial schedules, but provides a poor degree of
concurrency
• Concurrency-control schemes tradeoff between the amount of
concurrency they allow and the amount of overhead that they incur
• Testing a schedule for serializability after it has executed is a little
too late!
• Tests for serializability help us understand why a concurrency
control protocol is correct
• Goal – to develop concurrency control protocols that will assure
serializability.
View Serializability
• Let S and S´ be two schedules with the same set of transactions. S
and S´ are view equivalent if the following three conditions are
met, for each data item Q,
1. If in schedule S, transaction Ti reads the initial value of Q, then
in schedule S’ also transaction Ti must read the initial value of
Q.
2. If in schedule S transaction Ti executes read(Q), and that value
was produced by transaction Tj (if any), then in schedule S’
also transaction Ti must read the value of Q that was produced
by the same write(Q) operation of transaction Tj .
3. The transaction (if any) that performs the final write(Q)
operation in schedule S must also perform the final write(Q)
operation in schedule S’.
• As can be seen, view equivalence is also based purely on reads and
writes alone.
View Serializability
• A schedule S is view serializable if it is view equivalent to a serial
schedule.
• Every conflict serializable schedule is also view serializable.
• Below is a schedule which is view-serializable but not conflict
serializable.

• What serial schedule is above equivalent to?


• Every view serializable schedule that is not conflict serializable
has blind writes.
Test for View Serializability
• The precedence graph test for conflict serializability cannot be
used directly to test for view serializability.
• Extension to test for view serializability has cost exponential in
the size of the precedence graph.
• The problem of checking if a schedule is view serializable falls in
the class of NP-complete problems.
• Thus, existence of an efficient algorithm is extremely unlikely.
• However ,practical algorithms that just check some sufficient
conditions for view serializability can still be used.
More Complex Notions of Serializability
• The schedule below produces the same outcome as the serial schedule <
T1, T5 >, yet is not conflict equivalent or view equivalent to it.

If we start with A = 1000 and B = 2000, the final result is 960 and 2040
• Determining such equivalence requires analysis of operations other than
read and write.
Concurrency Control
Lock-Based Protocols
• A lock is a mechanism to control concurrent access to a data item
• Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read as well as
written. X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is
requested using lock-S instruction.
• Lock requests are made to the concurrency-control manager by the
programmer. Transaction can proceed only after request is granted.
Lock-Based Protocols (Cont.)
• Lock-compatibility matrix

• A transaction may be granted a lock on an item if the requested lock


is compatible with locks already held on the item by other
transactions
• Any number of transactions can hold shared locks on an item,
• But if any transaction holds an exclusive on the item no other
transaction may hold any lock on the item.
• If a lock cannot be granted, the requesting transaction is made to
wait till all incompatible locks held by other transactions have been
released. The lock is then granted.
Lock-Based Protocols (Cont.)
• Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
• Locking as above is not sufficient to guarantee serializability — if A
and B get updated in-between the read of A and B, the displayed
sum would be wrong.
• A locking protocol is a set of rules followed by all transactions while
requesting and releasing locks. Locking protocols restrict the set of
possible schedules.
The Two-Phase Locking Protocol
• This protocol ensures conflict-serializable schedules.
• Phase 1: Growing Phase
• Transaction may obtain locks
• Transaction may not release locks
• Phase 2: Shrinking Phase
• Transaction may release locks
• Transaction may not obtain locks
• The protocol assures serializability. It can be proved that the
transactions can be serialized in the order of their lock points (i.e.,
the point where a transaction acquired its final lock).
The Two-Phase Locking Protocol

• There can be conflict serializable schedules that cannot be


obtained if two-phase locking is used.
• However, in the absence of extra information (e.g., ordering of
access to data), two-phase locking is needed for conflict
serializability in the following sense:
• Given a transaction Ti that does not follow two-phase locking,
we can find a transaction Tj that uses two-phase locking, and a
schedule for Ti and Tj that is not conflict serializable.
Lock Conversions
• Two-phase locking with lock conversions:
– First Phase:
• can acquire a lock-S on item
• can acquire a lock-X on item
• can convert a lock-S to a lock-X (upgrade)
– Second Phase:
• can release a lock-S
• can release a lock-X
• can convert a lock-X to a lock-S (downgrade)
• This protocol assures serializability. But still relies on the
programmer to insert the various locking instructions.
Automatic Acquisition of Locks
• A transaction Ti issues the standard read/write instruction, without
explicit locking calls.
• The operation read(D) is processed as:
if Ti has a lock on D
then
read(D)
else begin
if necessary wait until no other
transaction has a lock-X on D
grant Ti a lock-S on D;
read(D)
end
Automatic Acquisition of Locks
• write(D) is processed as:
if Ti has a lock-X on D
then
write(D)
else begin
if necessary wait until no other transaction has any lock on D,
if Ti has a lock-S on D
then
upgrade lock on D to lock-X
else
grant Ti a lock-X on D
write(D)
end;
• All locks are released after commit or abort
Deadlocks
• Consider the partial schedule

• Neither T3 nor T4 can make progress — executing lock-S(B) causes


T4 to wait for T3 to release its lock on B, while executing lock-X(A)
causes T3 to wait for T4 to release its lock on A.
• Such a situation is called a deadlock.
• To handle a deadlock one of T3 or T4 must be rolled back
and its locks released.
Deadlocks

• Two-phase locking does not ensure freedom from deadlocks.


• In addition to deadlocks, there is a possibility of starvation.
• Starvation occurs if the concurrency control manager is badly
designed. For example:
• A transaction may be waiting for an X-lock on an item, while a
sequence of other transactions request and are granted an S-lock
on the same item.
• The same transaction is repeatedly rolled back due to deadlocks.
• Concurrency control manager can be designed to prevent starvation.
Deadlocks

• The potential for deadlock exists in most locking protocols. Deadlocks


are a necessary evil.
• When a deadlock occurs there is a possibility of cascading roll-backs.
• Cascading roll-back is possible under two-phase locking. To avoid this,
follow a modified protocol called strict two-phase locking -- a
transaction must hold all its exclusive locks till it commits/aborts.
• Rigorous two-phase locking is even stricter. Here, all locks are held
till commit/abort. In this protocol transactions can be serialized in the
order in which they commit.
Implementation of Locking
• A lock manager can be implemented as a separate process to
which transactions send lock and unlock requests
• The lock manager replies to a lock request by sending a lock grant
messages (or a message asking the transaction to roll back, in case
of a deadlock)
• The requesting transaction waits until its request is answered
• The lock manager maintains a data-structure called a lock table to
record granted locks and pending requests
• The lock table is usually implemented as an in-memory hash table
indexed on the name of the data item being locked
Lock Table
• Dark blue rectangles indicate granted
locks; light blue indicate waiting requests
• Lock table also records the type of lock
granted or requested
• New request is added to the end of the
queue of requests for the data item, and
granted if it is compatible with all earlier
locks
• Unlock requests result in the request being
deleted, and later requests are checked to
see if they can now be granted
• If transaction aborts, all waiting or granted
requests of the transaction are deleted
• lock manager may keep a list of locks
held by each transaction, to implement
this efficiently
Deadlock Handling
• System is deadlocked if there is a set of transactions such that every
transaction in the set is waiting for another transaction in the set.
• Deadlock prevention protocols ensure that the system will never
enter into a deadlock state. Some prevention strategies :
• Require that each transaction locks all its data items before it
begins execution (predeclaration).
• Impose partial ordering of all data items and require that a
transaction can lock data items only in the order specified by the
partial order.
Additional Deadlock Prevention Strategies
• Following schemes use transaction timestamps for the sake of
deadlock prevention alone.
• wait-die scheme — non-preemptive
• older transaction may wait for younger one to release data
item. (older means smaller timestamp) Younger transactions
never wait for older ones; they are rolled back instead.
• a transaction may die several times before acquiring needed
data item
• wound-wait scheme — preemptive
• older transaction wounds (forces rollback) of younger
transaction instead of waiting for it. Younger transactions may
wait for older ones.
• may be fewer rollbacks than wait-die scheme.
Deadlock prevention
• Both in wait-die and in wound-wait schemes, a rolled back
transactions is restarted with its original timestamp. Older
transactions thus have precedence over newer ones, and
starvation is hence avoided.
• Timeout-Based Schemes:
• a transaction waits for a lock only for a specified amount of
time. If the lock has not been granted within that time, the
transaction is rolled back and restarted,
• Thus, deadlocks are not possible
• simple to implement; but starvation is possible. Also difficult
to determine good value of the timeout interval.
Deadlock Detection
• Deadlocks can be described as a wait-for graph, which consists of
a pair G = (V,E),
• V is a set of vertices (all the transactions in the system)
• E is a set of edges; each element is an ordered pair Ti Tj.
• If Ti  Tj is in E, then there is a directed edge from Ti to Tj,
implying that Ti is waiting for Tj to release a data item.
• When Ti requests a data item currently being held by Tj, then the
edge Ti  Tj is inserted in the wait-for graph. This edge is
removed only when Tj is no longer holding a data item needed by
Ti.
• The system is in a deadlock state if and only if the wait-for graph
has a cycle. Must invoke a deadlock-detection algorithm
periodically to look for cycles.
Deadlock Detection

Wait-for graph without a cycle Wait-for graph with a cycle


Deadlock Recovery
• When deadlock is detected :
• Some transaction will have to rolled back (made a victim) to
break deadlock. Select that transaction as victim that will
incur minimum cost.
• Rollback -- determine how far to roll back transaction
• Total rollback: Abort the transaction and then restart it.
• More effective to roll back transaction only as far as
necessary to break deadlock.
• Starvation happens if same transaction is always chosen as
victim. Include the number of rollbacks in the cost factor to
avoid starvation
Multiple Granularity
• Allow data items to be of various sizes and define a hierarchy of
data granularities, where the small granularities are nested within
larger ones
• Can be represented graphically as a tree.
• When a transaction locks a node in the tree explicitly, it implicitly
locks all the node's descendents in the same mode.
• Granularity of locking (level in tree where locking is done):
• fine granularity (lower in tree): high concurrency, high locking
overhead
• coarse granularity (higher in tree): low locking overhead, low
concurrency
Example of Granularity Hierarchy

The levels, starting from the coarsest (top) level are


• database
• area
• file
• record
Intention Lock Modes
• In addition to S and X lock modes, there are three additional lock
modes with multiple granularity:
• intention-shared (IS): indicates explicit locking at a lower level
of the tree but only with shared locks.
• intention-exclusive (IX): indicates explicit locking at a lower
level with exclusive or shared locks
• shared and intention-exclusive (SIX): the subtree rooted by
that node is locked explicitly in shared mode and explicit
locking is being done at a lower level with exclusive-mode
locks.
• intention locks allow a higher level node to be locked in S or X
mode without having to check all descendent nodes.
Compatibility Matrix with Intention Lock Modes

• The compatibility matrix for all lock modes is:


Multiple Granularity Locking Scheme
• Transaction Ti can lock a node Q, using the following rules:
1. The lock compatibility matrix must be observed.
2. The root of the tree must be locked first, and may be locked in
any mode.
3. A node Q can be locked by Ti in S or IS mode only if the parent
of Q is currently locked by Ti in either IX or IS mode.
4. A node Q can be locked by Ti in X, SIX, or IX mode only if the
parent of Q is currently locked by Ti in either IX or SIX mode.
5. Ti can lock a node only if it has not previously unlocked any
node (that is, Ti is two-phase).
6. Ti can unlock a node Q only if none of the children of Q are
currently locked by Ti.
• Observe that locks are acquired in root-to-leaf order, whereas they
are released in leaf-to-root order.
• Lock granularity escalation: in case there are too many locks at a
particular level, switch to higher granularity S or X lock
Timestamp-Based Protocols
• Each transaction is issued a timestamp when it enters the system.
If an old transaction Ti has time-stamp TS(Ti), a new transaction Tj
is assigned time-stamp TS(Tj) such that TS(Ti) <TS(Tj).
• The protocol manages concurrent execution such that the time-
stamps determine the serializability order.
• In order to assure such behavior, the protocol maintains for each
data Q two timestamp values:
• W-timestamp(Q) is the largest time-stamp of any transaction
that executed write(Q) successfully.
• R-timestamp(Q) is the largest time-stamp of any transaction
that executed read(Q) successfully.
Timestamp-Based Protocols

• The timestamp ordering protocol ensures that any conflicting read


and write operations are executed in timestamp order.
• Suppose a transaction Ti issues a read(Q)
• If TS(Ti)  W-timestamp(Q), then Ti needs to read a value of Q
that was already overwritten.
• Hence, the read operation is rejected, and Ti is rolled back.
• If TS(Ti)  W-timestamp(Q), then the read operation is
executed, and R-timestamp(Q) is set to max(R-timestamp(Q),
TS(Ti)).
Timestamp-Based Protocols
• Suppose that transaction Ti issues write(Q).
1. If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is
producing was needed previously, and the system assumed
that that value would never be produced.
 Hence, the write operation is rejected, and Ti is rolled
back.
2. If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an
obsolete value of Q.
 Hence, this write operation is rejected, and Ti is rolled
back.
3. Otherwise, the write operation is executed, and W-
timestamp(Q) is set to TS(Ti).
Example Use of the Protocol

A partial schedule for several data items for transactions with


timestamps 1, 2, 3, 4, 5
Correctness of Timestamp-Ordering Protocol
• The timestamp-ordering protocol guarantees serializability since all
the arcs in the precedence graph are of the form: Thus, there will be
no cycles in the precedence graph

• Timestamp protocol ensures freedom from deadlock as no


transaction ever waits.
• But the schedule may not be cascade-free, and may not even be
recoverable.
Recoverability and Cascade Freedom
• Problem with timestamp-ordering protocol:
• Suppose Ti aborts, but Tj has read a data item written by Ti
• Then Tj must abort; if Tj had been allowed to commit earlier, the
schedule is not recoverable.
• Further, any transaction that has read a data item written by Tj
must abort
• This can lead to cascading rollback --- that is, a chain of rollbacks
• Solution 1:
• A transaction is structured such that its writes are all performed at
the end of its processing
• All writes of a transaction form an atomic action; no transaction
may execute while a transaction is being written
• A transaction that aborts is restarted with a new timestamp
• Solution 2: Limited form of locking: wait for data to be committed
before reading it
• Solution 3: Use commit dependencies to ensure recoverability
Thomas’ Write Rule
• Modified version of the timestamp-ordering protocol in which
obsolete write operations may be ignored under certain
circumstances.
• When Ti attempts to write data item Q, if TS(Ti) < W-timestamp(Q),
then Ti is attempting to write an obsolete value of {Q}.
• Rather than rolling back Ti as the timestamp ordering protocol
would have done, this {write} operation can be ignored.
• Otherwise this protocol is the same as the timestamp ordering
protocol.
• Thomas' Write Rule allows greater potential concurrency.
• Allows some view-serializable schedules that are not conflict-
serializable.
Validation-Based Protocol
• Execution of transaction Ti is done in three phases.
1. Read and execution phase: Transaction Ti writes only to
temporary local variables
2. Validation phase: Transaction Ti performs a ''validation test''
to determine if local variables can be written without violating
serializability.
3. Write phase: If Ti is validated, the updates are applied to the
database; otherwise, Ti is rolled back.
• The three phases of concurrently executing transactions can be
interleaved, but each transaction must go through the three phases
in that order.
• Assume for simplicity that the validation and write phase occur
together, atomically and serially
• I.e., only one transaction executes validation/write at a time.
• Also called as optimistic concurrency control since transaction
executes fully in the hope that all will go well during validation
Validation-Based Protocol
• Each transaction Ti has 3 timestamps
• Start(Ti) : the time when Ti started its execution
• Validation(Ti): the time when Ti entered its validation phase
• Finish(Ti) : the time when Ti finished its write phase
• Serializability order is determined by timestamp given at validation
time; this is done to increase concurrency.
• Thus, TS(Ti) is given the value of Validation(Ti).
• This protocol is useful and gives greater degree of concurrency if
probability of conflicts is low.
• because the serializability order is not pre-decided, and
• relatively few transactions will have to be rolled back.
Validation Test for Transaction Tj

• If for all Ti with TS (Ti) < TS (Tj) either one of the following condition
holds:
• finish(Ti) < start(Tj)
• start(Tj) < finish(Ti) < validation(Tj) and the set of data items
written by Ti does not intersect with the set of data items read
by Tj.
then validation succeeds and Tj can be committed. Otherwise,
validation fails and Tj is aborted.
• Justification: Either the first condition is satisfied, and there is no
overlapped execution, or the second condition is satisfied and
• the writes of Tj do not affect reads of Ti since they occur after Ti
has finished its reads.
• the writes of Ti do not affect reads of Tj since Tj does not read
any item written by Ti.
Schedule Produced by Validation
• Example of schedule produced using validation
Recovery System
Failure Classification
• Transaction failure :
• Logical errors: transaction cannot complete due to some internal
error condition
• System errors: the database system must terminate an active
transaction due to an error condition (e.g., deadlock)
• System crash: a power failure or other hardware or software failure
causes the system to crash.
• Fail-stop assumption: non-volatile storage contents are assumed
to not be corrupted by system crash
• Database systems have numerous integrity checks to prevent
corruption of disk data
• Disk failure: a head crash or similar disk failure destroys all or part
of disk storage
• Destruction is assumed to be detectable: disk drives use
checksums to detect failures
Recovery Algorithms
• Consider transaction Ti that transfers $50 from account A to
account B
• Two updates: subtract 50 from A and add 50 to B
• Transaction Ti requires updates to A and B to be output to the
database.
• A failure may occur after one of these modifications have been
made but before both of them are made.
• Modifying the database without ensuring that the transaction
will commit may leave the database in an inconsistent state
• Not modifying the database may result in lost updates if failure
occurs just after transaction commits
• Recovery algorithms have two parts
• Actions taken during normal transaction processing to ensure
enough information exists to recover from failures
• Actions taken after a failure to recover the database contents to
a state that ensures atomicity, consistency and durability
Storage Structure
• Volatile storage:
• does not survive system crashes
• examples: main memory, cache memory
• Nonvolatile storage:
• survives system crashes
• examples: disk, tape, flash memory, non-volatile RAM
• but may still fail, losing data
• Stable storage:
• a mythical form of storage that survives all failures
• approximated by maintaining multiple copies on distinct
nonvolatile media
Stable-Storage Implementation
• Maintain multiple copies of each block on separate disks
• copies can be at remote sites to protect against disasters such as
fire or flooding.
• Failure during data transfer can still result in inconsistent copies:
Block transfer can result in
• Successful completion
• Partial failure: destination block has incorrect information
• Total failure: destination block was never updated
• Protecting storage media from failure during data :
• Execute output operation as follows (assume two copies of
block):
1. Write the information onto the first physical block.
2. When the first write successfully completes, write the same
information onto the second physical block.
3. The output is completed only after the second write
successfully completes.
Stable-Storage Implementation
• Protecting storage media from failure during data transfer.
• Copies of a block may differ due to failure during output
operation. To recover from failure:
• First find inconsistent blocks:
• Expensive solution: Compare the two copies of every disk
block.
• Better solution:
• Record in-progress disk writes on non-volatile storage.
• Use this information during recovery to find blocks that
may be inconsistent, and only compare copies of these.
• Used in hardware RAID systems
Data Access
• Physical blocks are those blocks residing on the disk.
• Buffer blocks are the blocks residing temporarily in main memory.
• Block movements between disk and main memory are initiated
through the following two operations:
• input(B) transfers the physical block B to main memory.
• output(B) transfers the buffer block B to the disk, and replaces
the appropriate physical block there.
• We assume, for simplicity, that each data item fits in, and is stored
inside, a single block.
Block Storage Operations
Example of Data Access
buffer
Buffer Block A input(A)
X A
Buffer Block B Y B
output(B)
read(X)
write(Y)

x2
x1
y1

work area work area


of T1 of T2

memory disk
Data Access
• Each transaction Ti has its private work-area in which local copies
of all data items accessed and updated by it are kept.
• Ti's local copy of a data item X is called xi.
• Transferring data items between system buffer blocks and its
private work-area done by:
• read(X) assigns the value of data item X to the local variable xi.
• write(X) assigns the value of local variable xi to data item {X} in
the buffer block.
• Transactions
• Must perform read(X) before accessing X for the first time
write(X) can be executed at any time before the transaction
commits
Recovery and Atomicity
• To ensure atomicity despite failures, we first output information
describing the modifications to stable storage without modifying
the database itself.
• We study log-based recovery mechanisms in detail
• We first present key concepts
• And then present the actual recovery algorithm
• Less used alternative: shadow-copy and shadow-paging

shadow-copy
Log-Based Recovery
• A log is kept on stable storage.
• The log is a sequence of log records, and maintains a record of
update activities on the database.
• When transaction Ti starts, it registers itself by writing a
<Ti start>log record
• Before Ti executes write(X), a log record
<Ti, X, V1, V2>
is written, where V1 is the value of X before the write (the old
value), and V2 is the value to be written to X (the new value).
• When Ti finishes it last statement, the log record <Ti commit> is
written.
• Two approaches using logs
• Deferred database modification
• Immediate database modification
Immediate Database Modification
• The immediate-modification scheme allows updates of an
uncommitted transaction to be made to the buffer, or the disk
itself, before the transaction commits
• Update log record must be written before database item is written
• Output of updated blocks to stable storage can take place at any
time before or after transaction commit
• Order in which blocks are output can be different from the order
in which they are written.
• The deferred-modification scheme performs updates to
buffer/disk only at the time of transaction commit
• Simplifies some aspects of recovery
• But has overhead of storing local copy
Transaction Commit

• A transaction is said to have committed when its commit log


record is output to stable storage
• all previous log records of the transaction must have been
output already
• Writes performed by a transaction may still be in the buffer
when the transaction commits, and may be output later
Immediate Database Modification Example
Log Write Output

<T0 start>
<T0, A, 1000, 950>
<To, B, 2000, 2050
A = 950
B = 2050
<T0 commit>
<T1 start>
<T1, C, 700, 600>
C = 600 BC output before T1
BB , BC commits
<T1 commit>
BA

• Here BX denotes block containing X. BA output after T0


commits
Checkpoints
• Redoing/undoing all transactions recorded in the log can be very
slow
• processing the entire log is time-consuming if the system has
run for a long time
• we might unnecessarily redo transactions which have
already output their updates to the database.
• Streamline recovery procedure by periodically performing
checkpointing
• Output all log records currently residing in main memory
onto stable storage.
• Output all modified buffer blocks to the disk.
• Write a log record < checkpoint L> onto stable storage where
L is a list of all transactions active at the time of checkpoint.
• All updates are stopped while doing checkpointing
Checkpoints
• During recovery we need to consider only the most recent
transaction Ti that started before the checkpoint, and
transactions that started after Ti.
• Scan backwards from end of log to find the most recent
<checkpoint L> record
• Only transactions that are in L or started after the checkpoint
need to be redone or undone
• Transactions that committed or aborted before the
checkpoint already have all their updates output to stable
storage.
• Some earlier part of the log may be needed for undo operations
• Continue scanning backwards till a record <Ti start> is found
for every transaction Ti in L.
• Parts of log prior to earliest <Ti start> record above are not
needed for recovery, and can be erased whenever desired.
Example of Checkpoints
Tc Tf
T1
T2
T3
T4

checkpoint system failure

• T1 can be ignored as updates already output to disk due to


checkpoint
• T2 and T3 redone.
• T4 undone
Shadow Paging
• Shadow paging is an alternative to log-based recovery; this scheme
is useful if transactions execute serially
• Idea: maintain two page tables during the lifetime of a transaction –
the current page table, and the shadow page table
• Store the shadow page table in nonvolatile storage, such that state
of the database prior to transaction execution may be recovered.
• Shadow page table is never modified during execution
• To start with, both the page tables are identical. Only current page
table is used for data item accesses during execution of the
transaction.
• Whenever any page is about to be written for the first time
• A copy of this page is made onto an unused page.
• The current page table is then made to point to the copy
• The update is performed on the copy
Sample Page Table
Example of Shadow Paging
Shadow and current page tables after write to page 4
Shadow Paging
• To commit a transaction :
1. Flush all modified pages in main memory to disk
2. Output current page table to disk
3. Make the current page table the new shadow page table, as
follows:
• keep a pointer to the shadow page table at a fixed (known)
location on disk.
• to make the current page table the new shadow page table,
simply update the pointer to point to current page table on disk
• Once pointer to shadow page table has been written, transaction is
committed.
• No recovery is needed after a crash — new transactions can start
right away, using the shadow page table.
• Pages not pointed to from current/shadow page table should be
freed (garbage collected).
Shadow Paging
• Advantages of shadow-paging over log-based schemes
• no overhead of writing log records
• recovery is trivial
• Disadvantages :
• Copying the entire page table is very expensive
• Can be reduced by using a page table structured like a B+-tree
• No need to copy entire tree, only need to copy paths in the
tree that lead to updated leaf nodes
• Commit overhead is high even with above extension
• Need to flush every updated page, and page table
• Data gets fragmented (related pages get separated on disk)
• After every transaction completion, the database pages containing
old versions of modified data need to be garbage collected
• Hard to extend algorithm to allow transactions to run concurrently
• Easier to extend log based schemes
Accredited ‘A’ Grade By NAAC

Thank You !

You might also like