DBMS Unit 4
DBMS Unit 4
DBMS Unit 4
UNIT 4
UNIT-4
Transactions, Concurrency
Control and Recovery
Transaction Management
Transaction Management
Transaction
Transaction Concept
Transaction State
Concurrent Executions
Serializability
Recoverability
Testing for Serializability
Concurrency control
Lock-based protocols
Timestamp-based protocols
Multiple granularity
Multiversion schemes
Recovery Systems
Log-based recovery
Recovery with concurrent transactions
Transactions:
A transaction is a collection of actions that make consistent transformations of system
states.
Formally, it is a partial order over the operations that are part of the transactions.
Examples-
Consider an airline reservation example:
Transaction Reservation
begin
input (flight, date, customer_name);
temp <- Read(flight(date).sold_seats);
Write(flight(date).sold_seats, temp + 1);
Write(flight(date).cust_name, customer_name);
Output ("reservation completed")
end. {transaction}
What if there are no free seats?
Transaction Reservation
begin
input (flight, date, customer_name);
temp <- Read(flight(date).sold_seats);
if temp = flight(date).maximum then
begin
output("no free seats");
Abort
end
else begin
Write(flight(date).sold_seats, temp + 1);
Write(flight(date).cust_name, customer_name);
Commit;
output("reservation completed")
end
end. {transaction}
Transaction Concept
A transaction is a unit of program execution that accesses and possibly updates
various data items.
E.g. transaction to transfer €50 from account A to account B:
1. read_from_acoount(A)
2. A := A – 50
3. write_to_account(A)
4. read_from_accont(B)
5. B := B + 50
6. write_to_account(B)
Two main issues to deal with:
Failures of various kinds, such as hardware failures and system crashes
Concurrent execution of multiple transactions
Transaction ACID properties
E.g. transaction to transfer €50 from account A to account B:
1. read_from_acoount(A)
2. A := A – 50
3. write_to_account(A)
4. read_from_accont(B)
5. B := B + 50
6. write_to_account(B)
Atomicity requirement
if the transaction fails after step 3 and before step 6, money will be “lost” leading to an
inconsistent database state
Failure could be due to software or hardware
the system should ensure that updates of a partially executed transaction are not
reflected in the database
All or nothing, regarding the execution of the transaction
Durability requirement — once the user has been notified of transaction has completion, the
updates must persist in the database even if there are software or hardware failures.
Transaction ACID properties (Cont.)
Transaction to transfer €50 from account A to account B:
1. read_from_acoount(A)
2. A := A – 50
3. write_to_account(A)
4. read_from_accont(B)
5. B := B + 50
6. write_to_account(B)
Consistency requirement in above example:
the sum of A and B is unchanged by the execution of the transaction
In general, consistency requirements include
Explicitly specified integrity constraints such as primary keys and foreign keys
Implicit integrity constraints
e.g. sum of balances of all accounts, minus sum of loan amounts must equal value of
cash-in-hand
A transaction must see a consistent database and must leave a consistent database
During transaction execution the database may be temporarily inconsistent.
Constraints to be verified only at the end of the transaction
Transaction ACID properties (Cont.)
Isolation requirement — if between steps 3 and 6, another transaction T2 is
allowed to access the partially updated database, it will see an inconsistent
database (the sum A + B will be less than it should be).
T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B
Isolation can be ensured trivially by running transactions serially
that is, one after the other.
However, executing multiple transactions concurrently has significant benefits, as
we will see later.
ACID Properties - Summary
A transaction is a unit of program execution that accesses and
possibly updates various data items.To preserve the integrity of
data the database system must ensure:
Atomicity Either all operations of the transaction are properly reflected in the
database or none are.
Consistency Execution of a (single) transaction preserves the consistency of
the database.
Isolation Although multiple transactions may execute concurrently, each
transaction must be unaware of other concurrently executing transactions.
Intermediate transaction results must be hidden from other concurrently
executed transactions.
That is, for every pair of transactions Ti and Tj, it appears to Ti that either
Tj, finished execution before Ti started, or Tj started execution after Ti
finished.
Durability. After a transaction completes successfully, the changes it has
made to the database persist, even if there are system failures.
Non-ACID Transactions
There are application domains where ACID properties are not necessarily desired
or, most likely, not always possible.
This is the case of so-called long-duration transactions
Suppose that a transaction takes a lot of time
In this case it is unlikely that isolation can/should be guaranteed
E.g. Consider a transaction of booking a hotel and a flight
Without Isolation, Atomicity may be compromised
Consistency and Durability should be preserved
Usual solution for long-duration transaction is to define compensation action –
what to do if later the transaction fails
In (centralized) databases long-duration transactions are usually not considered.
But these are more and more important, specially in the context of the Web.
Transaction State
Active – the initial state; the transaction stays in this state while it is executing
Partially committed – after the final statement has been executed.
Failed – after the discovery that normal execution can no longer proceed.
Aborted – after the transaction has been rolled back and the database restored to
its state prior to the start of the transaction. Two options after it has been aborted:
restart the transaction
can be done only if no internal logical error
kill the transaction
Committed – after successful completion.
Schedule 3 Schedule 6
Conflict Serializability (Cont.)
Example of a schedule that is not conflict serializable:
We are unable to swap instructions in the above schedule to obtain either the
serial schedule < T3, T4 >, or the serial schedule < T4, T3 >.
Testing for Serializability
Consider some schedule of a set of transactions T1, T2, ..., Tn
Precedence graph — a direct graph where
the vertices are the transactions (names).
there is an arc from Ti to Tj if the two transaction conflict, and Ti
accessed the data item on which the conflict arose earlier.
We may label the arc by the item that was accessed.
Example 1
x
y
Example Schedule (Schedule A) + Precedence Graph
T1 T2 T3 T4
T5
read(X)
read(Y)
read(Z)
read(V)
read(W) T T
read(W)
read(Y) 1 2
write(Y)
write(Z)
read(U)
read(Y)
write(Y)
read(Z) T T
write(Z)
4
read(U) 3
write(U)
T
5
Test for Conflict Serializability
A schedule is conflict serializable if and only if its
precedence graph is acyclic.
Cycle-detection algorithms exist which take order n2 time,
where n is the number of vertices in the graph.
(Better algorithms take order n + e where e is the
number of edges.)
If precedence graph is acyclic, the serializability order can
be obtained by a topological sorting of the graph.
This is a linear order consistent with the partial order
of the graph.
For example, a serializability order for Schedule A
would be
T5 T1 T3 T2 T4
View Serializability
Sometimes it is possible to serialize schedules that are not conflict serializable
View serializability provides a weaker and still consistency preserving notion of
serialization
Let S and S´ be two schedules with the same set of transactions. S and S´ are
view equivalent if the following three conditions are met, for each data item
Q,
1. If in schedule S, transaction Ti reads the initial value of Q, then in schedule
S’ also transaction Ti must read the initial value of Q.
2. If in schedule S transaction Ti executes read(Q), and that value was
produced by transaction Tj (if any), then in schedule S’ also transaction Ti
must read the value of Q that was produced by the same write(Q) operation
of transaction Tj .
3. The transaction (if any) that performs the final write(Q) operation in
schedule S must also perform the final write(Q) operation in schedule S’.
Example: schedule 1 and schedule3 are view equivalent
View Serializability (Cont.)
A schedule S is view serializable if it is view equivalent to a serial schedule.
Every conflict serializable schedule is also view serializable.
Below is a schedule which is view-serializable but not conflict serializable.
If T8 should abort, T9 would have read (and possibly shown to the user) an inconsistent
database state. Hence, database must ensure that schedules are recoverable.
Cascading Rollbacks
Cascading rollback – a single transaction failure leads to a series of transaction
rollbacks. Consider the following schedule where none of the transactions has
yet committed (so the schedule is recoverable)
Concurrency-control protocols allow concurrent schedules, but ensure that the schedules are
conflict/view serializable, and are recoverable and cascadeless .
Concurrency control protocols generally do not examine the precedence graph as it is being
created
Instead a protocol imposes a discipline that avoids nonseralizable schedules.
Different concurrency control protocols provide different tradeoffs between the amount of
concurrency they allow and the amount of overhead that they incur.
Tests for serializability help us understand why a concurrency control protocol is correct.
Lock-Based Protocols
A lock is a mechanism to control concurrent access to a data item
Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read as well as
written. X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is
requested using lock-S instruction.
Lock requests are made to concurrency-control manager. Transaction can proceed only after request
is granted.
Lock-Based Protocols (Cont.)
Lock-compatibility matrix
Neither T3 nor T4 can make progress — executing lock-S(B) causes T4 to wait for
T3 to release its lock on B, while executing lock-X(A) causes T3 to wait for T4
to release its lock on A.
Such a situation is called a deadlock.
To handle a deadlock one of T3 or T4 must be rolled back
and its locks released.
Pitfalls of Lock-Based Protocols (Cont.)
The potential for deadlock exists in most locking protocols. Deadlocks are a
necessary evil.
Starvation is also possible if concurrency control manager is badly designed.
For example:
A transaction may be waiting for an X-lock on an item, while a sequence of
other transactions request and are granted an S-lock on the same item.
The same transaction is repeatedly rolled back due to deadlocks.
Concurrency control manager can be designed to prevent starvation.
Two-phase locking does not ensure freedom from deadlocks
Deadlock prevention protocols or deadlock detection mechanisms are
needed!
With detection mechanisms when deadlock is detected :
Some transaction will have to rolled back (made a victim) to break
deadlock. Select that transaction as victim that will incur minimum cost.
Deadlock Detection
Deadlocks can be described as a wait-for graph where:
vertices are all the transactions in the system
There is an edge Ti Tk in case Ti is waiting for Tk
When Ti requests a data item currently being held by Tk, then the edge Ti Tk
is inserted in the wait-for graph. This edge is removed only when Tk is no
longer holding a data item needed by Ti.
The system is in a deadlock state if and only if the wait-for graph has a cycle.
Must invoke a deadlock-detection algorithm periodically to look for cycles.
transaction transaction
with smaller with larger
timestamp timestamp
Some applications are willing to live with weak levels of consistency, allowing
schedules that are not serializable
E.g. a read-only transaction that wants to get an approximate total
balance of all accounts
E.g. database statistics computed for query optimization can be
approximate
Such transactions need not be serializable with respect to other
transactions
Tradeoff accuracy for performance
Levels of Consistency in SQL-92
Serializable — default in SQL standard
Repeatable read — only committed records to be read, repeated reads of
same record must return same value (no updates of read items in between).
However, a transaction may not be serializable – it may find some records
inserted by a transaction but not find others.
Read committed — only committed records can be read, but successive reads
of record may return different (but committed) values.
Read uncommitted — even uncommitted records may be read.
By default, in Oracle all consistency tests are made immediately after each DML command
(insert, delete or update).
However, it is possible to defer consistency checking of constraints (primary keys,
candidate keys, foreign keys, and check conditions) to the end of transactions.
Only this makes it possible e.g. to insert tuples in relation with circular dependencies
in foreign keys
For this:
each constraints that may possibly be deferred must be declared as deferrable:
At the definition of the constraint add deferrable immediately afterwards
at the transaction in which one wants to defer the verification of the constraints, add
command:
set constraints all deferred
In this command, instead of all it is possible to specify which constraints are to be
deferred, by giving their names separated by commas