Ch-5 Transactions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 65

Database Management System

UNIT II: Chapter 5: Transactions


Transaction Processing
Motivated by two independent reqs.

 Concurrent DB access

 Resilience to system failures


Concurrent Database Access
 Attribute-level Inconsistency

Update Project Set amount = amount + 10000 where


projectID = ‘DM22’;

concurrent with …

Update Project Set amount = amount + 20000 where


projectID = ‘DM22’;
Concurrent Database Access cont
 Tuple-level Inconsistency

Update Department Set phoneNo = ‘2378402’ where


deptID = ‘3’

concurrent with …

Update Department Set location = ‘Main Building’


where deptID = ‘3’
Concurrent Database Access cont
 Table-level Inconsistency

Update Professor Set decision = ‘accept’ where


profID in ( select profID from Project where amount
> 20000;)

concurrent with …

Update Project Set amount = 1.1 * amount where


startDate > 01/01/2006
 Concurrency Goal
 Execute sequence of SQL statements so they appear to
be running in isolation.

 System-Failure Goal
 Guarantee all-or-nothing execution, regardless of
failures.
Transactions
 Solution for both concurrency and failures

 A transaction is a sequence of one or more SQL


operations treated as a unit
 Transactions appear to run in isolation
 If the system fails, each transaction’s changes are reflected
either entirely or not at all

 SQL standard:
 Transaction begins automatically on first SQL statement
 On “commit” transaction ends and new one begins
Types of Failures
 Computer Failure
 A hardware or software error in the computer during
transaction execution.

 Transaction Error
 Failure caused by an operation in the transaction
Types of Failures cont…
 Local Errors
 Conditions that cause the cancellation of transaction.
(eg: data needed for transaction not found.)

 Concurrency control enforcement


 Transaction may be aborted by the concurrency
control method.
Types of Failures cont…
 Disk Failure
 Loss of data in disk blocks during a transaction due
to, disk read/write head crash.

 Physical problems and catastrophes


 Problems like power failure, fire etc.
What is a Transaction?
 A transaction is an atomic unit of database access,
which is either completely executed or not executed at
all.

 “All or nothing” principle


 Example: Online booking of railway reservation
tickets
Transactions
 Many enterprises use databases to store information
about their state
 Eg: Balances of all depositors at a bank

 When an event occurs in the real world that changes the


state of the enterprise, a program is executed to change
the database state in a corresponding way
 Eg: Bank balance must be updated when deposit is
made
 A transaction is a program that accesses the database in
response to real-world events.
Example of Transactions
 Debit Transaction
 Debit(Account_Number, Withdraw_Amt)
 Begin_Transaction
 Read(Account_Number,Balance) (ID1)
 Balance = Balance – Withdraw_Amt (ID2)
 Write(Account_Number, Balance) (ID3)
 End_Transaction
Example of Transactions cont…
 Credit Transaction
 Credit(Account_Number, Credit_Amt)
 Begin_Transaction
 Read(Account_Number, Balance) (IC1)
 Balance = Balance + Credit_Amt (IC2)
 Write(Account_Number, Balance) (IC3)
 End_Transaction
Read Operation – Read(X)
 Find the address of the disk block that contains item X.

 Copy that disk block into a buffer in main memory.

 Copy item X from the buffer to the program variable


named X.
Write Operation – Write(X)
 Find the address of the disk block that contains X.

 Copy that disk block into a buffer in main memory.

 Copy item X from the program variable named X into


the correct location in the buffer.

 Store the updated block from the buffer back to the


disk.
Transaction Properties
 Atomicity
 Consistency
 Isolation
 Durability
Scenario One
 What happens if the credit transaction fails after
executing IC1 and before executing IC2?

 Transaction Properties (Atomicity): Either all the


instructions are executed in full or none.

 Explanation: In the case of credit transaction, all the


instructions starting from begin transaction to end
transaction (IC1, IC2, IC3) will be executed in full or
none: Atomicity property.
Scenario Two
 What happens if Credit transaction and Debit transaction execute
simultaneously?
 Eg: IC1, ID1, IC2, ID2, IC3, ID3

 Transaction Properties (Consistency): When multiple


transactions are accessing data simultaneously, the access is
protected through concurrency control mechanisms.

 Explanation: In case both debit transaction and credit


transaction access the account number and balance data
simultaneously, they will be protected through concurrency
control mechanisms.
Scenario Three
 What happens if the results of credit transaction are
visible to debit transaction before it is actually written
onto the database?

 Transaction Properties (Isolation): the results of one


transaction will not be visible to other transactions till
the transaction “Commits”.

 Explanation: Either debit or credit transaction results


will not be available until they are committed.
Scenario Four
 What happens if the database server crashes before the
changed data is written onto a stable storage?

 Transaction Properties (Durability): The effects of


committed transactions are not lost after commitment.

 Explanation: The effects are debit transaction will not


be lost once it is committed.
Transaction Terminology
 Commit – Transaction completely performs all the
read’s and write’s and changes the database state
accordingly.

 Abort – Transaction is unable to complete all the read’s


and write’s and hence will undo the operations that were
performed till that point. Database will remain in the
same state as it was prior to the beginning of this
transaction.
Transaction Processing
 Two sample transactions:
 (a) Transaction T1 (b) Transaction T2
Why Concurrency Control is needed
 Multiple transactions are allowed to run concurrently.

 Advantages are:
 increased processor and disk utilization, leading to better
transaction throughput
 one transaction can be using the CPU while another is reading from or
writing to the disk
 throughput: # of transactions executed in a given amount of time.
 reduced average response time for transactions
 short transactions need not wait behind long ones.

 Concurrency control is needed to achieve isolation i.e.,


to control the interaction among the concurrent
transactions in order to prevent them from destroying the
database consistency.
Why Concurrency Control is needed
 Problems occur when concurrent transactions execute
in an uncontrolled manner:
 The Lost Update
 The Temporary Update (or Dirty Read)
 The Incorrect Summary

 Unrepeatable Read:
 A transaction T1 may read a given value. If another
transaction later updates that value and T1 reads that value
again, then T1 will see a different value.
The Lost Update Problem
 This occurs when two transactions that access the same
database items have their operations interleaved in a way
that makes the value of some database item incorrect.
The Temporary Update Problem (Dirty Read)
 This occurs when one transaction updates a database item
and then the transaction fails for some reason.
 The updated item is accessed by another transaction before it is
changed back to its original value.
Incorrect Summary Problem
 If one transaction is calculating an aggregate summary
function on a number of records while other
transactions are updating some of these records, the
aggregate function may calculate some values before
they are updated and others after they are updated.
Incorrect Summary Problem
Transaction and System Concepts
 A transaction is an atomic unit of work that is either completed in its entirety or not done at all.
 For recovery purposes, the system needs to keep track of when the transaction starts, terminates,
and commits or aborts.
 A transaction must be in one of the following states:
 Active:
 the initial state; the transaction stays in this state while it is executing
 Partially committed:
 after the final statement has been executed.
 Failed:
 after the discovery that normal execution can no longer proceed.
 Committed:
 after successful completion.
 Aborted:
 after the transaction has been rolled back and the database restored to its state prior to the
start of the transaction. Two options after it has been aborted:
 restart the transaction (only if no internal logical error)

 kill the transaction


Transaction and System Concepts
 Recovery manager keeps track of the following operations:
 begin_transaction
 This marks the beginning of transaction execution.
 read or write:
 These specify read or write operations on the database items that are executed as part of a
transaction.
 end_transaction:
 This specifies that read and write transaction operations have ended and marks the end
limit of transaction execution.
 At this point it may be necessary to check whether the changes introduced by the
transaction can be permanently applied to the database or whether the transaction has to be
aborted because it violates concurrency control or for some other reason.
 commit_transaction:
 This signals a successful end of the transaction so that any changes (updates) executed by the
transaction can be safely committed to the database and will not be undone.
 rollback (or abort):
 This signals that the transaction has ended unsuccessfully, so that any changes or effects that
the transaction may have applied to the database must be undone.
Transaction State
The System Log
 Log or Journal: The log keeps track of all transaction
operations that affect the values of database items.
 This information may be needed to permit recovery
from transaction failures.

 The log is kept on disk, so it is not affected by any


type of failure except for disk failure.

 In addition, the log is periodically backed up to


archival storage to guard against such failures.
The System Log
 T in the following discussion refers to a unique transaction-id that is
generated automatically by the system and is used to identify each
transaction:
 Types of log record:
 [start_transaction, T]:
 Records that transaction T has started execution.
 [write_item, T, X, old_value, new_value]:
 Records that transaction T has changed the value of database item X
from old_value to new_value.
 [read_item, T, X]:
 Records that transaction T has read the value of database item X.
 [commit, T]:
 Records that transaction T has completed successfully, and affirms
that its effect can be committed (recorded permanently) to the
database.
 [abort, T]:
 Records that transaction T has been aborted.
Schedules
 A schedule S of n transactions T1, T2, ..., Tn is an ordering
of all the operations in these transactions subject to the
constraint that:
 for each transaction Ti, the operations of Ti in S must
appear in the same order as they do in Ti.
 Note, however, that operations from other transactions
Tk can be interleaved with the operations of Ti in S.

 Example: Given
 T1 = R1(Q) W1(Q) & T2 = R2(Q) W2(Q)

 a schedule: R1(Q) R2(Q) W1(Q) W2(Q)

 not a schedule: W1(Q) R1(Q) R2(Q) W2(Q)


Schedules
 Sa: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y);
Schedules
 Sb: r1(X); w1(X); r2(X); w2(X); r1(Y); a1;
Conflict Operations
 Instructions (Operations) Oi and Oj of transactions Ti
and Tj respectively, conflict if and only if there exists
some item X accessed by both Oi and Oj, and at least
one of these instructions is write X.
1. Oi = Read(X), Oj = Read(X). Oi and Oj don’t conflict
2. Oi = Read(X), Oj = Write(X). They conflict
3. Oi = Write(X), Oj = Read(X). They conflict
4. Oi = Write(X), Oj = Write(X). They conflict

 Two operations in a schedule are conflicting if:


1) They belong to different transactions,
2) They access the same item X, and
3) At least one of them is a Write(X) operation.
Recoverable Schedule
 Once a transaction T is committed, it should never be
necessary to rollback T.
 Def: Schedule S is Recoverable if Tj reads a data item
written by Ti , the COMMIT operation of Ti appears
before the COMMIT operation of Tj
 Sa, Sb and Sa’ are recoverable:
 Sa’: r1(X); r2(X); w1(X); r1(Y); w2(X); c2; w1(Y); c1;

 Consider the following schedules:


 Sc: r1(X); w1(X); r2(X); r1(Y); w2(X); c2; a1;
 Sd: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); c1; c2;
 Se: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); a1; a2;
Cascading Rollback
 Cascading rollback/ Cascading abort:
 A schedule in which uncommitted transactions has to be
rolled back because it read an item from a transaction that
failed.
 As shown in schedule Se

 A single transaction abort leads to a series of transaction


Rollback.

 Q: What is the main cause of cascading rollback?


 Ans: DIRTY READ: reading an output of an uncommitted
transaction.
Cascadeless Schedule
 One where every transaction reads only the items that are
written by committed transactions.
 r2(X) in Sd and Se must be postponed until after T1 has
committed (or aborted), thus delaying T2 but ensuring no
cascading rollback if T1 aborts.

 Def: Schedule S is CASCADELESS if Tj reads a data item


written by Ti, the COMMIT operation of Ti appears before the
READ operation of Tj.
Strict Schedule
 Strict Schedules:
 A schedule in which a transaction can neither read or
write an item X until the last transaction that wrote X
has committed.
 Consider the following schedule:
 Sf: w1(X, 5); w2(X , 8); a1;

 It is not strict
 It permits T2 to write X even though T1 that last
wrote X had not yet committed (or aborted).
Serial Schedules
 A schedule S is serial,
 if for every transaction T participating in the
schedule, all the operations of T are executed
consecutively
 if operations from different transactions are not
interleaved.

 otherwise the schedule is called nonserial.


 Serial schedules:
 R1(Q) W1(Q) R2(Q) W2(Q)
 R2(Q) W2(Q) R1(Q) W1(Q)
 Non-serial schedule:
 R1(Q) R2(Q) W1(Q) W2(Q)
Example Serial Schedules
 (a) Serial schedule A: T1 followed by T2.
 (b) Serial schedules B: T2 followed by T1.
Example non-serial Schedule
 (c) Two nonserial schedules C and D with interleaving of
operations.
Several Observations
 Serial schedule guarantees database consistency.

 n transactions may form n! different serial schedules.

 Allowing only serial schedule may cause poor system


performance (i.e., low throughput)

 Serial schedule is not a must for guaranteeing


transaction consistency.
Serializability
 One way to ensure correctness of concurrent transactions is
to enforce serializability of transactions
 that is the interleaved execution of the transactions must be
equivalent to some serial execution of those transactions.

 Definition: A schedule is said to be serializable if the


result of executing that schedule is the same as the result of
executing some serial schedule.
 A schedule S of n transactions is serializable if it is
equivalent to some serial schedule of the same n
transactions.

 Different forms of schedule equivalence :


 Conflict equivalence (conflict serializability)
 View equivalence (view serializability)
Conflict Serializability
 Two schedules are said to be conflict equivalent if the order of
any two conflicting operations is the same in both schedules.

 We say that a schedule S is conflict serializable if it is conflict


equivalent to a serial schedule

 Example: Consider two transactions:


 T1 = R1(X) W1(X) R1(Y) W1(Y)

 T2 = R2(X) W2(X)

 The following two schedules are equivalent:


 S1: R1(X) W1(X) R2(X) W2(X) R1(Y) W1(Y)

 S2: R1(X) W1(X) R1(Y) W1(Y) R2(X) W2(X)


Conflict Serializability
 Two schedules are said to be conflict equivalent if the order of any
two conflicting operations is the same in both schedules.
 [OR: If a schedule S can be transformed into a schedule S´ by a series
of swaps of non-conflicting instructions, we say that S and S´ are
conflict equivalent.]
 We say that a schedule S is conflict serializable if it is conflict
equivalent to a serial schedule
 Example: Consider two transactions:
 T1 = R1(X) W1(X) R1(Y) W1(Y)

 T2 = R2(X) W2(X)

 The following two schedules are equivalent:


 S1: R1(X) W1(X) R2(X) W2(X) R1(Y) W1(Y)

 S2: R1(X) W1(X) R1(Y) W1(Y) R2(X) W2(X)


Conflict Serializability (Cont.)
 Example of a schedule that is not conflict serializable:
T3 T4

read(Q)
write(Q)
write(Q)

 Serial schedule - T3, T4 or T4, T3


Testing for Conflict Serializability
Schedules
Algorithm Precedence_Graph()
{

Step-1: Draw nodes corresponding to each of the transaction in S.

Step-2: Draw an edge from Ti to Tj if Ti precedes and conflicts


with Tj. Complete the precedence graph. (The conflicting actions
are RW, WR, and WW).

Step-3: If the precedence graph is acyclic,


then print "S is conflict serializable".
else print "S is not conflict serializable".
}
Example on Serializability testing
Example on Serializability testing
Example on Serializability testing
Example on Serializability testing
View Serializability
 Let S1 and S2 be two schedules with the same set of
transactions. S1 and S2 are view equivalent if the
following three conditions are met:

 If Ti reads initial value of A in S1, then Ti also reads


initial value of A in S2
 If Tj reads value of A written by Ti in S1, then Tj also
reads value of A written by Ti in S2
 If Ti writes final value of A in S1, then Ti also writes
final value of A in S2
View Serializability
 As can be seen, view equivalence is based purely on
reads and writes alone.

 Conditions 1 and 2 ensure that


 each transaction reads the same values in both
schedules.
 Condition 3, coupled with conditions 1 and 2,
ensures that
 both schedules results in the same final state
View Serializability
 A schedule S is view serializable
 if it is view equivalent to a serial schedule.
 Every conflict serializable schedule is also view serializable.
 Below Schedule is view-serializable but not conflict serializable.
 Every view serializable schedule that is not conflict serializable has
blind writes.
 a write operation without having performed a read operation

T1 T2 T3
read(x)
write(x)
write(x)
write(x)
Summary: Schedules
 Schedules must be conflict or view
serializable, and recoverable - for the sake
of database consistency, and preferably
cascadeless.
Exercise Question1
 Which of the following schedules is (conflict) serializable?
For each serializable schedule, determine the equivalent
serial schedules.

(a) r1 (X); r3 (X); w1(X); r2(X); w3(X)

(b) r1 (X); r3 (X); w3(X); w1(X); r2(X)

(c) r3 (X); r2 (X); w3(X); r1(X); w1(X)

(d) r3 (X); r2 (X); r1(X); w3(X); w1(X)


Solution1
Let there be three transactions T1, T2, and T3. They are executed concurrently and produce a schedule S. S is
serializable if it can be reproduced as at least one serial schedule (T1 T2 T3 or T1 T3 T2 or
T2 T1 T3 or T2 T3 T1 or T3 T1 T2 or T3 T2 T1).

(a) This schedule is not serializable because T1 reads X (r1(X)) before T3 but T3 reads X (r3(X)) before T1 writes
X (w1(X)), where X is a common data item. The operation r2(X) of T2 does not affect the schedule at all so
its position in the schedule is irrelevant. In a serial schedule T1, T2, and T3, the operation w1(X) comes after
r3(X), which does not happen in the question.

(b) This schedule is not serializable because T1 reads X ( r1(X)) before T3 but T3 writes X (w3(X)) before T1
writes X (w1(X)). The operation r2(X) of T2 does not affect the schedule at all so its position in the schedule
is irrelevant. In a serial schedule T1, T3, and T2, r3(X) and w3(X) must come after w1(X), which does not
happen in the question.

(c) This schedule is serializable because all conflicting operations of T3 happens before all conflicting operation
of T1. T2 has only one operation, which is a read on X (r2(X)), which does not conflict with any other
operation. Thus this serializable schedule is equivalent to r2(X); r3(X); w3(X); r1(X); w1(X) (e.g., T2 T3
T1) serial schedule.

(d) This is not a serializable schedule because T3 reads X (r3(X)) before T1 reads X (r1(X)) but r1(X) happens
before T3 writes X (w3(X)). In a serial schedule T3, T2, and T1, r1(X) will happen after w3(X), which does not
happen in the question.
Exercise Question2
 Consider the three transactions T1, T2, and T3, and the
schedules S1 and S2 given below. Draw the serializibility
(precedence) graphs for S1 and S2 and state whether each
schedule is serializable or not. If a schedule is serializable, write
down the equivalent serial schedule(s).

T1: r1(x); r1(z); w1(x)


T2: r2(z); r2(y); w2(z); w2(y)
T3: r3(x); r3(y); w3(y)
S1: r1(x); r2(z); r1(x); r3(x); r3(y); w1(x); w3(y); r2(y); w2(z); w2(y)
S2: r1(x); r2(z); r3(x); r1(z); r2(y); r3(y); w1(x); w2(z); w3(y); w2(y)
Exercise Question3
 Consider schedules S3, S4, and S5 below. Determine
whether each schedule is strict, cascadeless, recoverable, or
nonrecoverable.

S3: r1(x); r2(z); r1(z); r3(x); r3(y); w1(x); c1; w3(y); c3;
r2(y); w2(z); w2(y);c2
S4: r1(x); r2(z); r1(z); r3(x); r3(y); w1(x); w3(y); r2(y);
w2(z); w2(y); c1; c2; c3;
S5: r1(x); r2(z); r3(x); r1(z); r2(y); r3(y); w1(x); w2(z);
w3(y); w2(y); c3; c2;
Resources
 Chapter 21 of Fundamentals of Database Systems (FODS),
6th Edition.
 References books
 Internet Surf

 (Most of the slides adapted from above references)

You might also like