Unit 5 Transaction and Concurrency Control
Unit 5 Transaction and Concurrency Control
Unit 5 Transaction and Concurrency Control
UNIT-V
Transaction and Concurrency Control
Transactions
Transaction Concept
• A transaction is a unit of program execution that accesses and
possibly updates various data items.
• E.g., transaction to transfer Rs.50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Two main issues:
• Failures of various kinds, such as hardware failures and
system crashes
• Concurrent execution of multiple transactions
Required Properties of a Transaction
• Consider a transaction to transfer Rs. 50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Atomicity requirement
• If the transaction fails after step 3 and before step 6, money will be
“lost” leading to an inconsistent database state
• Failure could be due to software or hardware
• The system should ensure that updates of a partially executed
transaction are not reflected in the database
• Durability requirement — once the user has been notified that the
transaction has completed (i.e., the transfer of the Rs. 50 has taken
place), the updates to the database by the transaction must persist
even if there are software or hardware failures.
Required Properties of a Transaction
• Consistency requirement in above example:
• The sum of A and B is unchanged by the execution of the
transaction
• In general, consistency requirements include
• Explicitly specified integrity constraints such as primary keys
and foreign keys
• Implicit integrity constraints
• e.g., sum of balances of all accounts, minus sum of loan
amounts must equal value of cash-in-hand
• A transaction, when starting to execute, must see a consistent
database.
• During transaction execution the database may be temporarily
inconsistent.
• When the transaction completes successfully the database must be
consistent
• Erroneous transaction logic can lead to inconsistency
Required Properties of a Transaction
• Isolation requirement — if between steps 3 and 6 (of the fund transfer
transaction) , another transaction T2 is allowed to access the partially
updated database, it will see an inconsistent database (the sum A + B
will be less than it should be).
T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B
• Isolation can be ensured trivially by running transactions serially
• That is, one after the other.
• However, executing multiple transactions concurrently has significant
benefits.
ACID Properties
A transaction is a unit of program execution that accesses and
possibly updates various data items. To preserve the integrity of data
the database system must ensure:
• Atomicity. Either all operations of the transaction are properly
reflected in the database or none are.
• Consistency. Execution of a transaction in isolation preserves the
consistency of the database.
• Isolation. Although multiple transactions may execute concurrently,
each transaction must be unaware of other concurrently executing
transactions. Intermediate transaction results must be hidden from
other concurrently executed transactions.
• That is, for every pair of transactions Ti and Tj, it appears to Ti that
either Tj, finished execution before Ti started, or Tj started
execution after Ti finished.
• Durability. After a transaction completes successfully, the changes it
has made to the database persist, even if there are system failures.
Transaction State
• Active – the initial state; the transaction stays in this state
while it is executing
• Partially committed – after the final statement has been
executed.
• Failed -- after the discovery that normal execution can no
longer proceed.
• Aborted – after the transaction has been rolled back and the
database restored to its state prior to the start of the
transaction. Two options after it has been aborted:
• Restart the transaction
• can be done only if no internal logical error
• Kill the transaction
• Committed – after successful completion.
Transaction State
Concurrent Executions
• Multiple transactions are allowed to run concurrently in the
system. Advantages are:
• Increased processor and disk utilization, leading to better
transaction throughput
• E.g. one transaction can be using the CPU while another is
reading from or writing to the disk
• Reduced average response time for transactions: short
transactions need not wait behind long ones.
• Concurrency control schemes – mechanisms to achieve isolation
• That is, to control the interaction among the concurrent
transactions in order to prevent them from destroying the
consistency of the database
Schedules
• Schedule – a sequences of instructions that specify the
chronological order in which instructions of concurrent
transactions are executed
• A schedule for a set of transactions must consist of all
instructions of those transactions
• Must preserve the order in which the instructions appear in
each individual transaction.
• A transaction that successfully completes its execution will have a
commit instructions as the last statement
• By default transaction assumed to execute commit instruction
as its last step
• A transaction that fails to successfully complete its execution will
have an abort instruction as the last statement
Schedule 1
• Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to
B.
• An example of a serial schedule in which T1 is followed by T2 :
Schedule 2
• A serial schedule in which T2 is followed by T1 :
Schedule 3
• Let T1 and T2 be the transactions defined previously. The following
schedule is not a serial schedule, but it is equivalent to Schedule 1.
Schedule 3 Schedule 6
Conflict Serializability
• If T8 should abort, T9 would have read (and possibly shown to the user)
an inconsistent database state. Hence, database must ensure that
schedules are recoverable.
Cascading Rollbacks
• Cascading rollback – a single transaction failure leads to a series
of transaction rollbacks. Consider the following schedule where
none of the transactions has yet committed (so the schedule is
recoverable)
If we start with A = 1000 and B = 2000, the final result is 960 and 2040
• Determining such equivalence requires analysis of operations other than
read and write.
Concurrency Control
Lock-Based Protocols
• A lock is a mechanism to control concurrent access to a data item
• Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read as well as
written. X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is
requested using lock-S instruction.
• Lock requests are made to the concurrency-control manager by the
programmer. Transaction can proceed only after request is granted.
Lock-Based Protocols (Cont.)
• Lock-compatibility matrix
• If for all Ti with TS (Ti) < TS (Tj) either one of the following condition
holds:
• finish(Ti) < start(Tj)
• start(Tj) < finish(Ti) < validation(Tj) and the set of data items
written by Ti does not intersect with the set of data items read
by Tj.
then validation succeeds and Tj can be committed. Otherwise,
validation fails and Tj is aborted.
• Justification: Either the first condition is satisfied, and there is no
overlapped execution, or the second condition is satisfied and
• the writes of Tj do not affect reads of Ti since they occur after Ti
has finished its reads.
• the writes of Ti do not affect reads of Tj since Tj does not read
any item written by Ti.
Schedule Produced by Validation
• Example of schedule produced using validation
Recovery System
Failure Classification
• Transaction failure :
• Logical errors: transaction cannot complete due to some internal
error condition
• System errors: the database system must terminate an active
transaction due to an error condition (e.g., deadlock)
• System crash: a power failure or other hardware or software failure
causes the system to crash.
• Fail-stop assumption: non-volatile storage contents are assumed
to not be corrupted by system crash
• Database systems have numerous integrity checks to prevent
corruption of disk data
• Disk failure: a head crash or similar disk failure destroys all or part
of disk storage
• Destruction is assumed to be detectable: disk drives use
checksums to detect failures
Recovery Algorithms
• Consider transaction Ti that transfers $50 from account A to
account B
• Two updates: subtract 50 from A and add 50 to B
• Transaction Ti requires updates to A and B to be output to the
database.
• A failure may occur after one of these modifications have been
made but before both of them are made.
• Modifying the database without ensuring that the transaction
will commit may leave the database in an inconsistent state
• Not modifying the database may result in lost updates if failure
occurs just after transaction commits
• Recovery algorithms have two parts
• Actions taken during normal transaction processing to ensure
enough information exists to recover from failures
• Actions taken after a failure to recover the database contents to
a state that ensures atomicity, consistency and durability
Storage Structure
• Volatile storage:
• does not survive system crashes
• examples: main memory, cache memory
• Nonvolatile storage:
• survives system crashes
• examples: disk, tape, flash memory, non-volatile RAM
• but may still fail, losing data
• Stable storage:
• a mythical form of storage that survives all failures
• approximated by maintaining multiple copies on distinct
nonvolatile media
Stable-Storage Implementation
• Maintain multiple copies of each block on separate disks
• copies can be at remote sites to protect against disasters such as
fire or flooding.
• Failure during data transfer can still result in inconsistent copies:
Block transfer can result in
• Successful completion
• Partial failure: destination block has incorrect information
• Total failure: destination block was never updated
• Protecting storage media from failure during data :
• Execute output operation as follows (assume two copies of
block):
1. Write the information onto the first physical block.
2. When the first write successfully completes, write the same
information onto the second physical block.
3. The output is completed only after the second write
successfully completes.
Stable-Storage Implementation
• Protecting storage media from failure during data transfer.
• Copies of a block may differ due to failure during output
operation. To recover from failure:
• First find inconsistent blocks:
• Expensive solution: Compare the two copies of every disk
block.
• Better solution:
• Record in-progress disk writes on non-volatile storage.
• Use this information during recovery to find blocks that
may be inconsistent, and only compare copies of these.
• Used in hardware RAID systems
Data Access
• Physical blocks are those blocks residing on the disk.
• Buffer blocks are the blocks residing temporarily in main memory.
• Block movements between disk and main memory are initiated
through the following two operations:
• input(B) transfers the physical block B to main memory.
• output(B) transfers the buffer block B to the disk, and replaces
the appropriate physical block there.
• We assume, for simplicity, that each data item fits in, and is stored
inside, a single block.
Block Storage Operations
Example of Data Access
buffer
Buffer Block A input(A)
X A
Buffer Block B Y B
output(B)
read(X)
write(Y)
x2
x1
y1
memory disk
Data Access
• Each transaction Ti has its private work-area in which local copies
of all data items accessed and updated by it are kept.
• Ti's local copy of a data item X is called xi.
• Transferring data items between system buffer blocks and its
private work-area done by:
• read(X) assigns the value of data item X to the local variable xi.
• write(X) assigns the value of local variable xi to data item {X} in
the buffer block.
• Transactions
• Must perform read(X) before accessing X for the first time
write(X) can be executed at any time before the transaction
commits
Recovery and Atomicity
• To ensure atomicity despite failures, we first output information
describing the modifications to stable storage without modifying
the database itself.
• We study log-based recovery mechanisms in detail
• We first present key concepts
• And then present the actual recovery algorithm
• Less used alternative: shadow-copy and shadow-paging
shadow-copy
Log-Based Recovery
• A log is kept on stable storage.
• The log is a sequence of log records, and maintains a record of
update activities on the database.
• When transaction Ti starts, it registers itself by writing a
<Ti start>log record
• Before Ti executes write(X), a log record
<Ti, X, V1, V2>
is written, where V1 is the value of X before the write (the old
value), and V2 is the value to be written to X (the new value).
• When Ti finishes it last statement, the log record <Ti commit> is
written.
• Two approaches using logs
• Deferred database modification
• Immediate database modification
Immediate Database Modification
• The immediate-modification scheme allows updates of an
uncommitted transaction to be made to the buffer, or the disk
itself, before the transaction commits
• Update log record must be written before database item is written
• Output of updated blocks to stable storage can take place at any
time before or after transaction commit
• Order in which blocks are output can be different from the order
in which they are written.
• The deferred-modification scheme performs updates to
buffer/disk only at the time of transaction commit
• Simplifies some aspects of recovery
• But has overhead of storing local copy
Transaction Commit
<T0 start>
<T0, A, 1000, 950>
<To, B, 2000, 2050
A = 950
B = 2050
<T0 commit>
<T1 start>
<T1, C, 700, 600>
C = 600 BC output before T1
BB , BC commits
<T1 commit>
BA
Thank You !