Adb CH 1 2013

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

CHAPTER ONE

1. Transaction Management and Concurrency Control


Learning Objectives: After completing this chapter, you will be able to know the following concepts.
• Transaction • Problems of concurrent sharing
• Concurrency control • Concept of serializability
• Concurrency control techniques • Transaction Recovery
1.1. Single-User Versus Multiuser Systems
Single-user versus multiuser systems one criterion for classifying a database system is according to the
number of users who can use the system concurrently. A DBMS is single-user if at most one user at a
time can use the system, and it is multiuser if many users can use the system and hence access the
database concurrently. For example, database systems used in banks, insurance agencies, stock
exchanges, supermarkets, and many other applications are multiuser systems. In these systems,
hundreds or thousands of users are typically operating on the database by submitting transactions
concurrently to the system.

1.2. Transaction
A transaction is a collection of operations that performs a single logical function in a database
application. A transaction is a mechanism for applying the desired modifications/operations to a
database. It is a logical unit of work on the database. A transaction could be a whole program,
part/module of a program or a single command. Changes made in real time to a database are called
transactions. Example, ATM transactions, credit card approvals, flight reservations, hotel check-in,
phone calls, supermarket canning, academic registration and billing. A transaction can have one of two
outcomes. If it completes successfully, the transaction is said to have committed and the database
reaches a new consistent state. On the other hand, if the transaction does not execute successfully, the
transaction is aborted. If a transaction is aborted, the database must be restored to the consistent state it
was in before the transaction started. Such a transaction is rolled back or undone. A committed
transaction cannot be aborted.
1.2.1. Transaction Subsystem
• Transaction Manager: in DBMS coordinates transactions on behalf of application programs.
Communicates with scheduler (lock manager) which is responsible for implementing a strategy
for concurrency control. The transaction manager manages the execution of DML requests

1
and responsible for scheduling the transactions and providing the safest path to complete the
task. The transaction manager function is to ensure that concurrent access to data and ensures
that the database remains in a consistent state.
• Scheduler: in the DBMS ensures that transactions preserve consistency.
• Recovery Manager: Ensures that database is restored to the original state incase failure occurs.
• Buffer Manager: responsible for transfer of data between disk storage and main memory.

1.3. Transaction Management


Transaction Management plays a crucial role in DBMS by ensuring efficiency and consistency of the
DBMS. Transaction is the execution of user program in DBMS. Transaction management component
ensures that the database remains in a consistent state despite system failures and transaction failure.
In the web environment, several users’ attempt to access the data stored in same database. To maintain
the accuracy and the consistency of the database several scheduling algorithms are used.
1.3.1. Key Notations in Transaction Management
The key notations in transaction management are as follows:
• Object: The smallest Data item which is read or updated by the Transaction is called as Object.
• Transaction: It is represented by the symbol T. It is termed as the execution of query in DBMS.
• Read Operation: Read operation on particular object is notated by symbol R (object-name).
• Write Operation: Write operation on particular object is notated by symbol W (object-name).
• Commit: This term used to denote the successful completion of one Transaction.
• Abort: This term used to denote the unsuccessful interrupted Transaction.
1.3.2. Transaction Processing System(TPS)
Transaction processing system usually allow multiple transactions to run concurrently, update data
concurrently with consistency of the data. A TPS generally consists of a transaction processing monitor,
one or more DBMSs, and a set of application programs containing transaction. Transaction is a group
of logical operations that must all succeed or fail as a group. Systems dedicated to supporting such
operations are known as transaction processing systems. Transactions can be started, attempted,
then committed or aborted via data manipulation commands of SQL. Any transaction can have one
of two outcomes:
• Success: transaction commits and database reach a new consistent state. Committed transaction
cannot be aborted or rolled back. How do you discard a committed transaction?

2
• Failure: transaction aborts, and database must be restored to consistent state before it started. Such
a transaction is rolled back or undone. Aborted transaction that is rolled back can be restarted later.
1.3.3. State of a Transaction
A transaction is an atomic unit of work that should either be completed in its entirety or not done at all.
For recovery purposes, the system needs to keep track of when each transaction starts, terminates,
commits, or aborts states. The recovery manager needs to keep track of the following operations:
• Begin transaction: This marks the beginning of transaction execution.
• Read: It specify read operations on the database items that are executed as part of a transaction.
• Write: It specify write operations on the database items that are executed as part of a transaction
• End transaction: This specifies that read and write transaction operations have ended.
Transaction can have a number of states during its execution and it can end in the following states:
• Active: This state is the initial state; the transaction says in this state while it is executing.
• Aborted: Aborted, after the transaction has been rolled back and the database has been restored to
its state prior to the start of the transaction.
• Committed: A transaction that completes its execution successfully is said to be committed. Once
a transaction has been committed we cannot undo its effect by aborting it.

Start Ok to Commit
Commit
Commit
Database
Modified

No Error
System
Detects Error

Modify Abort End of


Transaction

Error Detected by System


Transaction Initiated Consistent
Consistent State
State

Error Transaction Rollback Database


Initiated unmodified

Figure: State transition diagram for a transaction.

3
1.3.4. ACID Property
Transactions access data using read and write operations. ACID properties, in totality, provide a
mechanism to ensure correctness and consistency of a database. Without the ACID property, the
integrity of the database cannot be guaranteed.
1. Atomicity: The all or nothing property. A transaction is an indivisible unit that is either performed
in its entirety or is not performed at all. Every transaction should be considered as an atomic process
which cannot be sub divided into small tasks. Transactions do not occur partially. Each transaction
is considered as one unit and either runs to completion or is not executed at all. It has two outcomes:
Abort: If a transaction aborts, changes made to database are not visible.
Commit: If a transaction commits, changes made are visible. Consider the following transaction
T consisting of T1 and T2: Transfer of 100 from account X to account Y.
Before: X: 500 Y: 200
Transaction T
T1 T2
Read(X) Read(Y)
X፡ =X - 100 Y፡ =Y + 100
Write(X) Write(Y)
After: X: 400 Y: 300
If the transaction fails after completion of T1 but before completion of T2. (say, after write(X) but
before write(Y)), then amount has been deducted from X but not added to Y. This results in an
inconsistent database state. Therefore, the transaction must be executed in entirety in order to ensure
correctness of database state.
2. Consistency: If the transaction is correct then a transaction, at the end of its execution, must leave
the database consistent. This means that integrity constraints must be maintained so that the
database is consistent before and after the transaction. It refers to correctness of a database.
Referring to the example above, the total amount before and after the transaction must be
maintained.
• Total before T occurs = 500 + 200 = 700.
• Total after T occurs = 400 + 300 = 700. Therefore, database is consistent. Inconsistency
occurs in case T1 completes but T2 fails. As a result, T is incomplete.
3. Isolation: Transactions execute independently of one another. A transaction must execute without
interference from other concurrent transactions. Changes occurring in a particular transaction will
not be visible to any other transaction until that particular change in that transaction is written to
memory or has been committed. This property ensures that the execution of transactions

4
concurrently will result in a state that is equivalent to a state achieved these were executed serially
in some order. Let X=500, Y = 500. Consider two transactions T1 and T2.
T1 T2
Read(X) Read(X)
X፡ =X * 100 Read(Y)
Write(X) Z፡ =X + Y
Read(Y) Write(Z)
Y: Y – 50
Write
Suppose T1 has been executed till Read (Y) and then T2 starts. As a result, interleaving of operations
takes place due to which T2 reads correct value of X but incorrect value of Y and sum computed by
T2: (X+Y = 50,000+500=50,500) is thus not consistent with the sum at end of transaction:
T1: (X+Y = 50,000 + 450 = 50,450). This results in database inconsistency, due to a loss of 50 units.
Hence, transactions must take place in isolation and changes should be visible only after a they have
been made to the main memory.
4. Durability: This property ensures that once the transaction has completed execution, the updates
and modifications to the database are stored in and written to disk and they persist even if system
failure occurs. These updates now become permanent and are stored in a non-volatile memory. The
effects of the transaction, thus, are never lost.
• Atomicity and Durability are closely related.
• Consistency and Isolation are closely related.
To enforce Consistency and Isolation concept DBMS has to maintain certain scheduling algorithms.

1.4. Types of Transaction Schedules


Schedule is sequence of operations by a set of concurrent transactions that preserves the order of the
operations in each of the individual transactions.
1. Serial Schedules: In this scheduling method, transactions are executed one by one from the start
to finish. Serial schedule is one in which no transaction starts until a running transaction has ended
are called serial schedules. The example for the serial schedule is illustrated in Figure.
T1 T2
R(A)
W(A)
Commit
R(A)
W(A)
Commit
Figure: Serial scheduling

5
2. Complete Schedules: Schedules in which the last operation of each transaction is either abort or
commit are called complete schedules. Example, consider the following schedule involving three
transactions T1, T2 and T3.
T1 T2 T3
R(A)
W(A)
R(B)
W(B)
Commit
Commit
Abort
This is a complete schedule since the last operation performed under every transaction is either commit
or abort.
3. Recoverable Schedules: Schedules in which transactions commit only after all transactions whose
changes they read commit are called recoverable schedules. In other words, if some transaction Tj
is reading value updated or written by some other transaction Ti, then the commit of Tj must occur
after the commit of Ti. E.g., Consider the following schedule involving two transactions T1 and T2.
T1 T2
R(A)
W(A)
W(A)
R(A)
Commit
Commit
This is a recoverable schedule since T1 commits before T2, that makes the value read by T2 correct.
Example, consider the following schedule involving two transactions T1 and T2.
T1 T2
R(A)
W(A)
W(A)
R(A)
Commit
Abort
T2 read the value of A written by T1, and committed. T1 later aborted, therefore the value read by T2
is wrong, but since T2 committed, this schedule is unrecoverable.
4. Cascadeless Schedules: Also called Avoids Cascading Aborts/rollbacks (ACA). Schedules in
which transactions read values only after all transactions whose changes they are going to read
commit are called cascadeless schedules. Avoids that a single transaction abort leads to a series of

6
transaction rollbacks. A strategy to prevent cascading aborts is to disallow a transaction from
reading uncommitted changes from another transaction in the same schedule. In other words, if
some transaction Tj wants to read value updated or written by some other transaction Ti, then the
commit of Tj must read it after the commit of Ti. Example, consider the following schedule
involving two transactions T1 and T2.
T1 T2
R(A)
W(A)
W(A)
Commit
R(A)
Commit
This schedule is cascadeless. Since the updated value of A is read by T2 only after the updating
transaction i.e., T1 commits. E.g., consider the following schedule involving two transactions T1 & T2.
T1 T2
R(A)
W(A)
R(A)
W(A)
Abort
Abort
It is a recoverable schedule but it does not avoid cascading aborts. It can be seen that if T1 aborts, T2
will have to be aborted too in order to maintain the correctness of the schedule as T2 has already read
the uncommitted value written by T1.
5. Strict Schedules: A schedule is strict if for any two transactions Ti, Tj, if a write operation of Ti
precedes a conflicting operation of Tj (either read or write), then the commit or abort event of Ti
also precedes that conflicting operation of Tj. In other words, Tj can read or write updated or written
value of Ti only after Ti commits/aborts. Example, consider the following schedule.
T1 T2
R(A)
R(A)
W(A)
Commit
W(A)
R(A)
Commit
This is a strict schedule since T2 reads and writes A which is written by T1 only after the commit of
T1. Note: It can be:
• Cascadeless schedules are stricter than recoverable schedules.

7
• Strict schedules are stricter than cascadeless schedules.
• Serial schedules satisfy constraints of all recoverable, cascadeless and strict schedules.
1.4.1. Ways of Transaction Execution
Basically, there are two ways of executing a set of transactions:

1. Serially (Serial Execution) In serial execution transactions are executed strictly serially. Thus,
transaction Ti completes and writes its results to the database then only the next transaction Tj is
scheduled for execution. Good things about serial execution: 1. Correct execution, i.e., if the input
is correct then output will be correct. 2. Fast execution, since all the resources are available to the
active. The worst thing about serial execution is very inefficient resource utilization. Example, of a
serial execution, suppose data items X = 10, Y = 6, and N =1 and T1 and T2 are transactions.
Time T1 T2
read (X) {X = 10}
X := X+N {X = 11}
write (X) {X = 11}
read (Y) {Y = 6}
Y := Y+N {Y = 7}
write (Y) {Y = 7}
read (X) {X = 11}
X := X+N {X = 12}
write (X)
Final values of X, Y at the end of T1 and T2: X = 12 and Y = 7. Thus, we can witness that in serial
execution of transaction, if we have two transactions Ti and Ti+1, then Ti+1 will only be executed after
the completion of Ti.
2. Concurrently (Concurrent Execution): is the reverse of serially executable transactions, i.e.,
reads and writes are interleaved in some order.
Time T1 T2
read (X) {X = 10}
read (X) {X = 10}
X := X+N {X = 11}
X := X+N {X = 11}
write (X) {X = 11}
write (X)
read (Y) {Y = 6}
Y := Y+N {Y = 7}
write (Y) {Y = 7}
Final values at the end of T1 and T2: X = 11, and Y = 7. This improves resource utilization,
unfortunately gives incorrect result. The correct value of X is 12 but in concurrent execution X =11,
which is incorrect. The reason for this error is incorrect sharing of X by T1 and T2. In serial execution

8
T2 read the value of X written by T1 (i.e., 11) but in concurrent execution T2 read the same value of X
(i.e., 10) as T1 did and the update made by T1 was overwritten by T2’s update. This is the reason the
final value of X is one less that what is produced by serial execution.
Anomalies Due to Interleaved Execution
If all the transactions in DBMS systems are doing read operation on the Database, then no problem will
arise. When the read and write operations done alternatively there is a possibility of some type of
anomalies. These are classified into three categories.
1. Write-Read Conflicts (WR Conflicts)
Dirty Read: This happens when the Transaction T2 is trying to read the object A that has been modified
by another Transaction T1, which has not yet committed. This type read is called as dirty read.
Example: Consider two Transactions T1 and T2, each of which, run alone, preserves database
consistency. T1 transfers Birr 200 from A to B, and T2 increments both A and B by 6%.
T1 T2
R(A)
W(A)
R(A)
W(A)
R(B)
R(B) Commit
W(B)
Commit
Figure: Reading uncommitted data.
Suppose if the transactions are interleaved according to the above schedule, then the account transfer
program T1 deducts Birr 100 from account A, then the interest deposit program T2 reads the current
values of accounts A and B and adds 6% interest to each, and then the account transfer program credits
Birr 100 to account B. The outcome of this execution will be different from the normal execution like
if the two instructions are executed one by one. This type of anomalies leaves the database in
inconsistency state.
2. Read-Write Conflicts (RW Conflicts)
Unrepeatable Read: In this case anomalous behavior could result is that a Transaction T2 could
change the value of an object A that has been read by a Transaction T1, while T2 is still in progress. If
T1 tries to read A again it will get different results. This type of read is called as Unrepeatable Read.
Example: If “A” denotes an account. Consider two Transactions T1 and T2. Duty of T1 and T2 are
reducing account A by Birr 100.
T1 T2

9
R(A)
R(A)
W(A)
W(A) Commit
Commit
Figure: RW conflicts
At first, T1 checks whether the money in account A is more than Birr 100. Immediately it is interleaved
and T2 also checks the account for money and reduce it by Birr 100. After T2, T1 is tried to reduce the
same account A. If the initial amount in A is Birr 101 then, after the execution of T2 only Birr 1 will
remain in account A. Now T1 will try to reduce it by Birr 100. This makes the Database inconsistent.
3. Write-Write Conflicts (WW Conflicts)
The third type of anomalous behavior is that one Transaction is updating an object while another one
is also in progress. Example: Consider the two Transactions T1, T2. Consider the two objects A, B.

T1 T2
R(A)
W(A)
R(A)
W(A)
R(B)
R(B) Commit
W(B)
Commit
Figure: WW conflicts
If A and B are two accounts and their values have to be kept equal always, Transaction T1 updates both
objects to 3,000 and T2 updates both objects to 2,000. At first T1 updates the value of object A to 3,000.
Immediately T2 makes A as 2,000 and B as 2,000 and committed. After the completion of T2, T1
updates B to 3,000. Now the value of A is 2,000 and value of B is 3,000, they are not equal. Constraint
violated in this case due to serial scheduling.

1.5. Concurrency Control


Concurrency control is the process of managing simultaneous operations on the database without
having them interfere with one another. Concurrency control manager controls the interaction among
the concurrent transactions, to ensure the consistency of the database. In DBMS several transactions
are executed simultaneously. To achieve concurrent transactions, the technique of interleaved
execution is used. But in this technique, there is a possibility for the occurrence of certain anomalies,
due to the overriding of one transaction on the particular database object which is already referred by
another transaction.

10
Purpose of Concurrency Control
• To enforce Isolation (through mutual exclusion) among conflicting transactions.
• To preserve database consistency through consistency preserving execution of transactions.
• To resolve read-write and write-write conflicts.
There are three reasons for allowing concurrency: Improved throughput, Improve resource
utilization and Reduced waiting time.
1.5.1. Concurrency Control Techniques
Concurrency control is the process of managing simultaneous operations on the database without
having them interfere with one another. The basic concurrency control techniques are:
• Locking techniques
• Timestamp ordering techniques
• Multi-version concurrency control techniques
• Optimistic concurrency control techniques
1.5.1.1. Locking Technique
A lock is a mechanism for enforcing limits on access to a resource in an environment where there are
many threads of execution. Locks are one way of enforcing concurrency control policies. Transaction
uses locks to deny access to other transactions and so prevent incorrect updates. Lock prevents another
transaction from modifying item or even reading it, in the case of a write lock.
• Lock (X): If a transaction T1 applies Lock on data item X, then X is locked and it is not
available to any other transaction.
• Unlock (X): T1 U nlocks X. X is available to other transactions.
Types of a Lock
Shared lock: A Read operation does not change the value of a data item. Hence a data item can be read
by two different transactions simultaneously under share lock mode. So only to read a data item T1 will
do: Share lock (X), then Read (X), and finally Unlock (X).
Exclusive lock: A write operation changes the value of the data item. Hence two write operations from
two different transactions or a write from T1 and a read from T2 are not allowed. A data item can be
modified only under Exclusive lock. To modify a data item T1 will do: Exclusive lock (X), then Write
(X) and finally Unlock (X). When these locks are applied, then a transaction must behave in a special

11
way. This special behavior of a transaction is called well-formed. Well-formed: A transaction is well-
formed if it does not lock a locked data item and it does not try to unlock an unlocked data item.
Locking Basic Rules:
▪ If transaction has shared lock on item, can read but not update item.
▪ If transaction has exclusive lock on item, can both read and update item.
▪ Reads cannot conflict, so more than one transaction can hold shared locks simultaneously.
▪ Exclusive lock gives transaction exclusive access to that item.
▪ Some systems allow transaction to upgrade a shared lock to an exclusive lock, or vice-versa.
Example, T1 and T2 are two transactions. They are executed under locking as follows. T1 locks A in
an exclusive mode. When T2 wants to lock A, it finds it locked by T1 so T2 waits for Unlock on A by
T1. When A is released then T2 locks A and begins execution. Suppose a lock on a data item is applied,
the data item is processed and it is unlocked immediately after reading/writing is completed as follows.
Initial values of A = 10 and B = 20.
Serial Execution of T1 and then T2 Concurrent Execution of T1 and T2
T1 T2 T1 T2
Lock (A) Lock (A)
read (A) {A = 10} read (A) {A = 10}
A := A + 100 A := A + 100
write (A) (A = 110} write (A) (A = 110}
Unlock (A) Unlock (A)
Lock (B) Lock (B)
read (B) {B = 20} read (B) {B = 20}
B := B + 10 B := B * 5
write (B) {B =30} write (B) {B = 100}
Unlock (B) Unlock (B)
Lock (B) Lock (B)
read (B) {B = 30} read (B) {B = 100}
B := B * 5 B := B + 10
write (B) {B = 150} write (B) {B = 110}
Unlock (B) Unlock (B)
Lock (A) Lock (A)
Read (A) {A = 110} Read (A) {A = 110}
A := A + 20 A := A + 20
Write (A) {A = 130} Write (A) {A = 130}
Unlock (A) Unlock (A)
Final Result: A=130 B=150 Final Result: A=130 B=110
The final result of the two transactions using the two types of transaction execution (serial and
concurrent) is not the same. This indicates that the above method of locking and unlocking is not

12
correct. Thus, to preserve consistency we have to use another approach to locking, two-phase
locking scheme.
Two-Phase Locking (2PL)
A transaction follows 2PL protocol if all locking operations precede the first unlock operation in
the transaction. 2PL protocol demands locking and unlocking of a transaction have two phases.
• Growing phase: acquires all locks but cannot release any locks.
• Shrinking phase: releases locks but cannot acquire any new locks. E.g., Strict 2P Lock
’ ’
T1 T2
read_lock(Y) read_lock(X)
read_item(Y) read_item(X)
write_lock(X) write_lock(Y)
read_item(X) read_item(Y)
X:=X+Y Y:=X+Y
write_item(X) write_item(Y)
Commit Commit
unlock(Y) unlock(X)
unlock(X) unlock(Y)
These transactions obey the Strict 2PL protocol
Locking Method Problems
Deadlock: A deadlock occurs when two or more transactions are in a simultaneous wait state, each
one waiting for one of the others to release a lock. In a multi-process system, deadlock is a
situation, which arises in shared resource environment where a process indefinitely waits for a
resource, which is held by some other process, which in turn waiting for a resource held by some
other process. For example, assume a set of transactions {T0, T1, T2, ..., Tn}. T0 needs a resource
X to complete its task. Resource X is held by T1 and T1 is waiting for a resource Y, which is held
by T2. T2 is waiting for resource Z, which is held by T0. Thus, all processes wait for each other
to release resources. In this situation, none of processes can finish their task. This situation is
known as 'deadlock'. To keep system deadlock free few methods can be used.
T1_ T2
Read_lock(Y)
Read_item(Y)
Read_lock(X)
Read_item(X)
Write_lock(X)
Waiting Write_lock(Y)

Waiting

13
T1 and T2 did follow two-phase policy but they are deadlock.
Deadlock Possible Solutions
Only one way to break deadlock abort one or more of the transactions in the deadlock. Deadlock
should be transparent to user, so DBMS should restart transaction(s). Two general techniques for
handling deadlock:
A. Deadlock Prevention: To prevent any deadlock situation in the system, the DBMS
aggressively inspects all the operations which transactions are about to execute. If it finds that
a deadlock situation might occur, then that transaction is never allowed to be executed. There
are deadlock prevention schemes, which uses time-stamp ordering mechanism of
transactions in order to pre-decide a deadlock situation.
Wait-Die Scheme: In this scheme, if a transaction request to lock a resource (data item), which is
already held with conflicting lock by some other transaction, one of the two possibilities may
occur: If TS(Ti) < TS(Tj), that is Ti, which is requesting a conflicting lock, is older than Tj, Ti is
allowed to wait until the data-item is available. If TS(Ti) > TS(tj), that is Ti is younger than Tj, Ti
dies. Ti is restarted later with random delay but with same timestamp. This scheme allows the
older transaction to wait but kills the younger one.
Wound-Wait Scheme: In this scheme, if a transaction request to lock a resource (data item),
which is already held with conflicting lock by some other transaction, one of the two possibilities
may occur: If TS(Ti) < TS(Tj), that is Ti, which is requesting a conflicting lock, is older than Tj,
Ti forces Tj to be rolled back, that is Ti wounds Tj. Tj is restarted later with random delay but with
same timestamp. If TS(Ti) > TS(Tj), that is Ti is younger than Tj, Ti is forced to wait until the
resource is available. This scheme, allows the younger transaction to wait but when an older
transaction request an item held by younger one, the older transaction forces the younger one to
abort and release the item. In both cases, transaction, which enters late in the system, is aborted.
B. Deadlock Detection and Recovery:
This mechanism can be used to detect any deadlock situation in advance and recover.
Timeout: The deadlock detection could be done using the technique of TIMEOUT. Every
transaction will be given a time to wait in case of deadlock. If a transaction waits for the predefined
period of time in idle mode, the DBMS will assume that deadlock occurred and it will abort and
restart the transaction.

14
1.5.1.2. Timestamp Ordering Techniques
Timestamp is a unique identifier created by DBMS that indicates relative starting time of a
transaction. Can be generated by using system clock at time transaction started, or incrementing a
logical counter every time, a new transaction starts. Time-stamping is a concurrency control
protocol that orders transactions in such a way that older transactions, transactions with smaller
time stamps, get priority in the event of conflict. Conflict is resolved by rolling back and restarting
transaction. Since there is no need to use lock there will be no deadlock.
In timestamp ordering, the schedule is equivalent to the particular serial order that corresponds to
the order of the transaction timestamps. To implement this scheme, every transaction will be given
a timestamp which is a unique identifier of a transaction. If Ti came to processing prior to Tj then
TS of Tj will be larger than TS of Ti. Again, each data item will have a timestamp for Read and
Write.
Rules for permitting execution of operations in Time-stamping Method. Suppose that
Transaction Ti issues Read(A)
• If TS(Ti) < WTS(A): this implies that Ti needs to read a value of A which was already
overwritten. Hence the read operation must be rejected and Ti is rolled back.
• If TS(Ti) >= WTS(A): then the read is executed and RTS(A) is set to the maximum of RTS(A)
and TS(Ti).
Suppose that Transaction Ti issues Write(A)
• If TS(Ti) < RTS(A): then this implies that the value of A that Ti is producing was previously
needed and it was assumed that it would never be produced. Hence, the Write operation must
be rejected and Ti is rolled back.
• If TS(Ti) < WTS(A): then this implies that Ti is attempting to Write an object value of A.
hence, this write operation can be ignored. Otherwise, the Write operation is executed and
WTS(A) is set to the maximum of WTS(A) or TS(Ti).
A transaction that is rolled back due to conflict will be restarted and be given a new timestamp.
Problem with timestamp-ordering protocol:
Suppose Ti aborts, but Tj has read a data item written by Ti
• Then Tj must abort; if Tj had been allowed to commit earlier, the schedule is not recoverable.
• Further, any transaction that has read a data item written by Tj must also abort
• This can lead to cascading rollback.

15
Cascading Rollback: A cascading rollback occurs in database systems when a transaction (T1)
causes a failure and a rollback must be performed. Other transactions dependent on T1's actions
must also be rollbacked due to T1's failure, thus causing a cascading effect. That is, one
transaction's failure causes many to fail. Practical database recovery techniques guarantee
cascadeless rollback, therefore a cascading rollback is not a desirable result.
1.5.1.3. Optimistic Technique
Locking and assigning and checking timestamp values may be unnecessary for some transactions
assumes that conflict is rare. When transaction reaches the level of executing commit, a check is
performed to determine whether conflict has occurred. If there is a conflict, transaction is rolled
back and restarted. At commit, check is made to determine whether conflict has occurred. If there
is a conflict, transaction must be rolled back and restarted. Potentially allows greater concurrency
than traditional protocols. It has three phases:
1. Optimistic Techniques Read Phase: Extends from start until immediately before commit.
Transaction reads values from database and stores them in local variables. Updates are
applied to a local copy of the data.
2. Optimistic Techniques Validation Phase: Follows the read phase. For read-only
transaction, checks that data read are still current values. If no interference, transaction is
committed, else aborted and restarted. For update transaction, checks transaction leaves
database in a consistent state, with serializability maintained.
3. Optimistic Techniques Write Phase: Follows successful validation phase for update
transactions. Updates made to local copy are applied to the database.
1.5.2. Granularity of Data Items
Granularity is the size of the data items chosen as the unit of protection by a concurrency control
protocol. It could be: The entire database, a file, a page (a section of physical disk in which relations
are stored), a record and a field value of a record. The granularity has effect on the performance of
the system. As locking will prevent access to the data, the size of the data required to be locked
will prevent other transactions from having access. If the entire database is locked, then
consistency will be highly maintained but less performance of the system will be witnessed. If a
single data item is locked; consistency maybe at risk but concurrent processing and performance
will be enhanced. Thus, as one goes from the entire database to a single value, performance and

16
concurrent processing will be enhanced but consistency will be at risk and needs good concurrency
control mechanism and strategy.
1.5.3. Problems of Concurrent Sharing
Although two transactions may be correct in themselves, interleaving of operations may produce
an incorrect result which needs control over access. Having a concurrent transaction processing,
one can enhance the throughput of the system. As reading and writing is performed from and on
secondary storage, the system will not be idle during these operations if there is a concurrent
processing. Every transaction should be correct by themselves, but this would not guarantee that
the interleaving of these transactions will produce a correct result. The three potential problems
caused by concurrency are:
Lost Update Problem: Successfully completed update on a data set by one transaction is
overridden by another transaction/user. Example, account with balance A=100.
▪ T1 reads the account A ▪ T2 reads the account A
▪ T1 withdraws 10 from A ▪ T2 adds 100 on A
▪ T1 makes the update in the Database ▪ T2 makes the update in the Database
In the above case, if done one after the other (serially) then we have no problem.
▪ If the execution is T1 followed by T2 then A=190
▪ If the execution is T2 followed by T1 then A=190
But if they start at the same time in the following sequence:
▪ T1 reads the account A=100
▪ T1 withdraws 10 making the balance A=90
▪ T2 reads the account A=100
▪ T2 adds 100 making A=200
▪ T1 makes the update in the Database A=90
▪ T2 makes the update in the Database A=200
After the successful completion of the operation in this schedule, the final value of A will be 200
which override the update made by the first transaction that changed the value from 100 to 90.
Uncommitted Dependency Problem: Occurs when one transaction can see intermediate results
of another transaction before it is committed. Example, T2 increases 100 making it 200 but then
aborts the transaction before it is committed. T1 gets 200, subtracts 10 and make it 190. But the
actual balance should be 90.

17
Inconsistent Analysis Problem: Occurs when transaction reads several values but second
transaction updates some of them during execution and before the completion of the first. Example,
T2 would like to add the values of A=10, B=20 and C=30. after the values are read by T2 and
before its completion, T1 updates the value of B to be 50. at the end of the execution of the two
transactions T2 will come up with the sum of 60 while it should be 90 since B is updated to 50.
1.5.4. Serializability
The objective of concurrency control protocol is to schedule transactions in such a way as to avoid
any interference between them. This demands a new principle in transaction processing, which is
serializability of the schedule of execution of multiple transactions. A non-serial schedule is said
to be serializable, if it is conflict equivalent or view-equivalent to a serial schedule. Example:
T1 T2
Read (A)
Write (A)
Read (A)
Write (A)
Read (B)
Write (B)
Read (B)
Write (B)
• Serial schedule: A schedule where the operations of each transaction are executed
consecutively without any interleaved operations from other transactions.
• Non-serial schedule: A schedule where the operations from a set of concurrent transactions
are interleaved.
The objective of serializability is to find non-serial schedules that allow transactions to execute
concurrently without interfering with one another, and thereby produce a database state that could
be produced by a serial execution. In serializability, the ordering of read and write operations is
important:
• If two transactions only read a data item, they do not conflict and order is not important.
• If two transactions either read or write completely separate data items, they do not conflict
and order is not important.
• If one transaction writes a data item and another either reads or writes the same data item, the
order of execution is important.
Consider the schedule S1 shown in Figure (a) containing operations from two concurrently
executing transactions T1 and T2. Since the write operation on X in T2 does not conflict with the

18
subsequent read operation on Y in T1, we can change the order of these operations to produce the
equivalent schedule S2 shown in Figure (b). If we also now change the order of the following
non-conflicting operations, we produce the equivalent serial schedule S3 shown in Figure (c):
• Change the order of the write(X) of T8 with the write(Y) of T7.
• Change the order of the read(X) of T8 with the read(Y) of T7.
• Change the order of the read(X) of T8 with the write(Y) of T7.
Time T1 T2 T1 T2 T1 T2
t1 Begin T1 Begin T1 Begin T1
t2 R(X) R(X) R(X)
t3 W(X) W(X) W(X)
t4 Begin T2 Begin T2 R(Y)
t5 R(X) R(X) W(Y)
t6 W(X) R(Y) Commit
t7 R(Y) W(X) Begin T2
t8 W(Y) W(Y) R(X)
t9 Commit Commit W(X)
t10 R(Y) R(Y) R(Y)
t11 W(Y) W(Y) W(Y)
t12 Commit Commit Commit
(a) S1 (b) S2 (c) S3
Figure: Equivalent schedules: (a) nonserial schedule S1; (b) nonserial schedule S2 equivalent to
S1; (c) serial schedule S3, equivalent to S1 and S2.
Schedule S3 is a serial schedule and, since S1 and S2 are equivalent to S3, S1 and S2 are
serializable schedules. This type of serializability is known as conflict serializability. A conflict
serializable schedule orders any conflicting operations in the same way as some serial execution.

1.6. Transaction Recovery


Transaction recovery is the process of restoring transaction to a correct (consistent) state in the
event of a failure. Transaction recovery reverses all the changes that the transaction has made to
the database before it was aborted. A transaction begins with successful execution and it ends with
successful execution of either a commit or a rollback statement. Transactions are not only the unit
of work but also the unit of recovery.

19
1.6.1. Failure Classification
Local Failure: It affects only the transaction in which the failure has actually occurred.
Global Failure: A Global Failure affects all of the transaction in progress at the time of the failure.
Global Failure fall into two broad categories are System Failure and Media Failure.
1. System Failure: There are various types of failure that may occur in a system.
i. Transaction Failure: transaction failure can be:
Logical Error: The transaction can no longer continue with its normal execution because
of some internal conditions such as wrong input, data not found, resources limit exceeded.
System Error: The system has entered an undesirable state as a result of which a transaction
cannot continue with its normal execution ego the deadlock occurred in system, starvation
in system.
ii. System Crash: The system failures, which affect all transactions currently in progress but
do not physically damage the database. A system failure is sometime called a soft crash.
2. Media Failure: A media failure is sometimes called a hard crash. Disk Failure is an example
of Media Failure. A disk block loses its content as a result of either a head crash or failure
during a data transfer operation.
1.6.2. Types of Transaction Recovery
In case of any type of failures, a transaction must either be aborted or committed to maintain data
integrity. Transaction log plays an important role for database recovery and brings the database
in a consistent state in the event of failure. The recovery manager guarantees the atomicity and
durability properties of transactions in the event of failure. During recovery from failure, the
recovery manager ensures that either all the effects of a given transaction are permanently recorded
in the database or none of them are recorded. They are two types of transaction recovery:

20
1.6.2.1. Forward Recovery (REDO)
Forward recovery (also called roll-forward) is the recovery procedure, which is used in case of a
physical damage, for example crash of disk pack (secondary storage), failures during writing of
data to database buffers, or failure during transferring buffers to secondary storage. The
intermediate results of the transactions are written in the database buffers. The data is transferred
to buffer and from secondary storage of the database. If the transaction had already issued its
commit, the recovery manager redo (roll forward) so that transaction's updates to the database.
The forward recovery guarantees the durability property of transaction.

Figure: Forward (roll-forward) recovery or redo


1.6.2.2.Backward Recovery or UNDO
Backward recovery (also called roll-backward) is the recovery procedure, which is used in case an
error occurs in the midst of normal operation on the database. If the transaction had not committed
at the time of failure, it will cause inconsistency in the database as because in the interim, other
programs may have read the incorrect data and made use of it. Then recovery manager must undo
(roll back) any effects of the transaction database. The backward recovery guarantees the atomicity
property of transactions.

Figure: Backward recovery or UNDO.

21
Backward recovery, example, at restart time, the system goes through the following procedure in
order to identify all transaction of type T 1 - T 5.

Figure: Five transaction categories

• Start with two lists of transactions, the UNDO list and the REDO list. Set the UNDO list equal
to the list of all transactions given in the most recent checkpoint given in the most recent
checkpoint record; set the REDO list to empty.
• Search forward through the log, starting from the check point record.
• If a BEGIN TRANSACTION log entry is found for transaction T add T to the UNDO list.
• If a COMMIT log entry is found for transaction T, move T from UNDO list to the REDO list.
• When the end of the log is reached the UNDO and REDO list identify respectively, transaction
of types T 3 and T 5 and transaction of types T 2 and T 4. The system now works backward
through the log, undoing the transactions in the UNDO list. Restoring the database to a
consistent state by undoing work is called backward recovery.
Recovery with Concurrent Transactions: When more than one transaction is being executed in
parallel, the logs are interleaved. At the time of recovery, it would become hard for recovery
system to backtrack all logs, and then start recovering. To ease this situation most modern DBMS,
use the concept of checkpoints.

Checkpoint: Keeping and maintaining logs in real time and in real environment may fill out all the
memory space available in the system. At time passes log file may be too big to be handled at all.
Checkpoint declares a point before which the DBMS was in consistent state and all the transactions
were committed. When system with concurrent transaction crashes and recovers, it does behave in
the following manner:

22
Recovery with concurrent transactions
The recovery system reads the logs backwards from the end to the last Checkpoint. It maintains
two lists, undo-list and redo-list. If the recovery system sees a log with <Tn, Start> and <Tn,
Commit> or just <Tn, Commit>, it puts the transaction in redo-list. If the recovery system sees a
log with <Tn, Start> but no commit or abort log found, it puts the transaction in undo-list. All
transactions in undo-list are then undone and their logs are removed. All transaction in redo-list,
their previous logs are removed and then redone again and log saved.

1.7. Database Recovery


Database recovery is the process of restoring the database to a correct state in the event of a failure.
Data can be stored in: main memory, magnetic disk, magnetic tape and optical disk. Some failures
affect main memory only, and others may affect secondary storage. Some of causes of failure are:
• System crashes due to hardware or software errors, resulting in loss of main memory.
• Media failures, such as head crashes resulting in the loss of parts of secondary storage.
• Application software errors, such as logical errors in the program that is accessing the database,
which cause one or more transactions to fail.
• Natural physical disasters, such as fires, floods, earthquakes, or power failures.
• Carelessness or unintentional destruction of data or facilities by operators or users;
• Sabotage, or intentional corruption or destruction of data, hardware, or software facilities.
1.7.1. Recovery Facilities 1000026922376
DBMS should provide following facilities to assist with recovery:
Backup Mechanism: Backup refers to the copying of physical or virtual files or databases to a
secondary location for preservation in case of equipment failure or catastrophe. Depending on
whether you make your database backups when the database is open or closed, backups can be

23
further categorized into the backup modes known as consistent backup and inconsistent backup.
Typically, a backup copy is produced at least once per day. The copy should be stored in a secured
location where it is protected from loss or damage. The backup copy is used to restore the database
in the event of hardware failure, catastrophic loss, or damage.

Logging Facilities: There are two types of logs, transaction log & database change log:
Transaction log: Transaction log contains a record of the essential data for each transaction that
is processed against the database. Data that are typically recorded for each transaction include the
transaction code or identification, action or type of transaction, time of the transaction, terminal
number or user ID, input data values, table and records accessed, records modified, and possibly
the old and new field values.
Database Change Log: The database change log contains before and after images of records that
have been modified by transactions. A before-image is a copy of a record before it has been
modified, and an after-image is a copy of the same record after it has been modified.
Checkpoint Facility: A checkpoint facility in a DBMS periodically refuses to accept any new
transactions. All transactions in progress are completed, and the journal files are brought up to
date. At this point, the system is in a quiet state, and the database and transaction logs are
synchronized. The DBMS writes a special record (called a checkpoint record) to the log file, which
is like a snapshot of the state of the database. A DBMS may perform checkpoints automatically or
in response to commands in user application programs. Checkpoints should be taken frequently.

Recovery Manager (RMAN): The recovery manager is a module of the DBMS which restores
the database to a correct condition when a failure occurs and which resumes processing user
requests. The recovery manager uses the logs to restore the database.

1.7.2. Recovery Techniques


Three main recovery techniques that are commonly employed are:
Deferred update: Deferred updates are not written to the database until after a transaction has
reached its commit point. If transaction fails before commit, it will not have modified database and
so no undoing of changes is required. Deferred update may be necessary to redo updates of
committed transactions as their effect may not have reached database.
Immediate update: In the case of immediate update, updates are applied to database as they occur.
There is a need to redo updates of committed transactions following a failure. Also, there may be

24
need to undo effects of transactions that had not committed at time of failure. It is essential that
log records are written before write to database. If no “transaction commit” record in log, then that
transaction was active at failure and must be undone. Undo operations are performed in reverse
order in which they were written to log.
Shadow paging: Shadow paging maintains two-page tables during life of a transaction, current
page and shadow page table. When transaction starts, two pages are the same. Shadow page table
is never changed thereafter and is used to restore database in the event of failure. During
transaction, current page table records all updates to database. When transaction completes, current
page table becomes shadow page table.
1.7.3. Crash Recovery
Crash recovery is the process of protecting the database from catastrophic system failures and
media failures. Recovery manager of a DBMS is responsible for ensuring transaction atomicity
and durability.
Saving checkpoints: It will save the status of the database in the period of time duration. So, if
any crashes occur, then database can be restored into last saved check point.
Steal approach: In this case, the object can be written into disk before the transaction which holds
the object is committed. This is happening when the buffer manager chooses the same place to
replace by some other page and at the same time another transaction requires the same page. This
method is called as stealing frames.
Forcing pages: In this case, once if the transaction completed the entire objects associated with it
should be forced to disk or written to the disk. But it will result in more I/O cost. So normally we
use only no-force approach.
References

[1] T. C. a. C. Begg, Database Systems, London: Addison Wesley, 2005.


[2] V. k. Pallaw, DATABASE MANAGEMENT SYSTEMS, New Delhi: Asian books private limited,
2010 .
[3] R. E. a. S. B. Navathe, Fundamentals of Database Systems, United States of America: Pearson, 2016.
[4] S. E. S. Sumathi, Fundamentalsof Relational Database Management Systems, Berlin: Springer, 2007.
[5] B. Thomas, OCA: Oracle Database 12c Administrator Certified Associate Study Guide, Canada: Sybex
an Imprint of Wiley, 2014 .

25

You might also like