Unit 4 Chapter 2 Transaction Management
Unit 4 Chapter 2 Transaction Management
Unit 4 Chapter 2 Transaction Management
Management
Syllabus Topics
• Basic concept of a Transaction
• ACID Properties
• State diagram
• Concept of Schedule
• Serializability – Conflict and View
• Concurrency Control Protocols
• Recovery techniques
What is Transaction in DBMS?
• Transactions are a set of operations that are used to perform some logical set of work.
• A transaction is made to change data in a database which can be done by inserting new
data, updating the existing data, or by deleting the data that is no longer required.
• For example, you are transferring money from your bank account (Say Account A) to
your friend’s account(Say Account B ), the set of operations would be like this:
• Simple Transaction Example
• Read your account balance
• Deduct the amount from your balance
• Write the remaining balance to your account
• Read your friend’s account balance
• Add the amount to his account balance
• Write the new updated balance to his account
What is Transaction in DBMS?
• In this case, we need to perform at least two updates.
• The first update is happening in the sender’s account from where the money
is sending and the other update is happening in the receiver’s account who is
receiving the money.
• Both of these updates should either get committed or get rollback if there is
an error. We don’t want the transaction to be in a half-committed state.
• Operations of Transaction:
• Following are the main operations of transaction:
• Read(X): Read operation is used to read the value of X from the database and
stores it in a buffer in main memory.
• Write(X): Write operation is used to write the value back to the database
from the buffer.
ACID properties
• To ensure the integrity and consistency of data during a transaction, the database
system maintains four properties. These properties are widely known as ACID
properties.
ACID properties
• Atomicity
• This property ensures that either all the operations of a transaction reflect in
database or none. Either it executes completely or it doesn’t, there shouldn’t be a
partial execution.
• Suppose Account A has a balance of 400$ & B has 700$. Account A is transferring
100$ to Account B.
• This is a transaction that has two operations
• a) Debiting 100$ from A’s balance
• b) Crediting 100$ to B’s balance.
• Let’s say first operation passed successfully while second failed, in this case A’s
balance would be 300$ while B would be having 700$ instead of 800$.
• This is unacceptable in a banking system. Either the transaction should fail without
executing any of the operation or it should process both the operations. The
Atomicity property ensures that.
ACID properties
• Atomicity
• There are two key operations are involved in a transaction to maintain the
atomicity of the transaction.
• Rollback : If there is a failure in the transaction, abort the execution and
rollback the changes made by the transaction.
• Commit: If transaction executes successfully, commit the changes to the
database.
• Consistency
• Database must be in consistent state before and after the execution of the
transaction. This ensures that there are no errors in the database at any point
of time. Application programmer is responsible for maintaining the
consistency of the database.
ACID properties
• Consistency
• Example:
A transferring 1000 dollars to B. A’s initial balance is 2000 and B’s initial
balance is 5000.
• Before the transaction:
Total of A+B = 2000 + 5000 = 7000$
• After the transaction:
Total of A+B = 1000 + 6000 = 7000$
• There is no other transaction which change values of A or B during
execution of T, then the new values of A and B will be 1000 and 6000 i.e.,
(A+B=7000 after the transaction).
• The data is consistent before and after the execution of the transaction so
this example maintains the consistency property of the database.
ACID properties
• Isolation
• A transaction shouldn’t interfere with the execution of another transaction.
• To preserve the consistency of database, the execution of transaction should
take place in isolation (that means no other transaction should run
concurrently when there is a transaction already running).
• For example account A is having a balance of 400$ and it is transferring 100$
to account B & C both.
• So we have two transactions here. Let’s say these transactions run
concurrently and both the transactions read 400$ balance, in that case the
final balance of A would be 300$ instead of 200$. This is wrong.
• If the transaction were to run in isolation then the second transaction would
have read the correct balance 300$ (before debiting 100$) once the first
transaction went successful.
ACID properties
• Durability
• Once a transaction completes successfully,
the changes it has made into the database
should be permanent even if there is a
system failure.
• The recovery-management component of
database systems ensures the durability of
transaction.
• ACID properties are the backbone of a
database management system.
• These properties ensure that even though
there are multiple transaction reading and
writing the data in the database, the data is
always correct and consistent.
State diagram
• During the lifetime of a
transaction, there are a
lot of states to go through.
These states update the
operating system about
the current state of the
transaction.
• These states decide the
regulations which
decide the fate of a
transaction whether it
will commit or abort.
State diagram
• Active State
• If a transaction is in execution then it is said to be in active state.
• During this state read or write operations can be performed.
• Partially Committed State
• Once the whole transaction is successfully executed, the transaction goes into
partially committed state where we have all the read and write operations
performed on the main memory (local memory) instead of the actual database.
• The reason why we have this state is because a transaction can fail during
execution so if we are making the changes in the actual database instead of local
memory, database may be left in an inconsistent state in case of any failure.
• This state helps us to rollback the changes made to the database in case of a
failure during execution.
State diagram
• Failed State
• A transaction considers failed when any one of the checks fails or if the
transaction is aborted while it is in the active state.
• If a transaction is executing and a failure occurs, either a hardware failure or a
software failure then the transaction goes into failed state from the active
state.
• Committed State
• If a transaction completes the execution successfully then all the changes
made in the local memory during partially committed state are permanently
stored in the database.
State diagram
• Aborted State
• If the transaction fails during its execution, it goes from failed state to aborted
state and because in the previous states all the changes were only made in the
main memory, these uncommitted changes are either deleted or rolled back.
• The transaction at this point can restart and start afresh from the active state.
• Terminated State
• State of transaction reaches terminated state when certain transactions which
are leaving the system can’t be restarted.
DBMS Schedule Time Transaction T1 Transaction T2
t1 Read(A)
• A series of operation from one transaction to
another transaction is known as schedule. t2 A=A+50
t3 Write(A)
• It is used to preserve the order of the operation in
each of the individual transaction. t4 Read(A)
t5 A+A+100
• When multiple transactions are running
concurrently then there needs to be a sequence in t6 Write(A)
Above two schedules, S1 and S2 are view equivalent because final write
operation in schedule S1 is done by T1 and in S2, T1 also does the final
write operation.
Concurrency Control Protocols
• Concurrency Control in Database Management System is a procedure of
managing simultaneous operations without conflicting with each other.
• Concurrent access is quite easy if all users are just reading data. There is no
way they can interfere with one another.
• Though for any practical Database, it would have a mix of READ and WRITE
operations and hence the concurrency is a challenge.
• Problems of Concurrency
• Lost Updates occur when multiple transactions select the same row and
update the row based on the value selected
• Uncommitted dependency issues occur when the second transaction selects a
row which is updated by another transaction
• Non-Repeatable Read occurs when a second transaction is trying to access the
same row several times and reads different data each time.
Concurrency Control Protocols
• Why use Concurrency method?
• To resolve read-write and write-write conflict issues
• Concurrency control helps to ensure serializability.
• Example
• Assume that two people who go to electronic kiosks at the same time to buy
a movie ticket for the same movie and the same show time.
• However, there is only one seat left in for the movie show in that particular
theatre. Without concurrency control in DBMS, it is possible that both
moviegoers will end up purchasing a ticket. However, concurrency control
method does not allow this to happen. Both moviegoers can still access
information written in the movie seating database. But concurrency control
only provides a ticket to the buyer who has completed the transaction
process first.
Concurrency Control Protocols
• Different concurrency control protocols offer different benefits between the amount of
concurrency they allow and the amount of overhead that they impose. Following are
the Concurrency Control techniques in DBMS:
• Lock-Based Protocols
• Two Phase Locking Protocol
• Timestamp-Based Protocols
• Validation-Based Protocols
• Lock-based Protocols
• Lock Based Protocols in DBMS is a mechanism in which a transaction cannot Read or
Write the data until it acquires an appropriate lock. Lock based protocols help to
eliminate the concurrency problem in DBMS for simultaneous transactions by locking
or isolating a particular transaction to a single user.
• A lock is a data variable which is associated with a data item. This lock signifies that
operations that can be performed on the data item.
Concurrency Control Protocols
• Lock-based Protocols
• Binary Locks: A Binary lock on a data item can either locked or unlocked states.
• Shared/exclusive: This type of locking mechanism separates the locks in DBMS
based on their uses. If a lock is acquired on a data item to perform a write
operation, it is called an exclusive lock.
• 1. Shared Lock (S):
• A shared lock is also called a Read-only lock. With the shared lock, the data item
can be shared between transactions. This is because you will never have
permission to update data on the data item.
• For example, consider a case where two transactions are reading the account
balance of a person. The database will let them read by placing a shared lock.
However, if another transaction wants to update that account’s balance, shared
lock prevent it until the reading process is over.
Concurrency Control Protocols
• Lock-based Protocols
• 2. Exclusive Lock (X):
• With the Exclusive Lock, a data item can be read as well as written. This is
exclusive and can’t be held concurrently on the same data item. X-lock is
requested using lock-x instruction. Transactions may unlock the data item
after finishing the ‘write’ operation.
• For example, when a transaction needs to update the account balance of a
person. You can allows this transaction by placing X lock on it. Therefore,
when the second transaction wants to read or write, exclusive lock prevent
this operation.
Concurrency Control Protocols
• Two Phase Locking Protocol
• Two Phase Locking Protocol also known as 2PL protocol is a method of
concurrency control in DBMS that ensures serializability by applying a lock to
the transaction data which blocks other transactions to access the same data
simultaneously.
• This locking protocol divides the execution phase of a transaction into three
different parts.
• In the first phase, when the transaction begins to execute, it requires
permission for the locks it needs.
• The second part is where the transaction obtains all the locks. When a
transaction releases its first lock, the third phase starts.
• In this third phase, the transaction cannot demand any new locks. Instead, it
only releases the acquired locks.
Concurrency Control Protocols
• Two Phase Locking Protocol
• The Two-Phase Locking protocol allows each
transaction to make a lock or unlock request
in two steps:
• Growing Phase: In this phase transaction
may obtain locks but may not release any
locks.
• Shrinking Phase: In this phase, a transaction
may release locks but not obtain any new
lock
Concurrency Control Protocols
• Timestamp-based Protocols
• Timestamp based Protocol in DBMS is an algorithm which uses the System
Time or Logical Counter as a timestamp to serialize the execution of
concurrent transactions. The Timestamp-based protocol ensures that every
conflicting read and write operations are executed in a timestamp order.
• The older transaction is always given priority in this method. It uses system
time to determine the time stamp of the transaction.
• Suppose there are there transactions T1, T2, and T3.
• T1 has entered the system at time 0010
• T2 has entered the system at 0020
• T3 has entered the system at 0030
• Priority will be given to transaction T1, then transaction T2 and lastly
Transaction T3.
Concurrency Control Protocols
• Validation Based Protocol
• Validation based Protocol in DBMS also known as Optimistic Concurrency Control Technique is a method
to avoid concurrency in transactions. In this protocol, the local copies of the transaction data are updated
rather than the data itself, which results in less interference while execution of the transaction.
• The Validation based Protocol is performed in the following three phases:
1. Read Phase
2. Validation Phase
3. Write Phase
• Read Phase
• In the Read Phase, the data values from the database can be read by a transaction but the write operation
or updates are only applied to the local data copies, not the actual database.
• Validation Phase
• In Validation Phase, the data is checked to ensure that there is no violation of serializability while
applying the transaction updates to the database.
• Write Phase
• In the Write Phase, the updates are applied to the database if the validation is successful, else; the
updates are not applied, and the transaction is rolled back.
Recovery techniques
• Database systems, like any other computer system, are subject to failures
but the data stored in them must be available as and when required.
• When a database fails it must possess the facilities for fast recovery. It
must also have atomicity i.e. either transaction are completed successfully
and committed (the effect is recorded permanently in the database) or the
transaction should have no effect on the database.
• The techniques used to recover the lost data due to system crashes,
transaction errors, viruses, catastrophic failure, incorrect commands
execution, etc. are database recovery techniques.
• Recovery techniques are heavily dependent upon the existence of a special
file known as a system log.
• It contains information about the start and end of each transaction and
any updates which occur during the transaction. The log keeps track of all
transaction operations that affect the values of database items.
Recovery techniques
• This information is needed to recover from transaction failure.
• The log is kept on disk start_transaction(T): This log entry records that transaction T
starts the execution.
• read_item(T, X): This log entry records that transaction T reads the value of database
item X.
• write_item(T, X, old_value, new_value): This log entry records that transaction T
changes the value of the database item X from old_value to new_value. The old value is
sometimes known as a before an image of X, and the new value is known as an
afterimage of X.
• commit(T): This log entry records that transaction T has completed all accesses to the
database successfully and its effect can be committed (recorded permanently) to the
database.
• abort(T): This records that transaction T has been aborted.
• checkpoint: Checkpoint is a mechanism where all the previous logs are removed from
the system and stored permanently in a storage disk. Checkpoint declares a point before
which the DBMS was in a consistent state, and all the transactions were committed.
Recovery techniques
• Undoing – If a transaction crashes, then the recovery manager may undo
transactions i.e. reverse the operations of a transaction. This involves
examining a transaction for the log entry write_item(T, x, old_value,
new_value) and set the value of item x in the database to old-value.
• Deferred update – This technique does not physically update the database on
disk until a transaction has reached its commit point. Before reaching commit,
all transaction updates are recorded in the local transaction workspace. If a
transaction fails before reaching its commit point, it will not have changed the
database in any way so UNDO is not needed.
• Immediate update – In the immediate update, the database may be updated by
some operations of a transaction before the transaction reaches its commit
point.
• Caching/Buffering – In this one or more disk pages that include data items to be
updated are cached into main memory buffers and then updated in memory
before being written back to disk.
Recovery techniques
• Shadow paging –When a transaction began executing the current directory
is copied into a shadow directory. When a page is to be modified, a
shadow page is allocated in which changes are made and when it is ready
to become durable, all pages that refer to the original are updated to refer
new replacement page.
• Backward Recovery – The term “Rollback ” and “UNDO” can also refer to
backward recovery. When a backup of the data is not available and
previous modifications need to be undone, this technique can be helpful.
With the backward recovery method, unused modifications are removed
and the database is returned to its prior condition.
• Forward Recovery – “Roll forward “and “REDO” refers to forwarding
recovery. When a database needs to be updated with all changes verified,
this forward recovery technique is helpful.