Unit 4 Notes
Unit 4 Notes
Unit 4 Notes
Unit-4 notes
Unit-4
Notes
Transaction Processing
Example:
You are working on a system for a bank. A customer goes to the ATM and
instructs it to transfer Rs. 1000 from savings to a checking account. This simple
transaction requires two steps:
The code to create this transaction will require two updates to the database. For
example, there will be two SQL statements: one UPDATE command to decrease
the balance in savings and a second UPDATE command to increase the balance in
the checking account.
You have to consider what would happen if a machine crashed between these two
operations. The money has already been subtracted from the savings account will
not be added to the checking account. It is lost. You might consider performing the
addition to checking first, but then the customer ends up with extra money, and the
bank loses. The point is that both changes must be made successfully.
There are several reasons for a transaction to fail in the middle of execution.
5. Disk failure: some disk blocks may lose their data because of a disk
read/write head crash. This may happen during a read/ write operation of a
transaction.
3. Aborted:- After the transaction has been rolled back and the database has
been restored to its state prior to the start of the transaction.
Transaction Properties
A transaction must have the following four properties, called ACID properties, to
ensure that a database remains in stable state after the transaction is executed:
• Atomicity
• Consistency
• Isolation
• Durability
In other words, execution of a transaction must leave a database in either its prior
stable state or a new stable state that reflects the new modifications (updates)
made by the transaction.
If the transaction fails, the database must be returned to the state it was in prior
to the execution of the failed transaction. If the transaction commits, the
database must reflect the new changes.
Overview of serializability
When multiple transactions are being executed by the operating system in a
multiprogramming environment, there are possibilities that instructions of one
transaction are interleaved with some other transaction.
Schedule
Types of schedule:
1. Complete schedule
2. Serial schedule
3. Non-serial schedule
4. Equivalent schedule
5. Serializable schedule
T1 T2
A=A+100
B=B-100
A=A*7.06
B=B*7.06
Schedule 1
T1 T2
A=A+100
B=B-100
A=A*7.06
B=B*7.06
Schedule 2
Non Serial Schedule: A schedule in which the operations from a set of concurrent
transactions are interleaved is called a non- serial schedule.
T1 T2
A=A+100
A=A*7.06
B=B-100
B=B*7.06
Schedule 3
SERIALSCHEDULE NON-SERIALSCHEDULE
A serial schedule is a sequence of operation by a set A non-serial schedule is a schedule where the operations of a of
concurrent transaction that preserves the order of group of concurrent transactions are interleaved. operations in
each of the individual transactions.
Transactions are performed in serial order. Transactions are performed in non-serial order, but result
should be same as serial.
EXAMPLE: EXAMPLE:
If some transaction T is long, the other transaction In this schedule the execution of other transaction goes on
must wait for T to complete all its operations. without waiting the completion of T.
Equivalent Schedule: For any database state, the effect (on the set of objects
in the database) of executing the first schedule is identical to the effect of
executing the second schedule.
Operations Conflict
• Rule 1: If two transactions just read a data object then they do not conflict
and the order is not important.
• Rule 3: If one transaction writes a data object and another either reads or
writes the same data object then the order of execution is important.
• Rule 4: Two actions on the same data object conflict if at least one
of them is a write.
Where
Set of edges consists of all edges Tià Tj for which one of the three conditions
hold:-
We also say that, if an edge Tià Tj exists in the precedence graph then in any
serial schedule S’ equivalent to S, Ti must appear before T.
• There is one edge and two vertices T1 and T2. Arrow indicates that all
instructions of T1 are executed before the execution of first instruction of
T2.
• Note: If the precedence graph for S has a cycle, then the schedule, S is not
conflict serializable. But, if the graph contains no cycles, then the
schedule, S, is conflict serializable.
Concurrency Control
• Lost updates
• Dirty read
• Unrepeatable read
Lost updates:- A lost update problem occurs when two transactions that access
the same database items have their operations in a way that makes the value of
some database item incorrect.
In other words, if transaction T1 and T2 both read a record and then update it, the
effects of the first update will be overwritten by the second update.
eg: T1 T2
read (A)
A:= A-100
read (A)
X:= A*0.02
A:= A+X
write (A)
read (B)
write (A)
B:=B+100
write (B)
Suppose that the initial values of account A and B are $2000 and $1500,
respectively. Then, after the serial execution of transactions T1 and T2 (T1
followed by T2), the value of account A should be $1938 and that of account B
should be $1600.
Suppose that the operations of T1 and T2 are interleaved in such a way that T2
reads the value of account A before T1 updates its value in the database. Now,
when T2 updates the value of account A in the database, the value of account A
updated by transaction T1 is overwritten and hence is lost. This is known as lost
update problem.
Dirty Read:- A dirty read problem occurs when one transaction updates a
database item and then the transaction fails for some reason. The updated
database item is accessed by another transaction before it is changed back to the
original value.
eg: Assume that T1 fails after debiting $100 from account A, but before crediting
this amount to account B. This will leave the database in an inconsistent state.
The value of account A is now $1900, which must be changed back to original
one, that is, $2000. However, before the transaction T1 is rolled back, let another
transaction T2 reads the incorrect value of account A. This incorrect value of
account A that is read by transaction T2 is called dirty data, and the problem is
called dirty read problem.
Unrepeatable Read:- The third problem occurs when a transaction tries to read
the value of the data item twice, and another transaction updates the same data
item in between the two read operations of the first transaction. As a result, the
first transaction reads varied values of same data item during its execution.
eg: T3 T4
read (A)
read (A)
A:= A+100
write (A)
read (A)
sum:= sum + A
write (sum)
Consider a transaction T3 that reads the value of account A. At this point let
another transaction T4 updates the value of account A. Now if T3 again tries to
read the value of account A, it will get a different value. As a result, the
transaction T3 receives different values for two reads of account A. This
interleaved schedule of transaction T3 and T4 that leads to a problem of
unrepeatable read.
Lock-Based Protocols
A lock is defined as a variable associated with a data item that describes the
status of the item with respect to possible operations that can be applied to it.
Lock Types
• Binary Locks
Binary Locking:- In binary locking there are two states of locking namely (a)
locked (‘1’) (b) unlocked (‘0’).
with each database item. If the value of lock on data item X is 1, item X cannot be
accessed by a database operation that requires the item. If a data item X is
unlocked, any transaction can lock the object for its use.
• T will not issue a lock_item(A) operation if it already holds the lock on item-
A.
• T will not issue an unlock_item(A) operation unless it already holds the lock
on item-A.
• At the most, only one transaction can hold the lock on a particular item. No
two transactions can access the same item concurrently.
• DBMS will not allow the transactions to read the same database object
even if neither of the transaction updates the database.
• At most one transaction can hold the lock on a particular item. No two
transactions can access the same data concurrently.
• Lock_S(A)
• LOCK_X(A)
• UNLOCK(A)
• Pessimistic Approach
• Optimistic Approach
2lptgnoc o er
sy
m n c u r r e n
-oseaopc s
t t c o n t r o l
h s
ml
pciapni g
t
o r i m
hkmph2ma p
p r o a c h e
t s
ss
aibpis a
ds
snieaht
ci
o ps
egaleac
c
lPpkdea ri
n
o
orglp a
h
cocp r
ktko i
c
iona
ncgh
go
l s
Pessimistic Approach: An approach in which the transactions are delayed if
they conflict with each other at some point of time in future is called as a
pessimistic approach.
i.e, ValidateàReadàComputeàWrite
Based on pessimistic (not sure) approach, there are two commonly used
locking protocols, namely:-
No checking is done while the transactions are executing. Also note that this
method does not require locking or time-stamping technique.
• Phase-I Growing Phase: A phase during which all locks are requested.
• Phase-II Shrinking Phase: A phase during which all locks are released.
Note:- As per 2PL protocol, the transaction cannot acquire a new lock after it
has unlocked any of its existing locked items.
• Then, it unlocks A. Then it locks B in shared mode and reads it. Then,
it unlocks B.
• And 2PL protocol means it should acquire all locks first and then
release them. So, it is not in 2-phase.
Limitations of 2 PL
• In this protocol, we must have prior knowledge about the order in which
the data items will be accessed. Here, the data items are arranged in a tree
form.
• The nodes of the tree represent data items to be accessed. If (A,B) is an arc
of the tree, then A is called as the parent of B.
A schedule of transactions obeying the above locks and using tree protocol are
as follows:
• These transactions are not in two phase. This is because the transaction,
T1, could release a lock on the data item, d1, after its processing with it is
over, thereby allowing T2 to proceed concurrently.
• But if we use tree protocol then lock is released by T1 and T2 can proceed
concurrently as shown above. So, we have a better concurrency now. Please
note that the above schedule is equivalent to a serial schedule as follows:-
• T1àT3àT2
• If we draw its precedence graph then we find that it does not have a
cycle. Hence, the tree/graph protocol ensures serializability
• This protocol is free from deadlocks and thus, no rollbacks are required.
• For each transaction, Ti. The time stamp is an integer (unique) value. It
is assigned by DBMS before the transaction, Ti, starts execution.
TS(Ti)<TS(Tj)
1. Uniqueness
2. Monotonicity
The uniqueness property assures that no equal timestamp values can exist
and monotonicity assures that timestamp values always increase.
The READ & WRITE operations of database within the same transaction must
have the same timestamp.
The timestamp method does not require any locks. Therefore, there are no
deadlocks. The timestamp methods do not make the transactions wait to prevent
conflicts as is the case with locking.
Deadlocks
There are following three basic schemes to detect and prevent deadlock:
1. Deadlock Prevention
2. Deadlock Detection
3. Deadlock Avoidance
If TS(Ti) < TS(Tj) − that is Ti, which is requesting a conflicting lock, is older
than Tj − then Ti is allowed to wait until the data-item is available.
This scheme allows the older transaction to wait but kills the younger one.
Wound-Wait Scheme:-
If TS(Ti) < TS(Tj), then Ti forces Tj to be rolled back − that is Ti wounds Tj. Tj
is restarted later with a random delay but with the same timestamp.
If TS(Ti) > TS(Tj), then Ti is forced to wait until the resource is available.
This scheme, allows the younger transaction to wait; but when an older
transaction requests an item held by a younger one, the older transaction forces
the younger one to abort and release the item.
In both the cases, the transaction that enters the system at a later stage is
aborted.
A simplest way to detect a state of deadlock is for the system to construct and
maintain a wait-for graph.
In a wait-for graph, an arrow is drawn from the transaction to the record being
sought and then drawing an arrow from that record to the transaction that is
currently using it. If the graph has cycles, deadlock is detected.
Thus, in a wait-for graph, one node is created for each transaction that is
currently executing.
When transaction T2 releases the lock (s) on the data items that the transaction
T1 was waiting for, the directed edge is dropped from the wait-for graph.
We have a state of deadlock if and only if the wait-for graph has a cycle.
In this method, several versions... of each data item X are maintained. For each
version, the value of version and the following two timestamps are kept:
1. Read phase: A transaction can read values of committed data items from the
database. However, updates are applied only to local copies (versions) of the
data items kept in the transaction workspace.
3. Write phase: If the validation phase is successful, the transaction updates are
applied to the database; otherwise, the updates are discarded and the
transaction is restarted.
The idea behind optimistic concurrency control is to do all the checks at once;
hence, transaction execution proceeds with a minimum of overhead until the
In the validation phase for transaction Ti, the protocol checks that Ti does not
interfere with any committed transactions or with any other transactions
currently in their validation phase. The validation phase for Ti checks that, for
each such transaction Tj that is either committed or is in its validation phase, one
of the following conditions holds:
1. Transaction Tj completes its write phase before Ti starts its read phase.
2. Ti starts its write phase after Tj completes its write phase, and the read_set of
Ti has no items in common with the write_set of Tj.
3. Both the read_set and write_set of Ti have no items in common with the
write_set of Tj, and Tj completes its read phase before Ti completes its
read phase.
When validating transaction Ti, the first condition is checked first for each
transaction Tj, since (1) is the simplest condition to check. Only if condition (1) is
false is condition (2) checked, and only if (2) is false is condition (3)—the most
complex to evaluate—checked. If any one of these three conditions holds, there is
no interference and Ti is validated successfully. If none of these three conditions
holds, the validation of transaction Ti fails and it is aborted and restarted later
because interference may have occurred.
Database Recovery
Recovery from transaction failure means that the database is restored to the
most recent consistent state just before the time of failure. To do this, the system
must keep information about the changes that were applied to data items by he
various transaction. This information is kept in system log.
Techniques used:
These techniques defer or postpone any actual updates to the database until the
transaction reaches it commit point. During transaction execution, the updates
are written to the log file. After the transaction reaches it commit point, the log
file is force-written to disk, then the updates are recorded in the database.
Transaction
Commit point
Updates
Database
Log file
If the transaction fails before reaching its commit point, there is no need to undo
any operations because the transaction has not affected the database on disk in
any way.
Fails
Transaction Commit point
No
need
to
UNDO
Updates Database
Log
REDO is needed in case the system fails after a transaction commits but
before all its changes are recorded on disk. In this case, the transaction
operations are redone from the log file.
Database
Done
Updates
Fail
REDO
Commit point
Transaction transaction not
Committed
Updates Database
Log
*no need to REDO.
Provisions must be made for undoing the effect of update operation that have
been applied to the database by a failed transaction. This is done by rolling back
the transaction and undoing the effect of the transaction write operation.
Commit point
Transaction transaction not
Committed
Done
Database
Updates
Fail
Log
UNDO
Shadow Paging
o Each contain pointer to a page on disk (1 to 1st page on database and so on…).
o The shadow page table is never changed over the duration of the transaction.
o The current page table may be changed when a transaction performs a write operation.
o All input and output operations use the current page table to locate database pages
on disk.
– A new copy of the modified DB page is created and the old copy is not
overwritten.
– The current directory entry is modified to point to the new disk block.
Advantages
Disadvantages
users are who they claim to be and thus prevent unauthorized users from gaining
access to secured resources.
In the diagram above, a user working at a client system interacts with the
authentication system to prove his identity and then carries on a conversation
with a server system. The server system, in turn, interacts with an authorization
system to determine what rights and privileges the client's user should be
granted.