Oodbms and Ordbms
Oodbms and Ordbms
Oodbms and Ordbms
Suseta Datta
Assistant Professor
Department : Computer Applications
University of Engineering & Management,
Kolkata
Concurrency Control
Concurrency Control
Concurrency control is a
database management systems concept that is used to
address conflicts with the simultaneous accessing or
altering of data that can occur with a multi-user system.
The objective of concurrency control is to ensure the
serializability of transactions in a multiuser database
environment.
Concurrency control is important because the
simultaneous execution of transactions over a shared
database can create several data integrity and
consistency problems.
Lock-based Protocols
A lock is a mechanism to control concurrent access to a
data item.
Data items can be locked in two modes :
exclusive (X) mode: Data item can be both read as
well as
written. X-lock is requested using lock-X
instruction.
shared (S) mode: Data item can only be read. S-lock
is requested using lock-S instruction.
Lock requests are made to the concurrency-control
manager by the programmer. Transaction can proceed
only after request is granted.
Lock-based Protocols
Lock-compatibility matrix -
Thomas write rule checks that T2's write is never seen by any transaction.
If we delete the write operation in transaction T2, then conflict serializable
schedule can be obtained which is shown in below figure:
Multiple Granularity
Allow data items to be of various sizes and define a
hierarchy of data granularities, where the small granularities
are nested within larger ones
Can be represented graphically as a tree.
When a transaction locks a node in the tree explicitly, it
implicitly locks all the node's descendents in the same
mode.
Granularity of locking (level in tree where locking is
done):
fine granularity (lower in tree): high concurrency, high locking
overhead
coarse granularity (higher in tree): low locking overhead, low
concurrency
Example of Granularity Hierarchy
<T0 start>
<T0, A, 1000, 950>
To, B, 2000, 2050
A = 950
B = 2050
<T0 commit>
<T1 start>
<T1, C, 700, 600>
C = 600
BB, BC
<T1 commit>
BA
• Note: BX denotes block containing X.
Immediate Database Modification
Recovery procedure has two operations instead of one:
undo(Ti) restores the value of all data items updated by Ti to
their old values, going backwards from the last log record for Ti
redo(Ti) sets the value of all data items updated by Ti to the new
values, going forward from the first log record for Ti.
Both operations must be idempotent.
That is, even if the operation is executed multiple times the
effect is the same as if it is executed once.
Needed since operations may get re-executed during recovery.
When recovering after failure:
Transaction Ti needs to be undone if the log contains the record
<Ti start>, but does not contain the record <Ti commit>.
Transaction Ti needs to be redone if the log contains both the
record <Ti start> and the record <Ti commit>.
Undo operations are performed first, then redo operations.
Immediate Database Modification
Below we show the log as it appears at three instances of time.
The recovery system reads the logs backwards from the end to the last checkpoint.
It maintains two lists, an undo-list and a redo-list.
If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn,
Commit>, it puts the transaction in the redo-list.
If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it
puts the transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed. All
the transactions in the redo-list and their previous logs are removed and then
redone before saving their logs.
Shadow Paging
Shadow paging is an alternative to log-based recovery; this scheme
is useful if transactions execute serially.
Idea: maintain two page tables during the lifetime of a transaction –
the current page table, and the shadow page table.
Store the shadow page table in nonvolatile storage, such that state of
the database prior to transaction execution may be recovered.
Shadow page table is never modified during execution.
To start with, both the page tables are identical. Only current page
table is used for data item accesses during execution of the
transaction.
Whenever any page is about to be written for the first time.
A copy of this page is made onto an unused page. [to be a new current
page]
The current page table is then made to point to the copy.
The update is performed on the copy.
Shadow Page Table
Shadow Paging
To commit a transaction :
Flush all modified pages in main memory to disk
Output current page table to disk
Make the current page table the new shadow page table, as follows:
keep a pointer to the shadow page table at a fixed (known) location on
disk.
to make the current page table the new shadow page table, simply
update the pointer to point to current page table on disk
Once pointer to shadow page table has been written, transaction is
committed.
No recovery is needed after a crash — new transactions can start
right away, using the shadow page table.
Pages not pointed to from current/shadow page table should be freed
(garbage collected).
Shadow Paging
Advantages of shadow-paging over log-based schemes
no overhead of writing log records.
recovery is trivial.
Disadvantages :
Copying the entire page table is very expensive.
Can be reduced by using a page table structured like a B+-tree.
No need to copy entire tree, only need to copy paths in the tree
that lead to updated leaf nodes.
Commit overhead is high even with above extension.
Need to flush every updated page, and page table.
Data gets fragmented (related pages get separated on disk).
After every transaction completion, the database pages containing old
versions of modified data need to be garbage collected.
Hard to extend algorithm to allow transactions to run concurrently.
Easier to extend log based schemes.
Thank You