Distributed Database Systems-Chhanda Ray-1

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 20

Chapter 7.

Distributed Transaction
Management
ACID Properties of Transactions
A transaction has four properties that lead to the consistency and reliability of a database
management system. These are atomicity, consistency, isolation and durability.
Atomicity Atomicity refers to the fact that each transaction is a single logical unit of
work in a database environment that may consist of a number of operations. Thus, either
all operations of a transaction are to be completed successfully, or none of them are
carried out at all. This property is known as all-or-nothing property of a transaction. It is
the responsibility of the DBMS to maintain atomicity even when failures occur in the
system. If the execution of a transaction is interrupted by any sort of failures, the DBMS
will determine what actions should be performed with that transaction to recover from
the failed state. To maintain atomicity during failures, either the system should complete
the remaining operations of the transaction or it should undo all the operations that
have already been executed by the transaction before the failure.

The execution of a transaction may be interrupted owing to a variety of reasons such as bad
input data, deadlock, system crash, processor failures, power failure, or media and
communication links failures (in case of distributed system). The recovery after a transaction
failure is divided into two different categories based on the different types of failures that
can occur in a database environment. The activity of ensuring transaction atomicity in the
case of transaction aborts due to failures caused by bad input data, deadlock and other
factors (except system crash and hardware failures) is called transaction recovery. On the
other hand, the activity of preserving transaction atomicity in the case of system crash or
hardware failures is called crash recovery.

Consistency Referring to the correctness of data, this property states that a transaction must
transform the database from one consistent state to another consistent state. It is the
responsibility of both the semantic data controller (integrity manager) and the concurrency
control mechanisms to maintain consistency of the data. The semantic data controller can
ensure consistency by enforcing all the constraints that have been specified in the database
schema, such as integrity and enterprise constraints. Similarly, the responsibility of a
concurrency control mechanism is to disallow transactions from reading or updating “dirty
data”. Dirty data refers to the state of the data at which the data have been updated by a
transaction, but the transaction has not yet committed. (The details of distributed
concurrency control mechanisms are discussed in Chapter 8).
A classification of consistency has been defined [Grey et al., 1976] based on dirty data,
which groups databases into four levels of consistency, which are listed in the following.
Degree 3: A transaction T1 observes degree 3 consistency if it is restricted by the following
rules:
T1 does not overwrite dirty data of other transactions.
T1 does not commit any write operation until it completes all its write operations, that is,
until the end of the transaction.
T1 does not read dirty data from other transactions.
Other transactions in the system do not read any dirty data updated by T1 before T1
completes.
Degree 2: A transaction T1 observes degree 2 consistency if it is restricted by the following
rules.
T1 does not overwrite dirty data of other transactions.
T1 does not commit any write operation until it completes all its write operations, that is,
until the end of the transaction.
T1 does not read dirty data from other transactions.

Degree 1: A transaction T1 observes degree 1 consistency if it is restricted by the following


rules.
1. T1 does not overwrite dirty data of other transactions.

T1 does not commit any write operation until it completes all its write operations, that is,
until the end of the transaction.
Degree 0: A transaction T1 observes degree 0 consistency if it is restricted by the following
facts.

1. T1 does not overwrite dirty data of other transactions.

Thus, it is clear that a higher degree of consistency covers all the lower levels. These
different degrees of consistency provide flexibility to application developers to define
transactions that can operate at different levels.

Isolation. According to this property, transactions can execute independently in a database


environment; that is, they are not dependent on each other. In other words, in a database
environment, transactions are isolated from one another and the partial effect of incomplete
transactions should not be visible to other transactions. This isolation property of
transactions is ensured by the concurrency control mechanism of a DBMS. Consequently, no
transaction can read or modify data that is being modified by another transaction. If this
property is not maintained, one of the following two events as listed below can take place in
the database.
Durability. Durability refers to the fact that the effects of a successfully completed
(committed) transaction are permanently recorded in the database and will not be affected
by a subsequent failure. This property ensures that once a transaction commits, the changes
are durable and cannot be erased from the database. It is the responsibility of the recovery
subsystem of a DBMS to maintain durability property of transactions. The recovery
manager determines how the database is to be recovered to a consistent state (database
recovery) in which all the committed actions are reflected.

Objectives of Distributed Transaction


Management
This section introduces the objectives of a transaction management subsystem of a
distributed DBMS. The responsibilities of the transaction manager in a distributed DBMS is
same as those of a corresponding subsystem in a centralized database system, but its task is
complicated owing to fragmentation, replication and the overall distributed nature of the
database. The consistency of the database should not be violated because of transaction
executions. Therefore, the primary objective of a transaction manger is to execute
transactions in an efficient way in order to preserve the ACID properties of transactions
(local as well as global transactions) irrespective of concurrent execution of transactions and
system failures. At the same time, the efficiency of transaction executions should be
improved to meet the performance criteria. In addition, a good transaction management
scheme must consider the following issues.
CPU and main memory utilization should be improved. Most of the typical database
applications spend much of their time waiting for I/O operations rather than on
computations. In a large system, the concurrent execution of these I/O bound
applications can turn into a bottleneck in main memory or in CPU time utilization. To
alleviate this problem, that is, to improve CPU and main memory utilization, a
transaction manager should adopt specialized techniques to deal with specific database
applications. This aspect is common to both centralized and distributed DBMSs.
Response time should be minimized. To improve the performance of transaction
executions, the response time of each individual transaction must be considered
and should be minimized. In the case of distributed applications, it is very difficult
to achieve an acceptable response time owing to the additional time required to
communicate between different sites.
Availability should be maximized. Although the availability in a distributed system is
better than that in a centralized system, it must be maximized for transaction recovery
and concurrency control in distributed databases. The algorithms implemented by the
distributed transaction manager must not block the execution of those transactions that
do not strictly need to access a site that is not operational.
Communication cost should be minimized. In a distributed system an additional
communication cost is incurred for each distributed or global application, because a
number of message transfers are required between sites to control the execution of a
global application. These messages are not only used to transfer data,
but are required to control the execution of the application. Preventative measures
should be adopted by the transaction manager to minimize the communication cost.

A Model for Transaction Management in


a Distributed System
This section represents an abstract model for transaction management in a distributed
database environment. In a distributed system, transactions are classified into two different
categories: local transactions and global transactions (or distributed transactions). If the
data requirement of a transaction can be fulfilled from the local site, it is called a local
transaction. Local transactions access data only from local sites. On the other hand, global
transactions access data from remote sites or multiple sites. It is obvious that the
transaction management in a distributed DBMS is more complicated than in a centralized
DBMS, as the distributed DBMS must ensure the atomicity of the global transaction as well
as of each component subtransaction executed at the local sites. This complicated task is
performed by four high-level interconnected modules of the DBMS. These are transaction
manager, concurrency control manager, recovery manager and buffer manager. The
transaction manager coordinates transactions on behalf of application programs by
communicating with the scheduler and implements a particular strategy for concurrency
control. The responsibility of the concurrency control manager is to maximize
concurrency, without allowing concurrently executing transactions to interfere with one
another, and thereby maintain the consistency of the database as well as the isolation
property of the transactions. The recovery manager preserves the database in a consistent
state in case of failures.
The buffer manager manages the efficient transfer of data between disk storage and
main memory.
In a distributed DBMS, all these modules exist in each local DBMS. In addition, a global
transaction manager or transaction coordinator is required at each site to control the execution
of global transactions as well as of local transactions initiated at that site. Therefore, an
abstract model of transaction management at each site of a distributed system

Transaction manager –. The transaction manager at each site manages the execution of the
transactions that access data stored at that local site. Each such transaction may be a local
transaction or part of a global transaction. The structure of the transaction manager is similar to
that of its counterpart in a centralized system, and it is responsible for the following:

Maintaining a log for recovery purposes in case of failures.


Implementing an appropriate concurrency control mechanism to coordinate the
concurrent execution of transactions executing at that local site.
Transaction coordinator –. The transaction coordinator at each site in a distributed system
coordinates the execution of both local and global transactions initiated at that site. This
component or module is not required in centralized DBMSs, as there is only one site in a
centralized system. The transaction coordinator at each site performs the following tasks to
coordinate the execution of transactions initiated at that local site.
It starts the execution of the transactions initiated at that site.
It breaks the transactions into a number of subtransactions and distributes these
subtransactions to the appropriate sites for execution.
It coordinates the termination of the transactions, which may result in the transactions
being committed at all sites or aborted at all sites.
The structure of this model is depicted in figure 7.1.

Figure 7.1. A Model for Transaction Management at Each Site in a DDBMS

A transaction is considered as a part of an application. When a user requests for the


execution of an application (may be local or global), the application issues
a begin_ transaction primitive. All actions that are performed by the application after that are
to be considered part of the same transaction until a commit or an abort primitive is issued. In
some systems, all these primitives are implicitly associated with the application; thus, it is not
necessary to define all these primitives explicitly. To execute a global application generated at
site S1, the transaction coordinator of site S1 initiates the transaction and breaks the transaction
into a number of subtransactions. It then involves multiple sites by consulting the global
system catalog for parallel execution of these subtransactions at different sites. When these
subtransactions are distributed over multiple local sites and perform some tasks on behalf of
the application, they are called agents. Agents reside at multiple sites and communicate with
each other via message passing. The transaction manager at a local site maintains the log
when a local transaction or part of a global transaction is executed at that local site. It also
controls the concurrent execution of transactions executing at that site. The results retrieved
from the parallel execution of these subtransactions at multiple sites are integrated by the
transaction coordinator of site S1, and finally the transaction is terminated.
Chapter 8. Distributed Concurrency Control
Concurrency Control Anomalies
The goal of concurrency control is to prevent interference among users who are simultaneously
accessing a database. This section introduces the different anomalies that can arise owing to
concurrent accesses of data in a multi-user centralized database environment as well as in a
distributed database environment. These are lost update anomaly, uncommitted dependency
or dirty read anomaly, inconsistent analysis anomaly, non-repeatable or fuzzy read
anomaly and phantom read anomaly. In addition, in a DDBMS an extra problem can arise,
namely, multiple-copy consistency problem, as data items may be replicated in distributed
databases. [Some of these anomalies were already introduced
in Chapter 7.]

Lost update anomaly – This anomaly can occur when an apparently successful
completed update operation made by one user (transaction) is overridden by another
user (transaction). This anomaly is illustrated in example 8.1.
Example 8.1.

Consider the example of an online electronic funds transfer system accessed via
remote automated teller machines (ATMs). Assume that one customer (T1) is
withdrawing $500 from an account with balance B, initially $2,000, while another
customer (T2) is depositing an amount $600 into the same account via ATM from a
different place. If these two tasks are executed serially, then the balance B will be
$2,100. Further assume that both the tasks are started nearly at the same time and both
read the same value $2,000. Due to concurrent access to the same data item, when
customer T1 is decrementing the value of B to $1,500, customer T2 is incrementing
the value of B to $2,100. Thus, update to the data item B made by customer T1 will be
overridden by the customer T2. Here, the consistency of the data is violated owing to
this anomaly, known as lost update anomaly.
Uncommitted dependency or dirty read anomaly – This problem occurs when one
transaction allows other transactions to read its data before it has committed and then
decides to abort. The dirty read anomaly is explained in example 8.2.
Example 8.2.

Assume that two transactions T1 and T2 are concurrently accessing an Employee


database. The transaction T1 is updating the salary of an employee by 10 percent
whose basic salary is salx, initially $4,000, while transaction T2 is calculating tax
deduction for the same employee depending on his basic salary, salx. Transaction T1
changes the value of salx to $4,400, allows transaction T2 to read this value before
transaction T1 has committed, and then decides to abort. Since transaction T1 is
aborted, salx should be restored to its original value $4,000. However, transaction T2
has read the new value of salx and calculates the tax deduction depending on this new
value $4,400 instead of $4,000. The value of salx read by the transaction T2 is called
dirty data, which leads to uncommitted dependency or dirty read anomaly. This
problem can be avoided by preventing transaction T2 from reading the new value
of sal x updated by transaction T1, until the decision has been made to either commit
or abort transaction T1.
Inconsistent analysis anomaly – The problem of inconsistent analysis occurs when a
transaction reads several values from the database but a second transaction updates
some of them during the execution of the first.
Example 8.3.

Let us consider the example of a banking system in which two transactions T3 and
T4 are accessing the database concurrently. The transaction T3 is calculating the
average balance of all customers of a particular branch while the transaction T4 is
depositing $1,000 into one customer account with balance balx at the same time.
Hence, the average balance calculated by the transaction T3 is incorrect, because the
transaction T4 incremented and updated the balance of one customer during the
execution of the transaction T3. This anomaly, known as inconsistent analysis
anomaly, can be avoided by preventing the transaction T4 from reading and updating
the value of balx until the transaction T3 completes its execution.
Non-repeatable or fuzzy read anomaly – Fuzzy read or non-repeatable read
anomalyoccurs when one transaction reads the value of a data item that is
subsequently modified or deleted by another transaction. If the first transaction
attempts to re-read the data item either that data item may not be found or it may
read a different value.
Example 8.4.

Assume that in the employee database, one transaction T1 reads the salary of an
employee while another transaction T2 is updating the salary of the same employee
(or the same employee record is deleted from the employee database by the
transaction T2) concurrently. Now, if the transaction T1 attempts to re-read the value
of the salary of the same employee, it will read a different value (or that record will be
not found) since employee database is updated by the transaction T2. Thus, two read
operations within the same transaction T1 return different values. This anomaly occurs
due to concurrent access of the same data item, known as fuzzy read anomaly.
Phantom read anomaly – This anomaly occurs when a transaction performing some
operation on the database based on a selection predicate, another transaction inserts
new tuples satisfying that predicate into the same database. This is known as phantom
read. For example, assume that in the employee database, one transaction T1 retrieves
all employees belonging to the R&D department while another transaction T2 inserts
new employees into that department. Hence, if the transaction T1 re-executes the same
operation, then the retrieved data set will contain additional (phantom) tuples that has
been inserted by the transaction T2 in the meantime.
Besides all the above anomalies, multiple-copy consistency problem can arise in a
distributed database system as a result of data distribution. This problem occurs when
data items are replicated and stored at different locations in a DDBMS. To maintain the
consistency of the global database, when a replicated data item is updated at one site, all
other copies of the same data item must be updated. If any copy is not updated, the
database becomes inconsistent. The updates of the replicated data items are carried out
either synchronously or asynchronously to preserve the consistency of the data.

Distributed Serializability
A transaction consists of a sequence of read and write operations together with some
computation steps that represents a single logical unit of work in a database environment. All
the operations of a transaction are units of atomicity; that is, either all the operations should
be completed successfully or none of them are carried out at all. Ozsu has defined a formal
notation for the transaction concept. According to this formalization, a transaction Ti is a
partial ordering of its operations and the termination condition. The partial order P = {Σ, α}
defines an ordering among the elements of Σ (called the domain), where Σ consists of the
read and write operations and the termination condition (abort, commit) of the transaction Ti,
and α indicates the execution order of these operations within Ti.
The execution sequence or execution ordering of operations of a transaction is called the
schedule. Let E denote an execution sequence of transactions T 1, T 2,. . ., Tn,. E is a serial
execution or serial schedule if no transactions execute concurrently in E; that is, each
transaction is executed to completion before the next one begins. Every serial execution or
serial schedule is deemed to be correct, because the properties of transactions imply that a
serial execution terminates properly and preserves database consistency. On the other hand,
an execution is said to be a non-serial executionor non-serial schedule if the operations
from a set of concurrent transactions execute simultaneously. A non-serial schedule (or
execution) is serializable if it is computationally equivalent to a serial schedule, that is, if it
produces the same result and has the same effect on the database as some serial execution.
To prevent data inconsistency, it is essential to guarantee serializability of concurrent
transactions.

In serializability, the ordering of read and write operations are important. In this context, it is
necessary to introduce the concept of conflicting transactions. If two transactions perform
read or write operations on different data items, then they are not conflicting, and hence the
execution order is not important. If two transactions perform read operation on the same data
item, then they are non-conflicting. Two transactions are said to be conflicting if they access
the same data item concurrently, and at least one transaction performs a write operation.
Thus, concurrent read -read operation by two transactions is non-conflicting, but concurrent
read-write, write-read and write-write operations by two transactions are conflicting. The
execution order is important if there is a conflicting operation caused by two transactions.
If concurrent accesses of data items are allowed in a database environment, then the schedule
may be non-serial, which indicates that the consistency of data may be violated. To allow
maximum concurrency and to preserve the consistency of data, it is sometimes possible to
generate a schedule from a non-serial schedule that is equivalent to a serial schedule or serial
execution order by swapping the order of the non-conflicting operations. If schedule S1 is
derived from a non-serial schedule S2 by swapping the execution order of non-conflicting
operations in S2, and if S1 produces the same output as that of a serial schedule S3 with the
same set of transactions, then S1 is called a serializableschedule. This type of serializability
is known as conflict serializability and Schedule S1 is said to be conflict equivalent to
schedule S2. A conflict serializable schedule orders any conflicting operations in the same
way as a serial execution. A directed precedence graph or a serialization graph can be
produced to test for conflict serializability.
Serializability theory for centralized databases can be extended in a straightforward manner
for distributed non-replicated databases. In a distributed system, each local site maintains a
schedule or execution order of transactions or subtransactions (part of a global transaction)
running at that site, called local schedule. The global schedule or global execution order is
the union of all local schedules in a non -replicated distributed database system. Hence, if
each local site maintains serializability, then the global schedule also becomes serializable as
the local serialization orders are identical. Replication of data items in a distributed system
adds extra complexity in maintaining global serializability or distributed serializability. In
this case also, it is possible that local schedules are serializable, but mutual consistency of
the replicated data items may not be preserved owing to the conflicting operations on the
same data item at multiple sites. Thus, a distributed schedule or global schedule (the union
of all local schedules) is said to be distributed serializable if the execution order at each
local site is serializable and the local serialization orders are identical. This requires that all
subtransactions appear in the same order in the equivalent serial schedule at all sites.

Formally, distributed serializability can be defined as follows. Consider S is the union of all
local serializable schedules S1, S2,. . ., Sn respectively in a distributed system. Now, the
global schedule S is said to be distributed serializable, if for each pair of conflicting
operations Oi and Oj from distinct transactions Ti and Tj respectively from different sites,
Oi precedes Oj in the total ordering S, and if and only if Ti precedes Tj in all of the local
schedules where they appear together. To attain serializability, a DDBMS must incorporate
synchronization techniques that control the relative ordering of conflicting operations.

Example 8.5.

Let us consider a simple transaction Ti that consists of the following operations:

Transaction Ti: Read(a);

Read(b);

a : = a + b;

write(a);

commit;

The partial ordering of the above transaction Ti can be formally represented as P = {Σ, α},
Where Σ = {R(a), R(b), W(a), C} and
α = {(R(a), W(a)), (R(b), W(a)), (W(a), (C)), (R(a), (C)), (R(b), (C))}.
Here, the ordering relation specifies the relative ordering of all operations with respect to
the termination condition. The partial ordering of a transaction facilitates to derive the
corresponding directed acyclic graph (DAG) for the transaction. The DAG of the above
transaction Ti is illustrated in figure 8.1.

Figure 8.1. Directed Acyclic Graph Representation of Transaction Ti

Centralized 2PL
In the centralized 2PL method, the lock manager or scheduler is a central component, and
hence a single site is responsible for managing all activities of the lock manager for the
entire distributed system. Before accessing any data item at any site, appropriate locks must
be obtained from the central lock manager. If a global transaction Ti is initiated at site Si of
the distributed system, then the centralized 2PL for that global transaction should be
implemented in the following way.
The transaction manager at the site Si (called transaction coordinator) partitions the global
transaction into several subtransactions using the information stored in the global system
catalog. Thus, the transaction coordinator is responsible for maintaining consistency of data. If
the data items are replicated, then it is also the responsibility of the transaction coordinator to
ensure that all replicas of the data item are updated. Therefore, for a write operation, the
transaction coordinator requests exclusive locks on all replicas of the data item that are
stored at different sites. However, for a read operation, the transaction coordinator can select
any one of the replicas of the data item that are available, preferably at its own site. The local
transaction managers of the participating sites request and release locks from/to the
centralized lock manager following the normal rules of the 2PL protocol. Participating sites
are those sites that are involved in the execution of the global transaction (subtransactions).
After receiving the lock request, the centralized lock manager checks whether that lock
request is compatible with the existing locking status or not. If it is compatible, the lock
manager sends a message to the corresponding site that the lock has been granted; otherwise,
the lock request is put in a queue until it can be granted.

In some systems, a little variation of this method is followed. Here the transaction
coordinator sends lock requests to the centralized lock manager on behalf of participating
sites. In this case, a global update operation that involves n sites requires a minimum of 2n +
3 messages to implement centralized 2PL method. These are 1 lock request from the
transaction coordinator, 1 lock grant message from the centralized lock manager, n update
messages from the transaction coordinator, nacknowledgement messages from
the n participating sites and 1 unlock request from the transaction coordinator as illustrated
in figure 8.3.

Figure 8.3. Communicating Messages in Centralized 2PL Method

The main advantage of the centralized 2PL method in DDBMSs is that it is a straightforward
extension of the 2PL protocol for centralized DBMSs; thus, it is less complicated and
implementation is relatively easy. In this case, the deadlock detection can be handled easily,
because the centralized lock manager maintains all the locking information. Hence, the
communication cost also is relatively low. In the case of DDBMSs, the disadvantages of
implementing the centralized 2PL method are system bottlenecks and lower reliability. For
example, as all lock requests go to the centralized lock manager, that central site may
become a bottleneck to the system. Similarly, the failure of the central site causes a major
failure of the system; thus, it is less reliable. This method also hampers the local autonomy.
Distributed 2PL
Distributed 2PL method implements the lock manager at each site of a distributed system. Thus,
the lock manager at each site is responsible for managing locks on data items that are stored
locally. If the database is not replicated, then distributed 2PL becomes the same as primary copy
2PL. If the database is replicated, then distributed 2PL implements a read-one-write-all
(ROWA) replica control protocol. In ROWA replica control protocol, any copy of a replicated
data item can be used for a read operation, but for a write operation, all copies of the replicated
data item must be exclusively locked before performing the update operation.

In distributed 2PL, when a global transaction is initiated at a particular site, the transaction
coordinator (the transaction manager of that site is called transaction coordinator) sends lock
requests to lock managers of all participating sites. In response to the lock requests the lock
manager at each participating site can send a lock granted message to the transaction
coordinator. However, the transaction coordinator does not wait for the lock granted
message in some implementations of the distributed 2PL method. The operations that are to
be performed at a participating site are passed by the lock manager of that site to the
corresponding transaction manager instead of the transaction coordinator. At the end of the
operation the transaction manager at each participating site can send the corresponding
message to the transaction coordinator. In an alternative approach, the transaction manager at
a participating site can also pass the “end of operation” message to its own lock manager,
who can then release the locks and inform the transaction coordinator. The communication
between the participating sites and the transaction coordinator when executing a global
transaction using distributed 2PL is depicted in figure 8.4.

Figure 8.4. Communicating Messages in Distributed 2PL Method

In distributed 2PL method, locks are handled in a decentralized manner; thus, it overcomes
the drawbacks of centralized 2PL method. This protocol ensures better reliability than
centralized 2PL and primary copy 2PL methods. However, the main disadvantage of this
approach is that deadlock handling is more complex owing to the presence of multiple lock
managers. Hence, the communication cost is much higher than in primary copy 2PL, as all
data items must be locked before an update operation. In distributed 2PL method, a global
update operation that involves n participating sites may require a minimum of 5n messages.
These are n lock requests, n lock granted messages, n update messages, n acknowledgement
messages and n unlock requests.

Majority Locking Protocol


(FROM SLIDES)
Chapter 9. Distributed Deadlock Management
Centralized Deadlock Detection
In Centralized Deadlock Detection method, a single site is chosen as Deadlock Detection
Coordinator (DDC) for the entire distributed system. The DDC is responsible for
constructing the GWFG for the system. Each lock manager in the distributed database
system transmits its LWFG to the DDC periodically. The DDC constructs the GWFG from
these LWFGs and checks for cycles in it. The occurrence of a global deadlock situation is
detected if there are one or more cycles in the GWFG. The DDC must break each cycle in
the GWFG by selecting the transactions to be rolled back and restarted to recover from a
deadlock situation. The information regarding the transactions that are to be rolled back and
restarted must be transmitted to the corresponding lock managers by the deadlock detection
coordinator.
To minimize communication cost, the DDC should only send the changes that have to be made
in the LWFGs to the lock managers. These changes represent the addition or removal of edges in
the LWFGs. The actual length of a period for global deadlock detection is a system design
decision, and it is a trade-off between the communication cost of the deadlock detection process
and the cost of detecting deadlocks late. The communication cost increases
if the length of the period is larger, whereas some deadlocks in the system go undetected if
the length of the deadlock detection period is smaller.
The centralized deadlock detection approach is very simple, but it has several drawbacks. This
method is less reliable, as the failure of the central site makes the deadlock detection impossible.
The communication cost is very high in this case, as other sites in the distributed system send
their LWFGs to the central site. Another disadvantage of centralized deadlock detection
technique is that false detection of deadlocks can occur, for which the deadlock recovery
procedure may be initiated, although no deadlock has occurred. In this method, unnecessary
rollbacks and restarts of transactions may also result owing to phantom deadlocks. [False
deadlocks and phantom deadlocks are discussed in detail in Section 9.4.4.]

Hierarchical Deadlock Detection


Hierarchical deadlock detection method reduces the communication overhead of
centralized deadlock detection method. With this approach, all sites in a distributed database
system are organized into a hierarchy, and a complete tree of deadlock detectors is
constructed instead of a single centralized deadlock detector. Each site in the distributed
system sends its LWFG to the deadlock detection site above it (adjacent parent node) in the
hierarchy. Thus, local deadlock detection is performed in the leaf nodes of the tree, whereas
the non-leaf nodes are responsible for detecting any deadlock situation involving all its child
nodes. The hierarchical deadlock detection method is illustrated with an example in figure
9.3.
Figure 9.3. Hierarchical Deadlock Detection

In figure 9.3, local deadlock detection is performed at leaf nodes site 1, site2, site3, site4 and site5.
The deadlock detector at site6, namely DD12 , is responsible for detecting any deadlock situation
involving its child nodes site1 and site2. Similarly, site3, site4 and site5 send their LWFGs to site7
and the deadlock detector at site7 searches for any deadlocks involving its adjacent child nodes.
A global deadlock detector exists at the root of the tree that would detect the occurrence of
global deadlock situations in the entire distributed system. Here, the global deadlock detector
resides at site8 and would detect the deadlocks involving site6 and site7.
The performance of the hierarchical deadlock detection approach depends on the
hierarchical organization of nodes in the system. This organization should reflect the
network topology and the pattern of access requests to different sites of the network. This
approach is more reliable than centralized deadlock detection as it reduces the dependence
on a central site; also it reduces the communication cost. However, its implementation is
considerably more complicated, particularly with the possibility of site and communication
link failures. In this approach, detection of false deadlocks can also occur.

Distributed Deadlock Detection


In distributed deadlock detection method, a deadlock detector exists at each site of the
distributed system. In this method, each site has the same amount of responsibility, and there
is no such distinction as local or global deadlock detector. A variety of approaches have been
proposed for distributed deadlock detection algorithms, but the most well-known and
simplified version, which is presented here, was developed by R. Obermarck in 1982.
In this approach, a LWFG is constructed for each site by the respective local deadlock
detectors. An additional external node is added to the LWFGs, as each site in the distributed
system receives the potential deadlock cycles from other sites. In the distributed deadlock
detection algorithm, the external node Tex is added to the LWFGs to indicate whether any
transaction from any remote site is waiting for a data item that is being held by a transaction
at the local site or whether any transaction from the local site is waiting for a data item that
is currently being used by any transaction at any remote site. For instance, an edge from the
node Ti to Tex exists in the LWFG, if the transaction Ti is waiting for a data item that is
already held by any transaction at any remote site. Similarly, an edge from the external node
Tex to Ti exists in the graph, if a transaction from a remote site is waiting to acquire a data
item that is currently being held by the transaction Ti at the local site. Thus, the local detector
checks for two things to determine a deadlock situation. If a LWFG contains a cycle that
does not involve the external node T ex, then it indicates that a deadlock has occurred locally
and it can be handled locally. On the other hand, a global deadlock potentially exists if the
LWFG contains a cycle involving the external node Tex. However, the existence of such a
cycle does not necessarily imply that there is a global deadlock, as the external
node Tex represents different agents.

The LWFGs are merged so as to determine global deadlock situations. To avoid sites
transmitting their LWFGs to each other, a simple strategy is followed here. According to this
strategy, one timestamp value is allocated to each transaction and a rule is imposed such that
one site S i transmits its LWFG to the site Sk, if a transaction, say Tk, at site Sk is waiting for a
data item that is currently being held by a transaction Ti at site Si and ts(Ti) < ts(Tk). If ts(Ti) <
ts(Tk), the site Si transmits its LWFG to the site Sk, and the site Sk adds this information to its
LWFG and checks for cycles not involving the external node Tex in the extended graph. If
there is no cycle in the extended graph, the process continues until a cycle appears and it
may happen that the entire GWFG is constructed and no cycle is detected. In this case, it is
decided that there is no deadlock in the entire distributed system. On the other hand, if the
GWFG contains a cycle not involving the external node Tex, it is concluded that a deadlock
has occurred in the system. The distributed deadlock detection method is illustrated in figure
9.4.

Figure 9.4. Distributed Deadlock Detection


After detecting a deadlock in the system, an appropriate recovery protocol is invoked to
resolve the deadlock situation. One or more transactions are chosen as victims, and
these transactions, together with all their subtransactions, are rolled back and restarted.
The major benefit of distributed deadlock detection algorithm is that it is potentially more
robust than the centralized or hierarchical methods. Deadlock detection is more complicated
in this method, because no single site contains all the information that is necessary to detect
a global deadlock situation in the system, and therefore substantial inter-site communication
is required, which increases the communication overhead. Another disadvantage of
distributed deadlock detection approach is that it is more vulnerable to the occurrence of
false deadlocks than centralized or hierarchical methods.

For Practical
See slides of that topic discuss in class
Chapter 10. Distributed
Recovery Management

Failures in a Distributed Database System


To design a reliable system that can recover itself from failures, it is necessary to identify the
types of failures the system has to deal with. There are various types of failures that may
occur in a system, each of which needs to be dealt with in a different manner. A distributed
database management system may suffer from all types of failures that may occur in a
centralized DBMS. The following is the list of failures that may occur in a centralized
DBMS.
Transaction failure–. There are two types of errors that may cause one or more
transactions to fail: application software errors and system errors. A transaction can
no longer continue owing to logical errors in the application software that is used for
database accessing, such as bad input, data not found, overflow or resource limit
exceeded. The system enters into an undesirable state (e.g., deadlock) resulting in the
failure of one or more transactions, known as a system error.
System crash–. A hardware malfunction, or a bug in the software (either in the
database software or in the operating system) may result in the loss of main memory,
which forces the processing of transactions to halt.
Disk failure–. A disk block loses its contents as a result of either a head crash or an
unreadable media or a failure during a data transfer operation. This also leads the
processing of transactions to halt.
In a distributed DBMS, several different types of failures can occur in addition to the failures
listed above for a centralized DBMS. These are site failure, communication link failure, loss
of messages and network partition as listed in the following.
Site failure–. In a distributed DBMS, system failures are typically referred to as site
failures, and it can occur owing to a variety of reasons as mentioned above. In this
context, it is necessary to mention that system failures that result in the loss of the
contents of the main memory are only considered here. The system failure may be
total or partial. A partial failure indicates the failure of some sites in the distributed
system while the other sites remain operational. A total failureindicates the
simultaneous failure of all sites in a distributed system.
Loss of messages–. The loss of messages, or improperly ordered messages, is the
responsibility of the underlying computer network protocol. It is assumed that these
types of failures are handled by the data communication component of the
distributed DBMS, so these are not considered here.
Communication link failure–. The performance of a distributed DBMS is highly
dependent on the ability of all sites in the network to communicate reliably with one
another. Although the communication technology has improved significantly, link
failure can still occur in a distributed DBMS.
Network partition–. Owing to communication link failure, it may happen that the
entire network gets split into two or more partitions, known as network partition.
In this case, all sites in the same partition can communicate with each other, but
cannot communicate with sites in the other partitions.
In some cases, it is very difficult to distinguish the type of failure that has occurred in
the system. Generally, the system identifies that a failure has occurred and initiates the
appropriate technique for recovery from the failure.

Two-Phase Commit Protocol (2PC)

Figure 10.1. Two-phase Commit Protocol

You might also like