Study Materials of Module 4 DBMS

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Paper Name: Database Management Systems

Paper Code: PCCCS501


Study Material
(Module 4)
 Transactions:
Collections of operations that form a single logical unit of work are called transactions. A database system
must ensure proper execution of transactions despite failures—either the entire transaction executes, or none
of it does. Furthermore, it must manage concurrent execution of transactions in a way that avoids the
introduction of inconsistency. In our funds-transfer example, a transaction computing the customer’s total
balance might see the checking-account balance before it is debited by the funds-transfer transaction, but see
the savings balance after it is credited. As a result, it would obtain an incorrect result.
A transaction is a unit of program execution that accesses and possibly updates various data items. Usually,
a transaction is initiated by a user program written in a high-level data-manipulation language (typically
SQL), or programming language (for example, C++, or Java), with embedded database accesses in JDBC or
ODBC. A transaction is delimited by statements (or function calls) of the form begin transaction and end
transaction. The transaction consists of all operations executed between the begin transaction and end
transaction.
This collection of steps must appear to the user as a single, indivisible unit. Since a transaction is indivisible,
it either executes in its entirety or not at all. Thus, if a transaction begins to execute but fails for whatever
reason, any changes to the database that the transaction may have made must be undone. This requirement
holds regardless of whether the transaction itself failed (for example, if it divided by zero), the operating
system crashed, or the computer itself stopped operating. As we shall see, ensuring that this requirement is
met is difficult since some changes to the database may still be stored only in the main-memory variables of
the transaction, while others may have been written to the database and stored on disk. This “all-or-none”
property is referred to as atomicity.
Furthermore, since a transaction is a single unit, its actions cannot appear to be separated by other database
operations not part of the transaction. While we wish to present this user-level impression of transactions, we
know that reality is quite different. Even a single SQL statement involves many separate accesses to the
database, and a transaction may consist of several SQL statements. Therefore, the database system must take
special actions to ensure that transactions operate properly without interference from concurrently executing
database statements. This property is referred to as isolation.
Even if the system ensures correct execution of a transaction, this serves little purpose if the system
subsequently crashes and, as a result, the system “forgets” about the transaction. Thus, a transaction’s actions
must persist across crashes. This property is referred to as durability.
Because of the above three properties, transactions are an ideal way of structuring interaction with a database.
This leads us to impose a requirement on transactions themselves. A transaction must preserve database
consistency—if a transaction is run atomically in isolation starting from a consistent database, the database
must again be consistent at the end of the transaction. This consistency requirement goes beyond the data
integrity constraints we have seen earlier (such as primary-key constraints, referential integrity, check
constraints, and the like). Rather, transactions are expected to go beyond that to ensure preservation of those
application-dependent consistency constraints that are too complex to state using the SQL constructs for data
integrity. How this is done is the responsibility of the programmer who codes a transaction. This property is
referred to as consistency.
To restate the above more concisely, we require that the database system maintain the following properties
of the transactions:
•Atomicity. Either all operations of the transaction are reflected properly in the database, or none are.
•Consistency. Execution of a transaction in isolation (that is, with no other transaction executing
concurrently) preserves the consistency of the database.
• Isolation. Even though multiple transactions may execute concurrently, the system guarantees that, for
every pair of transactions Ti and Tj , it appears to Ti that either Tj finished execution before Ti started or Tj
started execution after Ti finished. Thus, each transaction is unaware of other transactions executing
concurrently in the system.
• Durability. After a transaction completes successfully, the changes it has made to the database persist, even
if there are system failures.
These properties are often called the ACID properties; the acronym is derived from the first letter of each
of the four properties. As we shall see later, ensuring the isolation property may have a significant adverse
effect on system performance. For this reason, some applications com promise on the isolation property.
We shall study these compromises after first studying the strict enforcement of the ACID properties.
 Transaction Model:
Because SQL is a powerful and complex language, we begin our study of transactions with a simple database
language that focuses on when data are moved from disk to main memory and from main memory to disk.
The data items in our simplified model contain a single data value (a number in our examples). Each data
item is identified by a name (typically a single letter in our examples, that is, A, B, C, etc.). We shall illustrate
the transaction concept using a simple bank application consisting of several accounts and a set of transactions
that access and update those accounts. Transactions access data using two operations:
• read(X), which transfers the data item X from the database to a variable, also called X, in a buffer in main
memory belonging to the transaction that executed the read operation.
• write(X), which transfers the value in the variable X in the main-memory buffer of the transaction that
executed the write to the data item X in the database.
It is important to know if a change to a data item appears only in main memory or if it has been written to the
database on disk. In a real database system, the write operation does not necessarily result in the immediate
update of the data on the disk; the write operation may be temporarily stored elsewhere and executed on the
disk later. For now, however, we shall assume that the write operation updates the database immediately.
Let Ti be a transaction that transfers $50 from account A to account B. This transaction can be defined as:

Let us now consider each of the ACID properties. (For ease of presentation, we consider them in an order
different from the order A-C-I-D.)

 Consistency: The consistency requirement here is that the sum of A and B be unchanged by the execution
of the transaction. Without the consistency requirement, money could be created or destroyed by the
transaction! It can be verified easily that, if the database is consistent before an execution of the transaction,
the database remains consistent after the execution of the transaction. Ensuring consistency for an
individual transaction is the responsibility of the application programmer who codes the transaction.

 Atomicity: Suppose that, just before the execution of transaction Ti, the values of accounts A and B are
$1000 and $2000, respectively. Now suppose that, during the execution of transaction Ti, a failure occurs
that prevents Ti from completing its execution successfully. Further, suppose that the failure happened
after the write(A) operation but before the write(B) operation. In this case, the values of accounts A and
B reflected in the database are $950 and $2000. The system destroyed $50 as a result of this failure. In
particular, we note that the sum A + B is no longer preserved. Thus, because of the failure, the state of the
system no longer reflects a real state of the world that the database is supposed to capture. We term such
a state an inconsistent state. We must ensure that such inconsistencies are not visible in a database system.
Note, however, that the system must at some point be in an inconsistent state. Even if transaction Ti is
executed to completion, there exists a point at which the value of account A is $950 and the value of
account B is $2000, which is clearly an inconsistent state. This state, however, is eventually replaced by
the consistent state where the value of account A is $950, and the value of account B is $2050. Thus, if
the transaction never started or was guaranteed to complete, such an inconsistent state would not be visible
except during the execution of the transaction. That is the reason for the atomicity requirement: If the
atomicity property is present, all actions of the transaction are reflected in the database, or none are. The
basic idea behind ensuring atomicity is this: The database system keeps track (on disk) of the old values
of any data on which a transaction performs a write. This information is written to a file called the log. If
the transaction does not complete its execution, the database system restores the old values from the log
to make it appear as though the transaction never executed.

 Durability: Once the execution of the transaction completes successfully, and the user who initiated the
transaction has been notified that the transfer of funds has taken place, it must be the case that no system
failure can result in a loss of data corresponding to this transfer of funds. The durability property guarantees
that, once a transaction completes successfully, all the updates that it carried out on the database persist,
even if there is a system failure after the transaction completes execution. We assume for now that a failure
of the computer system may result in loss of data in main memory, but data written to disk are never lost.
Protection against loss of data on disk is discussed in Chapter 16. We can guarantee durability by ensuring
that either: 1. The updates carried out by the transaction have been written to disk before the transaction
completes. 2. Information about the updates carried out by the transaction and writ ten to disk is sufficient
to enable the database to reconstruct the updates when the database system is restarted after the failure.

 Isolation: Even if the consistency and atomicity properties are ensured for each transaction, if several
transactions are executed concurrently, their operations may interleave in some undesirable way, resulting
in an inconsistent state. For example, as we saw earlier, the database is temporarily inconsistent while the
transaction to transfer funds from A to B is executing, with the deducted total written to A and the increased
total yet to be written to B. If a second concurrently running transaction reads A and B at this intermediate
point and computes A+B, it will observe an inconsistent value. Furthermore, if this second transaction
then performs updates on A and B based on the inconsistent values that it read, the database may be left
in an inconsistent state even after both transactions have completed. A way to avoid the problem of
concurrently executing transactions is to execute transactions serially— that is, one after the other.
 Transaction Atomicity and Durability:
As we noted earlier, a transaction may not always complete its execution successfully. Such a transaction
is termed aborted. If we are to ensure the atomicity property, an aborted transaction must have no effect
on the state of the database. Thus, any changes that the aborted transaction made to the database must be
undone. Once the changes caused by an aborted transaction have been undone, we say that the transaction
has been rolled back. It is part of the responsibility of the recovery scheme to manage transaction aborts.
This is done typically by maintaining a log. Each database modification made by a transaction is first
recorded in the log. We record the identifier of the transaction performing the modification, the identifier
of the data item being modified, and both the old value (prior to modification) and the new value (after
modification) of the data item. Only then is the database itself modified. Maintaining a log provides the
possibility of redoing a modification to ensure atomicity and durability as well as the possibility of
undoing a modification to ensure atomicity in case of a failure during transaction execution.
A transaction that completes its execution successfully is said to be committed. A committed transaction
that has performed updates transforms the database into a new consistent state, which must persist even
if there is a system failure.
Once a transaction has committed, we cannot undo its effects by aborting it. The only way to undo the
effects of a committed transaction is to execute a compensating transaction. For instance, if a transaction
added $20 to an account, the compensating transaction would subtract $20 from the account. However,
it is not always possible to create such a compensating transaction. Therefore, the responsibility of writing
and executing a compensating transaction is left to the user, and is not handled by the database system.
We need to be more precise about what we mean by successful completion of a transaction. We therefore
establish a simple abstract transaction model. A transaction must be in one of the following states:
• Active, the initial state; the transaction stays in this state while it is executing.
• Partially committed, after the final statement has been executed.
• Failed, after the discovery that normal execution can no longer proceed.
• Aborted, after the transaction has been rolled back and the database has been restored to its state prior
to the start of the transaction.
• Committed, after successful completion.

State diagram of a transaction


Similarly, we say that a transaction has aborted only if it has entered the aborted state. A transaction is
said to have terminated if it has either committed or aborted. A transaction starts in the active state. When
it finishes its final statement, it enters the partially committed state. At this point, the transaction has
completed its execution, but it is still possible that it may have to be aborted, since the actual output may
still be temporarily residing in main memory, and thus a hardware failure may preclude its successful
completion. The database system then writes out enough information to disk that, even in the event of a
failure, the updates performed by the transaction can be re-created when the system restarts after the
failure. When the last of this information is written out, the transaction enters the committed state.
A transaction enters the failed state after the system determines that the transaction can no longer proceed
with its normal execution (for example, because of hardware or logical errors). Such a transaction must
be rolled back. Then, it enters the aborted state. At this point, the system has two options:
• It can restart the transaction, but only if the transaction was aborted as a result of some hardware or
software error that was not created through the internal logic of the transaction. A restarted transaction is
considered to be a new transaction.
• It can kill the transaction. It usually does so because of some internal logical error that can be corrected
only by rewriting the application program, or because the input was bad, or because the desired data were
not found in the database.
We must be cautious when dealing with observable external writes, such as writes to a user’s screen, or
sending email. Once such a write has occurred, it cannot be erased, since it may have been seen external
to the database systems.
Most systems allow such writes to take place only after the transaction has entered the committed state.
One way to implement such a scheme is for the database system to store any value associated with such
external writes temporarily in a special relation in the database, and to perform the actual writes only
after the transaction enters the committed state. If the system should fail after the transaction has entered
the committed state, but before it could complete the external writes, the database system will carry out
the external writes (using the data in nonvolatile storage) when the system is restarted. Handling external
writes can be more complicated in some situations. For example, suppose the external action is that of
dispensing cash at an automated teller machine, and the system fails just before the cash is actually
dispensed (we assume that cash can be dispensed atomically). It makes no sense to dispense cash when
the system is restarted, since the user may have left the machine. In such a case a compensating
transaction, such as depositing the cash back in the user’s account, needs to be executed when the system
is restarted. As another example, consider a user making a booking over the Web. It is possible that the
database system or the application server crashes just after the booking transaction commits. It is also
possible that the network connection to the user is lost just after the booking transaction commits. In
either case, even though the transaction has committed, the external write has not taken place. To handle
such situations, the application must be designed such that when the user connects to the Web application
again, she will be able to see whether her transaction had succeeded or not. For certain applications, it
may be desirable to allow active transactions to display data to users, particularly for long-duration
transactions that run for minutes or hours. Unfortunately, we cannot allow such output of observable data
unless we are willing to compromise transaction atomicity.

You might also like