DDBS Lec2
DDBS Lec2
DDBS Lec2
Distributed Database
Distributed DBMS Architecture
Source:
1. Principles of Distributed Database Systems
By TannerOzsu, Patric Valdureitz
2. Slides available
Architectural models for distributed DBMS
• Consider the possible ways in which multiple databases may be put
together for sharing by multiple DBMSs.
• Fig. 1.10 organizes the systems as characterized with respect to 1) the
autonomy of local systems 2) their distribution and 3) their heterogeneity.
Dimensions of autonomy:
1. Design autonomy:
Individual DBMSs are free to use the data models and transaction
management techniques that they prefer.
2. Communication autonomy:
Each of the individual DBMSs is free to make its own decision as to what
type of information it wants to provide to the other DBMSs or to the
software that controls their global execution.
3. Execution autonomy:
Each DBMS can execute the transactions that are submitted to it in any way
that it wants to.
Distribution
The distribution dimension of the taxonomy deals with data.
We consider the physical distribution of data over multiple sites, the user sees
the data as one logical pool.
Client/server distribution:
It concentrates data management duties at servers while the clients focus on
providing the application environment including the user interface.
Peer-to-peer distribution:
There is no distinction of client machines versus servers. Each machine has full
DBMS functionality and can communicate with other machines to execute
queries and transactions.
Heterogeneity
Heterogeneity may occur in various forms in distributed systems, ranging
from hardware heterogeneity and differences in networking protocols to
variations in data managers.
The important ones relate to data models, query languages, and transaction
management protocols.
Architectural alternatives
• Consider the architectural alternatives starting at the origin in Fig. 4.3, and
moving along the autonomy dimension.
• We use a notation based on the alternatives along the three dimension.
• The dimensions are identified as A (autonomy), D (distribution), and H
(heterogeneity).
• The alternatives along each dimension are identified by numbers 0, 1 or 2.
• Along the autonomy dimension, 0 represents tight integration, 1 represents
semiautonomous systems and 2 represents total isolation.
• Along the distribution dimension, 0 is for no distribution, 1 is for
client/server systems, and 2 is for peer-to-peer distribution.
• Along the heterogeneous dimension, 0 identifies homogeneous systems
while 1 stands for heterogeneous systems.
Some examples:
(A0, D2, H0)- Peer-to-Peer distributed homogeneous DBMS,
(A0, D1, H0)- Client-Server distribution,
(A2, D2, H1)- Peer-to-Peer distributed heterogeneous multi-database system.
Client/Server Systems
• Distinguish the functionality that needs to be provided
and divide these functions into two classes: server
functions and client functions.
• This provides a two-level architecture which makes it
easier to mange the complexity of modern DBMSs and
the complexity of distribution.
• The server does most of the data management work, i.e., all of query
processing and optimization, transaction management and storage
management is done at the server.
• The client, in addition to the application and the user interface, has a DBMS
client module that is responsible for managing the data that is cached to the
client and sometimes managing the transaction locks.
• There is OS and communication software that runs on both the client and the
server. The client/server architecture is depicted in Fig. 4.4.
• The client passes SQL queries to the server without trying to understand or
optimize them. The server does most of the work and returns the result relation
to the client.
• Users of a local DBMS define their own views on the local database and do not
need to change their applications if they do not want to access data from
another database.
Distributed Database Design
The design of a distributed computer system involves making decisions on the
placement of data and programs across the sites of a computer network, as
well as possibly designing the network itself.
The distribution of applications involves two things: the distribution of the
distributed DBMS software and the distribution of the application programs
that run on it.
Design Strategies
Two design strategies are :
1. top-down approach and
2. bottom-up approach.
Top-down design process
The activity begins with a requirements analysis that defines the environment
of the system and elicits both the data and processing needs of all potential
database users.
The observation and monitoring phase is used for constant monitoring and
periodic adjustment and tuning of the design and development activity.