Chapter Two: Database System Concepts and Architecture

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

CHAPTER TWO

Database System Concepts and Architecture


1. Data Models, Schema and Instances
A Data Model: a collection of concepts that can be used to describe the structure of a database
provides the necessary means to achieve this abstraction. By structure of a database we mean
the data types, relationships, and constraints that should hold on the data. Most data models
also include a set of basic operations for specifying retrievals and updates on the database.

Categories of Data Models

Many data models have been proposed, and we can categorize them according to the types of
concepts they use to describe the database structure.

Conceptual (high-level) data models provide concepts that are close to the way many end users
perceive data. Conceptual Data Models use concepts such as entities, attributes, and
relationships.

Physical Data Models describes how data is stored in the computer by representing information
such as stored record formats, record orderings, and access paths. An access path is a
structure that makes the search for particular database records efficient.

representational (or implementation) data models, which provide concepts that may be easily
understood by end users but that are not too far removed from the way data is organized in
computer storage. Representational data models hide many details of data storage on disk but can
be implemented on a computer system directly.

Database Schema: It is the description of a database which is expected to depict the overall
design of the database including the data structure and constraints of the database. Database
schema is defined during the database design process and changes very rarely afterwards. It
could be looked at like a template or building plan for one or several database instances. The
terms intension and metadata are interchangeably used to mean database schema. Database
schema is of three types: subschema, logical schema and physical schema. Furthermore, there
could be unlimited numbers of subschema, only one logical and physical schema.
Database State: It is the actual content of a database at a particular moment in time. The terms
occurrence, database instance, snapshot and extension are interchangeably used to mean database
state. The database state when it is initially loaded into a system is said to be initial database
state. The database state changes every time the database is updated.A database state that
satisfies the structure and constraints of the database in database schema is called valid state.

2. DBMS Architecture and Data Independence


DBMS Architecture

Database management systems are complex software which were often developed and optimized
over years. From the view of the user, however, most of them have a quite similar basic
architecture. The discussion of this basic architecture shall help to understand the connection
with data modeling and the introductionally to this module postulated 'data independence' of the
database approach.

The Three-Schema Architecture

The goal of the three-schema architecture, illustrated in Figure 1, is to separate the user applications from
the physical database. In this architecture, schemas can be defined at the following three levels:

External Level schema: It describes part of a database that is relevant to a particular user.
Different users have their own customized view of the database independent of other users. It
describes the various user views, often a restricted view of a database. Entity/Object based
data models like ER could be used for this level.

Conceptual Level schema: It is claimed to be an enterprise level or community view of the


database. It describes the data that is stored in database and the relationships among the data,
i.e., to describe the structure and constraints for the whole database for a community of users.
It uses record-based data models to do so. It is needed by database developers, DBA or
‘Power users’.

Internal Level schema: It is all about the physical representation of the database on the
computer including how data is stored in the database. Physical data model is used. This
level is needed by DBMS implementers and maintainers.
Figure 1: Three Schema Architecture

Depending on how the three levels are set, there are centralized databases and decentralized
(client/server) database architectures.

 Centralized Database Architecture: It combines everything into single system


including DBMS software, hardware, application programs, and user interface processing
software. User can still connect through a remote terminal. However, all processing is
done at centralized site.
Figure 2: Centralized Database Architecture

 Client/Server Architecture: It is an architecture in which the client (user interface) is


separated from the rest of the data views. Depending on the setting of application logic
and data store in one server or not, there are two types of architectures
o 2-tier client/server architecture: one server is used to store the database schema
and database state.

Presentation Layer Servers for business


logic and data services
 Clients provide appropriate interfaces through a client software module to
access and utilize the various server resources.
 Clients may be diskless machines or PCs or Workstations with disks with
only the client software installed.
 Connected to the servers via some form of a network.
 (LAN: local area network, wireless network, etc.)
 DBMS Server provides database query and transaction services to the
clients
 Relational DBMS servers are often called SQL servers, query servers, or
transaction servers
 Applications running on clients utilize an Application Program Interface
(API) to access server databases via standard interface such as:
 ODBC: Open Database Connectivity standard
 JDBC: for Java programming access
 Client and server must install appropriate client module and server module
software for ODBC or JDBC
o 3-tier client/server architecture: separate servers are used to store the database
schema and database state. It is common for web applications. Database schema is
stored at an application server or web server at the intermediate. It enhances
security and reduces burden on the database server.
Presentation Layer Business Logic Data Services
Layer layer

Data Independence

The three-schema architecture can be used to explain the concept of data independence, which
can be defined as the capacity to change the schema at one level of a database system without
having to change the schema at the next higher level. We can define two types of data
independence:

 Logical data independence is the capacity to change the conceptual schema without
having to change external schemas or application programs. We may change the
conceptual schema to expand the database (by adding a record type or data item), or to
reduce the database (by removing a record type or data item). In the latter case, external
schemas that refer only to the remaining data should not be affected. Only the view
definition and the mappings need be changed in a DBMS that supports logical data
independence. Application programs that reference the external schema constructs must
work as before, after the conceptual schema undergoes a logical reorganization. Changes
to constraints can be applied also to the conceptual schema without affecting the external
schemas or application programs.
 Physical data independence is the capacity to change the internal schema without
having to change the conceptual (or external) schemas. Changes to the internal schema
may be needed because some physical files had to be reorganized - for example, by
creating additional access structures - to improve the performance of retrieval or update.
If the same data as before remains in the database, we should not have to change the
conceptual schema.
3. Database Language and Interface

Database Languages
So far, we have got to know about database, database management system descriptions and the
components of a database system. In this section, it is explained how 'a data gets into a database system'
and 'how the information gets to the users'. More correctly formulated the following questions will be
answered:
 How does an application interact with a database management system?
 How does a user look at a database system?
 How can a user query a database system and view the results in his/her application?
Data Definition Language (DDL)

For describing data and data structures a suitable description tool, a data definition language (DDL), is
needed. With this help a data scheme can be defined and also changed later.

 The DDL is used to define both the conceptual (e.g., relational) and external (e.g., views)
schemas.
 SQL is used in the relational model for the DDL tasks which have three
subtasks for each of the three levels:
of physical (as storage definition language-SDL done by DBA), conceptual (DDL)
and external (as view definition language –VDL).

Typical DDL operations (with their respective keywords in the structured query language SQL):

 Creation of tables and definition of attributes (CREATE TABLE ...)


 Change of tables by adding or deleting attributes (ALTER TABLE …)
 Deletion of whole table including content (!) (DROP TABLE …) etc
Data Manipulation Language (DML)

Additionally a language for the descriptions of the operations with data like store, search, read, change,
etc. the so-called data manipulation, is needed. Such operations can be done with a data manipulation
language (DML). Within such languages keywords like insert, modify, update, delete, select, etc. are
common.
Typical DML operations (with their respective keywords in the structured query language SQL):

 Add data (INSERT)


 Change data (UPDATE)
 Delete data (DELETE)
 Query data (SELECT) Etc….
a. Standalone DML query language interface: can be non-prodcedural and used on its own
for complex database operations. Example: Oracle SQLPlus. This is called query
language.
b. Programmer interfaces for embedding DML in programming languages:
I. Embedded Approach: DML can be embedded in programming language like C
as with Oracle Pro*C.
II. Procedure Call Approach: A library of functions can also be provided to access
the DBMS from a programming language like JAVA, C++ (e.g. through JDBC
and ODBC connectivities).
III. Database Programming Language Approach: e.g. ORACLE has PL/SQL, a
programming language based on SQL; language incorporates SQL and its data
types as integral components
IV. Scripting Languages: PHP (client-side scripting) and Python (server-side
scripting) are used to write database programs.
c. Other user-friendly interfaces for interaction with DML built with computer aided
software engineering (CASE) tools like Sybase Powerbuilder, Borland Jbuilder, Oracle
forms and JDeveloper, e.g.,
I. Menu-based, forms-based, graphics-based, etc.
II. Forms-based, designed for naïve users used to filling in entries on a form
III. Graphics-based including Point and Click, Drag and Drop, etc., Specifying a
query on a schema diagram
IV. Natural language: requests in written English
V. Combinations of the above: including natural language, speech, web browser with
keyword search, parametric interfaces with function keys, eg, used by bank
tellers.
d. Mobile Interfaces:interfaces allowing users to perform transactions using mobile apps
Data Control Language (DCL)
A Data Control Language (DCL) is a computer language and a subset of SQL, used to control access to
data in a database.
Examples of DCL commands include:
 GRANT used to allow specified users to perform specified tasks.
 REVOKE used to cancel previously granted or denied permissions
Database Interfaces

The application poses with the help of SQL, a query language, a query to the database system.
There, the corresponding answer (result set) is prepared and also with the help of SQL given
back to the application. This communication can take place interactively or be embedded into
another language.
Type and Use of the Database Interface
Following, two important uses of a database interface like SQL are listed:
Interactive: SQL can be used interactively from a terminal.
Embedded: SQL can be embedded into another language (host language) which might be used
to create a database application.
User Interfaces
A user interface is the view of a database interface that is seen by the user. User interfaces are
often graphical or at least partly graphical (GUI - graphical user interface) constructed and offer
tools which make the interaction with the database easier.
1. Form-based Interfaces
This interface consists of forms which are adapted to the user. He/She can fill in all of the fields
and make new entries to the database or only some of the fields to query the other ones. But
some operations might be restricted by the application. Form-based user interfaces are wide
spread and are a very important means of interacting with a DBMS. They are easy to use and
have the advantage that the user does not need special knowledge about database languages like
SQL.
Figure 3: Form-based Interfaces

2. Text-based Interfaces
To be able to administrate the database or for other professional users there are possibilities to
communicate with the DBMS directly in the query language (in code form) via an input/output
window.
Text-based interfaces are very powerful tools and allow a comprehensive interaction with a
DBMS. However, the use of these is based on active knowledge of the respective database
language.

Figure 4: Text-based Interfaces


4. The Database System Environment
A database environment is a collective system of components that comprise and regulates the
group of data, management, and use of data which consist of software, hardware, people,
techniques of handling database and the data also.

Here, the hardware in a database environment means the computers and computer peripherals
that are being used to manage a database and the software means the whole thing right from the
operating system (OS) to the application programs that includes database management software
like M.S. Access or SQL Server. Again the people in a database environment include those
people who administrate and use the system. The techniques are the rules, concepts, and
instructions given to both the people and the software along with the data with the group of facts
and information positioned within the database environment.

Figure 5: database environment

5. Classification of Database Management Systems


Database management systems can be classified based on several criteria, such as the data
model, user numbers and database distribution, all described below.

Classification Based on Data Model


The most popular data model in use today is the relational data model. Well-known DBMSs like
Oracle, MS SQL Server, DB2 and MySQL support this model. Other traditional models, such as
hierarchical data models and network data models, are still used in industry mainly on
mainframe platforms. However, they are not commonly used due to their complexity. These are
all referred to as traditional models because they preceded the relational model.

In recent years, the newer object-oriented data models were introduced. This model is a database
management system in which information is represented in the form of objects as used in object-
oriented programming. Object-oriented databases are different from relational databases, which
are table-oriented. Object-oriented database management systems (OODBMS) combine database
capabilities with object-oriented programming language capabilities.

The object-oriented models have not caught on as expected so are not in widespread use. Some
examples of object-oriented DBMSs are O2, ObjectStore and Jasmine.

Classification Based on User Numbers


A DBMS can be classification based on the number of users it supports. It can be a single-user
database system, which supports one user at a time, or a multiuser database system, which
supports multiple users concurrently.

Classification Based on Database Distribution


There are four main distribution systems for database systems and these, in turn, can be used to
classify the DBMS.

Centralized systems
With a centralized database system, the DBMS and database are stored at a single site that is
used by several other systems too. This is illustrated in Figure 6.1.

Figure 6: Example of a centralized database system.

In the early 1980s, many Canadian libraries used the GEAC 8000 to convert their manual card
catalogues to machine-readable centralized catalogue systems. Each book catalogue had a
barcode field similar to those on supermarket products.

Distributed database system


In a distributed database system, the actual database and the DBMS software are distributed
from various sites that are connected by a computer network, as shown in Figure 6.2.
Figure 7: Example of a distributed database system.

Homogeneous distributed database systems


Homogeneous distributed database systems use the same DBMS software from multiple sites.
Data exchange between these various sites can be handled easily. For example, library
information systems by the same vendor, such as Geac Computer Corporation, use the same
DBMS software which allows easy data exchange between the various Geac library sites.

Heterogeneous distributed database systems


In a heterogeneous distributed database system, different sites might use different DBMS
software, but there is additional common software to support data exchange between these sites.
For example, the various library database systems use the same machine-readable cataloguing
(MARC) format to support library record data exchange.

You might also like