Chapter Two: Database System Concepts and Architecture
Chapter Two: Database System Concepts and Architecture
Chapter Two: Database System Concepts and Architecture
Many data models have been proposed, and we can categorize them according to the types of
concepts they use to describe the database structure.
Conceptual (high-level) data models provide concepts that are close to the way many end users
perceive data. Conceptual Data Models use concepts such as entities, attributes, and
relationships.
Physical Data Models describes how data is stored in the computer by representing information
such as stored record formats, record orderings, and access paths. An access path is a
structure that makes the search for particular database records efficient.
representational (or implementation) data models, which provide concepts that may be easily
understood by end users but that are not too far removed from the way data is organized in
computer storage. Representational data models hide many details of data storage on disk but can
be implemented on a computer system directly.
Database Schema: It is the description of a database which is expected to depict the overall
design of the database including the data structure and constraints of the database. Database
schema is defined during the database design process and changes very rarely afterwards. It
could be looked at like a template or building plan for one or several database instances. The
terms intension and metadata are interchangeably used to mean database schema. Database
schema is of three types: subschema, logical schema and physical schema. Furthermore, there
could be unlimited numbers of subschema, only one logical and physical schema.
Database State: It is the actual content of a database at a particular moment in time. The terms
occurrence, database instance, snapshot and extension are interchangeably used to mean database
state. The database state when it is initially loaded into a system is said to be initial database
state. The database state changes every time the database is updated.A database state that
satisfies the structure and constraints of the database in database schema is called valid state.
Database management systems are complex software which were often developed and optimized
over years. From the view of the user, however, most of them have a quite similar basic
architecture. The discussion of this basic architecture shall help to understand the connection
with data modeling and the introductionally to this module postulated 'data independence' of the
database approach.
The goal of the three-schema architecture, illustrated in Figure 1, is to separate the user applications from
the physical database. In this architecture, schemas can be defined at the following three levels:
External Level schema: It describes part of a database that is relevant to a particular user.
Different users have their own customized view of the database independent of other users. It
describes the various user views, often a restricted view of a database. Entity/Object based
data models like ER could be used for this level.
Internal Level schema: It is all about the physical representation of the database on the
computer including how data is stored in the database. Physical data model is used. This
level is needed by DBMS implementers and maintainers.
Figure 1: Three Schema Architecture
Depending on how the three levels are set, there are centralized databases and decentralized
(client/server) database architectures.
Data Independence
The three-schema architecture can be used to explain the concept of data independence, which
can be defined as the capacity to change the schema at one level of a database system without
having to change the schema at the next higher level. We can define two types of data
independence:
Logical data independence is the capacity to change the conceptual schema without
having to change external schemas or application programs. We may change the
conceptual schema to expand the database (by adding a record type or data item), or to
reduce the database (by removing a record type or data item). In the latter case, external
schemas that refer only to the remaining data should not be affected. Only the view
definition and the mappings need be changed in a DBMS that supports logical data
independence. Application programs that reference the external schema constructs must
work as before, after the conceptual schema undergoes a logical reorganization. Changes
to constraints can be applied also to the conceptual schema without affecting the external
schemas or application programs.
Physical data independence is the capacity to change the internal schema without
having to change the conceptual (or external) schemas. Changes to the internal schema
may be needed because some physical files had to be reorganized - for example, by
creating additional access structures - to improve the performance of retrieval or update.
If the same data as before remains in the database, we should not have to change the
conceptual schema.
3. Database Language and Interface
Database Languages
So far, we have got to know about database, database management system descriptions and the
components of a database system. In this section, it is explained how 'a data gets into a database system'
and 'how the information gets to the users'. More correctly formulated the following questions will be
answered:
How does an application interact with a database management system?
How does a user look at a database system?
How can a user query a database system and view the results in his/her application?
Data Definition Language (DDL)
For describing data and data structures a suitable description tool, a data definition language (DDL), is
needed. With this help a data scheme can be defined and also changed later.
The DDL is used to define both the conceptual (e.g., relational) and external (e.g., views)
schemas.
SQL is used in the relational model for the DDL tasks which have three
subtasks for each of the three levels:
of physical (as storage definition language-SDL done by DBA), conceptual (DDL)
and external (as view definition language –VDL).
Typical DDL operations (with their respective keywords in the structured query language SQL):
Additionally a language for the descriptions of the operations with data like store, search, read, change,
etc. the so-called data manipulation, is needed. Such operations can be done with a data manipulation
language (DML). Within such languages keywords like insert, modify, update, delete, select, etc. are
common.
Typical DML operations (with their respective keywords in the structured query language SQL):
The application poses with the help of SQL, a query language, a query to the database system.
There, the corresponding answer (result set) is prepared and also with the help of SQL given
back to the application. This communication can take place interactively or be embedded into
another language.
Type and Use of the Database Interface
Following, two important uses of a database interface like SQL are listed:
Interactive: SQL can be used interactively from a terminal.
Embedded: SQL can be embedded into another language (host language) which might be used
to create a database application.
User Interfaces
A user interface is the view of a database interface that is seen by the user. User interfaces are
often graphical or at least partly graphical (GUI - graphical user interface) constructed and offer
tools which make the interaction with the database easier.
1. Form-based Interfaces
This interface consists of forms which are adapted to the user. He/She can fill in all of the fields
and make new entries to the database or only some of the fields to query the other ones. But
some operations might be restricted by the application. Form-based user interfaces are wide
spread and are a very important means of interacting with a DBMS. They are easy to use and
have the advantage that the user does not need special knowledge about database languages like
SQL.
Figure 3: Form-based Interfaces
2. Text-based Interfaces
To be able to administrate the database or for other professional users there are possibilities to
communicate with the DBMS directly in the query language (in code form) via an input/output
window.
Text-based interfaces are very powerful tools and allow a comprehensive interaction with a
DBMS. However, the use of these is based on active knowledge of the respective database
language.
Here, the hardware in a database environment means the computers and computer peripherals
that are being used to manage a database and the software means the whole thing right from the
operating system (OS) to the application programs that includes database management software
like M.S. Access or SQL Server. Again the people in a database environment include those
people who administrate and use the system. The techniques are the rules, concepts, and
instructions given to both the people and the software along with the data with the group of facts
and information positioned within the database environment.
In recent years, the newer object-oriented data models were introduced. This model is a database
management system in which information is represented in the form of objects as used in object-
oriented programming. Object-oriented databases are different from relational databases, which
are table-oriented. Object-oriented database management systems (OODBMS) combine database
capabilities with object-oriented programming language capabilities.
The object-oriented models have not caught on as expected so are not in widespread use. Some
examples of object-oriented DBMSs are O2, ObjectStore and Jasmine.
Centralized systems
With a centralized database system, the DBMS and database are stored at a single site that is
used by several other systems too. This is illustrated in Figure 6.1.
In the early 1980s, many Canadian libraries used the GEAC 8000 to convert their manual card
catalogues to machine-readable centralized catalogue systems. Each book catalogue had a
barcode field similar to those on supermarket products.