DBMS 1st Unit

Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

Database Management

System
By:
Dr. Shailendra Prakash Saxena
Associate Professor, CSE Department
IIMT College of Engg., Greater Noida
SYLLABUS
• Unit I
• Introduction: An overview of database management
system, database system Vs file system, Database system
concept and architecture, data model schema and
instances, data independence and database language
and interfaces, data definitions language, DML, Overall
Database Structure.

• Data Modeling using the Entity Relationship Model:


ER model concepts, notation for ER diagram, mapping
constraints, keys, Concepts of Super Key, candidate key,
primary key, Generalization, aggregation, reduction of
an ER diagrams to tables, extended ER model,
relationship of higher degree.
Course Outcomes

• On successful completion of the course, the students will be able to:

• CO1. : Able to master the basic concepts and understand the


applications of database systems.
• CO2.: Able to construct an Entity-Relationship (E-R) model from
specifications and to perform the
• transformation of the conceptual model into corresponding logical
data structures.
• CO3.: Able to understand the basic database storage structures and
access techniques.
• CO4.: Able to distinguish between good and bad database design,
apply data normalization principles, and
• be aware of the impact of data redundancy on database integrity
and maintainability.
• CO5. : Able to apply database transaction management and
database recovery
An overview of Database Management
System
• A database is a logically coherent organized collection of similar
data. Similar data refers to the data based on same context.
• For Example: An ‘Employee’ database contains similar data for all
employees and every employee's entry contains a similar type of
information.
• The database serves as a base from which the desired information
can be retrieved, conclusions can be drawn and decisions can be
made.
• A Database Management System (DBMS) is a collection of
interrelated data and a set of programs to access those data.
• DBMS is the combination of two words:
Database + Management System = DBMS
A database is a collection of related information stored, so that can
be available to many users for different purposes. Management
system is a collection of programs that enables users to create and
maintain the database.
Introduction to DBMS
• DBMS is a collection of programs that enables users to create
and maintain a database or we can say that DBMS is a general
purpose software system that facilitates the process of
defining, construction, manipulation and sharing database
among various users and applications.
• Defining a database involves specifying the data types,
structures and for the data to be stored in database.
constraints
• Constructing the database is the process of storing the data
itself on some storage medium that is controlled to PR.
• Manipulating database includes functions querying the
database to retrieve specific data, update the database and
generating reports from the data.
• Sharing a database allows multiple users and programs to
access the database concurrently.
Introduction to DBMS
Components of DBMS
• This system involves data, the hardware that physically stores data, the
software that utilizes the hardware's file system in order to store the data
and provide a method to retrieve or change the data and finally the user
who turn the data into information.
• 1. User: In DBMS there are three classes of users are considered:
• (i) Application programmers: Develop the application program. These
programs can manipulate the database in all possible ways.
• (ii) End Users: Access the database from a terminal using a query language
or through application program.
• (iii) Data base Administrator (DBA): is the person responsible for design,
construction and maintenance of a database.
• 2. Software: Software includes the DBMS, operating, System, Network
Software (If necessary) and the application program.
• 3. Hardware: IT can range from a PC to a network of computers. It also
includes various storage devices (Like hard disk) and input – output
devices.
• 4. Data: Data can be numerical data (Numbers, floating point numbers) and
non numerical data (Character, data or logical data).
Advantages of DBMS
• (i) Reduction in
Data Redundancy:
Data redundancy
refers to the
duplication of data.
In non database
system, each
application has its
own separate files.

• Ex. : In College data


base various files are
manages as:
Advantages of DBMS
• It is clear from above,
that in file system. We
have to repeat common
data in all files that
results in problem of
redundancy and causes
wastage of space.
• But in case of centralized
database, data can be
shared by number of
applications, so college
can maintain data base
as:
Advantages of DBMS
• Clearly we can see in above database Roll No.,
Name, Class, Father Name, Address are stored
repeatedly in file system in each application, need
not to be stored repeatedly in case of DBMS.
• In data base every other application can access this
information with the help of common entry, Roll
No.
• So, we can say that centralized system of DBMS
reduces the redundancy to a great extent but
cannot eliminate it, as Roll No. is still repeated in
all relations.
Advantages of DBMS
• (ii) Integrity can be enforced:
Integrity of data means that data in database is accurate, i.e.
incorrect information cannot be stored in database.
For Example: In College date base – suppose that College
having only BA, BCA, BBA classes. But if an user enters the
class MCA, then this in correct information must not be stored
in database and must be prompted that this is a an invalid
data entry.
- In case of File system, this integrity constraint is applied to
all the applications a separately.
- In database, this integrity constraint is applied once on the
class field of general office. So, Integrity constraint can be
easily enforced in centralized DBMS system as compared to
file system.
Advantages of DBMS
• (iii)Inconsistency can be avoided: When the same
data is duplicated and changes are made at one
site, and other remains unchanged, then it gives
rise to inconsistency.
• (iv) Data can be shared: Data can be shared by
multiple applications in centralized DBMS.
• (v) Enforcement of Standards: In data base
system data is stored at one central place, so
standards can be easily enforced by DBA. This
ensures standardized data formats to facilitate
data transfer between systems.
Advantages of DBMS
• (vi)Improved Security: Database Security means
protecting the data in the database from unauthorized
uses. Data security checks can be carried out
whenever access is attempted to sensitive data.
• (vii) Improved backup & Recovery facility: Through
the recovery and backup sub systems, database
system provides the facility for recovering from
hardware and software failure. The recovery sub
system ensures that the database is restored in
position, where it was before the system crashes.
• (viii) Concurrency control: The database systems are
designed to manage concurrent access of database by
many users. They also prevent any loss of information
or loss of integrity due to these concurrent accesses
Disadvantages of Database Systems
• (i) Complexity Increases: The data structure may become more
complex because of centralized database, supporting many
applications in an organization.

• (ii) Complexity of Backup and Recovery: For a database system to


accurate and available all times, a procedure is required to be
developed and used for providing i.e. incorrect information and
backup copies to all its users, when damage.

• (iii)Cost of Software: There are many softwares which are used and
are very costly. Hence from economic point of view it is a drawback.

• (iv) Cost of Hardware: Cost of hardware increases, as it needs more


disk space, need additional manpower and also increase installation
and management etc.
Data Abstraction (View of Data)
• DBMS provide abstract view of data to users, i.e. the system hides
certain details of how the data are stored and maintained. Generally
there are three level of abstraction.

• (i) Physical Level (Internal Level): Lowest level of abstraction


describes This level describes how the data are actually stored and
complex low level data structures in detail

• (ii) Logical Level (Conceptual Level): next higher level of


abstraction describes what data are stored in database, and what
relationship exist among those data.

• (iii)View Level (External Level): Highest Level of abstraction


describes only a part of the entire database. The view level
simplifies the interaction with the system. The system may provide
many views for the same database
Data Abstraction (View of Data)
Schemas, Sub Schemas and Instances
• Database changes, when any information is inserted or deleted.
• The collection of information stores in the database at a particular
moment is called an instance of database.
• The overall design of the database is called, the database schema.
For Ex: A Student Schema:

Name Roll No. Class Marks Course Department

• The schema will remain same, while the values filled into it changes
from instant to instant.
• Subschema is a subset of schema having the same properties that a
schema has. Subschema allows the user to view only that part of the
database that is of interest to him.
Architecture of DBMS
• Architecture of DBMS can be classified as:

• (i) Application Architecture of DBMS.

• (ii) Schema architecture of DBMS.


Application Architecture of DBMS
• This is of two types:

• (a) Two Tier Architecture

• (b) Three Tier Architecture:


Two Tier Architecture
• In this architecture users are connected with database
through network. Two different machines are used
• Client Machine: on which database user works.
• Server Machine: on which the database system runs.

• In two tier architecture, the application is participated


into a component that resides at client machine and
invokes the database system functionality at the server
machine through query language statements.

• Application program interface standards like ODBC


and JDBC are used for interaction between the client
and the server.
Three Tier Architecture
• In three tier architecture, the client machine acts as
merely a front end and does not contain any direct
data base calls. Instead the client end communicates
with an application server, through forms interface.
• The application server in turn communicates with a
database system to access data. In this the business
logic of the application, i.e. 'what actions to carry
under what conditions' is embedded in the application
server, instead of being distributed among multiple
clients.
• Three tier applications are more appropriate for large
applications and for applications that runs on the
World Wide Web.
Two and Three Tier Architecture
Schema Architecture of DBMS
• There are three different types of schemas in the database
corresponding to each view of database.

• (a) External Schema: External view is described by means of a


scheme called external scheme that corresponds to different
views of data.

• (b) Conceptual Schema: Conceptual view is defined by


conceptual schema, which describes all the entities, attributes
and relationship together with integrity constraints.

• (c) Internal Schema: Internal view is defined by internal
schemes, which is complete description of the internal model,
containing definition of stored records, the methods of
representation, the data fields and the indexes used.
Schema Architecture of DBMS
Consider a College database schema:
Data Independence
• Three schema architecture provides data
Independence i.e. upper levels are unaffected by
changes in lower levels.
• Two kind of data independence is there.
• (i) Logical Data Independence: Logical data
independence indicates that the conceptual
schema can be changed without affecting the
external schemas.
• (ii) Physical data Independence: Physical data
independence indicates that the physical storage
structures or devices can be changed without
affecting conceptual schema
Database Languages
• Database Languages are used to create and maintain database on
computer. There are large numbers of database languages like
oracle, MYSQL, MS Access, DBase, Fox Pro etc.
• SQL statements commonly used in oracle and MS Access can be
categorized as DDL, DCL and DML.
• (i) Data Definition Language (DDL): It is a language that allows
user to define data and their relationship to other types of data. It is
mainly used to create files, data bases, data dictionary and tables
within databases. Main tasks of DDL are:
• To create the database objects like tables, indexes etc.
• To modify database objects. Ex. we can change the data type of a
field or we can drop / add a field from / to the table.
• To destroy the data base objects.
• Example. Syntax: SQL > Create Table <Table Name> (FieldName
DataType1 (Width 1), FieldName2 Datatype2 (Width2), …..);
• Create Table Student (Name Varchar2 (12), Age Number (2),
Address Varchar2 (20), Class Number (2));
• This will create a table of students having field Name, Age, Address
and Class.
Database Languages
• (ii) Data Manipulation Language (DML): This language is used in
data manipulation operations on the data held in the databases. It
allows users to insert, update, delete, and retrieve data from the
database. Part of DML that used for data retrieval is called a query
language.
• DML Can be categorized as:
• (a) Procedural DML: It required a user to specify what data are
needed and how to get those data.
• (b) Non procedural DML: It is also called declarative DML. It
requires a user to specify what data are needed without specifying
how to get those data.
• Example. (i) Inserting new records in the database: New values can
be inserted into table by use of INSERT INTO <Table Name>
Command.
• SQL > Insert into Student (Name, Age, Address, Class) Values
(‘Vijay', 24, 'Kanpur', 1); & thus 1 row created in the database.
• (ii) Retrieve the data: Data filled in the table can be viewed or
displayed on the screen by 'SELECT' Command.
• SQL > Select * From Student;
Database Languages
• (iii)Data control Language (DCL): DCL
statements controls access to data and
database. Occasionally DCL statements
are grouped with DML Statements.
• Using DCL we can:
• 1. Commit: Save work done.
• 2. Roll back: Restore database to original
since last commit.
Data Models
• Data models can be defined as integrated collection of
concepts for describing and manipulating data,
relationships between data and constraints on the data
in an organization.
• A data model has three components.
• A Structural Part: Consisting of a set of rules
according to which, databases can be constructed.
• A Manipulative part: Defines types of operations that
are allowed on data.
• Set of Integrity rules: Which insures data is accurate.
• Data models can be categorized in three broad
categories;
• 1. Object Based Data Models.
• 2. Physical Data Models.
• 3. Record Based Data Model.
Object Based Data Models
• Object based data models use concepts such as
entities, attributes and relationships.
• An Entity is a distinct object (a person, place,
concepts, events etc.) in the organization.
• An attribute is a property that describes some
aspect of the object that we wish to record.
• A relationship is an association between entities.
• Some of the more common types of object based
data models are:
• Entity Relationship Model
• Object oriented Model
• Functional Model
Physical Data Models
• A physical data model describes how data
is stored in the computer, and how
information is represented such as record
structures, record ordering and access
parts.
Record based Logical Models
• Record based logical models are used in
describing data at the logical and view
levels.
• Three mostly accepted record based data
models are:
• Hierarchical Model
• Network Model
• Relational Model
Hierarchical Model
• This is one of the oldest data base model, dating from the late
1950s. One of the first hierarchical database information
management system was developed jointly by North
American Rockwell company and IBM.
• This model is like a structure of a tree with the records
forming the nodes and fields forming the branches of a tree.
• Consider a company’s organizational structure. At the top we
have a general manager (GM).
• Under him, we have several Deputy General Managers
(DGMs).
• Each DGM looks after a couple of departments and each
department will have a manager and many employees.
• When represented in a hierarchical model, there will be
separate rows for representing the GM, each DGM, each
department, each manager and each employee.
Hierarchical Model
Hierarchical Model
Hierarchical Model
Hierarchical Model
• Problem:
• Insert anomalies: It is not possible to insert the information of
supplier S4, who does not supply any part. This is because a
node can not exist without a root. However a part P5, which is
not supplied by any supplier, can be inserted without any
problem. We can say that insert anomaly exist only for those
children, which has no corresponding parent.
• Update anomalies: Update anomaly may exist only for
children, not for parent, because children may have multiple
entries in the database. Example. Suppose we wish to change
the city of supplier S1 from Gzb. to Agra, then we have to
carry out two operations.
• First, searching S1 for each part and then multiple updations
for each occurrence of S1. But city of Part P1 can be changed
easily from Gzb. to Agra, because there is only one entry for
part P1 and problem of inconstancy will not arise.
Advantages of Hierarchical model
• (a) Simplicity: In interracial structure, the
relationship between various layers is logically
simple.
• (b) Data Security: This was first database model
that offered data security.
• (c) Data Integrity: As this model is based on
parent / child relationship. So, there is always a
link between the parent and the child under it. So,
this model promotes integrity.
• (d) Efficiency: This model is very efficient when
database contains a large number of 1:N
relationship.
Network Model
• Formalized in 1971, by Database task group
(DBTG) of the conference on system languages.
This model allows multiple paths among
segments i.e. more then one parent child
relationship. Hence this model allows having 1:1
(one to one), 1:M (one to many) and M:M (Many
to many relationship).
• Network model is based on graph structure. A
network database consists of collection of records,
which are connected to each other by links.
• Record: A Record is a collection of attributes, each
contain only one data value.
• Link: A link is an association between two records.
So, network model is collection of graphs.
Network Model
Example.: Consider relation working for between Employee and Department.

These models were costly and great difficult of use.

Main difference of N/W model from hierarchical model is its ability


to handle many to many relations.
Relational Data Model
Relational Model stores data in the form of
tables. This concept is proposed by Dr. E. F.
Codd, a researcher of IBM in 1960s. The
relational model consists of three major
components:
• The set of relations and set of domains that
defines the way data can be represented
(data structure).
• Integrity rules that defines the procedure to
protect data (Data Integrity).
• The operations that can be performed on data
(Data Manipulation).
Relational Data Model
• Definition: A relational model database is defined as a database
that allows you to group its data item into one or more
independent tables that can be related to one another using fields
common to each related table.
• Relational database systems have the following characteristics:
• The whole data is represented as an orderly arrangement of data
into rows and columns, called a relation or table.
• All values are scalar i.e. at any given position (row / column), there
is one and only one value.
• Basic terminology:
•  Each row (record) of a relational table is called tuple.
•  Columns of the table represents attributes or fields.
•  The values for these attributes are called the domain.
•  Number of tuples in a relation determines its cardinality.
•  Number of attributes in a relation determines its degree.
Overall Database Structure
• Overall database structure can be portioned in four parts.

• (i) Database Users:


• a) Naïve Users: These are the users who interact with system
by invoking one of the application programs that have been
written earlier. Typical interface for new users is a form
Interface.
• b) Application programmer: These are the users who develop
the application programs.
• c) Sophisticated users: These are the users who interact with
the system without writing programs. Instead they form their
request in a database query language.
• d) Database Administrator (DBA): is a person with
responsibility of controlling & protecting the data. It also
coordinate the database design of implementation of data
security procedures, protect the integrity of data and make
sure that system performance is satisfactory.
Overall Database Structure
• (ii) Query Processor: Query processor helps in
simplify the database access. The query processor
components include:
• a) DDL Interpreter: It interprets DDL statements
and records the definitions in the data dictionary.
• b) DML Compiler: It translated DML statements in
a query language into an evaluation plan consisting
of low level instructions that the query evaluation
engine understand. A query can be translated into
many number of alternative evaluation plans that all
give the same result. DML compiler also performs
query optimization i.e. it picks lowest cost
evaluation plan among the alternatives.
• c) Query Evaluation Engine: It executes low level
instructions generated by DML complier.
Overall Database Structure
• (iii)Storage Manager:
• A storage manager is a program module that provides the interface
between the low level data stored in database and application programs
& queries submitted to the system. Storage manager translates various
DML statements into low level file system commands.
• This storage manager is responsible for storing, retrieving and updating
data in the database. The Storage manager components include:
• a) Authorization and Integrity Manager: It tests satisfaction of integrity
constraints and check the authority of users to access data.
• b) Transaction Manager: It ensures that database remains in consistent
state even in system failures and concurrent transaction execution
proceed without conflicting.
• c) File Manager: It manages the allocation of space on disk storage and
data structures used to represent information stored on disk.
• d) Buffer Manager: It is responsible for fetching data from disk storage
to main memory, and deciding what data to cache in main memory. It
enables database to handle data sizes that are much larger then size of
main memory.
Overall Database Structure
• (iv) Disk Storage:
• Storage manager implements several data
structures as part of physical system
implementation.
• a) Data Files: Stores the database itself.
• b) Data Dictionary: Stores metadata about the
structure of database, in particular schema of
database.
• c) Indices: It can provide fast access to database
like the index in any text book a database index
provides pointers to those data items, which
hold a particular value.
Overall Database Structure
Question Bank
• Q.1: Distinguish between a File Processing System and a DBMS.
• Q.2: What is the role of DBA?
• Q.3: What is the role of data independence in DBMS?
• Q.4: Draw a diagram of database system architecture.
• Q.5: What do you mean by data abstraction? Explain the
difference between physical level, conceptual level and view
level of abstraction.
• Q.6: What is the difference between logical data independence
and physical data independence?
• Q.7: Define following terms:
• (i) Data (ii) Database (iii) Database System (iv) End user
• (v) DDL (vi) DML(v ii) Data redundancy (viii) Key
• Q.8: What do you mean by External schema, Conceptual schema,
and Internal schema?
• Q.9: Explain three level architecture of DBMS in detail.

You might also like