DBMS Unit-1
DBMS Unit-1
DBMS Unit-1
Need of DBMS:
Database systems are basically developed for dealing with huge amount of data. There
are two things that require to do are: 1) Storage of data 2) retrieval of data.
Storage: According to the principles of database systems, the data is stored in such a
way that it acquires lot less space as the redundant data (duplicate data) has been
removed before storage.
Retrieval of data: Along with storing the data in an optimized and systematic
manner, it is also important that we retrieve the data quickly when needed. Database
systems ensure that the data is retrieved effectively and efficiently.
Databases are usually designed to manage large bodies of information. This involves
Definition of structures for information storage (data modeling).
Provision of mechanisms for the manipulation of information (File and systems
structure, query processing).
Providing for the safety of information in the database (crash recovery and
security).
Concurrency control if the system is shared by users.
Application of DBMS
The advantages of using a DBMS and the capabilities that a good DBMS
should possess are:
1. Controlling Redundancy
In traditional file processing, every user group maintains its own files
for handling its data-processing applications. This may lead to duplicate
some or all of the same data in their own files. This redundancy in storing
the same data multiple times leads to several problems.
This redundancy is prevented in a database as there is a single
database and any change in it is reflected immediately. Because of this, there
is no chance of encountering duplicate data.
Economies of Scale.
The DBMS approach permits consolidation of data
and applications, thus reducing the amount of wasteful overlap between
Advantage of DBMS
There are several advantages of Database management system over file system.
Data Consistency and Integrity: As we discussed earlier the root cause of data
inconsistency is data redundancy, since data normalization takes care of the data
redundancy, data inconsistency also been taken care of as part of it
Data Security: It is easier to apply access constraints in database systems so that only
authorized user is able to access the data. Each user has a different set of access thus data
is secured from the issues such as identity theft, data leaks and misuse of data.
Privacy: Limited access means privacy of data.
Easy access to data – Database systems manages data in such a way so that the data is
easily accessible with fast response times.
Easy recovery: Since database systems keeps the backup of data, it is easier
to do a full recovery of data in case of a failure.
Flexible: Database systems are more flexible than file processing systems.
DATA MODEL
A data model-a collection of concepts that can be used to describe the structure of a
database
provides the necessary means to achieve this abstraction.
structure of a database, we mean the data types, relationships, and constraints that should
hold for the data.
Most data models also include a set of basic operations for specifying retrievals and
updates on the database.
Data Model gives us an idea that how the final system will look like after its complete
implementation. It defines the data elements and the relationships between the data elements.
Data M
odels are used to show how data is stored, connected, accessed and updated in the database
management system.
The Relational model is the most widely used model. Apart from the Relational model,. Some of
the Data Models in DBMS are:
1. Hierarchical Model
2. Network Model
3. Entity-Relationship Model
4. Relational Model
Hierarchical Model
Hierarchical Model was the first DBMS model. This model organises the data in the
hierarchical tree structure. The hierarchy starts from the root which has root data and then it
expands in the form of a tree adding child node to the parent node. This model easily represents
some of the real-world relationships like food recipes, sitemap of a website etc. Example: We
Network Model
This model is an extension of the hierarchical model. It was the most popular model
before the relational model. This model is the same as the hierarchical model, the only difference
is that a record can have more than one parent. It replaces the hierarchical tree with a
graph. Example: In the example below we can see that node student has two parents i.e. CSE
Department and Library. This was earlier not possible in the hierarchical model.
Entity-Relationship Model
Entity-Relationship Model or simply ER Model is a high-level data model diagram. In
this model, we represent the real-world problem in the pictorial form to make it easy for the
stakeholders to understand. It is also very easy for the developers to understand the system by
just looking at the ER diagram. We use the ER diagram as a visual tool to represent an ER
Model. ER diagram has the following three components:
Entities: Entity is a real-world thing. It can be a person, place, or even a
concept. Example: Teachers, Students, Course, Building, Department, etc are some of the
entities of a School Management System.
Attributes: An entity contains a real-world property called attribute. This is the
characteristics of that attribute. Example: The entity teacher has the property like teacher
id, salary, age, etc.
Relationship: Relationship tells how two attributes are related. Example: Teacher works
for a department.
Example:
In the above diagram, the entities are Teacher and Department. The attributes of Teacher entity
are Teacher_Name, Teacher_id, Age, Salary, Mobile_Number. The attributes of
entity Department entity are Dept_id, Dept_name. The two entities are connected using the
relationship. Here, each teacher works for a department.
Features of ER Model
Graphical Representation for Better Understanding: It is very easy and simple to
understand so it can be used by the developers to communicate with the stakeholders.
ER Diagram: ER diagram is used as a visual tool for representing the model.
Advantages of ER Model
Simple: Conceptually ER Model is very easy to build. If we know the relationship
between the attributes and the entities we can easily build the ER Diagram for the model.
Effective Communication Tool: This model is used widely by the database designers for
communicating their ideas.
Easy Conversion to any Model: This model maps well to the relational model and can be
easily converted relational model by converting the ER model to the table. This model
can also be converted to any other model like network model, hierarchical model etc.
Disadvatages of ER Model
No industry standard for notation: There is no industry standard for developing an ER
model. So one developer might use notations which are not understood by other
developers.
Hidden information: Some information might be lost or hidden in the ER model. As it is
a high-level view so there are chances that some details of information might be hidden.
Relational Model
Relational Model is the most widely used model. In this model, the data is maintained
in the form of a two-dimensional table. All the information is stored in the form of row and
columns. The basic structure of a relational model is tables. So, the tables are also
called relations in the relational model. Example: In this example, we have an Employee table.
Levels of abstraction
1. Data Abstraction
The major purpose of a database system is to provide users with an
abstract view of the system. The system hides certain details of how data
is stored and maintained.
Complexity should be hidden from database users.
There are three levels of abstraction:
Physical Level:
Deals with how the data are stored.
E.g. index, B-tree, hashing.
Lowest level of abstraction.
Complex low-level structures described in detail.
Conceptual Level:
Next highest level of abstraction.
Describes what data are stored.
View Level:
Highest level.
Describes part of the database for a particular group of users.
Can be many different views of a database.
E.g. tellers in a bank get a view of customer accounts, but not of payroll
data.
Three levels of abstraction.
Physical data independence helps you to separate conceptual levels from the internal/physical
levels. It allows you to provide a logical description of the database without the need to specify
physical structures. Compared to Logical Independence, it is easy to achieve physical data
independence.
With Physical independence, you can easily change the physical storage structures or devices
with an effect on the conceptual schema. Any change done would be absorbed by the mapping
between the conceptual and internal levels. Physical data independence is achieved by the
Due to Physical independence, any of the below change will not affect the conceptual layer.
1. External views
2. External API or programs
Any change made will be absorbed by the mapping between external and conceptual levels.
Due to Logical independence, any of the below change will not affect the external layer.
Database Administrators
Database Designers
Database designers are responsible for identifying the data to be stored in the data-base
and for choosing appropriate structures to represent and store this data
Database designers typically interact with each potential group of users and
develop views of the database that meet the data and processing requirements of these
groups.
End Users
End users are the people whose jobs require access to the database for
querying, updating, and generating reports; the database primarily exists for
their use. There are several categories of end users:
Casual end users occasionally access the database, but they may need
different information each time. They use a sophisticated database query
language to specify their requests and are typically middle- or high-level
managers or other occasional browsers.
Responsibilities of DBA:
The day-to-day activities that a DBA perform include:
In addition to those who design, use, and administer a database, others are associated
with the design, development, and operation of the DBMS software and
system environment.
DBMS system designers and implementers design and implement the DBMS modules
and interfaces as a software package.
Tool developers design and implement tools—the software packages that facilitate
database modeling and design, database system design, and improved performance.
Database Architecture:
The architecture of a database system is greatly influenced by the underlying computer
system on which the database system runs.
Database systems can be centralized, or client-server, where one server machine
executes work on behalf of multiple client machines.
Database systems can also be designed to exploit parallel computer architectures.
Distributed databases span multiple geographically separated machines.
A database system is partitioned into modules that deal with each of the responsibilities
of the overall system. The functional components of a database system can be broadly
divided into the storage manager and the query processor components.
The storage manager is important because databases typically require a large amount of
storage space. The query processor is important because it helps the database system
simplify and facilitate access to data.
It is the job of the database system to translate updates and queries written in a
nonprocedural language, at the logical level, into an efficient sequence of operations at
the physical level.
Database applications are usually partitioned into two or three parts, as in Figure 1.4. In a
two-tier architecture, the application resides at the client machine, where it invokes
database system functionality at the server machine through query language statements.
Application program interface standards like ODBC and JDBC are used for interaction
between the client and the server.
In contrast, in a three-tier architecture, the client machine acts as merely a front end and
does not contain any direct database calls. Instead, the client end communicates with an
application server, usually through a forms interface.
The application server in turn communicates with a database system to access data. The
business logic of the application, which says what actions to carry out under what
conditions, is embedded in the application server, instead of being distributed across
multiple clients. Three-tier applications are more appropriate for large applications, and
for applications that run on the WorldWideWeb.
Query Processor:
The query processor components include
· DDL interpreter, which interprets DDL statements and records the definitions in the data
dictionary.
Storage Manager:
A storage manager is a program module that provides the interface between the lowlevel
data stored in the database and the application programs and queries submitted to the
system. The storage manager is responsible for the interaction with the file manager.
The raw data are stored on the disk using the file system, which is usually provided by a
conventional operating system. The storage manager translates the various DML
statements into low-level file-system commands. Thus, the storage manager is
responsible for storing, retrieving, and updating data in the database.
The storage manager components include:
· Authorization and integrity manager, which tests for the satisfaction of integrity constraints
and checks the authority of users to access data.
·
· Transaction manager, which ensures that the database remains in a consistent (correct) state
despite system failures, and that concurrent transaction executions proceed without conflicting.
·
· File manager, which manages the allocation of space on disk storage and the data structures
used to represent information stored on disk.
·
· Buffer manager, which is responsible for fetching data from disk storage into main memory,
and deciding what data to cache in main memory. The buffer manager is a critical part of the
database system, since it enables the database to handle data sizes that are much larger than the
size of main memory.
Entity-Relationship Model
An entity-relationship model (ERM) is a data model which represents data as a
collection of entities and their relationships.
In this, data is represented as a collection of entities and relationships among the entities.
An entity is a thing or object in the real world that is distinguishable from other objects.
Each entity is described by set of attributes.
E.g. Customer entity is described by the attributes cust_no, cust_name, and address.
A relationship is an association among several entities.
e.g. A CustAcct relationship associates a Customer entity with Account entity.
The set of all entities of the same type is called the entity set and the set of relationships
of the same type is called relationship set.
The overall logical structure of a database can be expressed graphically by an E-R
diagram:
Different notations used for ER Diagram
Attributes
Attributes are the properties of entities. Attributes are represented by means of ellipses.
Every ellipse represents one attribute and is directly connected to its entity (rectangle).
Attributes are categorized into following types:
Simple attribute – is an attribute that cannot be divided into sub parts.
E.g. AccountNumber
Composite attribute – is an attribute that can be divided into sub parts.
E.g. Name can be divided into sub parts as first_ name, middle _name and last_ name.
Single valued and Multi valued attributes – if an attribute takes only one value then it is
single valued attribute. if an attribute takes more than one value then it is multi valued
attribute.
E.g. AccountNumber is single valued attribute and PhoneNumber is multi valued
attribute.
Derived attribute – the value for this type of attribute can be derived from the values of
other related attributes.
E.g. Age is a derived attribute as its value can be derived from Date_of_birth attribute.
Key attribute
Key is a attribute or set of attributes that uniquely identify an entity set from set of
entities.
E.g. suppose student entity consists of the attributes (Roll_no, Name, DOB, Address)
then the attribute Roll_no identifies uniquely a student from set of students. Hence
Roll_no is the key for student entity.
Transforming an E-R diagram with aggregation into tabular form is easy. We create a
table for each entity and relationship set as before.
The table for relationship set uses contains a column for each attribute in the primary key
of machinery and work.
A strong entity set with only simple attributes will require only one table in relational model.
Attributes of the table will be the attributes of the entity set.
The primary key of the table will be the key attribute of the entity set.
Example-
Example-
A strong entity set with any number of multi valued attributes will require two tables in relational
model.
One table will contain all the simple attributes with the primary key.
Other table will contain the primary key and all the multi valued attributes.
Example-
Roll_no City
Roll_no Mobile_no
NOTE-
If we consider the overall ER diagram, three tables will be required in relational model-
One table for the entity set ―Employee‖
One table for the entity set ―Department‖
One table for the relationship set ―Works in‖
NOTE- Here, combined table will be drawn for the entity set B and relationship set R.
NOTE- Here, combined table will be drawn for the entity set A and relationship set R.
Here, two tables will be required. Either combine ‗R‘ with ‗A‘ or ‗B‘
Way-01: Way-02:
1. A ( a1 , a2 ) 1.AR ( a1 , a2 , b1 )
2. BR ( a1 , b1 , b2 ) 2. B ( b1 , b2 )
Rule-06: For Binary Relationship With Both Cardinality Constraints and Participation
Constraints-
Case-01: For Binary Relationship With Cardinality Constraint and Total Participation
Constraint From One Side-
Because cardinality ratio = 1 : n , so we will combine the entity set B and relationship set R.
Then, two tables will be required-
1. A ( a1 , a2 )
2. BR ( a1 , b1 , b2 )
Because of total participation, foreign key a1 has acquired NOT NULL constraint, so it can‘t be
null now.
Case-02: For Binary Relationship With Cardinality Constraint and Total Participation
Constraint From Both Sides-
If there is a key constraint from both the sides of an entity set with total participation, then that
binary relationship is represented using only single table.
ARB(a1,a2,b1,b2)
-
Rule-07: For Binary Relationship With Weak Entity Set-
Weak entity set always appears in association with identifying relationship with total
participation constraint.
Problem-01:
Find the minimum number of tables required for the following ER diagram in relational model-
Solution-
Applying the rules, minimum 3 tables will be required-
MR1 (M1 , M2 , M3 , P1)
P (P1 , P2)
NR2 (P1 , N1 , N2)
Problem-02:
Find the minimum number of tables required to represent the given ER diagram in relational
model-
Solution-
Applying the rules, minimum 4 tables will be required-
AR1R2 (a1 , a2 , b1 , c1)
B (b1 , b2)
C (c1 , c2)
R3 (b1 , c1)
Solution-
Applying the rules, minimum 5 tables will be required-
Problem-04:
Find the minimum number of tables required to represent the given ER diagram in relational
model-
Solution-
Applying the rules, minimum 3 tables will be required-
E1 (a1 , a2)
E2R1R2 (b1 , b2 , a1 , c1 , b3)
E3 (c1 , c2)
Solution-
Applying the rules that we have learnt, minimum 6 tables will be required-
Account (Ac_no , Balance , b_name)
Branch (b_name , b_city , Assets)
Loan (L_no , Amt , b_name)
Borrower (C_name , L_no)
Customer (C_name , C_street , C_city)
Depositor (C_name , Ac_no)