Database Notes

Database management system:
A database management system is the software system that allows users to define, create and
maintain a database and provides controlled access to the data. A Database Management System
(DBMS) is basically a collection of programs that enables users to store, modify, and extract
information from a database as per the requirements. DBMS is an intermediate layer between
programs and the data. Programs access the DBMS, which then accesses the data. There are
different types of DBMS ranging from small systems that run on personal computers to huge
systems that run on mainframes. The following are main examples of database applications: •
Computerized library systems • Automated teller machines • Flight reservation systems •
Computerized parts inventory systems A database management system is a piece of software that
provides services for accessing a database, while maintaining all the required features of the data.
Commercially available Database management systems in the market are dbase, FoxPro, IMS and
Oracle, MySQL, SQL Servers and DB2 etc. These systems allow users to create update, and extract
information from their databases.Compared to a manual filing system, the biggest advantages to a
computerized database system are speed, accuracy, and’ accessibility.
Advantages of DBMS:
a)Controls Redundancy: In file system, each application has its own private files, which cannot
be shared between multiple applications. 1:his can often lead to considerable redundancy in the
stored data, which results in wastage of storage space. By having centralized database most of this
can be avoided. It is not possible that all redundancy should be eliminated. Sometimes there are
sound business and technical reasons for maintaining multiple copies of the same data. In a
database system, however this redundancy can be controlled.

b) Integrity can be enforced: Integrity of data means that data in database is always accurate,
such that incorrect information cannot be stored in database. In order to maintain the integrity of
data, some integrity constraints are enforced on the database. A DBMS should provide capabilities
for defining and enforcing the constraints.
c) Inconsistency can be avoided : When the same data is duplicated and changes are made at one
site, which is not propagated to the other site, it gives rise to inconsistency and the two entries
regarding the same data will not agree. At such times the data is said to be inconsistent. So, if the
redundancy is removed chances of having inconsistent data is also removed.
d) Data can be shared: As explained earlier, the data about Name, Class, Father __name etc. of
General_Office is shared by multiple applications in centralized DBMS as compared to file system
so now applications can be developed to operate against the same stored data. The applications
may be developed without having to create any new stored files.
e) Standards can be enforced : Since DBMS is a central system, so standard can be enforced
easily may be at Company level, Department level, National level or International level. The
standardized data is very helpful during migration or interchanging of data. The file system is an
independent system so standard cannot be easily enforced on multiple independent applications.
f) Restricts unauthorized access: When multiple users share a database, it is likely that some
users will not be authorized to access all information in the database. For example, account office
data is often considered confidential, and hence only authorized persons are allowed to access such
data. In addition, some users may be permitted only to retrieve data, whereas other are allowed
both to retrieve and to update. Hence, the type of access operation retrieval or update must also be
controlled. Typically, users or user groups are given account numbers protected by passwords,
which they can use to gain access to the database. A DBMS should provide a security and
authorization subsystem, which the DBA uses to create accounts and to specify account
restrictions. The DBMS should then enforce these restrictions automatically.
g)Solves Enterprise Requirement than Individual Requirement: Since many types of users
with varying level of technical knowledge use a database, a DBMS should provide a variety of
user interface. The overall requirements of the enterprise are more important than the individual
user requirements. So, the DBA can structure the database system to provide an overall service
that is “best for the enterprise”.
h) Provides Backup and Recovery: A DBMS must provide facilities for recovering from
hardware or software failures. The backup and recovery subsystem of the DBMS is responsible
for recovery. For example, if the computer system fails in the middle of a complex update program,
the recovery subsystem is responsible for making sure that the .database is restored to the state it
was in before the program started executing.
i)Cost of developing and maintaining system is lower: It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system. Although the initial cost of setting up of a database can be large, but the
cost of developing and maintaining application programs to be far lower than for similar service
using conventional systems. The productivity of programmers can be higher in using non-
procedural languages that have been developed with DBMS than using procedural languages.
j) Data Model can be developed : The centralized system is able to represent the complex data
and interfile relationships, which results better data modeling properties. The data madding
properties of relational model is based on Entity and their Relationship, which is discussed in detail
in chapter 4 of the book.

k)Concurrency Control : DBMS systems provide mechanisms to provide concurrent access of
data to multiple users.
Disadvantages of DBMS
a) Complexity : The provision of the functionality that is expected of a good DBMS makes the
DBMS an extremely complex piece of software. Database designers, developers, database
administrators and end-users must understand this functionality to take full advantage of it. Failure
to understand the system can lead to bad design decisions, which can have serious consequences
for an organization.
b) Size : The complexity and breadth of functionality makes the DBMS an extremely large piece
of software, occupying many megabytes of disk space and requiring substantial amounts of
memory to run efficiently.
c) Performance: Typically, a File Based system is written for a specific application, such as
invoicing. As result, performance is generally very good. However, the DBMS is written to be
more general, to cater for many applications rather than just one. The effect is that some
applications may not run as fast as they used to.
d)Higher impact of a failure: The centralization of resources increases the vulnerability of the
system. Since all users and applications rely on the ~vailabi1ity of the DBMS, the failure of any
component can bring operations to a halt.
e) Cost of DBMS: The cost of DBMS varies significantly, depending on the environment and
functionality provided. There is also the recurrent annual maintenance cost.

f) Additional Hardware costs: The disk storage requirements for the DBMS and the database
may necessitate the purchase of additional storage space. Furthermore, to achieve the required
performance it may be necessary to purchase a larger machine, perhaps even a machine dedicated
to running the DBMS. The procurement of additional hardware results in further expenditure.
g) Cost of Conversion: In some situations, the cost oftlle DBMS and extra hardware may be
insignificant compared with the cost of converting existing applications to run on the new DBMS
and hardware. This cost also includes the cost of training staff to use these new systems and
possibly the employment of specialist staff to help with conversion and running of the system.
This cost is one of the main reasons why some organizations feel tied to their current systems and
cannot switch to modern database technology.
Type of Database System:
On the basis of the number of users:
• Single-user DBMS
• Multi-user DBMS
On the basis of the site location
• Centralized DBMS
• Parallel DBMS
• Distributed DBMS
• Client/server DBMS
The database system may be multi-user or single-user. The configuration of the hardware and the
size of the organization will determine whether it is a multi-user system or a single user system.
In single user system the database resides on one computer and is only accessed by one user at a
time. This one user may design, maintain, and write database programs. Due to large amount of
data management most systems are multi-user. In this situation the data are both integrated and
shared. A database is integrated when the same information is not recorded in two places. For
example, both the Library department and the Account department of the college database may
need student addresses. Even though both departments may access different portions of the
database, the students’ addresses should only reside in one place. It is the job of the DBA to make
sure that the DBMS makes the correct addresses available from one central storage area.
On the basis of site location

a)Centralized Database System:
The centralized database system consists of a single processor together with its associated data
storage devices and other peripherals. It is physically confined to a single location. Data can be
accessed from the multiple sites with the use of a computer network while the database is
maintained at the central site.
Disadvantages of Centralized Database System
• When the central site computer or database system goes down, then every one (users) is blocked
from using the system until the system comes back. • Communication costs from the terminals to
the central site can be expensive.
b)Parallel Database System:

Parallel database system architecture consists of a multiple Central Processing Units (CPUs) and
data storage disk in parallel. Hence, they improve processing and Input/Output (I/O) speeds.
Parallel database systems are used in the application that have to query extremely large databases
or that have to process an extremely large number of transactions per second.
Advantages of a Parallel Database System
• Parallel database systems are very useful for the applications that have to query extremely large
databases (of the order of terabytes, for example, 1012 bytes) or that have to process an extremely
large number of transactions per second (of the order of thousands of transactions per second). •
In a parallel database system, the throughput (that is, the number of tasks that can be completed in
a given time interval) and the response time (that is, the amount of time it takes to complete a
single task from the time it is· submitted) are very high.
Disadvantages of a Parallel Database System
• In a parallel database system, there· is a startup cost associated with initiating a single process
and the startup-time may overshadow the actual processing time, affecting speedup adversely. •
Since process executing in a parallel system often access shared resources, a slowdown may result
from interference of each new process as it completes with existing processes for commonly held
resources, such as shared data storage disks, system bus and so on.
c)Distributed Database System:
A logically interrelated collection of shared data physically distributed over a computer network
is called as distributed database and the software system that permits the management of the
distributed database and makes the distribution transparent to users is called as Distributed
DBMS. It consists of a single logical database that is split into a number of fragments. Each
fragment is stored on one or more computers under the control of a separate DBMS, with the
computers connected by a communications network. As shown, in distributed database system,
data is spread across a variety of different databases. These are managed by a variety of different
DBMS software running on a variety of different operating systems. These machines are spread
(or distributed) geographically and connected together by a variety of communication
networks.
Advantages of Distributed Database System
• Distributed database architecture provides greater efficiency and better performance. • A single
database (on server) can be shared across several distinct client (application) systems. • As data
volumes and transaction rates increase, users can grow the system incrementally. • It causes less
impact on ongoing operations when adding new locations. • Distributed database system provides
local autonomy. Disadvantages of Distributed Database System • Recovery from failure is
more complex in distributed database systems than in centralized systems.
d)Client-Server DBMS:
Client/Server architecture of database system has two logical components namely client, and
server. Clients are generally personal computers or workstations whereas server is large
workstations, mini range computer system or a mainframe computer system. The applications and
tools of DBMS run on one or more client platforms, while the DBMS soft wares reside on the
server. The server computer is caned backend and the client’s computer is called front end. These
server and client computers are connected into a network. The applications and tools act as clients
of the DBMS, making requests for its services. The DBMS, in turn, processes these requests and
returns the results to the client(s). Client/Server architecture handles the Graphical User Interface
(GUI) and does computations and other programming of interest to the end user. The server
handles parts of the job that are common to many clients, for example, database access and updates.
Multi-Tier client server computing models
In a single-tier system the database is centralized, which means the DBMS Software and the data
reside in one location and the dumb terminals were used to access the DBMS as
shown. The rise of
personal computers in businesses during the 1980s, the increased reliability of networking
hardware causes Two-tier and Three-tier systems became common. In a two-tier system, different
software is required for the server and for the client. Illustrates the two-tier client server model. At
the early stages client server computing model was called two-tier-computing model in which
client is considered as data capture and validation tier and Server was considered as data storage
tier. This scenario is depicted.

Problems of two-tier architecture
The need of enterprise scalability challenged this traditional two-tier client-server model. In the
mid-1990s, as application became more complex and could be deployed to hundreds or thousands
of end-users, the client side, now undergoes with following
problems: • A’ fat’
client requiring considerable resources on client’s computer to run effectively. This includes disk
space, RAM and CPU. • Client machines require administration which results overhead.
Three-tier architecture
By 1995, three-tier architecture appears as improvement over two-tier architecture. It has three
layers, which are: • First Layer: User Interface which runs on end-user’s computer (the client) . •
Second Layer: Application Server It is a business logic and data processing layer. This middle
tier runs on a server which is called as Application Server. • Third Layer: Database Server It is a
DBMS, which stores the data required by the middle tier. This tier may run on a separate server
called the database server. As, described earlier, the client is now responsible for application’s user
interface, thus it requires less computational resources now clients are called as ‘thin client’ and it
requires less maintenance.
Advantages of Client/Server Database System
• Client/Server system has less expensive platforms to support applications that had previously
been running only on large and expensive mini or mainframe computers • Client offer icon-based
menu-driven interface, which is superior to the traditional command-line, dumb terminal interface
typical of mini and mainframe computer systems. • Client/Server environment facilitates in more
productive work by the users and making better use of existing
data.
• Client/Server database system is more flexible as compared to the Centralized system. •
Response time and throughput is high. • The server (database) machine can be custom-built
(tailored) to the DBMS function and thus can provide a better DBMS performance. • The client
(application database) might be a personnel workstation, tailored to the needs of the end users and
thus able to provide better interfaces, high availability, faster responses and overall improved ease
of use to the user. + A single database (on server) can be shared across several distinct client
(application) systems.
Disadvantages of Client/Server Database System
• Programming cost is high in client/server environments, particularly in initial phases. • There is
a lack of management tools for diagnosis, performance monitoring and tuning and security control,
for the DBMS, client and operating systems and networking environments.
Database Languages:
A DBMS must provide appropriate languages and interfaces for each category of users to express
database queries and updates. Database Languages are used to create and maintain database
on computer. There are large numbers of database languages like Oracle, MySQL, MS Access,
dBase, FoxPro etc. SQL statements commonly used in Oracle and MS Access can be categorized
as data definition language (DDL), data control language (DCL) and data manipulation language
(DML).
Types
a)Data Definition Language (DDL):
It is a language that allows the users to define data and their relationship to other types of data. It
is mainly used to create files, databases, data dictionary and tables within databases.It is also
used to specify the structure of each table, set of associated values with each attribute, integrity
constraints, security and authorization information for each table and physical storage structure
of each table on disk.
b)Data Manipulation Language (DML):
It is a language that provides a set of operations to support the basic data manipulation operations
on the data held in the databases. It allows users to insert, update, delete and retrieve data from
the database. The part of DML that involves data retrieval is called a query language.
c)Data Control Language (DCL):

DCL statements control access to data and the database using statements such as GRANT and
REVOKE. A privilege can either be granted to a User with the help of GRANT statement. The
privileges assigned can be SELECT, ALTER, DELETE, EXECUTE, INSERT, INDEX etc. In
addition to granting of privileges, you can also revoke (taken back) it by using REVOKE
command.
File Processing System:

File processing systems was an early attempt to computerize the manual filing system that we are
all familiar with. A file system is a method for storing and organizing computer files and the data
they contain to make it easy to find and access them. File systems may use a storage device such
as a hard disk or CD-ROM and involve maintaining the physical location of the files. In our own
home, we probably have some sort of filing system, which contains receipts, guarantees,
invoices, bank statements, and such like. When we need to look something up, we go to the
filing system and search through the system starting from the first entry until we find what we
want. Alternatively, we may have an indexing system that helps to locate what we want more
quickly. For example we may have divisions in the filing system or separate folders for different
types of item that are in some way logically related.The manual filing system works well when
the number of items to be stored is small. It even works quite adequately when there are large
numbers of items and we have only to store and retrieve them. However, the manual filing
system breaks down when we have to cross-reference or process the information in the files. For
example, a typical real estate agent’s office might have a separate file for each property for sale
or rent, each potential buyer and renter, and each member of staff. Clearly the manual system is
inadequate for this’ type of work. The file based system was developed in response to the needs
of industry for more efficient data access. In early processing systems, an organization’s
information was stored as groups of records in separate files.In the traditional approach, we used
to store information in flat files which are maintained by the file system under the operating
system’s control. Here, flat files are files containing records having no structured relationship
among them. The file handling which we learn under C/C ++ is the example of file processing
system. The Application programs written in C/C ++ like programming languages go through the
file system to access these flat.
Characteristics of File Processing System:

• It is a group of files storing data of an organization. • Each file is independent from one
another. • Each file is called a flat file. • Each file contained and processed information for one
specific function, such as accounting or inventory. • Files are designed by using programs
written in programming languages such as COBOL, C, C++. • The physical implementation and
access procedures are written into database application; therefore, physical changes resulted in
intensive rework on the part of the programmer. • As systems became more complex, file
processing systems offered little flexibility, presented many limitations, and were difficult to
maintain.
Limitations of the File Processing System I File-

Based Approach:
1.Separated and Isolated Data: To make a decision, a user might need data from two separate
files. First, the files were evaluated by analysts and programmers to determine the specific data
required from each file and the relationships between the data and then applications could be
written in a programming language to process and extract the needed data. Imagine the work
involved if data from several files was needed.
2.Duplication of data: Often the same information is stored in more than one file. Uncontrolled
duplication of data is not required for several reasons, such as: • Duplication is wasteful. It costs
time and money to enter the data more than once • It takes up additional storage space, again
with associated costs. • Duplication can lead to loss of data integrity; in other words the data is
no longer consistent. For example, consider the duplication of data between the Payroll and
Personnel departments. If a member of staff moves to new house and the change of address is
communicated only to Personnel and not to Payroll, the person’s pay slip will be sent to the
wrong address. A more serious problem occurs if an employee is promoted with an associated
increase in salary. Again, the change is notified to Personnel but the change does not filter
through to Payroll. Now, the employee is receiving the wrong salary. When this error is detected,
it will take time and effort to resolve. Both these examples, illustrate inconsistencies that may
result from the duplication of data. As there is no automatic way for Personnel to update the data
in the Payroll files, it is difficult to foresee such inconsistencies arising. Even if Payroll is
notified of the changes, it is possible that the data will be entered incorrectly.
3. Data Dependence: In file processing systems, files and records were described by specific
physical formats that were coded into the application program by programmers. If the format of a
certain record was changed, the code in each file containing that format must be updated.
Furthermore, instructions for data storage and access were written into the application’s code.
Therefore, .changes in storage structure or access methods could greatly affect the processing or
results of an application. In other words, in file based approach application programs are data
dependent. It means that, with the change in the physical representation (how the data is
physically represented in disk) or access technique (how it is physically accessed) of data,
application programs are also affected and needs modification. In other words application
programs are dependent on the how the data is physically stored and accessed.
4. Difficulty in representing data from the user’s view: To create useful applications for the
user, often data from various files must be combined. In file processing it was difficult to
determine relationships between isolated data in order to meet user requirements.
5. Data Inflexibility: Program-data interdependency and data isolation, limited the flexibility of
file processing systems in providing users with ad-hoc information requests
6. Incompatible file formats: As the structure of files is embedded in the application programs,
the structures are dependent on the application programming language. For example, the
structure of a file generated by a COBOL program may be different from the structure of a file
generated by a ‘C’ program. The direct incompatibility of such files makes them difficult to
process jointly.
7. Data Security. The security of data is low in file based system because, the data is maintained
in the flat file(s) is easily accessible. For Example: Consider the Banking System. The Customer
Transaction file has details about the total available balance of all customers. A Customer wants
information about his account balance. In a file system it is difficult to give the Customer access
to only his data in the· file. Thus enforcing security constraints for the entire file or for certain
data items are difficult.
8. Transactional Problems. The File based system approach does not satisfy transaction
properties like Atomicity, Consistency, Isolation and Durability properties commonly known as
ACID properties.
9. Concurrency problems. When multiple users access the same piece of data at same interval
of time then it is called as concurrency of the system. When two or more users read the data
simultaneously there is ll( problem, but when they like to update a file simultaneously, it may
result in a problem.
10. Poor data modeling of real world. The file based system is not able to represent the
complex data and interfile relationships, which results poor data modeling properties.
Entity-Relationship Model:
In Entity-Relationship model a database is modeled as a collection of entities and relationship
among entities. The ER model views the real world as a construct of entities and association
between entities.
Entities
An entity is an object whose information is stored in the database. It is distinguishable from other
objects. For example: specific person, company, event, plant. In other words, any thing that may
‘have an independent existence and about which we intend to collect data is known as Entity. It
is also known as Entity type. Entities are the principal data object about which information is to
be collected. Entities are usually recognizable concepts, either concrete or abstract, such as
person, places, things, or events, which have relevance to the database. Some specific examples
of entities are EMPLOYEES, PROJECTS, and INVOICES. An entity is analogous to a table in
the relational model. Entities are classified as independent or dependent (in some methodologies,
the terms used are strong and weak, respectively). An independent entity is one that does not rely
on another for identification. A dependent entity is one that relies on another for identification.
An entity occurrence (also called an instance) is an individual occurrence of an entity. An
occurrence is analogous to a row in the relational table.
Entity Set
An entity set is a set of entities of the same type that share the same properties. For example: set
of all persons, companies, trees, holidays. In object oriented terminology entity is considered as
an object and entity set is considered as a class. For example, Student is an entity set or class and
a particular student is an entity or an object.
Special Entity Types

Associative entities (also known as intersection entities) are entities used to associate two or
more entities in order to reconcile a many-to-many relationship. Subtypes entities are used in
generalization hierarchies to represent a subset of instances of their parent entity, called the super
type, but which have attributes or relationships that apply only to the subset. Associative entities
and generalization hierarchies are discussed in more detail in next sections.
Relationships
A Relationship represents an association between two or more entities. Relationships are
classified in terms of degree, connectivity, cardinality, and existence. An example of a
relationship would be: • Employees are assigned to projects • Projects have subtasks •
Departments manage one or more projects
Attributes
Attributes describe the properties of the entity of which they are associated. A particular instance
of an attribute is a value. For example, “Ram” is one value of the attribute Name. The domain of
an attribute is the collection of all possible values an attribute can have. The domain of Name is a
character string.
Attribute
We can classify attributes as following types: • Simple • Composite • Single-values • Multi-
values • DerivedSimple Attribute: A simple attribute is an attribute composed of a single
component with an independent existence. Simple attributes cannot be further subdivided.
Examples of simple attributes include Sex, Age, and Salary etc. Simple attributes are sometimes
called atomic attributes. Composite Attribute: An attribute composed of multiple components,
each with an independent existence is called a composite attribute. Some attributes can be further
divided to yield smaller components with an independent existence of their own. For example,
the Address attribute can be composed of components like Street number, Area, City, Pin code
and so on. The decision to model the Address, Area, and City etc. is dependent on whether the
user view of the model refers to the Address attribute as a single unit or as individual
components. Composite attributes are
shown.
Si
ngle-valued Attribute: A single-valued attribute is one that holds a single value for a single
entity. The majority of attributes are single-valued for a particular entity. For example, the
Classroom entity has as single value for the room_number attribute and therefore the
room_number attribute is referred to as being single-valued. Multi-valued Attribute: A multi-
valued attribute is one that holds multiple values for a single entity. Some attribute has multiple
values for a particular entity. For example, a student entity can have multiple values for the
Hobby attribute-reading, music, movies and so on. A multi-valued attribute may have set of
numbers with upper and lower limits. For example, the Hobby attribute of a Student may have
between one and five values. In other words, a student may have a minimum of one hobby and
maximum of 5 hobbies. Derived Attribute: A derived attribute is one that represents a value
that is derivable from the value of a related attribute or set of attributes, not necessarily in the
same entity. Some attributes may be related for a particular entity. For example the Age attribute
can be derived from the date-of-birth attribute and therefore they are related. We refer the age
attribute as a derived attribute, the value of which is derived from the date-of-birth
attribute.
Degreeof a Relationship
The degree of a relationship is the number of entities associated with the relationship. The n-
ary relationship is the general form for degree n. Special cases are the binary, and ternary, where
the degree is 2, and 3, respectively. Binary relationships, the association between two entities are
the most common type in the real world. A recursive binary relationship occurs when an entity is
related to itself. An example might be “some employees are married to other employees”. A
ternary relationship involves three entities and is used when a binary relationship is inadequate.
Many modeling approaches recognize only binary relationships. Ternary orn-ary relationships
are decomposed into two or more binary relationships.

Connectivity and Cardinality
The connectivity of a relationship describes the mapping of associated entity instances in the
relationship. The values of connectivity are “one” or “many”. The cardinality of a relationship is
the actual number of related occurrences for each of the two entities. The basic types of
connectivity for relations are: • One to One (1: 1). • One to Many (1:M) • Many to One (M:l) •
Many to Many (M:M) A one-to-one (1:1) relationship is when at most one instance of an entity
A is associated with one instance of entity B. For example, “employees in the company are each
assigned their own office. For each employee there exists a unique office and for each office
there exists a unique employee. A one-to-many (1:M) relationships is when for one instance of
entity A, there are zero, one, or many instances of entity B, but for one instance of entity B, there
is only one instance of entity A. Examples of l:M relationships are: • A department has many
employees. • Each employee is assigned to one department. A many-to-one (M: 1) relationships
is when for one instance of entity A is associated with at most one instances of entity B, but for
one instance of entity B, there may be any number of instances of entity A. Examples of M: 1
relationships is • Many employees one department. A many-to-many (M: M) relationship,
sometimes called non-specific, is when for one instance of entity A, there are zero, one, or many
instances of entity B and for one instance of entity B there are zero, one, or many instances of
entity A. Examples are: • Employees can be assigned to no more than two projects at the same
time. • Projects must have assigned at least three employees. A single employee can be assigned
to many projects; conversely, a single project can have assigned to it many employees. Here, the
cardinality for the relationship between employees and projects is two and the cardinality
between project and employee is three. Many-to-many relationships cannot be directly translated
to relational tables but instead must be transformed into two or more one-to-many relationships
using associative entities.
Direction
The direction of a relationship indicates the originating entity of a relationship. The entity from
which a relationship originates is the parent entity; the entity where the relationship terminates is
the child entity. The type of the relation is determined by the direction of line connecting
relationship component and the entity. To distinguish different types of relation, we draw either a
directed line or an undirected line between the relationship set and the entity set. Directed line is
used to indicate one occurrence and undirected line is used to indicate many occurrences in a
relation as shown in next case. To illustrate these different types of relationships consider the
following entity sets: DEPARTMENT, MANAGER, EMPLOYEE, PROJECT. The

relationship between a DEPARTMENT and a MANAGER is usually one-to-one; there is only
one manager per department and a manager manages only one department. This relationship
between entities is shown in figure 4.3. Each entity is represent by a rectangle and a direct line
indicates the relationship between them. The relationship for MANAGER to DEPARTMENT
and from DEPARTMENT to MANAGER is both 1:1. Note that a one to one relationship
between two entity set does not imply that for an occurrence of an entity from one set at any time
there must be an occurrence of an entity in the other set. In the case of an organization, there
could be times when a department is without a manager or when an employee who is classified
as a manager may be without a department to manage. Some instance of one to one relationship
between the entities DEP ARTMENT and
MANAGER.
A one to many,
relationship exists from the entity MANAGER to the entity EMPLOYEE because there are
several employees reporting to the manager. As we just pointed out, there could be an occurrence
of the entity type MANAGER having zero occurrences of the entity type EMPLOYEE reporting
to him or her. A reverse relationship, from EMPLOYEE to MANAGER, would be many to one,
since a single manager may supervise many employees However, given an instance of the entity
set EMPLOYEE, there could be only one instance of the entity set MANAGER to whom that
employee reports (assuming that no employee reports to more than one manager). The
relationship between entities is illustrated and figure shows some instances of this
relationship.
The relationship
between the entity EMPLOYEE and the entity PROJECT can be derived as follows: Each
employee could be involved in a number of different projects, and a number of employees could
be working on a given project. This relationship Between EMPLOYEE and PROJECT is many
to many. It is illustrated and shows some instances of such a
relationship.
Entity Relationship Diagram:
1. Gathering information: This could be a written document that describes the system in question
with reasonable amount of details.
2. Producing ERD: ERD or Entity Relationship Diagram is a diagrammatic representation of the
description we have gathered about the system.
3. Designing the database: Out of the ERD we have created, it is very easy to determine the tables,
the attributes which the tables must contain and the relationship among these tables.
4. Normalization: This is a process of removing different kinds of impurities from the tables we
have just created in the above step.

How to Prepare an ERD
Step 1 Let us take a very simple example and we try to reach a fully organized database from it.
Let us look at the following simple statement: A boy eats an ice cream. This is a description of a
real word activity, and we may consider the above statement as a written document (very short, of
course).
Step 2 Now we have to prepare the ERD. Before doing that we have to process the statement a
little. We can see that the sentence contains a subject (boy), an object (ice cream) and a verb (eats)
that defines the relationship between the subject and the object. Consider the nouns as entities
(boy and ice cream) and the verb (eats) as a relationship. To plot them in the diagram, put the
nouns within rectangles and the relationship within a diamond. Also, show the relationship with a
directed arrow, starting from the subject entity (boy) towards the object entity (ice
cream).
Well, fine. Up to this point the ERD shows how boy and ice cream are related. Now, every boy
must have a name, address, phone number etc. and every ice cream has a manufacturer, flavor,
price etc. Without these the diagram is not complete. These items which we mentioned here are
known as attributes, and they must be incorporated in the ERD as connected
ovals.
But can only entities have attributes? Certainly not. If we want then the relationship must have
their attributes too. These attribute do not inform anything more either about the boy or the ice
cream, but they provide additional information about the relationships between the boy and the ice
cream.
Step 3 We are almost complete now. If you look carefully, we now have defined structures for at
least three tables like the

following:
However, this is still not a working database, because by
definition, database should be “collection of related tables.” To make them connected, the tables
must have some common attributes. If we chose the attribute Name of the Boy table to play the
role of the common attribute, then the revised structure of the above tables become something like
the following. This is as complete
as it can be. We now have information about the boy, about the ice cream he has eaten and about
the date and time when the eating was done.

Cardinality of Relationship While creating relationship between two entities, we may often need
to face the cardinality problem. This simply means that how many entities of the first set are related
to how many entities of the second set. Cardinality can be of the following three types.
One-to-One Only one entity of the first set is related to only one entity of the second set. E.g. A
teacher teaches a student. Only one teacher is teaching only one student. This can be expressed in
the followingdiagram
as:
One-to-Many Only one entity of the first set is related to multiple entities of the second set.
E.g. A teacher teaches students. Only one teacher is teaching many students. This can be expressed
in the followingdiagram
as:
Many-to-One Multiple entities of the first set are related to multiple entities of the second set.
E.g. Teachers teach a student. Many teachers are teaching only one student. This can be expressed
in the followingdiagram
as:
Many-to-Many Multiple entities of the first set is related to multiple entities of the second set.
E.g. Teachers teach students. In any school or college many teachers are teaching many students.
This can be consideredas a two way one-to-many relationship. This can be expressed in the
following diagram
as: In
this discussion we have not included the attributes, but you can understand that they can be used
without any problem if we want to.
The Concept of Keys
A key is an attribute of a table which helps to identify a row. There can be many different types of
keys which are explained here.
Super Key or Candidate Key: It is such an attribute of a table that can uniquely identify a row in
a table. Generally they contain unique values and can never contain NULL values. There can be
more than one super key or candidate key in a table e.g. within a STUDENT table Roll and Mobile
No. can both serve to uniquely identify a student.
Primary Key: It is one of the candidate keys that are chosen to be the identifying key for the entire
table. E.g. although there are two candidate keys in the STUDENT table, the college would
obviously use Roll as the primary key of the table.
Alternate Key: This is the candidate key which is not chosen as the primary key of the table. They
are named so because although not the primary key, they can still identify a row.
Composite Key: Sometimes one key is not enough to uniquely identify a row. E.g. in a single
class Roll is enough to find a student, but in the entire school, merely searching by the Roll is not
enough, because there could be 10 classes in the school and each one of them may contain a certain
roll no 5. To uniquely identify the student we have to say something like “class VII, roll no 5”. So,
a combination of two or more attributes is combined to create a unique combination of values,
such as Class + Roll.
Foreign Key: Sometimes we may have to work with an attribute that does not have a primary key
of its own. To identify its rows, we have to use the primary attribute of a related table. Such a copy
of another related table’s primary key is called foreign key.
Strong and Weak Entity Based on the concept of foreign key, there may arise a situation when
we have to relate an entity having a primary key of its own and an entity not having a primary key
of its own. In such a case, the entity having its own primary key is called a strong entity and the
entity not having its own primary key is called a weak entity. Whenever we need to relate a strong
and a weak entity together, the ERD would change just a little. Say, for example, we have a
statement “A Student lives in a Home.” STUDENT is obviously a strong entity having a primary
key Roll. But HOME may not have a unique primary key, as its only attribute Address may be
shared by many homes (what if it is a housing estate?). HOME is a weak entity in this case. The
ERD of this statement would be like the
following
As you can see, the weak entity itself and the relationship linking a strong and weak entity must
have double border.
Different Types of Database
There are three different types of data base. The difference lies in the organization of the database
and the storage structure of the data. We shall briefly mention them here.
Relational DBMS
This is our subject of study. A DBMS is relational if the data is organized into relations, that is,
tables. In RDBMS, all data are stored in the well-known row-column format.
Hierarchical DBMS
In HDBMS, data is organized in a tree like manner. There is a parent-child relationship among
data items and the data model is very suitable for representing one-to-many relationship. To access
the data items, some kind of tree-traversal techniques are used, such as preorder traversal. Because
HDBMS is built on the one-to-many model, we have to face a little bit of difficulty to organize a
hierarchical database into row column format. For example, consider the following hierarchical
database that shows four employees (E01, E02, E03, and E04) belonging to the same department
D1. There are two ways to
represent the above one-to-many information into a relation that is built in one-to-one relationship.
The first is called Replication, where the department id is replicated a number of times in the table
like the following. Replication makes the same data
item redundant and is an inefficient way to store data. A better way is to use a technique called
the Virtual Record. While using this, the repeating data item is not used in the table. It is kept at
a separate place. The table, instead of containing the repeating information, contains a pointer to
that place where the data item is
stored. This
organization saves a lot of space as data is not made redundant. Network DBMS The NDBMS
is built primarily on a one–to-many relationship, but where a parent-child representation among
the data items cannot be ensured. This may happen in any real world situation where any entity
can be linked to any entity. The NDBMS was proposed by a group of theorists known as
the Database Task Group (DBTG). What they said looks like this: In NDBMS, all entities are
called Records and all relationships are called Sets. The record from where the relationship starts
is called the Owner Record and where it ends is called Member Record. The relationship or set
is strictly one-to-
many. In case
we need to represent a many-to-many relationship, an interesting thing happens. In NDBMS,
Owner and Member can only have one-to-many relationship. We have to introduce a third common
record with which both the Owner and Member can have one-to-many relationship. Using this
common record, the Owner and Member can be linked by a many-to-many relationship. Suppose
we have to represent the statement Teachers teach students. We have to introduce a third record,
suppose CLASS to which both teacher and the student can have a many-to-many relationship.
Using the class in the middle, teacher and student can be linked to a virtual many-to many
relationship.

Database Notes

Uploaded by

Copyright:

Available Formats

Database Notes

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Database Notes

Uploaded by

Copyright:

Available Formats

Database management system:

Computerized library systems • Automated teller machines • Flight reservation systems •

computerized database system are speed, accuracy, and’ accessibility.

database system, however this redundancy can be controlled.

for defining and enforcing the constraints.

redundancy is removed chances of having inconsistent data is also removed.

General_Office is shared by multiple applications in centralized DBMS as compared to file system

may be developed without having to create any new stored files.

independent system so standard cannot be easily enforced on multiple independent applications.

restrictions. The DBMS should then enforce these restrictions automatically.

that is “best for the enterprise”.

was in before the program started executing.

i)Cost of developing and maintaining system is lower: It is much easier to respond to

unanticipated requests when data is centralized in a database than when it is stored in a

in chapter 4 of the book.

data to multiple users.

DBMS an extremely complex piece of software. Database designers, developers, database

memory to run efficiently.

applications may not run as fast as they used to.

component can bring operations to a halt.

functionality provided. There is also the recurrent annual maintenance cost.

cannot switch to modern database technology.

Type of Database System:

On the basis of the number of users:

On the basis of the site location

On the basis of site location

maintained at the central site.

Disadvantages of Centralized Database System

the central site can be expensive.

b)Parallel Database System:

or that have to process an extremely large number of transactions per second.

Advantages of a Parallel Database System

Disadvantages of a Parallel Database System

c)Distributed Database System:

computers connected by a communications network. As shown, in distributed database system,

(or distributed) geographically and connected together by a variety of communication

Advantages of Distributed Database System

local autonomy. Disadvantages of Distributed Database System • Recovery from failure is

more complex in distributed database systems than in centralized systems.

Multi-Tier client server computing models

shown. The rise of

tier. This scenario is depicted.

of end-users, the client side, now undergoes with following

requires less maintenance.

Advantages of Client/Server Database System

productive work by the users and making better use of existing

• Client/Server database system is more flexible as compared to the Centralized system. •

Disadvantages of Client/Server Database System

• Programming cost is high in client/server environments, particularly in initial phases. • There is

b)Data Manipulation Language (DML):

c)Data Control Language (DCL):

File Processing System:

file system to access these flat.

Characteristics of File Processing System:

Limitations of the File Processing System I File-

involved if data from several files was needed.

physically represented in disk) or access technique (how it is physically accessed) of data,

file processing systems in providing users with ad-hoc information requests

data items are difficult.