Database Management System: Prof. Dr. O.P.Vyas
Database Management System: Prof. Dr. O.P.Vyas
Database Management System: Prof. Dr. O.P.Vyas
Database
Management System
Prof. Dr. O.P.Vyas
M.Tech. ( Computer Sc. ) IIT Kharagpur
Ph.D. (Computer Sc) IIT Kharagpur
DAAD Fellow (Germany), AOTS Fellow(Japan)
Professor & Program Coordinator(SE)
Indian Institute of Information
Technology, Allahabad
Silberschatz, Korth and Sudarshan 3.2 Database System Concepts
2
We are pouring in
data but starving in
Knowledge
Jiawei Han
Silberschatz, Korth and Sudarshan 3.3 Database System Concepts
3
Database Management System
Data Information Knowledge
Chronology: Data & Knowledge Engineering
DBMS Definition & Application
Why DBMS ? Purpose
Developing Information Systems
Transaction Management
DBMS Types & users
DBMS languages
DDL Vs. DML
DBMS Architecture
Client-server based
Web-based architecture
DBMS Implementation
Data Modeling
E-R Diagram
From diagram to Tables
Data Base design
Data Querying: Concepts & Constructs
Query / Report format design, development.
Recent trends in DBMS
Silberschatz, Korth and Sudarshan 3.4 Database System Concepts
Our Approach.
Conceptual understanding & then Implementation...
Case Study & Modeling discussions
Through Class Assignments, Group presentations
Practical on SQL, PL/SQL on Oracle
Database Designing: Concepts & Tests.
Recent Trends in DBMS
Reference Material:
Database system concepts by Abraham Silberschatz ;
Henry F. Korth ; S. Sudarshan (Mc Graw Hill)
Fundamentals of Database Systems by Elmasri, Ramez;
Navathe, Shamkant B.
Book on SQL / Oracle : Complete Reference
4
Silberschatz, Korth and Sudarshan 3.5 Database System Concepts
Data, Information & Knowledge
Computer processing requires data, which is a collection of
raw facts, figures and symbols, such as numbers, words,
images, video and sound, given to the computer during the
input phase.
Computer processes the data to create Information which is
data that is organized, meaningful, and useful.
During the output Phase, the information that has been
created is put into some form, such as a printed report.
A DBMS processes the data & provide information.
Knowledge is not simply the information presented, but is
information further processed with intelligent mechanism
incorporating experience, domain knowledge and specialized
techniques.
To generate (business) insight / strategy based on the
Knowledge acquired from IT based systems. KDD (
Knowledge Discovery in Database) is an significant concept
related to data mining and business intelligence.
Silberschatz, Korth and Sudarshan 3.6 Database System Concepts
6
Data Engineering-Chronological Development
Evolutionary steps Business query Enabling
technologies
Product vendors Characteristics
Data
Collection
(1960s)
What was my
total revenue in
the last five
years
Computers,
tapes, disks
IBM Retrospective,
static data
delivery
Data access
(1980s)
What were A.C
unit sales in New
England last
March?
Relational
Databases, SQL,
ODBC
Oracle, Sybase,
Informix,
IBM,Microsoft
Retrospective
dynamic data
delivery at
record level.
Data
Warehousing
& Decision
support
systems1990s)
What were
A.C.unit sales in
New England
last March?Drill
down to
Boston.
OLAP, Multi -
dimensional
databases, DW
Pilot, Comshare,
Arbor,Cognos,
Microstrategy.
Retrospective,
dynamic data
delivery at
multiple levels.
Data Mining
(Emerging )
Whats likely to
happen to
Bostons
A.C.unit sales
next month?
Why ?
Advanced
algorithms,
multiprocessor
computers,
massive
databases.
Pilot, Lockheed,
IBM, SGI, Mineset
etc.
Prospective,
proactive
information
delivery.
Silberschatz, Korth and Sudarshan 3.7 Database System Concepts
7
Database Management System (DBMS)
Data Base - Collection of interrelated data
DBMS - Set of programs to access the data.
DBMS is designed to manage large bodies of data, contains
information about a particular enterprise.
Management of DB would involve Storage, Modify, Update &
Retrieval of Data as per requirements.
DBMS provides an environment that is both convenient and
efficient to use.
Database Applications:
Banking: all transactions
Airlines: reservations, schedules (first database in a geographical
distributed)
Universities: registration, grades
Sales: customers, products, purchases
Manufacturing: production, inventory, orders, supply chain
Human resources: employee records, salaries, tax deductions
Databases touch all aspects of our lives.
Silberschatz, Korth and Sudarshan 3.8 Database System Concepts
8
Database Management System
There are a large number of commercial database
systems in use today, which are based on the
DBMS concepts. The major ones are ;
Oracle
Microsoft Access
SQL server.
IBM DB2 .
Sybase
Informix.
Some are free / public domain database systems
MySQL, PostgresSQL
Silberschatz, Korth and Sudarshan 3.9 Database System Concepts
9
Database Management system
Purpose of Database Systems: Developing
Information system
View of Data
Data Models
Data Definition Language
Data Manipulation Language
Transaction Management
Storage Management
Database Administrator
Database Users
Overall System Structure
Silberschatz, Korth and Sudarshan 3.10 Database System Concepts
10
Developing Information systems
The purpose of a implementing a database system in any
organization is to develop an effective Information System in
the organization.
What is information system?
Information system is an computerization of some existing
manual /system which after automation provides useful
information to the targeted user.
One of the most popular example of an information system is
computerization of Railway Reservation Systems
Developed by CRIS [Centre for Railways Information
Systems] Is providing very valuable information like
1.Availability of trains between two stations .
2. Availability of Tickets in a particular Train, on particular date etc.
3. Booking a Ticket online.
Being used as Web based system it is sound implementation
of DBMS !
Silberschatz, Korth and Sudarshan 3.11 Database System Concepts
11
Information systems
Did we ever bother to see how this computerization
of railways reservation system would have first
conceived, designed, developed and implemented.
Database designing is important step for developing
such a system ,
Same is true for developing a business information
system for a big corporate house like Hindustan
Lever , Godrej, Reliance group of companies etc.
Database system has evolved a systematic
mechanism for this aspect of developing
information system and the very initial step is
called Modeling
Silberschatz, Korth and Sudarshan 3.12 Database System Concepts
Before Database Systems.
In the early days, database applications were built
directly on top of file systems with a program. language
Transaction Management is typical example which
highlights the significance of a DBMS.
In case of transferring funds from one bank A/C to
other if System failure occurs !
This failure may occur at any stage & the system
failures (e.g., power failures, operating system
crashes) will result not only as a transaction failure
but may be in serious inconsistencies in database.
Money withdrawn from one Account ( and network
fails) so not received in the Other A/C. !
Also in concurrent transactions how to ensure the
consistency of the database.
And many other problems.
Silberschatz, Korth and Sudarshan 3.13 Database System Concepts
Purpose of Database Systems
Drawbacks of using file systems to store data:
Data redundancy and inconsistency
Multiple file formats, duplication of information in
different files
Difficulty in accessing data
Need to write a new program to carry out each new task
Data isolation multiple files and formats
Integrity problems
Integrity constraints (e.g. account balance >0 ) become
buried in program code rather than being stated
explicitly
Hard to add new constraints or change existing ones
Silberschatz, Korth and Sudarshan 3.14 Database System Concepts
Purpose of Database Systems (Cont.)
Drawbacks of using file systems (cont.)
Atomicity of updates
Failures may leave database in an inconsistent state with
partial updates carried out
Example: Transfer of funds from one account to another
should either complete or not happen at all
Concurrent access by multiple users
Concurrent accessed needed for performance
Uncontrolled concurrent accesses can lead to inconsistencies
Example: Two people reading a balance and updating it at
the same time
Security problems
Hard to provide user access to some, but not all, data
Database Management systems offer solutions to all the above
problems
Silberschatz, Korth and Sudarshan 3.15 Database System Concepts
Developing an Information System
Developing the System:
Architecture of the Database
User Categories of Database
Software Tools for DBMS implementation
Database Design : Requires understanding the given
domain
Logical Design
Data Modeling: Types of Modeling
Physical design
Constructing the Model of given scenario:
Banking system.
Car Insurance Company database
15
Silberschatz, Korth and Sudarshan 3.16 Database System Concepts
Database Architecture
The architecture of a database systems is greatly
influenced by the underlying computer system
on which the database is running:
Centralized
Client-server
Parallel (multiple processors and disks)
Distributed
Data Base Application Architecture can be categorized as
Two-Tier and Three Tier Architecture
Silberschatz, Korth and Sudarshan 3.17 Database System Concepts
Database Application Architectures
(web browser)
Old
Modern
Silberschatz, Korth and Sudarshan 3.18 Database System Concepts
18
Database Users
Users are differentiated by the way they expect to
interact with the system
Application programmers interact with system
through DML calls
Sophisticated users give requests in a database
query language
Specialized users write specialized database
applications that do not fit into the traditional data
processing framework
Naive users invoke one of the permanent application
programs that have been written previously
E.g. people accessing database over the web, bank
tellers, clerical staff
Silberschatz, Korth and Sudarshan 3.19 Database System Concepts
19
Database Administrator
Coordinates all the activities of the database
system;
The database administrator has a good
understanding of the enterprises information
resources and needs.
Database administrator's duties include:
Schema definition
Storage structure and access method definition
Schema and physical organization modification
Granting user authority to access the database
Specifying integrity constraints
Monitoring performance and responding to changes in
requirements
Silberschatz, Korth and Sudarshan 3.20 Database System Concepts
20
Data base User categories & Architecture
Silberschatz, Korth and Sudarshan 3.21 Database System Concepts
21
Levels of Abstraction
Physical level describes how a record (e.g., customer) is stored.
Logical level: describes data stored in database, and the
relationships among the data.
type customer = record
name : string;
street : string;
city : integer;
end;
View level: application programs hide details of data types.
Views can also hide information (e.g., salary) for security
purposes.
Silberschatz, Korth and Sudarshan 3.22 Database System Concepts
22
View of Data
An Architecture for a database system
Silberschatz, Korth and Sudarshan 3.23 Database System Concepts
23
Data Base Management System
Ordinary Users interact
with DBMS via Menu in Front-End
Application like Visual Basic.
Whereas Sophisticated Users
& Database administrator
interacts with DBMS
via SQL interface or directly
with Oracle / MS SQL Server / IBM DB-2
Users
Front-end
Database
Application
DBMS
Database
Silberschatz, Korth and Sudarshan 3.24 Database System Concepts
24
DBMS Project Team
Data Specialists
--> Modeler
--> Architect
Developers
FRONT END
(VB, VC++, JAVA)
DBAs
BACK END
( Oracle,
SQL-Server, DB-2)
Project Leader
25
Database Models
There are different Database models among which Hierarchical
Model was also there but there was no standards for it.
Although basic model was incorporated in all Hierarchal database
software, there was no precise standards or concepts.
These models lacked a standard DDL (data definition Language)
component and this created many problems there was also lacking
in DML(Data manipulation Language) which made database very
difficult.
In 1975 there was a meeting of database professionals to establish a
standard set for database design & implementations .
This meeting was called by (CODASYL) Conference on DATA
Systems Languages committee created by DataBase Task Group
(DBTG) to create standard set for Database Task Group
Silberschatz, Korth and Sudarshan 3.26 Database System Concepts
Other Data Models
Although Relational model is one of the most
widely used DBMS,
but there are other approaches being used for
some specific applications;
Object-oriented data model
Object-relational data model
Silberschatz, Korth and Sudarshan 3.27 Database System Concepts
27
Relational Database Management System
The large number of commercial database systems
in use today are based on the concepts of
Relational DBMS. The major ones are ;
Oracle
Microsoft Access
SQL server.
IBM DB2 .
Sybase
Informix.
Some are free / public domain database systems
MySQL, PostgresSQL
Silberschatz, Korth and Sudarshan 3.28 Database System Concepts
28
Database implementation
Date Modeling:
E-R Modeling : Purpose.
E-R model for Banking Enterprise, Car Insurance
Company
Conversion of ER model into Table
Table Creation
Data Base Languages & Querying
Relation Algebra
Query Language : DDL, DML
Silberschatz, Korth and Sudarshan 3.29 Database System Concepts
29
Data Models
A Data modeling is extremely significant move
towards understanding the main components of
the existing system, the working of the system and
understanding the users expectations ( i.e.queries
from the system ).
Data model: A collection of tools for describing;
data
data relationships
Entity-Relationship model: Most effective model
Relational model
Other models:
object-oriented model
Older models: network model and hierarchical
model
Silberschatz, Korth and Sudarshan 3.30 Database System Concepts
The Entity-Relationship Model
Models an enterprise as a collection of entities
and relationships
Entity: a thing or object in the enterprise
that is distinguishable from other objects
Described by a set of attributes
Relationship: an association among several
entities
Represented diagrammatically by an entity-
relationship diagram:
Silberschatz, Korth and Sudarshan 3.31 Database System Concepts
31
Purpose of E/R Model
The E/R model allows us to sketch the design of a
database informally.
Designs are pictures called entity-relationship
diagrams.
Fairly mechanical ways to convert E/R diagrams
to real implementations like relational databases.
It is most widely used for database design .
Database design in E-R model is converted into design
in the Relational model.
* Entity-Relationship is a high-level data model.
* Relational model is a lower level model.
.It uses a collection of tables to represent both data
and the relationships among those data.
Silberschatz, Korth and Sudarshan 3.32 Database System Concepts
32
Entity Sets
Entity = thing or object.
Eg .Book for a Library System
Entity set = collection of similar entities.
Similar to a class in object-oriented languages.
Attribute = property of an entity set.
Eg: book has title, author, publisher, price, year
publication.
Generally, all entities in a set have the same
properties.
Attributes are simple values, e.g. integers or
character strings.
Silberschatz, Korth and Sudarshan 3.33 Database System Concepts
33
Cont..
There are three basic elements in ER models:
Entities are the "things" about which we seek
information.
Attributes are the data we collect about the
entities.
Relationships provide the structure needed to
draw information from multiple entities
or
Relationship is an Association among several
entities.
Silberschatz, Korth and Sudarshan 3.34 Database System Concepts
34
E/R Diagrams
In an entity-relationship diagram, each entity set
is represented by a rectangle.
Each attribute of an entity set is represented by an
oval, with a line to the rectangle representing its
entity set.
Silberschatz, Korth and Sudarshan 3.35 Database System Concepts
35
Relationships
A relationship connects two or more entity sets.
It is represented by a diamond, with lines to each
of the entity sets involved.
Silberschatz, Korth and Sudarshan 3.36 Database System Concepts
36
ER Model for A Bank
In a bank each customer is Entity and bank account
can be considered to be Entity, which are described by
attributes.
Here Attributes are account -number and balance,
describing one particular Account in a bank.
A relationship is an association among several entities.
For Ex. a Depositor relationship associates a customer
with each account that she has .
Silberschatz, Korth and Sudarshan 3.37 Database System Concepts
37
Relational Model
Example of tabular data in the relational model
customer-
name
Customer-
id
customer-
street
customer-
city
account-
number
Johnson
Smith
Johnson
Jones
Smith
192-83-7465
019-28-3746
192-83-7465
321-12-3123
019-28-3746
Alma
North
Alma
Main
North
Palo Alto
Rye
Palo Alto
Harrison
Rye
A-101
A-215
A-201
A-217
A-201
Attributes
Silberschatz, Korth and Sudarshan 3.38 Database System Concepts
38
A Sample Relational Database
Silberschatz, Korth and Sudarshan 3.39 Database System Concepts
39
E.R Model
Construct an E.R diagram for a Carinsurance
company whose Customers own one or more cars
each.Each car has associated number of accidents
with it, ranging from zero to any number of recorded
accidents.
List of Entities in this scenario:
Person
Car
and ..?
Accident
Silberschatz, Korth and Sudarshan 3.40 Database System Concepts
40
person
Driver-id
address
name
owns
car
license model
year
participated
Damage-amount
accident
Report-no
location
date
driver
E.R.Model for car Insurance company
car
Driver-id
Silberschatz, Korth and Sudarshan 3.41 Database System Concepts
41
Relational table of car insurance co.
PERSON
DRIVER- ID Address Name
101 ravi Civil lines
102
rajesh
M.G Road
150 Nitin Civil lines
license model year
de-240811 Maruti-zen 1998
Bb-39101 Maruti alto 2004
Gg-45001 Tata sumo 2003
Report- no. location date
02- 203 Vivekanand
nager
02-01-04
05-503 Civil lines 29-03-04
06-405 Ashram
road
15-04-04
CAR
ACCIDENT
Construct the table for
all relationships ?
Silberschatz, Korth and Sudarshan 3.42 Database System Concepts
42
Specialization
An entity set may include subgroupings of entities that are
distinct in some way from other entities in the set.
For e.g A subset of entities within an entity set may have
attributes that are not shared by all entites in the entity set.
It is represented by
ISA
Silberschatz, Korth and Sudarshan 3.43 Database System Concepts
43
Example with specialization
Person
name
street
city
ISA
employee customer
Silberschatz, Korth and Sudarshan 3.44 Database System Concepts
SUMMARY OF ER-DIAGRAM
NOTATION FOR ER SCHEMAS
Meaning
ENTITY TYPE
WEAK ENTITY TYPE
RELATIONSHIP TYPE
IDENTIFYING RELATIONSHIP TYPE
ATTRIBUTE
KEY ATTRIBUTE
MULTIVALUED ATTRIBUTE
COMPOSITE ATTRIBUTE
DERIVED ATTRIBUTE
TOTAL PARTICIPATION OF E
2
IN R
CARDINALITY RATIO 1:N FOR E
1
:E
2
IN R
STRUCTURAL CONSTRAINT (min, max) ON
PARTICIPATION OF E IN R
Symbol
E
1
R E
2
E
1
R E
2
R
(min,max)
E
N
Silberschatz, Korth and Sudarshan 3.45 Database System Concepts
Assignment #1
Designing an ER Diagram ( Last Date 16
th
July2011)
Consider the following set of requirements for a University database. Design
an ER diagram for this application:
The university keeps track of each student's name, student number, social
security number, current address and phone number, permanent address and
phone number, birthdate, sex, class (freshman, graduate), major department,
minor department (if any), degree program (B.A., B.S., ... Ph.D.). Some user
applications need to refer to the city, state, and zip code of the student's
permanent address and to the student's last name. Both social security number
and student number are unique for each student. All students will have at least
a major department.
Each department is described by a name, department code, office number,
office phone, and college. Both the name and code have unique values for each
department.
Each course has a course name, description, course number, number of credits,
level and offering department. The course number is unique for each course.
Each section has an instructor, semester, year, course, and section
number. The section number distinguishes sections of the same course that are
taught during the same semester/year; its value is an integer (1, 2, 3, ... up to
the number of sections taught during each semester).
A grade report must be generated for each student that lists the section, letter
grade, and numeric grade (0,1,2,3, or 4) for each student and calculates his or
her average GPA.
Silberschatz, Korth and Sudarshan 3.46 Database System Concepts
46
Silberschatz, Korth and Sudarshan 3.47 Database System Concepts
47
Database implementation
Date Modeling:
E-R Modeling : Purpose.
E-R model for Banking Enterprise, Car Insurance Company
Key, Super Key Attributes & Weak Entity Sets
Conversion of ER model into Table
Table Creation using SQL
Data Base Languages & Querying
Relation Algebra
Query Language : Procedural & Non-procedural
DDL, DML
Silberschatz, Korth and Sudarshan 3.48 Database System Concepts
Concept of Keys
As we have seen that in many of ER models,
converted into tables; we can find some special
attributes who can uniquely identify the record.
Key Attributes
Super Key
Primary Key
Any Entity set without Key attribute ?
48
Silberschatz, Korth and Sudarshan 3.49 Database System Concepts
49
Concept of Keys
A Key Attributes is special attribute which can
uniquely identify the record for Entity
A super key of an entity set is a set of one or more
attributes whose values uniquely determine each
entity.
Ex. Super key : ( customer-id, Customer-name)
A candidate key of an entity set is a minimal super key
Customer-id is candidate key of customer
account-number is candidate key of account
Although several candidate keys may exist, one of the
candidate keys is selected to be the primary key.
The combination of primary keys of the participating
entity sets forms a super key of a relationship set.
(customer-id, account-number) is the super key of depositor
Silberschatz, Korth and Sudarshan 3.50 Database System Concepts
50
Keys..
Real world keys sometimes cause problems in a
database.
Example: Customer
Customer-ids are not easy to remember by account
holder !
Avoid phone numbers: people may not notify you when
numbers change.
Often best to let the DBMS generate unique values
Access: AutoNumber
SQL Server: Identity
Oracle: Sequences (but require additional programming)
Drawback: Numbers are not related to any business
data, so the application needs to hide them and
provide other look up mechanisms.
Silberschatz, Korth and Sudarshan 3.51 Database System Concepts
51
Weak Entity Sets
An entity set that does not have a primary key is
referred to as a weak entity set.
The existence of a weak entity set depends on the
existence of a identifying entity set
The primary key of a weak entity set is formed by
the primary key of the strong entity set on which
the weak entity set is existence dependent, plus
the weak entity sets discriminator.
Silberschatz, Korth and Sudarshan 3.52 Database System Concepts
52
ER Model to Table
Example of a Bank.
Each entity is converted into Table with various
attributes becoming Column headers
The relationship is created by taking key
attributes of the two Entities.
SQL is used with CREATE Table command
Silberschatz, Korth and Sudarshan 3.53 Database System Concepts
53
A Sample Relational Database
Silberschatz, Korth and Sudarshan 3.54 Database System Concepts
54
How to convert E.R in relational table
5
D
D
A
A
T
T
A
A
B
B
A
A
S
S
E
E
Notation
Table name
Primary key is underlined
Table columns
Customer(CustomerID, Phone, Name, Address, City, State, ZipCode)
CustomerID Phone LastName FirstName Address City State Zipcode
1 502-666-7777 Johnson Martha 125 Main Street Alvaton KY 42122
2 502-888-6464 Smith Jack 873 Elm Street Bowling Green KY 42101
3 502-777-7575 Washington Elroy 95 Easy Street Smiths Grove KY 42171
4 502-333-9494 Adams Samuel 746 Brown Drive Alvaton KY 42122
5 502-474-4746 Rabitz Victor 645 White Avenue Bowling Green KY 42102
6 616-373-4746 Steinmetz Susan 15 Speedway Drive Portland TN 37148
7 615-888-4474 Lasater Les 67 S. Ray Drive Portland TN 37148
8 615-452-1162 Jones Charlie 867 Lakeside Drive Castalian Springs TN 37031
9 502-222-4351 Chavez Juan 673 Industry Blvd. Caneyville KY 42721
10 502-444-2512 Rojo Maria 88 Main Street Cave City KY 42127
Silberschatz, Korth and Sudarshan 3.55 Database System Concepts
55
create table
create table account (
account-number char(10),
balance integer)
The above instruction creates a table named
account, having two columns, namely account
number and balance.
The account number column will accept the input
data as character with maximum length of 10
characters.
Whereas the balance is defined as Integer We will
discuss all possible domain types in coming slides.
Silberschatz, Korth and Sudarshan 3.56 Database System Concepts
56
Database
A database consists of multiple relations, which means
multiple tables.
Information about an enterprise is broken up
into parts, with each relation ( table) storing
one part of the information
Ex: Table account : stores information about accounts
Table customer : stores information about customers
Table depositor : stores information about which customer
owns which account
But Why to Break into Smaller parts .?
Silberschatz, Korth and Sudarshan 3.57 Database System Concepts
Database
Storing all information as a single relation such as
bank(account-number, balance, customer-name, ..)
results in
repetition of information (e.g. two customers own an
account)
the need for null values (e.g. represent a customer
without an account)
Normalization theory deals with how to design relational
schemas.
Let us have look at detail ER Diagram for Banking
Enterprise
57
Silberschatz, Korth and Sudarshan 3.58 Database System Concepts
Database Design
The process of designing the general structure of the database:
Logical Design Deciding on the database schema.
Database design requires that we find a good collection of relation
schemas.
Business decision What attributes should we record in the
database?
Computer Science decision What relation schemas should
we have and how should the attributes be distributed among the
various relation schemas?
Physical Design Deciding on the physical layout of the database
Silberschatz, Korth and Sudarshan 3.59 Database System Concepts
Querying the Database
Users put Query to a Database and the
efficiency and accuracy of the database
depends upon the fast, accurate and
appropriate information provided by the
Database.
The reasons for longevity and success of
RDBMS is its underlying concepts/mechanism
to address the user queries.
It is required to understand the query and its
working mechanism in a RDBMS.
There were many Querying Languages
proposed for working with RDBMS, some of
which became popular & standard.
59
Silberschatz, Korth and Sudarshan 3.60 Database System Concepts
Query Languages
Language in which user requests information
from the database.
Can also be viewed as either DDL or DML.
DDL ( Data Definition Language)
DML ( Data Manipulation Language)
Query Language has its own syntax and working
mechanism?
60
Silberschatz, Korth and Sudarshan 3.61 Database System Concepts
Query Languages
Categories of languages
Procedural
Non-procedural, or declarative
Pure languages:
Relational algebra
Tuple relational calculus
Domain relational calculus
Pure languages form underlying basis of query languages
that people use.
Silberschatz, Korth and Sudarshan 3.62 Database System Concepts
Data Definition Language (DDL)
The schema for each relation, including
attribute types.
Integrity constraints
Authorization information for each relation.
Non-standard SQL extensions also allow
specification of
The set of indices to be maintained for each
relations.
The physical storage structure of each relation
on disk.
Allows the specification of:
Silberschatz, Korth and Sudarshan 3.63 Database System Concepts
Data Manipulation Language (DML)
Language for accessing and manipulating the data
organized by the appropriate data model
DML also known as query language
Two classes of languages
Procedural user specifies what data is
required and how to get those data
Declarative (nonprocedural) user
specifies what data is required without
specifying how to get those data
SQL is the most widely used query language
Silberschatz, Korth and Sudarshan 3.64 Database System Concepts
Data Definition Language (DDL)
Specification notation for defining the database schema
Example: create table account (
account_number char(10),
branch_name char(10),
balance integer)
DDL compiler generates a set of tables stored in a data
dictionary
Data dictionary contains metadata (i.e., data about data)
Database schema
Data storage and definition language
Specifies the storage structure & access methods
used
Integrity constraints & Authorization
Domain constraints
Referential integrity (e.g. branch_name must correspond
to a valid branch in the branch table)
Silberschatz, Korth and Sudarshan 3.65 Database System Concepts
65
Data Manipulation Language (DML)
DML is part of DDL. It is Language for accessing and
manipulating the data organized by the appropriate
data model
DML also known as query language
SQL is the most widely used query language, and
definitely Nonprocedural language.
IBM developed the original version of SQL .
SQL being most popular commercial language is
being implemented in most RDBMS namely Oracle,
Access, SQL server etc.
Silberschatz, Korth and Sudarshan 3.66 Database System Concepts
Create Table Construct
An SQL relation is defined using the create table
command:
create table r (A
1
D
1
, A
2
D
2
, ..., A
n
D
n
,
(integrity-constraint
1
),
...,
(integrity-constraint
k
))
r is the name of the relation
each A
i
is an attribute name in the schema of
relation r
D
i
is the data type of attribute A
i
Example:
create table branch
(branch_name char(15),
branch_city char(30),
assets integer)
Silberschatz, Korth and Sudarshan 3.67 Database System Concepts
SQL: History
IBM Sequel language developed as part of System R project
at the IBM San Jose Research Laboratory
Renamed Structured Query Language (SQL)
ANSI and ISO standard SQL:
SQL-86
SQL-89
SQL-92
SQL:1999 (language name became Y2K compliant!)
SQL:2003
Commercial systems offer most, if not all, SQL-92 features,
plus varying feature sets from later standards and special
proprietary features.
Not all examples here may work on your particular
system.
Silberschatz, Korth and Sudarshan 3.68 Database System Concepts
68
Basic Structure
SQL is based on set and relational operations with
certain modifications and enhancements
A typical SQL query has the form:
select A
1
, A
2
, ..., A
n
from r
1
, r
2
, ..., r
m
where P
A
i
s represent attributes
r
i
s represent relations
P is a predicate.
This query is equivalent to the relational algebra
expression.
[
A1, A2, ..., An
(o
P
(r
1
x r
2
x ... x r
m
))
The result of an SQL query is a relation.
Silberschatz, Korth and Sudarshan 3.69 Database System Concepts
Domain Types in SQL
char(n). Fixed length character string, with user-specified length n.
varchar(n). Variable length character strings, with user-specified maximum
length n.
int. Integer (a finite subset of the integers that is machine-dependent).
smallint. Small integer (a machine-dependent subset of the integer
domain type).
numeric(p,d). Fixed point number, with user-specified precision of p digits,
with n digits to the right of decimal point.
real, double precision. Floating point and double-precision floating point
numbers, with machine-dependent precision.
float(n). Floating point number, with user-specified precision of at least n
digits.
More are there.
Silberschatz, Korth and Sudarshan 3.70 Database System Concepts
Working with RDBMS
E-R Modeling: Database Design
Normalization
Conversion from ER Schema into Table
Creation of Table using SQL constructs
Integrity Constraints
Working with Oracle
70
Silberschatz, Korth and Sudarshan 3.71 Database System Concepts
71
E-R Diagram for the Banking Enterprise
Silberschatz, Korth and Sudarshan 3.72 Database System Concepts
72
RDBMS Design issues
A database consists of multiple relations
Information about an enterprise is broken up into parts, with each
relation storing one part of the information
E.g.: account : stores information about accounts
depositor : stores information about which customer
owns which account
customer : stores information about customers
Each Relational database schema consists of a number of relation
schemas.
Ex. The Relational database schema for Banking Enterprise, consists
of many Relation schemas ( Entity-Relationships).
So far we have assumed that Entity - attributes are grouped to form
a relation schema by using the common sense of database
designer or by mapping a schema defined by ER model.
We still need some formal measure of why one grouping of
attributes into a relation schema may be better than another.
In general, the goal of schema design is to reduce various
anomalies.
Silberschatz, Korth and Sudarshan 3.73 Database System Concepts
73
Pitfalls in Relational Database Design
Relational database design requires that we find a
good collection of relation schemas. A bad design
may lead to
Repetition of Information.
Inability to represent certain information.
Design Goals:
Avoid redundant data
Ensure that relationships among attributes are
represented
Facilitate the checking of updates for violation of
database integrity constraints.
Silberschatz, Korth and Sudarshan 3.74 Database System Concepts
Integrity Constraints
Integrity constraints guard against accidental
damage to the database, by ensuring that
authorized changes to the database do not result
in a loss of data consistency.
A checking account must have a
balance greater than $10,000.00
A salary of a bank employee must be
at least $4.00 an hour
A customer must have a (non-null)
phone number
Silberschatz, Korth and Sudarshan 3.75 Database System Concepts
Integrity Constraints on Tables
not null
primary key (A
1
, ..., A
n
)
Example: Declare branch_name as the primary key for
branch
.
create table branch
(branch_name char(15),
branch_city char(30) not null,
assets integer,
primary key (branch_name))
primary key declaration on an attribute automatically
ensures not null in SQL-92 onwards, needs to be
explicitly stated in SQL-89
Silberschatz, Korth and Sudarshan 3.76 Database System Concepts
Constraints on a Single Relation
not null
primary key
unique
check (P ), where P is a predicate
Silberschatz, Korth and Sudarshan 3.77 Database System Concepts
Not Null Constraint
Declare branch_name for branch is not null
branch_name char(15) not null
Declare the domain Dollars to be not null
create domain Dollars numeric(12,2) not null
Silberschatz, Korth and Sudarshan 3.78 Database System Concepts
The Unique Constraint
unique ( A
1
, A
2
, , A
m
)
The unique specification states that the attributes
A1, A2, Am
form a candidate key.
Candidate keys are permitted to be null (in contrast to primary
keys).
Silberschatz, Korth and Sudarshan 3.79 Database System Concepts
The check clause
check (P ), where P is a predicate
Example: Declare branch_name as the primary key for
branch and ensure that the values of assets are non-
negative.
create table branch
(branch_name char(15),
branch_city char(30),
assets integer,
primary key (branch_name),
check (assets >= 0))
Silberschatz, Korth and Sudarshan 3.80 Database System Concepts
The check clause (Cont.)
The check clause in SQL-92 permits domains to be restricted:
Use check clause to ensure that an
hourly_wage domain allows only values
greater than a specified value.
create domain hourly_wage
numeric(5,2)
constraint value_test check(value
> = 4.00)
The domain has a constraint that ensures
that the hourly_wage is greater than 4.00
The clause constraint value_test is optional;
useful to indicate which constraint an
update violated.
Silberschatz, Korth and Sudarshan 3.81 Database System Concepts
Referential Integrity
Ensures that a value that appears in one relation for a given
set of attributes also appears for a certain set of attributes in
another relation.
Example: If Perryridge is a branch name appearing in one
of the tuples in the account relation, then there exists a tuple
in the branch relation for branch Perryridge.
Primary and candidate keys and foreign keys can be specified as
part of the SQL create table statement:
The primary key clause lists attributes that comprise the
primary key.
The unique key clause lists attributes that comprise a
candidate key.
The foreign key clause lists the attributes that comprise
the foreign key and the name of the relation referenced
by the foreign key. By default, a foreign key references
the primary key attributes of the referenced table.
Silberschatz, Korth and Sudarshan 3.82 Database System Concepts
Foreign Key
In the context of relational databases, a foreign key is a
referential constraint between two tables.
The foreign key identifies a column or a set of columns in
one (referencing) table that refers to set of columns in
another (referenced) table.
The columns in the referencing table must be the primary
key or other candidate key in the referenced table. The
values in one row of the referencing columns must occur in
a single row in the referenced table.
Thus, a row in the referencing table cannot contain values
that don't exist in the referenced table (except potentially
NULL).
82
Silberschatz, Korth and Sudarshan 3.83 Database System Concepts
Referential Integrity in SQL Example
create table customer
(customer_name char(20),
customer_street char(30),
customer_city char(30),
primary key (customer_name ))
create table branch
(branch_name char(15),
branch_city char(30),
assets numeric(12,2),
primary key (branch_name ))
Silberschatz, Korth and Sudarshan 3.84 Database System Concepts
Referential Integrity in SQL Example (Cont.)
create table account
(account_number char(10),
branch_name char(15),
balance integer,
primary key (account_number),
foreign key (branch_name) references branch )
create table depositor
(customer_name char(20),
account_number char(10),
primary key (customer_name, account_number),
foreign key (account_number ) references account,
foreign key (customer_name ) references customer
)
Silberschatz, Korth and Sudarshan 3.85 Database System Concepts
Basic Insertion and Deletion of Tuples
Newly created table is empty
Add a new tuple to account
insert into account
values ('A-9732', 'Perryridge', 1200)
Insertion fails if any integrity constraint is
violated
Delete all tuples from account
delete from account
Note: Will see later how to delete selected tuples
Silberschatz, Korth and Sudarshan 3.86 Database System Concepts
Drop and Alter Table Constructs
The drop table command deletes all information about the
dropped relation from the database.
The alter table command is used to add attributes to an
existing relation:
alter table r add A D
where A is the name of the attribute to be added to relation r
and D is the domain of A.
All tuples in the relation are assigned null as the value for
the new attribute.
The alter table command can also be used to drop attributes of
a relation:
alter table r drop A
where A is the name of an attribute of relation r
Dropping of attributes not supported by many databases
Silberschatz, Korth and Sudarshan 3.87 Database System Concepts
Schema Diagram for the Banking Enterprise
Silberschatz, Korth and Sudarshan 3.88 Database System Concepts
88
RDBMS Design
A database consists of multiple relations
Information about an enterprise is broken up into parts, with
each relation storing one part of the information
E.g.: account : stores information about accounts
depositor : stores information about which
customer owns which account
customer : stores information about customers
Each Relational database schema consists of a number of
relation schemas.
Ex. The Relational database schema for Banking
Enterprise, consists of many Relation schemas ( Entity-
Relationships).
So far we have assumed that Entity - attributes are grouped to
form a relation schema by using the common sense of
database designer or by mapping a schema defined by ER
model.
Silberschatz, Korth and Sudarshan 3.89 Database System Concepts
RDBMS
We still need some formal measure of why one grouping of
attributes into a relation schema may be better than another.
In general, the goal of schema design is to reduce various
anomalies.
But once the schema is finalized with appropriate design, the
Database is required to be implemented for handling user-
queries.
It is worth mentioning that accurate query answering is of
extreme requirement.
So we would first understand the basis of querying
mechanism in RDBMS and then will discuss How to develop
new Database systems.
89
Silberschatz, Korth and Sudarshan 3.90 Database System Concepts
Relational Algebra as a basis
Relational algebras received little attention until
the publication of E.F. Codd s relational model
of data in 1970.
Codd proposed such an algebra as a basis for
database query languages.
As in any algebra, some operators are primitive
and the others, being definable in terms of the
primitive ones, are derived.
90
Silberschatz, Korth and Sudarshan 3.91 Database System Concepts
Relational Algebra as a basis
It is proposed to treat the tables of Data base as Relations
and consequently as Sets in mathematical term.
Codd proposed six basic operators and some derived ones
forming the basis of querying mechanism in a database.
Although three of the six basic operators are taken from set
theory,
there are additional constraints that are present in their
relational algebra counterparts:
For set union and set difference, the two
relations involved must be union-compatible
that is, the two relations must have the same
set of attributes.
As set intersection can be defined in terms of
set difference, the two relations involved in set
intersection must also be union-compatible.
91
Silberschatz, Korth and Sudarshan 3.92 Database System Concepts
Attribute Types
Each attribute of a relation has a name
The set of allowed values for each attribute is called the
domain of the attribute
Attribute values are (normally) required to be atomic; that is,
indivisible
E.g. the value of an attribute can be an
account number, but cannot be a set of
account numbers
Domain is said to be atomic if all its members are atomic
The special value null is a member of every domain
The null value causes complications in the definition of
many operations
We shall ignore the effect of null values in our main
presentation and consider their effect later
Silberschatz, Korth and Sudarshan 3.93 Database System Concepts
Relation Schema
Formally, given domains D
1
, D
2
, . D
n
a relation r is a subset
of D
1
x D
2
x x D
n
Thus, a relation is a set of n-tuples (a
1
, a
2
, , a
n
) where each
a
i
e D
i
Schema of a relation consists of
attribute definitions
name
type/domain
integrity constraints
Silberschatz, Korth and Sudarshan 3.94 Database System Concepts
Relation Instance
The current values (relation instance) of a relation are
specified by a table
An element t of r is a tuple, represented by a row in a
table
Order of tuples is irrelevant (tuples may be stored in an
arbitrary order)
Jones
Smith
Curry
Lindsay
customer_name
Main
North
North
Park
customer_street
Harrison
Rye
Rye
Pittsfield
customer_city
customer
attributes
(or columns)
tuples
(or rows)
Silberschatz, Korth and Sudarshan 3.95 Database System Concepts
95
Silberschatz, Korth and Sudarshan 3.96 Database System Concepts
Relational Algebra
Procedural language
Six basic operators
select
project
union
set difference
Cartesian product
rename
The operators take two or more relations as inputs and give
a new relation as a result.
Silberschatz, Korth and Sudarshan 3.97 Database System Concepts
Select Operation Example
Relation r A B C D
o
o
|
|
o
|
|
|
1
5
12
23
7
7
3
10
o
A=B ^ D > 5
(r)
A B C D
o
|
o
|
1
23
7
10
Silberschatz, Korth and Sudarshan 3.98 Database System Concepts
Select Operation
Notation: o
p
(r)
p is called the selection predicate
Defined as:
o
p
(r) = {t | t e r and p(t)}
Where p is a formula in propositional
calculus consisting of terms connected by :
. (and), v (or), (not)
Each term is one of:
<attribute> op <attribute> or
<constant>
where op is one of: =, =, >, >. <. s
Example of selection:
o
branch-name=Perryridge
(account)
Silberschatz, Korth and Sudarshan 3.99 Database System Concepts
Project Operation Example
Relation r:
A B C
o
o
|
|
10
20
30
40
1
1
1
2
A C
o
o
|
|
1
1
1
2
=
A C
o
|
|
1
1
2
[
A,C
(r)
Silberschatz, Korth and Sudarshan 3.100 Database System Concepts
Project Operation
Notation:
[
A1, A2, , Ak
(r)
where A
1
, A
2
are attribute names and r is a relation name.
The result is defined as the relation of k columns
obtained by erasing the columns that are not
listed
Duplicate rows removed from result, since
relations are sets
E.g. To eliminate the branch-name attribute of account
[
account-number, balance
(account)
Silberschatz, Korth and Sudarshan 3.101 Database System Concepts
Union Operation Example
Relations r, s:
r s:
A
B
o
o
|
1
2
1
A B
o
|
2
3
r
s
A B
o
o
|
|
1
2
1
3
Silberschatz, Korth and Sudarshan 3.102 Database System Concepts
Union Operation
Notation: r s
Defined as:
r s = {t | t e r or t e s}
For r s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (e.g., 2nd
column
of r deals with the same type of values as does the 2nd
column of s)
E.g. to find all customers with either an account or a loan
[
customer-name
(depositor) [
customer-name
(borrower)
Silberschatz, Korth and Sudarshan 3.103 Database System Concepts
Set Difference Operation Example
Relations r, s:
r s:
A B
o
o
|
1
2
1
A B
o
|
2
3
r
s
A B
o
|
1
1
Silberschatz, Korth and Sudarshan 3.104 Database System Concepts
Set Difference Operation
Notation r s
Defined as:
r s = {t | t e r and t e s}
Set differences must be taken between compatible relations.
r and s must have the same arity
attribute domains of r and s must be
compatible
Silberschatz, Korth and Sudarshan 3.105 Database System Concepts
Cartesian-Product Operation-Example
Relations r, s:
r x s:
A B
o
|
1
2
A B
o
o
o
o
|
|
|
|
1
1
1
1
2
2
2
2
C D
o
|
|
o
|
|
10
10
20
10
10
10
20
10
E
a
a
b
b
a
a
b
b
C D
o
|
|
10
10
20
10
E
a
a
b
b
r
s
Silberschatz, Korth and Sudarshan 3.106 Database System Concepts
Cartesian-Product Operation
Notation r x s
Defined as:
r x s = {t q | t e r and q e s}
Assume that attributes of r(R) and s(S) are
disjoint. (That is, R S = C).
If attributes of r(R) and s(S) are not disjoint, then
renaming must be used.
Silberschatz, Korth and Sudarshan 3.107 Database System Concepts
Composition of Operations
Can build expressions using multiple operations
Example: o
A=C
(r x s)
r x s
o
A=C
(r x s)
A B
o
o
o
o
|
|
|
|
1
1
1
1
2
2
2
2
C D
o
|
|
o
|
|
10
19
20
10
10
10
20
10
E
a
a
b
b
a
a
b
b
A B C D E
o
|
|
1
2
2
o
|
|
10
20
20
a
a
b
Silberschatz, Korth and Sudarshan 3.108 Database System Concepts
Rename Operation
Allows us to name, and therefore to refer to, the
results of relational-algebra expressions.
Allows us to refer to a relation by more than one
name.
Example:
x
(E)
returns the expression E under the name X
If a relational-algebra expression E has arity n, then
x
(A1, A2, , An)
(E)
returns the result of expression E under the name X, and with
the attributes renamed to A1, A2, ., An.
Silberschatz, Korth and Sudarshan 3.109 Database System Concepts
109
Basic Structure
SQL is based on set and relational operations with
certain modifications and enhancements
A typical SQL query has the form:
select A
1
, A
2
, ..., A
n
from r
1
, r
2
, ..., r
m
where P
A
i
s represent attributes
r
i
s represent relations
P is a predicate.
This query is equivalent to the relational algebra
expression.
[
A1, A2, ..., An
(o
P
(r
1
x r
2
x ... x r
m
))
The result of an SQL query is a relation.
Silberschatz, Korth and Sudarshan 3.110 Database System Concepts
110
Silberschatz, Korth and Sudarshan 3.111 Database System Concepts
Banking Example
branch (branch-name, branch-city, assets)
customer (customer-name, customer-street, customer-
city)
account (account-number, branch-name, balance)
loan (loan-number, branch-name, amount)
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
Silberschatz, Korth and Sudarshan 3.112 Database System Concepts
Example Queries
Find all loans of over $1200
o
amount >1200
(loan)
Find the loan number for each loan of an amount greater
than $1200
[
loan-number
(o
amount >1200
(loan))
Silberschatz, Korth and Sudarshan 3.113 Database System Concepts
Example Queries
Find the names of all customers who have a loan,
an account, or both, from the bank
[
customer-name
(borrower) [
customer-name
(depositor)
Find the names of all customers who have a loan
and an account at bank.
[
customer-name
(borrower) [
customer-name
(depositor)
Silberschatz, Korth and Sudarshan 3.114 Database System Concepts
Example Queries
Find the names of all customers who have a loan at the
Perryridge branch.
[
customer-name
(o
branch-name=Perryridge
(o
borrower.loan-number = loan.loan-number
(borrower x loan)))
Find the names of all customers who have a loan at the Perryridge
branch but do not have an account at any branch of the bank.
[
customer-name
(o
branch-name = Perryridge
(o
borrower.loan-number = loan.loan-number
(borrower x loan)))
[
customer-name
(depositor)
Silberschatz, Korth and Sudarshan 3.115 Database System Concepts
Example Queries
Find the names of all customers who have a loan at the Perryridge
branch.
Query 1
[
customer-name
(o
branch-name = Perryridge
(o
borrower.loan-number = loan.loan-number
(borrower x loan)))
Query 2
[
customer-name
(o
loan.loan-number = borrower.loan-number
(
(o
branch-name = Perryridge
(loan)) x
borrower)
)
Silberschatz, Korth and Sudarshan 3.116 Database System Concepts
Example Queries
Find the largest account balance
Rename account relation as d
The query is:
[
balance
(account) - [
account.balance
(o
account.balance < d.balance
(account x
d
(account)))
Silberschatz, Korth and Sudarshan 3.117 Database System Concepts
Formal Definition
A basic expression in the relational algebra consists of either one
of the following:
A relation in the database
A constant relation
Let E
1
and E
2
be relational-algebra expressions; the following are
all relational-algebra expressions:
E
1
E
2
E
1
- E
2
E
1
x E
2
o
p
(E
1
), P is a predicate on attributes in E
1
[
s
(E
1
), S is a list consisting of some of the attributes in
E
1
x
(E
1
), x is the new name for the result of E
1
Silberschatz, Korth and Sudarshan 3.118 Database System Concepts
Additional Operations
We define additional operations that do not add any
power to the relational algebra, but that simplify
common queries.
Set intersection
Natural join
Division
Assignment
Silberschatz, Korth and Sudarshan 3.119 Database System Concepts
Set-Intersection Operation
Notation: r s
Defined as:
r s ={ t | t e r and t e s }
Assume:
r, s have the same arity
attributes of r and s are compatible
Note: r s = r - (r - s)
Silberschatz, Korth and Sudarshan 3.120 Database System Concepts
Set-Intersection Operation - Example
Relation r, s:
r s
A B
o
o
|
1
2
1
A B
o
|
2
3
r
s
A B
o 2
Silberschatz, Korth and Sudarshan 3.121 Database System Concepts
Natural-Join Operation
Notation: r s
Let r and s be relations on schemas R and S respectively. The result
is a relation on schema R S which is obtained by considering each
pair of tuples t
r
from r and t
s
from s.
If t
r
and t
s
have the same value on each of the attributes in R S, a
tuple t is added to the result, where
t has the same value as t
r
on r
t has the same value as t
s
on s
Example:
R = (A, B, C, D)
S = (E, B, D)
Result schema = (A, B, C, D, E)
r s is defined as:
[
r.A, r.B, r.C, r.D, s.E
(o
r.B = s.B r.D = s.D
(r x s))
Silberschatz, Korth and Sudarshan 3.122 Database System Concepts
Natural Join Operation Example
Relations r, s:
A B
o
|
o
o
1
2
4
1
2
C D
o
|
|
a
a
b
a
b
B
1
3
1
2
3
D
a
a
a
b
b
E
o
|
o
e
r
A B
o
o
o
o
o
1
1
1
1
2
C D
o
o
|
a
a
a
a
b
E
o
o
o
s
r s
Silberschatz, Korth and Sudarshan 3.123 Database System Concepts
Division Operation
Suited to queries that include the phrase for all.
Let r and s be relations on schemas R and S respectively
where
R = (A
1
, , A
m
, B
1
, , B
n
)
S = (B
1
, , B
n
)
The result of r s is a relation on
schema
R S = (A
1
, , A
m
)
r s = { t | t e [
R-S
(r) . u e s ( tu e r ) }
r s
Silberschatz, Korth and Sudarshan 3.124 Database System Concepts
Division Operation Example
Relations r, s:
r s:
A
B
o
|
1
2
A B
o
o
o
|
o
o
o
e
e
|
1
2
3
1
1
1
3
4
6
1
2
r
s
Silberschatz, Korth and Sudarshan 3.125 Database System Concepts
Another Division Example
A B
o
o
o
|
|
a
a
a
a
a
a
a
a
C D
o
|
a
a
b
a
b
a
b
b
E
1
1
1
1
3
1
1
1
Relations r, s:
r s:
D
a
b
E
1
1
A B
o
a
a
C
r
s
Silberschatz, Korth and Sudarshan 3.126 Database System Concepts
Division Operation (Cont.)
Property
Let q r s
Then q is the largest relation satisfying q x s _ r
Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S _ R
r s = [
R-S
(r) [
R-S
( ([
R-S
(r) x s) [
R-S,S
(r))
To see why
[
R-S,S
(r) simply reorders attributes of r
[
R-S
([
R-S
(r) x s) [
R-S,S
(r)) gives those tuples t
in
[
R-S
(r) such that for some tuple u e s, tu e r.
Silberschatz, Korth and Sudarshan 3.127 Database System Concepts
Assignment Operation
The assignment operation () provides a convenient way to
express complex queries, write query as a sequential program
consisting of a series of assignments followed by an expression
whose value is displayed as a result of the query.
Assignment must always be made to a temporary relation
variable.
Example: Write r s as
temp1
[
R-S
(r)
temp2 [
R-S
((temp1 x s) [
R-S,S
(r))
result = temp1 temp2
The result to the right of the is assigned to the relation variable on
the left of the .
May use variable in subsequent expressions.
Silberschatz, Korth and Sudarshan 3.128 Database System Concepts
Example Queries
Find all customers who have an account from at least the
Downtown and the Uptown branches.
Query 1
[
CN
(o
BN=Downtown
(depositor account))
[
CN
(o
BN=Uptown
(depositor account))
where CN denotes customer-name and BN denotes
branch-name.
Query 2
[
customer-name, branch-name
(depositor account)
temp(branch-name
)
({(Downtown), (Uptown)})
Silberschatz, Korth and Sudarshan 3.129 Database System Concepts
Find all customers who have an account at all branches located
in Brooklyn city.
[
customer-name, branch-name
(depositor account)
[
branch-name
(o
branch-city = Brooklyn
(branch))
Example Queries
Silberschatz, Korth and Sudarshan 3.130 Database System Concepts
Extended Relational-Algebra-Operations
Generalized Projection
Outer Join
Aggregate Functions
Silberschatz, Korth and Sudarshan 3.131 Database System Concepts
Generalized Projection
Extends the projection operation by allowing arithmetic functions
to be used in the projection list.
[
F1, F2, , Fn
(E)
E is any relational-algebra expression
Each of F
1
, F
2
, , F
n
are are arithmetic expressions involving
constants and attributes in the schema of E.
Given relation credit-info(customer-name, limit, credit-balance),
find how much more each person can spend:
[
customer-name, limit credit-balance
(credit-info)
Silberschatz, Korth and Sudarshan 3.132 Database System Concepts
Aggregate Functions and Operations
Aggregation function takes a collection of values and returns a
single value as a result.
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
Aggregate operation in relational algebra
G1, G2, , Gn
g
F1( A1), F2( A2),, Fn( An)
(E)
E is any relational-algebra expression
G
1
, G
2
, G
n
is a list of attributes on which to group
(can be empty)
Each F
i
is an aggregate function
Each A
i
is an attribute name
Silberschatz, Korth and Sudarshan 3.133 Database System Concepts
Aggregate Operation Example
Relation r:
A B
o
o
|
|
o
|
|
|
C
7
7
3
10
g
sum(c)
(r)
sum-C
27
Silberschatz, Korth and Sudarshan 3.134 Database System Concepts
Aggregate Operation Example
Relation account grouped by branch-name:
branch-name
g
sum(balance)
(account)
branch-name account-number balance
Perryridge
Perryridge
Brighton
Brighton
Redwood
A-102
A-201
A-217
A-215
A-222
400
900
750
750
700
branch-name balance
Perryridge
Brighton
Redwood
1300
1500
700
Silberschatz, Korth and Sudarshan 3.135 Database System Concepts
Aggregate Functions (Cont.)
Result of aggregation does not have a name
Can use rename operation to give it a name
For convenience, we permit renaming as part of
aggregate operation
branch-name
g
sum(balance) as sum-balance
(account)
Silberschatz, Korth and Sudarshan 3.136 Database System Concepts
Outer Join
An extension of the join operation that avoids loss of
information.
Computes the join and then adds tuples form one relation
that does not match tuples in the other relation to the result
of the join.
Uses null values:
null signifies that the value is unknown or
does not exist
All comparisons involving null are (roughly
speaking) false by definition.
Will study precise meaning of comparisons with
nulls later
Silberschatz, Korth and Sudarshan 3.137 Database System Concepts
Outer Join Example
Relation loan
loan-number amount
L-170
L-230
L-260
3000
4000
1700
Relation borrower
customer-name loan-number
Jones
Smith
Hayes
L-170
L-230
L-155
branch-name
Downtown
Redwood
Perryridge
Silberschatz, Korth and Sudarshan 3.138 Database System Concepts
Outer Join Example
Inner Join
loan Borrower
loan borrower
Left Outer Join
loan-number amount
L-170
L-230
3000
4000
customer-name
Jones
Smith
branch-name
Downtown
Redwood
loan-number amount
L-170
L-230
L-260
3000
4000
1700
customer-name
Jones
Smith
null
branch-name
Downtown
Redwood
Perryridge
Silberschatz, Korth and Sudarshan 3.139 Database System Concepts
Outer Join Example
Right Outer Join
loan borrower
loan-number amount
L-170
L-230
L-155
3000
4000
null
customer-name
Jones
Smith
Hayes
loan-number amount
L-170
L-230
L-260
L-155
3000
4000
1700
null
customer-name
Jones
Smith
null
Hayes
loan borrower
Full Outer Join
branch-name
Downtown
Redwood
null
branch-name
Downtown
Redwood
Perryridge
null
Silberschatz, Korth and Sudarshan 3.140 Database System Concepts
Null Values
It is possible for tuples to have a null value, denoted by null, for
some of their attributes
null signifies an unknown value or that a value does not exist.
The result of any arithmetic expression involving null is null.
Aggregate functions simply ignore null values
Is an arbitrary decision. Could have
returned null as result instead.
We follow the semantics of SQL in its
handling of null values
For duplicate elimination and grouping, null is treated like any
other value, and two nulls are assumed to be the same
Alternative: assume each null is different
from each other
Both are arbitrary decisions, so we simply
follow SQL
Silberschatz, Korth and Sudarshan 3.141 Database System Concepts
Modification of the Database
The content of the database may be modified
using the following operations:
Deletion
Insertion
Updating
All these operations are expressed using the
assignment operator.
Silberschatz, Korth and Sudarshan 3.142 Database System Concepts
Deletion
A delete request is expressed similarly to a query, except
instead of displaying tuples to the user, the selected tuples
are removed from the database.
Can delete only whole tuples; cannot delete values on only
particular attributes
A deletion is expressed in relational algebra by:
r r E
where r is a relation and E is a relational algebra query.
Silberschatz, Korth and Sudarshan 3.143 Database System Concepts
Deletion Examples
Delete all account records in the Perryridge branch.
account account o
branch-name = Perryridge
(account)
Delete all loan records with amount in the range of 0 to 50
loan loan o
amount > 0 and amount s 50
(loan)
Delete all accounts at branches located in Needham.
r
1
o
branch-city = Needham
(account branch)
r
2
[
branch-name, account-number, balance
(r
1
)
r
3
[
customer-name, account-number
(r
2
depositor)
account account r
2
depositor depositor r
3
Silberschatz, Korth and Sudarshan 3.144 Database System Concepts
Insertion
To insert data into a relation, we either:
specify a tuple to be inserted
write a query whose result is a set of tuples
to be inserted
in relational algebra, an insertion is expressed by:
r r E
where r is a relation and E is a relational algebra expression.
The insertion of a single tuple is expressed by letting E be a
constant relation containing one tuple.
Silberschatz, Korth and Sudarshan 3.145 Database System Concepts
Insertion Examples
Insert information in the database specifying that Smith has
$1200 in account A-973 at the Perryridge branch.
account account {(Perryridge, A-973, 1200)}
depositor depositor {(Smith, A-973)}
Provide as a gift for all loan customers in the Perryridge
branch, a $200 savings account. Let the loan number serve
as the account number for the new savings account.
r
1
(o
branch-name = Perryridge
(borrower loan))
account account [
branch-name, account-number,200
(r
1
)
depositor depositor [
customer-name, loan-number
,(r
1
)
Silberschatz, Korth and Sudarshan 3.146 Database System Concepts
Updating
A mechanism to change a value in a tuple without charging
all values in the tuple
Use the generalized projection operator to do this task
r [
F1, F2, , FI,
(r)
Each F, is either the ith attribute of r, if the ith attribute is not
updated, or, if the attribute is to be updated
F
i
is an expression, involving only constants and the
attributes of r, which gives the new value for the attribute
Silberschatz, Korth and Sudarshan 3.147 Database System Concepts
Update Examples
Make interest payments by increasing all balances by 5
percent.
account [
AN, BN, BAL * 1.05
(account)
where AN, BN and BAL stand for account-number, branch-
name and balance, respectively.
Pay all accounts with balances over $10,000
6 percent interest and pay all others 5 percent
account [
AN, BN, BAL * 1.06
(o
BAL > 10000
(account))
[
AN, BN, BAL * 1.05
(o
BAL s 10000
(account))
Silberschatz, Korth and Sudarshan 3.148 Database System Concepts
Views
In some cases, it is not desirable for all users to
see the entire logical model (i.e., all the actual
relations stored in the database.)
Consider a person who needs to know a
customers loan number but has no need to see
the loan amount. This person should see a
relation described, in the relational algebra, by
[
customer-name, loan-number
(borrower loan)
Any relation that is not of the conceptual model
but is made visible to a user as a virtual relation
is called a view.
Silberschatz, Korth and Sudarshan 3.149 Database System Concepts
View Definition
A view is defined using the create view statement
which has the form
create view v as <query expression
where <query expression> is any legal relational
algebra query expression. The view name is
represented by v.
Once a view is defined, the view name can be
used to refer to the virtual relation that the view
generates.
View definition is not the same as creating a new
relation by evaluating the query expression
Rather, a view definition causes the saving of an
expression to be substituted into queries using
the view.
Silberschatz, Korth and Sudarshan 3.150 Database System Concepts
View Examples
Consider the view (named all-customer) consisting of
branches and their customers.
create view all-customer as
[
branch-name, customer-name
(depositor account)
[
branch-name, customer-name
(borrower loan)
We can find all customers of the Perryridge branch by writing:
[
branch-name
(o
branch-name = Perryridge
(all-customer))
Silberschatz, Korth and Sudarshan 3.151 Database System Concepts
Updates Through View
Database modifications expressed as views must be translated
to modifications of the actual relations in the database.
Consider the person who needs to see all loan data in the loan
relation except amount. The view given to the person, branch-
loan, is defined as:
create view branch-loan as
[
branch-name, loan-number
(loan)
Since we allow a view name to appear wherever a relation name
is allowed, the person may write:
branch-loan branch-loan {(Perryridge, L-37)}
Silberschatz, Korth and Sudarshan 3.152 Database System Concepts
Updates Through Views (Cont.)
The previous insertion must be represented by an insertion into
the actual relation loan from which the view branch-loan is
constructed.
An insertion into loan requires a value for amount. The insertion
can be dealt with by either.
rejecting the insertion and returning an error message
to the user.
inserting a tuple (L-37, Perryridge, null) into the loan
relation
Some updates through views are impossible to translate into
database relation updates
create view v as o
branch-name = Perryridge
(account))
v v (L-99, Downtown, 23)
Others cannot be translated uniquely
all-customer all-customer (Perryridge, John)
Have to choose loan or account, and
create a new loan/account number!
Silberschatz, Korth and Sudarshan 3.153 Database System Concepts
Views Defined Using Other Views
One view may be used in the expression defining another view
A view relation v
1
is said to depend directly on a view relation v
2
if v
2
is used in the expression defining v
1
A view relation v
1
is said to depend on view relation v
2
if either v
1
depends directly to v
2
or there is a path of dependencies from
v
1
to v
2
A view relation v is said to be recursive if it depends on itself.
Silberschatz, Korth and Sudarshan 3.154 Database System Concepts
View Expansion
A way to define the meaning of views defined in terms of other
views.
Let view v
1
be defined by an expression e
1
that may itself contain
uses of view relations.
View expansion of an expression repeats the following
replacement step:
repeat
Find any view relation v
i
in e
1
Replace the view relation v
i
by the expression defining v
i
until no more view relations are present in e
1
As long as the view definitions are not recursive, this loop will
terminate
Silberschatz, Korth and Sudarshan 3.155 Database System Concepts
Banking Example
branch (branch-name, branch-city, assets)
customer (customer-name, customer-street, customer-city)
account (account-number, branch-name, balance)
loan (loan-number, branch-name, amount)
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
Silberschatz, Korth and Sudarshan 3.156 Database System Concepts
Basic Query Structure
A typical SQL query has the form:
select A
1
, A
2
, ..., A
n
from r
1
, r
2
, ..., r
m
where P
A
i
represents an attribute
R
i
represents a relation
P is a predicate.
This query is equivalent to the relational algebra
expression.
The result of an SQL query is a relation.
)) ( (
2 1 , , ,
2 1
m P A A A
r r r
n
[
o
Silberschatz, Korth and Sudarshan 3.157 Database System Concepts
157
SQL
SQL: widely used non-procedural language
E.g. find the name of the customer with customer-id 192-83-7465
select customer.customer-name
from customer
where customer.customer-id = 192-83-7465
E.g. find the balances of all accounts held by the customer with
customer-id 192-83-7465
select account.balance
from depositor, account
where depositor.customer-id = 192-83-7465 and
depositor.account-number = account.account-number
Application programs generally access databases through one of
Language extensions to allow embedded SQL
Application program interface (e.g. ODBC/JDBC) which allow SQL
queries to be sent to a database
Silberschatz, Korth and Sudarshan 3.158 Database System Concepts
158
The select Clause
The select clause corresponds to the projection
operation of the relational algebra. It is used to
list the attributes desired in the result of a query.
Find the names of all branches in the loan relation
select branch-name
from loan
In the pure relational algebra syntax, the query
would be:
[
branch-name
(loan)
Silberschatz, Korth and Sudarshan 3.159 Database System Concepts
SQL
An asterisk in the select clause denotes all
attributes
select *
from loan
NOTE: SQL does not permit the - character in
names, so you would use, for example,
branch_name instead of branch-name in a real
implementation. We use - since it looks nicer!
NOTE: SQL names are case insensitive, meaning
you can use upper case or lower case.
You may wish to use upper case in places where we use
bold font.
159
Silberschatz, Korth and Sudarshan 3.160 Database System Concepts
160
The select Clause (Cont.)
SQL allows duplicates in relations as well as in query
results.
To force the elimination of duplicates, insert the
keyword distinct after select.
Find the names of all branches in the loan relations,
and remove duplicates
select distinct branch-name
from loan
The keyword all specifies that duplicates not be removed.
select all branch-name
from loan
Silberschatz, Korth and Sudarshan 3.161 Database System Concepts
161
The select Clause (Cont.)
The select clause can contain arithmetic expressions
involving the operation, +, , -, and /, and operating
on constants or attributes of tuples.
The query:
select loan-number, branch-name, amount - 100
from loan
would return a relation which is the same as the loan
relations, except that the attribute amount is multiplied
by 100.
Silberschatz, Korth and Sudarshan 3.162 Database System Concepts
162
The where Clause
The where clause corresponds to the selection predicate
of the relational algebra. If consists of a predicate
involving attributes of the relations that appear in the from
clause.
The find all loan number for loans made a the Perryridge
branch with loan amounts greater than $1200.
select loan-number
from loan
where branch-name = Perryridge and amount > 1200
Comparison results can be combined using the logical
connectives and, or, and not.
Comparisons can be applied to results of arithmetic
expressions.
Silberschatz, Korth and Sudarshan 3.163 Database System Concepts
163
The where Clause (Cont.)
SQL Includes a between comparison operator in
order to simplify where clauses that specify that a
value be less than or equal to some value and greater
than or equal to some other value.
Find the loan number of those loans with loan
amounts between $90,000 and $100,000 (that is,
>$90,000 and s$100,000)
select loan-number
from loan
where amount between 90000 and 100000
Silberschatz, Korth and Sudarshan 3.164 Database System Concepts
The Rename Operation
SQL allows renaming relations and attributes using the as clause:
old-name as new-name
E.g. Find the name, loan number and loan amount of all customers;
rename the column name loan_number as loan_id.
select customer_name, borrower.loan_number as loan_id, amount
from borrower, loan
where borrower.loan_number = loan.loan_number
Silberschatz, Korth and Sudarshan 3.165 Database System Concepts
Tuple Variables
Tuple variables are defined in the from clause via the use
of the as clause.
Find the customer names and their loan numbers and
amount for all customers having a loan at some branch.
Find the names of all branches that have greater assets than
some branch located in Brooklyn.
select distinct T.branch_name
from branch as T, branch as S
where T.assets > S.assets and S.branch_city = 'Brooklyn'
Keyword as is optional and may be omitted
borrower as T borrower T
Some database such as Oracle require as to be omitted
select customer_name, T.loan_number, S.amount
from borrower as T, loan as S
where T.loan_number = S.loan_number
Silberschatz, Korth and Sudarshan 3.166 Database System Concepts
String Operations
SQL includes a string-matching operator for comparisons
on character strings. The operator like uses patterns that
are described using two special characters:
percent (%). The % character matches any substring.
underscore (_). The _ character matches any character.
Find the names of all customers whose street includes the
substring Main.
select customer_name
from customer
where customer_street like '% Main%'
Match the name Main%
like 'Main\%' escape '\'
SQL supports a variety of string operations such as
concatenation (using ||)
converting from upper to lower case (and vice versa)
finding string length, extracting substrings, etc.
Silberschatz, Korth and Sudarshan 3.167 Database System Concepts
Ordering the Display of Tuples
List in alphabetic order the names of all customers having a
loan in Perryridge branch
select distinct customer_name
from borrower, loan
where borrower loan_number = loan.loan_number and
branch_name = 'Perryridge'
order by customer_name
We may specify desc for descending order or asc for
ascending order, for each attribute; ascending order is the
default.
Example: order by customer_name desc
Silberschatz, Korth and Sudarshan 3.168 Database System Concepts
Duplicates
In relations with duplicates, SQL can define how many copies
of tuples appear in the result.
Multiset versions of some of the relational algebra operators
given multiset relations r
1
and r
2
:
1. o
u
(r
1
): If there are c
1
copies of tuple t
1
in r
1
, and t
1
satisfies selections o
u,
, then there are c
1
copies of t
1
in
o
u
(r
1
).
2. H
A
(r ): For each copy of tuple t
1
in r
1
, there is a copy
of tuple H
A
(t
1
) in H
A
(r
1
) where H
A
(t
1
) denotes the
projection of the single tuple t
1
.
3. r
1
x r
2
: If there are c
1
copies of tuple t
1
in r
1
and c
2
copies of tuple t
2
in r
2
, there are c
1
x c
2
copies of the
tuple t
1
. t
2
in r
1
x r
2
Silberschatz, Korth and Sudarshan 3.169 Database System Concepts
Duplicates (Cont.)
Example: Suppose multiset relations r
1
(A, B) and r
2
(C)
are as follows:
r
1
= {(1, a) (2,a)} r
2
= {(2), (3), (3)}
Then H
B
(r
1
) would be {(a), (a)}, while H
B
(r
1
) x r
2
would be
{(a,2), (a,2), (a,3), (a,3), (a,3), (a,3)}
SQL duplicate semantics:
select A
1
,
,
A
2
, ..., A
n
from r
1
, r
2
, ..., r
m
where P
is equivalent to the multiset version of the expression:
)) ( (
2 1 , , ,
2 1
m P A A A
r r r
n
[
o
Silberschatz, Korth and Sudarshan 3.170 Database System Concepts
Set Operations
The set operations union, intersect, and except operate on
relations and correspond to the relational algebra operations
, , .
Each of the above operations automatically eliminates
duplicates; to retain all duplicates use the corresponding
multiset versions union all, intersect all and except all.
Suppose a tuple occurs m times in r and n times in s, then, it
occurs:
m
+ n times in r union all s
min(m,n) times in r intersect all s
max(0, m n) times in r except all s
Silberschatz, Korth and Sudarshan 3.171 Database System Concepts
Set Operations
Find all customers who have a loan, an account, or
both:
(select customer_name from depositor)
except
(select customer_name from borrower)
(select customer_name from depositor)
intersect
(select customer_name from borrower)
Find all customers who have an account but no loan.
(select customer_name from depositor)
union
(select customer_name from borrower)
Find all customers who have both a loan and an account.
Silberschatz, Korth and Sudarshan 3.172 Database System Concepts
Aggregate Functions
These functions operate on the multiset
of values of a column of a relation, and
return a value
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
Silberschatz, Korth and Sudarshan 3.173 Database System Concepts
Aggregate Functions (Cont.)
Find the average account balance at the Perryridge branch.
Find the number of depositors in the bank.
Find the number of tuples in the customer relation.
select avg (balance)
from account
where branch_name = 'Perryridge'
select count (*)
from customer
select count (distinct customer_name)
from depositor
Silberschatz, Korth and Sudarshan 3.174 Database System Concepts
Aggregate Functions Group By
Find the number of depositors for each branch.
Note: Attributes in select clause outside of aggregate functions must
appear in group by list
select branch_name, count (distinct customer_name)
from depositor, account
where depositor.account_number = account.account_number
group by branch_name
Silberschatz, Korth and Sudarshan 3.175 Database System Concepts
Aggregate Functions Having Clause
Find the names of all branches where the average account
balance is more than $1,200.
Note: predicates in the having clause are applied after the
formation of groups whereas predicates in the where
clause are applied before forming groups
select branch_name, avg (balance)
from account
group by branch_name
having avg (balance) > 1200
Silberschatz, Korth and Sudarshan 3.176 Database System Concepts
Views
In some cases, it is not desirable for all users to
see the entire logical model (i.e., all the actual
relations stored in the database.)
Consider a person who needs to know a
customers loan number but has no need to see
the loan amount. This person should see a
relation described, in the relational algebra, by
[
customer-name, loan-number
(borrower loan)
Any relation that is not of the conceptual model
but is made visible to a user as a virtual relation
is called a view.
Silberschatz, Korth and Sudarshan 3.177 Database System Concepts
View Definition
A view is defined using the create view statement
which has the form
create view v as <query expression
where <query expression> is any legal relational
algebra query expression. The view name is
represented by v.
Once a view is defined, the view name can be
used to refer to the virtual relation that the view
generates.
View definition is not the same as creating a new
relation by evaluating the query expression
Rather, a view definition causes the saving of an
expression to be substituted into queries using
the view.
Silberschatz, Korth and Sudarshan 3.178 Database System Concepts
View Definition
A relation that is not of the conceptual model but is
made visible to a user as a virtual relation is called a
view.
A view is defined using the create view statement which
has the form
create view v as < query expression >
where <query expression> is any legal SQL expression.
The view name is represented by v.
Once a view is defined, the view name can be used to
refer to the virtual relation that the view generates.
Silberschatz, Korth and Sudarshan 3.179 Database System Concepts
Example Queries
A view consisting of branches and their customers
Find all customers of the Perryridge branch
create view all_customer as
(select branch_name, customer_name
from depositor, account
where depositor.account_number =
account.account_number )
union
(select branch_name, customer_name
from borrower, loan
where borrower.loan_number = loan.loan_number )
select customer_name
from all_customer
where branch_name = 'Perryridge'
Silberschatz, Korth and Sudarshan 3.180 Database System Concepts
Uses of Views
Hiding some information from some users
Consider a user who needs to know a
customers name, loan number and branch
name, but has no need to see the loan
amount.
Define a view
(create view cust_loan_data as
select customer_name,
borrower.loan_number, branch_name
from borrower, loan
where borrower.loan_number =
loan.loan_number )
Grant the user permission to read
cust_loan_data, but not borrower or loan
Predefined queries to make writing of other queries easier
Common example: Aggregate queries used
for statistical analysis of data
Silberschatz, Korth and Sudarshan 3.181 Database System Concepts
Processing of Views
When a view is created
the query expression is stored in the
database along with the view name
the expression is substituted into any query
using the view
Views definitions containing views
One view may be used in the expression
defining another view
A view relation v
1
is said to depend directly
on a view relation v
2
if v
2
is used in the
expression defining v
1
A view relation v
1
is said to depend on view
relation v
2
if either v
1
depends directly to v
2
or there is a path of dependencies from v
1
to v
2
A view relation v is said to be recursive if it
depends on itself.
Silberschatz, Korth and Sudarshan 3.182 Database System Concepts
View Expansion
A way to define the meaning of views defined in terms of other
views.
Let view v
1
be defined by an expression e
1
that may itself contain
uses of view relations.
View expansion of an expression repeats the following
replacement step:
repeat
Find any view relation v
i
in e
1
Replace the view relation v
i
by the expression defining v
i
until no more view relations are present in e
1
As long as the view definitions are not recursive, this loop will
terminate
Silberschatz, Korth and Sudarshan 3.183 Database System Concepts
183
Database Design
Introduction
Design issues : Pitfalls and goodness of a design
Feasibility study: Costs-Benefits
Evaluating benefits of Information system
Identifying user requirements.
Types of Database users and their inputs for study.
Requirement analysis.
Designing systems with UML.
E-R Diagrams to UML notations.
Higher level views through packages.
Silberschatz, Korth and Sudarshan 3.184 Database System Concepts
184
Database design : Introduction
Database design can be discussed in two aspects;
Technical aspect.
Commercial aspect.
Technical aspect of the design will consider the ER model and
related design aspect.Whereas the commercial aspect will
consider the Feasibility study , in terms of Costs incurred and
benefits derived, by introducing the proposed Information
system.
For Technical aspect of the database design, We shall consider
the Relational model and related design issues.
The use of an attribute or entity set to represent an object.
Whether a real-world concept is best expressed by an entity set or a
relationship set.
The use of a strong or weak entity set.
Silberschatz, Korth and Sudarshan 3.185 Database System Concepts
185
RDBMS Design issues
A database consists of multiple relations
Information about an enterprise is broken up into parts, with each
relation storing one part of the information
E.g.: account : stores information about accounts
depositor : stores information about which customer
owns which account
customer : stores information about customers
Each Relational database schema consists of a number of relation
schemas.
Ex. The Relational database schema for Banking Enterprise, consists
of many Relation schemas ( Entity-Relationships).
So far we have assumed that Entity - attributes are grouped to form
a relation schema by using the common sense of database
designer or by mapping a schema defined by ER model.
We still need some formal measure of why one grouping of
attributes into a relation schema may be better than another.
In general, the goal of schema design is to reduce various
anomalies.
Silberschatz, Korth and Sudarshan 3.186 Database System Concepts
186
Pitfalls in Relational Database Design
Relational database design requires that we find a
good collection of relation schemas. A bad design
may lead to
Repetition of Information.
Inability to represent certain information.
Design Goals:
Avoid redundant data
Ensure that relationships among attributes are
represented
Facilitate the checking of updates for violation of
database integrity constraints.
Silberschatz, Korth and Sudarshan 3.187 Database System Concepts
187
Normalization
When Various database designs are available then it is important to find
out the criterion , based on which one can say that one particular
design is better than other.
One particular design can be told to be better than other, if there are
minimum anomalies in the database.
The basic objective of normalization is to reduce the various anomalies
in the database.
Normalization can be looked upon as a process of analyzing the given
relation schemas based on their FDs and primary keys to achieve the
desirable properties of ;
Minimizing redundancy
Minimizing the insertion, deletion, and update anomalies.
Unsatisfactory relation schemas that do not meet certain conditions
the normal form tests are decomposed into smaller relation schemas
that meet the tests and hence possess the desirable properties.
Thus, the normalization procedure provides database designers with;
A formal framework for analyzing relation schemas based on their keys and
on the functional dependencies among their attributes.
A series of normal form tests that can be carried out on individual relation
schemas so that the relational database can be normalized to any desired
degree.
Silberschatz, Korth and Sudarshan 3.188 Database System Concepts
188
Relational Database Design
RDBMS design issues Pitfalls and Normalization.
First Normal Form
Pitfalls in Relational Database Design
Functional Dependencies
Decomposition
Boyce-Codd Normal Form
Third Normal Form
Multivalued Dependencies and Fourth Normal Form
Overall Database Design Process