The Relational Data Model (Based On Chapter 5)
The Relational Data Model (Based On Chapter 5)
The Relational Data Model (Based On Chapter 5)
(Based on Chapter 5)
1
1. Relational Model Concepts
6
R is also called the intension of a relation
r is also called the extension of a relation
Let S1 = {0,1}
Let S2 = {a,b,c}
Let R be a subset-of S1 X S2
for example: r(R) = {<0.a> , <0,b> , <1,c> }
7
DEFINITION SUMMARY
Table Relation
Column Attribute/Domain
Row Tuple
Values in a column Domain
Table Definition Schema of Relation
Populated Table Extension
8
Figure 7.1 The attributes and tuples of a relation STUDENT.
9
2 Characteristics of Relations
10
Values in a tuple: All values are considered atomic
(indivisible). A special null value is used to represent
values that are unknown or inapplicable to certain
tuples.
Notation:
- We refer to component values of a tuple t by
t[Ai] = vi (the value of attribute Ai for tuple t).
- Similarly, t[Au, Av, ..., Aw] refers to the subtuple of t
containing the values of attributes Au, Av, ..., Aw,
respectively.
11
Figure 7.2 The relation STUDENT from Figure 7.1, with a different
order of tuples
12
3 Relational Integrity Constraints
• Key constraints
• Entity integrity constraints,
• Referential integrity constraints
13
3.1 Key Constraints
Superkey of R: A set of attributes SK of R such
that no two tuples in any valid relation instance
r(R) will have the same value for SK. That is,
for any distinct tuples t1 and t2 in r(R), t1[SK]
<> t2[SK].
14
Example: The CAR relation schema:
CAR(State, Reg#, SerialNo, Make, Model, Year)
has two keys Key1 = {State, Reg#}, Key2 = {SerialNo},
which are also superkeys. {SerialNo, Make} is a superkey
but not a key.
15
Figure 7.4 The CAR relation with two candidate keys: LicenseNumber
and EngineSerialNumber.
16
Figure 7.5 Schema diagram for the COMPANY relational database
schema; the primary keys are underlined.
17
Figure 7.5 Schema diagram for the COMPANY relational database
schema; the primary keys are underlined.
18
Figure 7.6 (continued)
19
3.2 Entity Integrity
Relational Database Schema: A set S of relation schemas
that belong to the same database. S is the name of the
database.
S = {R1, R2, ..., Rn}
Entity Integrity: The primary key attributes PK of each relation
schema R in S cannot have null values in any tuple of r(R). This is
because primary key values are used to identify the individual
tuples.
t[PK] <> null for any tuple t in r(R)
Note: Other attributes of R may be similarly constrained to disallow
null values, even though they are not members of the primary key.
20
3.3 Referential Integrity
A constraint involving two relations (the previous constraints
involve a single relation).
Used to specify a relationship among tuples in two relations: the
referencing relation and the referenced relation.
Tuples in the referencing relation R1 have attributes FK (called
foreign key attributes) that reference the primary key attributes
PK of the referenced relation
R2. A tuple t1 in R1 is said to reference a tuple t2 in R2 if t1[FK] =
t2[PK].
A referential integrity constraint can be displayed in a relational
database schema as a directed arc from R1.FK to R2.
21
Figure 7.7 Referential integrity constraints displayed on the COMPANY relational
database schema diagram.
22
Figure 7.6 One possible relational database state corresponding to the
company schema.
23
4 Update Operations on Relations
- INSERT a tuple.
- DELETE a tuple.
- MODIFY a tuple.
- Integrity constraints should not be violated by the update
operations.
- Several update operations may have to be grouped together.
- Updates may propagate to cause other updates
automatically. This may be necessary to maintain integrity
constraints.
- 24
In case of integrity violation, several actions can be
taken:
- cancel the operation that causes the violation
(REJECT optiom)
- perform the operation but inform the user of the
violation
- trigger additional updates so the violation is
corrected (CASCADE option, SET NULL option)
- execute a user-specified error-correction routine
25
5 The Relational Algebra
28
PROJECT operation (denoted by ):
- Keeps only certain attributes (columns) from a relation R
specified in an attribute list L
- Form of operation: L(R)
- Resulting relation has only those attributes of R specified in
L
Example: FNAME,LNAME,SALARY(EMPLOYEE)
- The PROJECT operation eliminates duplicate tuples in the
resulting relation so that it remains a mathematical set (no
duplicate elements)
29
Example: SEX,SALARY(EMPLOYEE)
30
Sequences of operations:
- Several operations can be combined to form a relational
algebra expression (query)
FNAME,LNAME,SALARY ( DNO=4(EMPLOYEE) )
- Alternatively, we specify explicit intermediate relations for
each step:
DEPT4_EMPS <- DNO=4(EMPLOYEE)
R <- FNAME,LNAME,SALARY(DEPT4_EMPS)
-
31
Attributes can optionally be renamed in the
resulting left-hand-side relation (this may
be required for some operations that will be
presented later):
DEPT4_EMPS <- DNO=4(EMPLOYEE)
R(FIRSTNAME,LASTNAME,SALARY)
<-
FNAME,LNAME,SALARY(DEPT4_EMPS)
32
Figure 7.8 Results of SELECT and PROJECT operations.
(a) (DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE).
(b) LNAME, FNAME, SALARY(EMPLOYEE). (c) SEX, SALARY(EMPLOYEE).
33
Class Number – CS 304
Class Name -
DBMS
Instructor –
Sanjay Madria
36
Figure 7.11 Illustrating the set operations union, intersection and difference.
(a) Two union compatible relations.
(b) STUDENT INSTRUCTOR. (c) STUDENT INSTRUCTOR
(d) STUDENT - INSTRUCTOR (e) INSTRUCTOR - STUDENT
37
CARTESIAN PRODUCT
39
5.3 JOIN Operations
THETA JOIN: Similar to a CARTESIAN PRODUCT followed by a
SELECT.
The condition c is called a join condition.
R(A1, A2, ..., Am, B1, B2, ..., Bn) <-R1(A1, A2, ..., Am) X c R2 (B1,
B2, ..., Bn)
Here c can be <, >, =, <=, >=
EQUIJOIN: The join condition c includes one or more equality
comparisons involving attributes from R1 and R2. That is, c is of
the form:
(Ai=Bj) AND ... AND (Ah=Bk); 1<i,h<m, 1<j,k<n
In the above EQUIJOIN operation:
Ai, ..., Ah are called the join attributes of R1
Bj, ..., Bk are called the join attributes of R2
40
Example of using EQUIJOIN:
41
NATURAL JOIN (*):
In an EQUIJOIN R <- R1 X c R2, the join attribute of R2 appear
redundantly in the result relation R. In a NATURAL JOIN, the
redundant join attributes of R2 are eliminated from R. The
equality condition is implied and need not be specified.
R <- R1 *(join attributes of R1),(join attributes of R2) R2
42
If the join attributes have the same names in both
relations, they need not be specified and we can
write R <- R1 * R2.
43
Figure 7.13 Illustrating the JOIN operation.
44
Figure 7.14 An illustration of the NATURAL JOIN operation.
(a) PROJ_DEPT PROJECT * DEPT.
(b) DEPT_LOCS DEPARTMENT * DEPT_LOCATIONS.
45
Note: In the original definition of NATURAL JOIN, the
join attributes were required to have the same names in
both relations.
There can be a more than one set of join attributes with a
different meaning between the same two relations. For
example:
JOIN ATTRIBUTES RELATIONSHIP
EMPLOYEE.SSN= EMPLOYEE manages
DEPARTMENT.MGRSSN the DEPARTMENT
EMPLOYEE.DNO= EMPLOYEE works for
DEPARTMENT.DNUMBER the DEPARTMENT
46
A relation can have a set of join attributes to join it with
itself :
JOIN ATTRIBUTES RELATIONSHIP
EMPLOYEE(1).SUPERSSN= EMPLOYEE(2) supervises
EMPLOYEE(2).SSN EMPLOYEE(1)
- One can think of this as joining two distinct copies of the
relation, although only one relation actually exists
- In this case, renaming can be useful
47
Figure 7.15 Illustrating the division operation.
(a) Dividing SSN_PNOS by SMITH_PNOS. (b) T R S.
48
Complete Set of Relational Algebra Operations:
- All the operations discussed so far can be described as a
sequence of only the operations SELECT, PROJECT, UNION,
SET DIFFERENCE, and CARTESIAN PRODUCT.
- Hence, the set { , , U, - , X } is called a complete set of
relational algebra operations. Any query language equivalent to
these operations is called relationally complete.
- For database applications, additional operations are needed
that were not part of the original relational algebra. These
include:
1. Aggregate functions and grouping.
2. OUTER JOIN.
49
5.4 Additional Relational Operations
AGGREGATE FUNCTIONS
- Functions such as SUM, COUNT, AVERAGE, MIN, MAX
are often applied to sets of values or sets of tuples in database
applications
<grouping attributes> F<function list> (R)
- The grouping attributes are optional
Example 1: Retrieve the average salary of all employees (no grouping
needed):
R(AVGSAL) <- F AVERAGE SALARY (EMPLOYEE)
50
Example 2: For each department, retrieve the department
number, the number of employees, and the average salary
(in the department):
R(DNO,NUMEMPS,AVGSAL) <-
DNO F COUNT SSN, AVERAGE SALARY (EMPLOYEE)
DNO is called the grouping attribute in the above example
51
Figure 7.16 An illustration of the AGGREGATE FUNCTION operation.
(a) R(DNO, NO_OF_EMPLOYEES, AVERAGE_SAL) DNO COUNT SSN, AVERAGE
SALARY (EMPLOYEE). (b) DNO COUNT SSN, AVERAGE SALARY(EMPLOYEE).
(C) COUNT SSN, AVERAGE SALARY(EMPLOYEE).
52
OUTER JOIN
- In a regular EQUIJOIN or NATURAL JOIN operation,
tuples in R1 or R2 that do not have matching tuples in the other
relation do not appear in the result
- Some queries require all tuples in R1 (or R2 or both) to
appear in the result
- When no matching tuples are found, nulls are placed for the
missing attributes
-
53
LEFT OUTER JOIN: R1 X R2 lets every tuple in R1
appear in the result
- RIGHT OUTER JOIN: R1 X R2 lets every tuple in
R2 appear in the result
- FULL OUTER JOIN: R1 X R2 lets every tuple in R1
or R2 appear in the result
54
Figure 7.18 The LEFT OUTER JOIN operation.
55
RENAME Operator
S(R) – Renaming R as S
(B1, B2,…Bn) (R) – Renaming attributes of R as Bi’s.
56
Some Queries
Q. Retrieve the SSNs of all the employees who either work in
dept. 5 or supervise an employee who works in dept 5.
Dept-Emps DNO = 5 (Employee)
Result1 SSN (Dept-Emps)
Result 2(SSN) SuperSSN (Dept-Emps)
Result = Result 1 Result 2
57
Q. Find for each female employee, a list of names of her dependents.
Female-Emp SEX = F (Employee)
Empname FNAME, LNAME, SSN (Female-Emp)
Emp-dep Empname Dependent
Actual-dep SSN = ESSN (Emp-dep)
Result FNAME, LNAME, Dependent-name (Actual-dep)
Q. Find the name of the manager of each department.
Dept-mgr Department MGRSSN = SSN (Employee)
Result DNAME, LNAME, FNAME (Dept-mgr)
58
• List names of managers who have atleast
one dependent
• Find names of employees who have no
dependents
• List names of all employees with two or
more dependents
59
Figure 7.12 An illustration of the CARTESIAN PRODUCT operation.
60
Figure 7.12 (continued)
61
The Insert Operation
1. Insert <‘Cecilia’, ‘F”, “Kolonsky’, null, ‘1960-04-
05’, ‘6357 Windy Lane, Katy, TX’, F, 28000, null,
4> into EMPLOYEE
This insertion violates the entity integrity constraint
(null for the primary key SSN), so it is rejected.
2. Insert <‘Alicia’, ‘J’, ‘Zelaya’, 999887777, 1960-
04-05’, ‘6357 Windy Lane, Katy, TX’, F, 28000,
987654321’, 4> into EMPLOYEE
This insertion violates the key constraint because
another tuple with the same SSN Value already
exists in the EMPLOYEE relation, and so it is
rejected.
62
The Insert Operation (contd.)
3. Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘67768989’, 1960-
04-05’, ‘6357 Windswept, katy, TX’, F, 28000,
‘987654321’, 7> into EMPLOYEE
This insertion violates the referential integrity
constraint specified on DNO because no
DEPARTMENT tuple exists with DNUMBER = 7
4. Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘67768989’, 1960-
04-05’, ‘6357 Windy Lane, Katy, TX’, F, 28000, null,
4> into EMPLOYEE.
This insertion satisfies all constraints, so it is
acceptable.
63
The Delete Operation
1. Delete the WORKS_ON tuple with ESSN =
‘999887777’ and PNO = 10.
This deletion is acceptable
64
The Delete Operation (contd.)
3. Delete the EMPLOYEE tuple with SSN= ‘333445555’
This deletion will result in even worse referential integrity
violations, because the tuple involved is referenced by tuples
from the EMPLOYEE, DEPARTMENT, WORKS_ON and
DEPENDENT relations.
• Options
• Reject deletion
• Propagate deletion by deleting other tuples
• Modify the referencing attributes that cause the violations
• Or combination of three above
65
The Update Operation
1. Update the SALARY of the EMPLOYEE tuple
with SSN = ‘999887777; to 28000
Acceptable
2. Update the DNO of the EMPLOYEE tuple with
SSN = ‘999887777’ to 1
Acceptable
3. Update the DNO of the EMPLOYEE tuple
with SSN = ‘999887777’ to 7
Unacceptable, because it violates referential
integrity.
66
The Update Operation (contd.)
4. Update the SSN of the EMPLOYEE tuple with
SSN = ‘999887777’ to ‘987654321’
Unacceptable, because it violates primary key
and referential integrity constraints.
67