Formal Relational Query Languages

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 26

Chapter 6: Formal Relational

Query Languages
Outline

• Relational Algebra
• Tuple Relational Calculus
• Domain Relational Calculus
Relational Algebra
• Procedural language
• Six basic operators
• select: 
• project: 
• union: 
• set difference: –
• Cartesian product: x
• rename: 
• The operators take one or two relations as inputs
and produce a new relation as a result.
Select Operation
• Notation:  (r)
p
• p is called the selection predicate
• Defined as:

p(r) = {t | t  r and p(t)}

Where p is a formula in propositional calculus consisting of terms connected


by :  (and),  (or),  (not)
Each term is one of:
<attribute> op <attribute> or <constant>
where op is one of: =, , >, . <. 

• Example of selection:

 dept_name=“Physics”(instructor)
Project Operation
• Notation:  (r )
A1 , A2 ,, Ak

where A1, A2 are attribute names and r is a relation


name.
• The result is defined as the relation of k columns
obtained by erasing the columns that are not listed
• Duplicate rows removed from result, since relations
are sets
• Example: To eliminate the dept_name attribute of
instructor
ID, name, salary (instructor)
Union Operation
• Notation: r  s
• Defined as:
r  s = {t | t  r or t  s}
• For r  s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (example: 2nd
column
of r deals with the same type of values as does the 2nd
column of s)
• Example: to find all courses taught in the Fall 2009 semester, or
in the Spring 2010 semester, or in both
course_id ( semester=“Fall” Λ year=2009 (section)) 

course_id ( semester=“Spring” Λ year=2010 (section))


Set Difference Operation
• Notation r – s
• Defined as:
r – s = {t | t  r and t  s}

• Set differences must be taken between compatible relations.


• r and s must have the same arity
• attribute domains of r and s must be compatible

• Example: to find all courses taught in the Fall 2009 semester, but not in
the Spring 2010 semester

course_id ( semester=“Fall” Λ year=2009 (section)) −

course_id ( semester=“Spring” Λ year=2010 (section))


Set-Intersection Operation
• Notation: r  s
• Defined as:
• r  s = { t | t  r and t  s }
• Assume:
• r, s have the same arity
• attributes of r and s are compatible
• Note: r  s = r – (r – s)
Cartesian-Product Operation
• Notation r x s
• Defined as:
r x s = {t q | t  r and q  s}

• Assume that attributes of r(R) and


s(S) are disjoint. (That is, R  S =
).
• If attributes of r(R) and s(S) are not
disjoint, then renaming must be
used.
Rename Operation
• Allows us to name, and therefore to refer to, the results
of relational-algebra expressions.
• Allows us to refer to a relation by more than one name.
• Example:
 x (E)

returns the expression E under the name X


• If a relational-algebra expression E has arity n, then
 x ( A1 , A2 ,..., An ) ( E )
returns the result of expression E under the name X,
and with the
attributes renamed to A1 , A2 , …., An .
Formal Definition
• A basic expression in the relational algebra consists of
either one of the following:
• A relation in the database
• A constant relation
• Let E1 and E2 be relational-algebra expressions; the
following are all relational-algebra expressions:
• E1  E2
• E1 – E2
• E1 x E2
• p (E1), P is a predicate on attributes in E1
• s(E1), S is a list consisting of some of the attributes in E1
•  x (E1), x is the new name for the result of E1
Tuple Relational Calculus
Tuple Relational Calculus
• A nonprocedural query language, where each query is of the
form
{t | P (t ) }
• It is the set of all tuples t such that predicate P is true for t
• t is a tuple variable, t [A ] denotes the value of tuple t on
attribute A
• t  r denotes that tuple t is in relation r
• P is a formula similar to that of the predicate calculus
Predicate Calculus Formula
1. Set of attributes and constants
2. Set of comparison operators: (e.g., , , ,
, , )
3. Set of connectives: and (), or (v)‚ not ()
4. Implication (): x  y, if x if true, then y is
true
x  y x v y
5. Set of quantifiers:
 t r (Q (t )) ”there exists” a tuple in t in relation r
such that predicate Q (t ) is true
 t r (Q (t )) Q is true “for all” tuples t in relation r
Example Queries
• Find the ID, name, dept_name, salary for
instructors whose salary is greater than $80,000
{t | t  instructor  t [salary ]  80000}
Notice that a relation on schema (ID, name, dept_name, salary) is
implicitly defined by the query
 As in the previous query, but output only the ID attribute value

{t |  s instructor (t [ID ] = s [ID ]  s [salary ]  80000)}

Notice that a relation on schema (ID) is implicitly defined by


the query
Example Queries
• Find the names of all instructors whose
department is in the Watson building
{t | s  instructor (t [name ] = s [name ]
 u  department (u [dept_name ] = s[dept_name] “
 u [building] = “Watson” ))}

 Find the set of all courses taught in the Fall 2009 semester, or in
the Spring 2010 semester, or both

{t | s  section (t [course_id ] = s [course_id ] 


s [semester] = “Fall”  s [year] = 2009
v u  section (t [course_id ] = u [course_id ] 
u [semester] = “Spring”  u [year] = 2010 )}
Example Queries
 Find the set of all courses taught in the Fall 2009 semester, and in
the Spring 2010 semester

{t | s  section (t [course_id ] = s [course_id ] 


s [semester] = “Fall”  s [year] = 2009
 u  section (t [course_id ] = u [course_id ] 
u [semester] = “Spring”  u [year] = 2010 )}

 Find the set of all courses taught in the Fall 2009 semester, but not in
the Spring 2010 semester

{t | s  section (t [course_id ] = s [course_id ] 


s [semester] = “Fall”  s [year] = 2009
  u  section (t [course_id ] = u [course_id ] 
u [semester] = “Spring”  u [year] = 2010 )}
Universal Quantification
• Find all students who have taken all courses offered in the Biology department
• {t |  r  student (t [ID] = r [ID]) 
( u  course (u [dept_name]=“Biology” 
 s  takes (t [ID] = s [ID ] 
s [course_id] = u [course_id]))}
Safety of Expressions
• It is possible to write tuple calculus expressions that
generate infinite relations.
• For example, { t |  t r } results in an infinite
relation if the domain of any attribute of relation r
is infinite
• To guard against the problem, we restrict the set of
allowable expressions to safe expressions.
• An expression {t | P (t )} in the tuple relational
calculus is safe if every component of t appears in
one of the relations, tuples, or constants that
appear in P
• NOTE: this is more than just a syntax condition.
• E.g. { t | t [A] = 5  true } is not safe --- it defines an infinite set
with attribute values that do not appear in any relation or
tuples or constants in P.
Safety of Expressions (Cont.)
• Consider again that query to find all students who have taken all courses offered in the Biology
department
• {t |  r  student (t [ID] = r [ID]) 
( u  course (u [dept_name]=“Biology” 
 s  takes (t [ID] = s [ID ] 
s [course_id] = u [course_id]))}
• Without the existential quantification on student, the above query would be unsafe if the Biology
department has not offered any courses.
Domain Relational Calculus
Domain Relational Calculus
• A nonprocedural query language equivalent in
power to the tuple relational calculus
• Each query is an expression of the form:

{  x , x , …, x  | P (x , x , …, x )}
1 2 n 1 2 n

• x1, x2, …, xn represent domain variables


• P represents a formula similar to that of the predicate
calculus
Example Queries
• Find the ID, name, dept_name, salary for instructors
whose salary is greater than $80,000
• {< i, n, d, s> | < i, n, d, s>  instructor  s  80000}
• As in the previous query, but output only the ID
attribute value
• {< i> | < i, n, d, s>  instructor  s  80000}
• Find the names of all instructors whose department is
in the Watson building
{< n > |  i, d, s (< i, n, d, s >  instructor
  b, a (< d, b, a>  department  b =
“Watson” ))}
Example Queries
 Find the set of all courses taught in the Fall 2009 semester, or in
the Spring 2010 semester, or both
{<c> |  a, s, y, b, r, t ( <c, a, s, y, b, r, t >  section 
s = “Fall”  y = 2009 )
v  a, s, y, b, r, t ( <c, a, s, y, b, r, t >  section ] 
s = “Spring”  y = 2010)}
This case can also be written as
{<c> |  a, s, y, b, r, t ( <c, a, s, y, b, r, t >  section 
( (s = “Fall”  y = 2009 ) v (s = “Spring”  y = 2010))}
 Find the set of all courses taught in the Fall 2009 semester, and in
the Spring 2010 semester

{<c> |  a, s, y, b, r, t ( <c, a, s, y, b, r, t >  section 


s = “Fall”  y = 2009 )
  a, s, y, b, r, t ( <c, a, s, y, b, r, t >  section ] 
s = “Spring”  y = 2010)}
Safety of Expressions
The expression:
{  x1, x2, …, xn  | P (x1, x2, …, xn )}

is safe if all of the following hold:


1.All values that appear in tuples of the expression are
values from dom (P ) (that is, the values appear either in
P or in a tuple of a relation mentioned in P ).
2.For every “there exists” subformula of the form  x (P1(x
)), the subformula is true if and only if there is a value of
x in dom (P1) such that P1(x ) is true.
3.For every “for all” subformula of the form x (P1 (x )),
the subformula is true if and only if P1(x ) is true for all
values x from dom (P1).
Universal

Quantification
Find all students who have taken all courses offered in the Biology
department
• {< i > |  n, d, tc ( < i, n, d, tc >  student 
( ci, ti, dn, cr ( < ci, ti, dn, cr >  course  dn =“Biology”
  si, se, y, g ( <i, ci, si, se, y, g>  takes ))}
• Note that without the existential quantification on student, the
above query would be unsafe if the Biology department has not
offered any courses.

You might also like