Unit 2

Relational Query Languages
Relational query languages use relational algebra to break the user requests
and instruct the DBMS to execute the requests. It is the language by which user
communicates with the database. These relational query languages can be
procedural or non-procedural.
Prepared by : Dr. Ahmad Jamal

Procedural Query Language
A procedural query language will have set of queries instructing the DBMS to
perform various transactions in the sequence to meet the user request. For
example, get_CGPA procedure will have various queries to get the marks of
student in each subject, calculate the total marks, and then decide the CGPA
based on his total marks. This procedural query language tells the database
what is required from the database and how to get them from the database.
Relational algebra is a procedural query language.

Non-Procedural Query Language
Non-procedural queries will have single query on one or more tables to get
result from the database. For example, get the name and address of the student
with particular ID will have single query on STUDENT table. Relational
Calculus is a non procedural language which informs what to do with the
tables, but doesn’t inform how to accomplish this.
These query languages basically will have queries on tables in the database. In
the relational database, a table is known as relation. Records / rows of the table
are referred as tuples. Columns of the table are also known as attributes. All
these names are used interchangeably in relational database.

Relational Algebra
Relational algebra is a procedural query language. It takes one or more
relations / tables and performs the operation and produce the result. This result
is also considered as a new table or relation. Suppose we have to retrieve
student name, address and class for the given ID. What a relational algebra will
do in this case is, it filters the name, address and class from the STUDENT
table for the input ID. In mathematical terms, relational algebra has produced a
subset of STUDENT table for the given ID.
Relational algebra will have operators to indicate the operations. This algebra
can be applied on single relation – called unary or can be applied on two tables
– called binary. While applying the operations on the relation, the resulting
subset of relation is also known as new relation. There can be multiple steps
involved in some of the operations.

Relational Algebra in DBMS has 6 fundamental operations.

Select (σ)

Project (∏)

Rename (ρ)

Cartesian product (X)

Union (U)

Set-difference (-)

Select (σ)
Select (σ) – This is a unary relational operation. This operation pulls the
horizontal subset (subset of rows) of the relation that satisfies the conditions.
This can use operators like <, >, <=, >=, = and != to filter the data from the
relation. It can also use logical AND, OR and NOT operators to combine the
various filtering conditions. This operation can be represented as below:
σ p (r)
Where σ is the symbol for select operation, r represents the relation/table, and p
is the logical formula or the filtering conditions to get the subset.

Select (σ)
Examples
σSTD_NAME = “James” (STUDENT)
It selects the record/tuple from the STUDENT table with Student name as
‘James’
σdept_id = 20 AND salary>=10000 (EMPLOYEE) –
Selects the records from EMPLOYEE table with department ID = 20 and

employees whose salary is more than 10000.

Project (∏)
Project (∏) – This is a unary operator and is similar to select operation above. It
creates the subset of relation based on the conditions specified. Here, it selects
only selected columns/attributes from the relation- vertical subset of relation.
The select operation above creates subset of relation but for all the attributes in
the relation. It is denoted as below:
∏a1, a2, a3 (r)
Where ∏ is the operator for projection, r is the relation and a1, a2, a3 are the
attributes of the relations which will be shown in the resultant subset.

Project (∏)
Examples
∏std_name, address, course (STUDENT)
This will select all the records from STUDENT table but only selected columns –
std_name, address and course. Suppose we have to select only these 3
columns for particular student then we have to combine both project and select
operations.
∏STD_ID, address, course (σ STD_NAME = “James”(STUDENT))
This selects the record for ‘James’ and displays only std_ID, address and his
course columns. Here we can see two unary operators are combined, and it has
two operations performing. First it selects the tuple from STUDENT table for
‘James’. The resultant subset of STUDENT is also considered as intermediary
relation. But it is temporary and exists till the end of this operation. It then filters
the 3 columns from this temporary relation.
Rename (ρ)
Rename (ρ) – This is a unary operator used to rename the tables and columns
of a relation. When we perform self join operation, we have to differentiate two
same tables. In such case rename operator on tables comes into picture. When
we join two or more tables and if those tables have same column names, then it
is always better to rename the columns to differentiate them. This occurs when
we perform Cartesian product operation.
ρ R(E)
Where ρ is the rename operator, E is the existing relation name, and R is the
new relation name.

Rename (ρ)
Examples
ρ STUDENT (STD_TABLE)
Renames STD_TABLE table to STUDENT
ρ STD_ID, STD_NAME, STD_ADDRESS(STUDENT)
It will rename the columns in the order the names appear in the table

Cartesian product (X): – This is a binary operator. It combines the tuples of two
relations into one relation.
RXS
Where R and S are two relations and X is the operator. If relation R has m
tuples and relation S has n tuples, then the resultant relation will have mn
tuples. For example, if we perform cartesian product on EMPLOYEE (5 tuples)
and DEPT relations (3 tuples), then we will have new tuple with 15 tuples.

Example
EMPLOYEE X DEPT
This operator will simply create a pair between the tuples of each table. i.e.;
each employee in the EMPLOYEE table will be mapped with each department
in DEPT table.

Example

Union (U)
Union (U) – It is a binary operator, which combines the tuples of two relations. It
is denoted by
RUS
Where R and S are the relations and U is the operator.

Union (U)
Example
DESIGN_EMPLOYEE U TESTING_EMPLOYEE
Where DESIGN_EMPLOYEE and TESTING_EMPLOYEE are two relations.

Union (U)
Example

How Union is different from Cartesian product

Cartesian product combines the attributes of two relations into one relation
whereas Union combines the tuples of two relations into one relation.

In Union, both relations should have same number of columns. Suppose we
have to list the employees who are working for design and testing
department. Then we will do the union on employee table. Since it is union on
same table it has same number of attributes. Cartesian product does not
concentrate on number of attribute or rows. It blindly combines the attributes.

In Union, both relations should have same types of attributes in same order.
In the above example, since union is on employee relation, it has same type
of attribute in the same order.

Set-difference (-)
Set-difference (-) – This is a binary operator. This operator creates a new

relation with tuples that are in one relation but not in other relation. It is denoted
by ‘-‘symbol.
R–S
Where R and S are the relations.

Set-difference (-)
DESIGN_EMPLOYEE −TESTING_EMPLOYEE
In case we want to retrieve the employees who are working in Design

department but not in testing.

Relational Calculus
Relational Calculus in database management system (DBMS) is all about
"What you want ?". Relational calculus does not tell us how to get the results
from the Database, but it just cares about what we want.

Tuple Relational Calculus (TRC)
Tuple Relational Calculus in DBMS uses a tuple variable (t) that goes to each
row of the table and checks if the predicate is true or false for the given row.
Depending on the given predicate condition, it returns the row or part of the row.
The Tuple Relational Calculus expression Syntax

{t \| P(t)}
Where t is the tuple variable that runs over every Row, and P(t) is the predicate
logic expression or condition.

Tuple Relational Calculus (TRC)
Example
Question 1: Write a TRC query to get all the data of customers whose zip code is 12345.
Solution:
{t \| Customer(t) ∧ t[Zipcode] = 12345 }

Workflow of query - The tuple variable "t" will go through every tuple of the
Customer table. Each row will check whether the Cust_Zipcode is 12345 or not
and only return those rows that satisfies the Predicate expression condition.
The TRC expression can be read as "Return all the tuple which belongs to the
Customer Table and whose Zipcode is equal to 12345."
Result:

Domain Relational Calculus (DRC)
Domain Relational Calculus uses domain Variables to get the column values
required from the database based on the predicate expression or condition.
Example:
{<x1,x2,x3,x4...> \| P(x1,x2,x3,x4...)}
where,
<x1,x2,x3,x4...> are domain variables used to get the column values required,
and P(x1,x2,x3...) is predicate expression or condition.

DRC Example
Question 2: Write a DRC query to get the data of all customers with Zip code
12345.
Solution: {<x1,x2,x3> \| <x1,x2> ∈ Customer ∧ x3 = 12345 }

Workflow of Query: In the above query x1,x2,x3 (ordered) refers to the attribute
or column which we need in the result, and the predicate condition is that the
first two domain variables x1 and x2 should be present while matching the
condition for each row and the third domain variable x3 should be equal to
12345.
Result:

SQL3
SQL3 was accepted as the new standard for SQL in 1999, after more than 7
years of debate. Basically, SQL3 includes data definition and management
techniques from Object-Oriented dbms, OO-dbms, while maintaining the
relational dbms platform. Based on this merger of concepts and techniques,
DBMSs that support SQL3 are called Object-Relational or or-dbms'.
The most central data modelling notions included in SQL3 are:


Classification hierarchies

Embedded structures that support composite attributes

Collection data-types (sets, lists/arrays, and multi-sets) that can be used for
multi-valued attribute types

Large OBject types, LOBs, within the DB, as opposed to requiring external
storage, and

User defined data-types and functions (UDT/UDF) that can be used to define
complex structures and derived attribute value calculations, among many
other function extensions. Prepared by : Dr. Ahmad Jamal
DDL and DML constructs

Open source DBMS
An open source database allows users to create a system based on their
unique requirements and business needs. It is free and can also be shared. The
source code can be modified to match any user preference.
Enlisted below are the most popular Enterprise-Grade Open Source DBMS that
are used worldwide.

Altibase

MySQL

PostgreSQL

Maria DB

MongoDB

Cassandra

SQLite

Cubrid

SQL Server
Commercial Database
A commercial database is one created for commercial purposes only and it's
available at a price. Unlike open source databases, commercial databases can
only be viewed or modified by authorized users.
Oracle, DB2, Splunk etc.

Relational Database Design
Relational database design (RDD) models information and data into a set of
tables with rows and columns. Each row of a relation/table represents a record,
and each column represents an attribute of data. The Structured Query
Language (SQL) is used to manipulate relational databases.
The design of a relational database is composed of four stages, where the data
are modeled into a set of related tables. The stages are:

Define relations/attributes

Define primary keys

Define relationships

Normalization

Key
Keys in DBMS are used to uniquely identify any record or row of data from the
table. It is also used to establish and identify relationships between tables.
Keys in DBMS are of different types eg: Super key, Candidate key, Primary Key,
Foreign key, etc.

Primary key
It is the first key used to identify one and only one instance of an entity uniquely.
An entity can contain multiple keys, as we saw in the PERSON table. The key
which is most suitable from those lists becomes a primary key.

Candidate key
A candidate key is an attribute or set of attributes that can uniquely identify a
tuple.
Except for the primary key, the remaining attributes are considered a candidate
key. The candidate keys are as strong as the primary key.

Super Key
Super key is an attribute set that can uniquely identify a tuple. A super key is a
superset of a candidate key.
In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the name of two

employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this
combination can also be a key.
Foreign key

Foreign keys are the column of the table used to point to the primary key of another table.

Every employee works in a specific department in a company, and employee and department
are two different entities. So we can't store the department's information in the employee
table. That's why we link these two tables through the primary key of one table.

We add the primary key of the DEPARTMENT table, Department_Id, as a new attribute in the
EMPLOYEE table.

In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related.

Domain
A domain is a unique set of values that can be assigned to an attribute in a
database. For example, a domain of strings can accept only string values.
The data type defined for a column in a database is called a database domain.
This data type can either be a built-in type (such as an integer or a string) or a
custom type that defines data constraints.
Example
Let's say that we have a table, that stores student records. Now, suppose there
is a column, or attribute to store the students' contact numbers. The column for
contact numbers should only expect numeric values, usually of the type INT,
which is the domain for that attribute.

Domain Integrity Constraints
Domain Constraints are user-defined columns that assist the user in entering
values that are appropriate for the data type. If it detects an incorrect input, it
informs the user that the column is not properly filled. Or, to put it another way,
it's an attribute that describes all of the potential values for the attribute, such as
integer, character, date, time, string, and so on.
Types of Domain Constraints

1: NOT NULL
2: Check

Data Dependency
Data Dependency is the relationship between the data stored in the tables.
It has the following types:

Functional Dependency

Fully-Functional Dependency

Transitive Dependency

Multivalued Dependency

Partial Dependency

If the information stored in a table can uniquely determine another information in

the same table, then it is called Functional Dependency. Consider it as an
association between two attributes of the same relation.
If P functionally determines Q, then

P -> Q

Example
In the above table, EmpName is functionally dependent on EmpID because

EmpName can take only one value for the given value of EmpID:
EmpID -> EmpName

Fully-functionally Dependency
An attribute is fully functional dependent on another attribute, if it is Functionally
Dependent on that attribute and not on any of its proper subset.
For example, an attribute Q is fully functional dependent on another attribute P,

if it is Functionally Dependent on P and not on any of the proper subset of P.
{EmpID, ProjectID} Â -> (Days)

Transitive Dependency
When an indirect relationship causes functional dependency it is called

Transitive Dependency.
If P -> Q and Q -> R is true, then P-> R is a transitive dependency.

Multivalued Dependency
When existence of one or more rows in a table implies one or more other rows
in the same table, then the Multi-valued dependencies occur.
If a table has attributes P, Q and R, then Q and R are multi-valued facts of P.
It is represented by double arrow − →→
Example:
P->->Q
Q→→R
In the above case, Multivalued Dependency exists only if Q and R are

independent attributes. Prepared by : Dr. Ahmad Jamal
Multivalued Dependency occurs when two attributes in a table are independent of
each other but are dependent on the third attribute.
Partial Dependency
Partial Dependency occurs when a nonprime attribute is functionally dependent
on part of a candidate key.
In the above table, we have partial dependency; let us see how:
The prime key attributes are StudentID and ProjectNo.

As stated, the non-prime attributes i.e. StudentName and ProjectName should be
functionally dependent on part of a candidate key, to be Partial Dependent.
The StudentName can be determined by StudentID that makes the relation Partial
Dependent.
The ProjectName can be determined by ProjectID, which that the relation Partial
Dependent. Prepared by : Dr. Ahmad Jamal
Normalization
Normalization is the process of organizing the data and the attributes of a
database. It is performed to reduce the data redundancy in a database and to
ensure that data is stored logically. Data redundancy in DBMS means having
the same data but at multiple places. It is necessary to remove data redundancy
because it causes anomalies in a database which makes it very hard for a
database administrator to maintain it.
Normal Forms

First Normal Form (1NF)
A relation is in 1NF if every attribute is a single-valued attribute or it does not

contain any multi-valued or composite attribute, i.e., every attribute is an atomic
attribute. If there is a composite or multi-valued attribute, it violates the 1NF. To
solve this, we can create a new row for each of the values of the multi-valued
attribute to convert the table into the 1NF.

Here, the Employee Phone Number is a multi-valued attribute. So, this relation is not in 1NF.
To convert this table into 1NF, we make new rows with each Employee Phone Number as a new row as
In next slide:

1NF

Second Normal Form (2NF)
The normalization of 1NF relations to 2NF involves the elimination of partial
dependencies. A partial dependency in DBMS exists when any non-prime
attributes, i.e., an attribute not a part of the candidate key, is not fully
functionally dependent on one of the candidate keys.
For a relational table to be in second normal form, it must satisfy the following
rules:

The table must be in first normal form.

It must not contain any partial dependency.

If a partial dependency exists, we can divide the table to remove the partially
dependent attributes and move them to some other table where they fit in
well.
Table Name: EmployeeProjectDetail
In the above table, the prime attributes of the table are Employee Code and Project ID. We
have partial dependencies in this table because Employee Name can be determined by
Employee Code and Project Name can be determined by Project ID. Thus, the above
relational table violates the rule of 2NF.

To remove partial dependencies from this table and normalize it into second
normal form, we can decompose the EmployeeProjectDetail table into the
following three tables:

Third Normal Form (3NF)
The normalization of 2NF relations to 3NF involves the elimination of transitive
dependencies in DBMS.
For a relational table to be in third normal form, it must satisfy the following
rules:

The table must be in the second normal form.

No non-prime attribute is transitively dependent on the primary key.

For each functional dependency X -> Z at least one of the following conditions
hold:
1: X is a super key of the table.
2: Z is a prime attribute of the table.
If a transitive dependency exists, we can divide the table to remove the

transitively dependent attributes and place them to a new table along with a
copy of the determinant.
If a transitive dependency exists, we can divide the table to remove the
transitively dependent attributes and place them to a new table along with a
copy of the determinant.
The above table is not in 3NF because it has Employee Code -> Employee City transitive
dependency because:
Employee Code -> Employee Zipcode
Employee Zipcode -> Employee City
Also, Employee Zipcode is not a super key and Employee City is not a prime attribute.
To remove transitive dependency from this table and normalize it into the third normal form,
we can decompose the <EmployeeDetail> table into the following two tables:
Thus, we’ve converted the <EmployeeDetail>
table into 3NF by decomposing it into
<EmployeeDetail> and <EmployeeLocation>
tables as they are in 2NF and they don’t have
any transitive dependency.
Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form(BCNF) is an advanced version of 3NF as it contains
additional constraints compared to 3NF.
For a relational table to be in Boyce-Codd normal form, it must satisfy the

following rules:
The table must be in the third normal form.

For every non-trivial functional dependency X -> Y, X is the superkey of the
table. That means X cannot be a non-prime attribute if Y is a prime attribute.
A superkey is a set of one or more attributes that can uniquely identify a row in
a database table.

The above table satisfies all the normal forms till 3NF, but it violates the rules of BCNF
because the candidate key of the above table is {Employee Code, Project ID}. For the non-
trivial functional dependency, Project Leader -> Project ID, Project ID is a prime attribute
but Project Leader is a non-prime attribute. This is not allowed in BCNF.
To convert the given table into BCNF, we decompose it into three tables:
Thus, we have converted the
<EmployeeProjectLead> table into BCNF by
decomposing it into <EmployeeProject> and
<ProjectLead> tables. Prepared by : Dr. Ahmad Jamal
Armstrong’s Axioms
William Armstrong in 1974 suggested a few rules related to functional
dependency. They are called RAT rules.
1. Reflexivity: If A is a set of attributes and B is a subset of A, then the

functional dependency A → B holds true.
For example, { Employee_Id, Name } → Name
2. Augmentation: If a functional dependency A → B holds true, then appending

any number of the attribute to both sides of dependency doesn't affect the
dependency. It remains true.
For example, X → Y holds true then, ZX → ZY also holds true.

For example, if { Employee_Id, Name } → { Name } holds true then,
{ Employee_Id, Name, Age } → { Name, Age } Prepared by : Dr. Ahmad Jamal
3. Transitivity: If two functional dependencies X → Y and Y → Z hold true, then
X → Z also holds true by the rule of Transitivity.
For example, if { Employee_Id } → { Name } holds true

and { Name } → { Department } holds true,
then { Employee_Id } → { Department } also holds true.

Decomposition
Decomposition in Database Management System is to break a relation into
multiple relations to bring it into an appropriate normal form. It helps to remove
redundancy, inconsistencies, and anomalies from a database. The
decomposition of a relation R in a relational schema is the process of replacing
the original relation R with two or more relations in a relational schema. Each of
these relations contains a subset of the attributes of R and together they include
all attributes of R.
If a relation is not properly decomposed, then it may lead to other problems like
information loss, etc. There are two types of decomposition as shown below:

Rules for Decomposition
Whenever we decompose a relation, there are certain properties that must be

satisfied to ensure no information is lost while decomposing the relations. These
properties are:

1: Lossless Join Decomposition

2: Dependency Preserving

1: Lossless Join Decomposition
A lossless Join decomposition ensures two things:

No information is lost while decomposing from the original relation.


If we join back the sub decomposed relations, the same relation that was
decomposed is obtained.


We can follow certain rules to ensure that the decomposition is a lossless join
decomposition

Let’s say we have a relation R and we decomposed it into R1 and R2, then
the rules are:

The union of attributes of both the sub relations R1 and R2 must contain all
the attributes of original relation R.

R1 ∪ R2 = R

The intersection of attributes of both the sub relations R1 and R2 must not be
null, i.e., there should be some attributes that are present in both R1 and R2.

R1 ∩ R2 ≠ ∅

The intersection of attributes of both the sub relations R1 and R2 must be the
superkey of R1 or R2, or both R1 and R2.

R1 ∩ R2 = Super key of R1 or R2

Lossless vs Lossy Decomposition
In a lossy decomposition, one or more of these conditions would fail and we will
not be able to recover Complete information as present in the original relation.

2: Dependency Preserving
The second property of lossless decomposition is dependency preservation

which says that after decomposing a relation R into R1 and R2, all
dependencies of the original relation R must be present either in R1 or R2 or
they must be derivable using the combination of functional dependencies
present in R1 and R2.

What we learn till now
Conclusion

Decomposition is the process of breaking an original relation into multiple
sub relations.

Decomposition helps to remove anomalies redundancy, and other problems
in a DBMS.

Decomposition can be lossy or lossless.

An ideal decomposition should be lossless join decomposition and
dependency preserving.

Query Processing and Optimization
Query Processing is the activity performed in extracting data from the database.
The activities involved in parsing, validating, execution and optimizing a query is called
Query Processing.
Steps:
The steps involved in query processing and optimization are as follows:

A sequence of primitive operations that can be used to evaluate a query is called
query execution plan or query evaluation plan.

The query execution engine takes a query evaluation plan, executes that plan and
produces the desired output. The different execution plans for a given query can
have different costs based on the number of disks. It is the responsibility of the
system to construct a query evaluation plan which minimizes the cost of query
evaluation. This task is called query optimization.


A query optimization is expressed in high level query language and is
scanned, parsed and validated. The scanner identifies the SQL keywords,
attributes and relation names in the text of the query.

The parser checks the syntax which is used to determine if the query is
formulated according to syntax rules of the query language.

Finally, the query is evaluated by checking that all attributes and relation
names are valid and semantically meaningful in the schema of a particular
database.

An internal representation of the query is then created which is either a tree
or a graph known as query tree or query graph.

If the query written SQL is translated into relational algebra, then its internal
representation is a query tree. Otherwise, if TRC or DRC its internal
representation is a query graph. A graph has many possible execution
strategies and the process of choosing a strategy with minimum cost is
called query optimization.
The query processing and optimization in the DBMS are
explained in the form of a diagram below −

Evaluation of relational algebra expression
SQL queries are decomposed into query blocks. One query block contains a
single SELECT-FROM-WHERE expression, as well as GROUP BY and
HAVING clause (if any). Nested queries are split into separate query blocks.
Select lastname, firstname from employee where salary>(select max(salary)

from employee where deptname =CSE ;
C=(select max(salary) from employee where deptname=CSE); // inner block
Select lastname, firstname from employee where salary>c; //outer block
Where C represents the result returned from the inner block.

The relation algebra for the inner block is
Ģmax(salary) (σdname=CSE(employee))
The relation algebra for the outer blocks is
Πlastname, firstname(σsalary>c(employee))
The query optimizer would then choose an execution or evaluation plan for
each block. Prepared by : Dr. Ahmad Jamal
Materialized evaluation
Evaluate one operation at a time. Evaluate the expression in a bottom-up

manner and stores intermediate results to temporary files.
Store the result of A ⋈ B in a temporary file.

Store the result of C ⋈ D in a temporary file.
Finally, join the results stored in temporary files.
The overall cost=sum of costs of individual operations + cost of writing
intermediate results to disk, cost of writing results to results to temporary files
and reading them back is quite high.
Pipelined evaluation
Evaluate several operations simultaneously. Result of one operation is passed

to the next operation. Evaluate the expression in a bottom-up manner and
don’t store intermediate results to temporary files.
Don’t store the result of A ⋈ B in a temporary file. Instead the result is passed
directly for projection with C and so on.
Query
Equivalence
Query equivalence deals with queries that are equal.
This means the output of the two queries you consider will be the same.
For example,
Consider a table called employee with fields like name, age, salary and
department.
Query 1:
SELECT * FROM (SELECT * FROM EMPLOYEE WHERE AGE>30)
WHERE SALARY>50000;
Query 2:
SELECT * FROM (SELECT * FROM EMPLOYEE WHERE SALARY>50000)
WHERE AGE>30;
These two queries output will be the records of the employees whose salary is
greater than 50000 and age is greater than 30.
“The equivalence rule says that expressions of two forms are the same or
equivalent because both expressions produce the same outputs on any legal
database instance. It means that we can possibly replace the expression of
the first form with that of the second form and replace the expression of the
second form with an expression of the first form. Thus, the optimizer of the
query-evaluation plan uses such an equivalence rule or method for
transforming expressions into the logically equivalent one.”

Joins
Join is a combination of a Cartesian product followed by a selection process. A
Join operation pairs two tuples from different relations, if and only if a given
join condition is satisfied.
Join in DBMS is a binary operation which allows you to combine join product
and selection in one single statement. The goal of creating a join condition is
that it helps you to combine the data from two or more DBMS tables.
Types of Join
There are mainly two types of joins in DBMS:

Inner Joins: Theta, Natural, EQUI

Outer Join: Left, Right, Full

Inner Join
Inner Join is used to return rows from both tables which satisfy the given
condition. It is the most widely used join operation and can be considered as a
default join-type.
Inner Join further divided into three subtypes:

Theta join

Natural join

EQUI join

Theta Join
Theta Join allows you to merge two tables based on the condition represented
by theta. Theta joins work for all comparison operators. It is denoted by
symbol θ. The general case of JOIN operation is called a Theta join.
Syntax:
A ⋈θ B
Theta join can use any conditions in the selection criteria.
Condition can be <,> and =.

STUDENT ⋈Student.Std = Subject.Class SUBJECT

Equijoin
When Theta join uses only equality comparison operator, it is said to be

equijoin.

Natural Join
Natural join does not use any comparison operator. It does not concatenate
the way a Cartesian product does. We can perform a Natural Join only if there
is at least one common attribute that exists between two relations. In addition,
the attributes must have the same name and domain.
Natural join acts on those matching attributes where the values of attributes in
both the relations are same.

Outer Joins
Left Outer Join

(R Left Outer Join S)
All the tuples from the Left relation, R, are included in the resulting relation. If
there are tuples in R without any matching tuple in the Right relation S, then
the S-attributes of the resulting relation are made NULL.

Right Outer Join
All the tuples from the Right relation, S, are included in the resulting relation. If
there are tuples in S without any matching tuple in R, then the R-attributes of
resulting relation are made NULL.

Full Outer Join
All the tuples from both participating relations are included in the resulting
relation. If there are no matching tuples for both relations, their respective
unmatched attributes are made NULL.

Unit 2

Uploaded by

Copyright:

Available Formats

Unit 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 2

Uploaded by

Copyright:

Available Formats

Relational Query Languages

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

σSTD_NAME = “James” (STUDENT)

σdept_id = 20 AND salary>=10000 (EMPLOYEE) –

Selects the records from EMPLOYEE table with department ID = 20 and

Prepared by : Dr. Ahmad Jamal

∏a1, a2, a3 (r)

Prepared by : Dr. Ahmad Jamal

∏STD_ID, address, course (σ STD_NAME = “James”(STUDENT))

Prepared by : Dr. Ahmad Jamal

Renames STD_TABLE table to STUDENT

ρ STD_ID, STD_NAME, STD_ADDRESS(STUDENT)

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Where R and S are the relations and U is the operator.

Prepared by : Dr. Ahmad Jamal

Where DESIGN_EMPLOYEE and TESTING_EMPLOYEE are two relations.

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Set-difference (-) – This is a binary operator. This operator creates a new

Where R and S are the relations.

Prepared by : Dr. Ahmad Jamal

In case we want to retrieve the employees who are working in Design

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

The Tuple Relational Calculus expression Syntax

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Solution: {<x1,x2,x3> \| <x1,x2> ∈ Customer ∧ x3 = 12345 }

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

The most central data modelling notions included in SQL3 are:

Prepared by : Dr. Ahmad Jamal

Oracle, DB2, Splunk etc.

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the name of two

Prepared by : Dr. Ahmad Jamal

Prepared by : Dr. Ahmad Jamal

Types of Domain Constraints

Prepared by : Dr. Ahmad Jamal

It has the following types:

Prepared by : Dr. Ahmad Jamal

If the information stored in a table can uniquely determine another information in

If P functionally determines Q, then

Prepared by : Dr. Ahmad Jamal