5 normalizationDBMS
5 normalizationDBMS
5 normalizationDBMS
The Data reflects the assignment of employees to projects .An the employee can be
assigned to more than one project.
Example: Darlene M.Smithson (EMP_NUM =112) has been assigned to two project
Amber wave and star flight.
The structure of data set in the above table has the following deficiencies
1. The project number (PROJ_NUM) is intended to be a primary key or at least a
part of primary key, but it contains nulls.
2. The table entries invite data inconsistencies. example , The
JOB_CLASS_VALUE “ Elect engineer “ might be entered as “Elect .Engg”.
3. The table displays data redundancies .Those redundancies yield the following
anomalies :
❖ Update Anomalies : Modifying JOB CLASS for Employee number 105
requires many alterations , one for each EMP_NUM=105
1
DATABASE MANAGEMENT SYSTEMS
Functional Dependency
The attribute B is fully functionally dependent on the attribute A if each value of A determines
one and only one value of B.
Example : PROJ_NUM ? PROJ_NAME (read as PROJ_NUM functionally determines
PROJ_NAME)
In this case, the attribute PROJ_NUM is known as the determinant attribute and the attribute
PROJ_NAME is known as the dependent attribute.
Fully Functional Dependency: If the attribute B is functionally dependent on a composite key A but
not on any subset of that composite key, the attribute B is fully functionally dependent on A.
2
DATABASE MANAGEMENT SYSTEMS
The below table contain multivalued attributes .so the above relation is not in 1NF
Table 1:
3
DATABASE MANAGEMENT SYSTEMS
Converting to 2NF is done only when the 1NF has a composite primary key .If the 1NF has a single
attribute primary key , then the table is automatically in 2NF.The 1NF to 2NF conversion is simple
starting with the 1NF format example above figure 1NF
Step 1 : Write each key component on a separate line ;
• Write each key component on a separate line , then write the original key on the last line
• Example :
PROJ_NUM
EMP_NUM
PROJ_NUM EMP_NUM
• Each component will become the key in a new table i.e, the original table is now divided into
three tables (PROJECT,EMPLOYEE and ASSIGNMENT )
4
DATABASE MANAGEMENT SYSTEMS
The results of step 1 and step are shown in below figure .Here Most of the anomalies have been
eliminated .For example ,if we want to add ,change or delete a PROJECT record , we need to go to
only to the PROJECT table and make the change to only one record
FIGURE 2NF
NOTE : When a table’s primary key is composed of several attributes then the partial dependency
can exist .
A table whose primary key consists of only a single attribute is automatically in 2NF once it is in
1NF
THIRD NORMAL FORM :
A relation is in 3NF if and only if it is in 2NF and no transitive dependencies exist .
A transitive dependency is a functional dependency between two non-key attributes
CONVERSION TO THIRD NORMAL FORM:
The 2NF to 3NF conversion is simple starting with the 2NF format example shown in above figure
2NF
The anomalies resulted in Database organization shown in figure 2NF are easily eliminated by
completing the following three steps :
Step 1:Identify Each New Determinant :
• For every transitive dependency , write its determinant as a PK for a new Table .
• A determinant is any attribute whose value determines other values within the row .
• For example ,The figure 2NF shows only one table that contains a transitive dependency .
Therefore write the determinant for this transitive dependency as : JOB_CLASS
Step 2 : Identify the Dependent Attributes :
Identify the attributes that are dependent on each determinant identified in step 1 and identify the
dependency
In this example : JOB_CLASS → CHG_HOUR
5
DATABASE MANAGEMENT SYSTEMS
Name the table to reflect its contents and function .In this case table name is JOB
STEP 3: Remove the Dependent Attribute from Transitive Dependencies :
Eliminate all dependent attributes in the transitive relationships from each of the tables that have such
a transitive relationship .
In this example , eliminate CHG_HOUR from the EMPLOYEE table
i.e.,
EMP_NUM → EMP_NAME , JOB_CLASS
JOB_CLASS remains in the EMPLOYEE table to serve as a foreign key (FK).
After the 3NF conversion has been completed , the database contains four tables :
PROJECT (PROJ_NUM , PROJ_NAME)
EMPLOYEE (EMP_NUM , EMP_NAME ,JOB_CLASS )
JOB (JOB_CLASS ,CHG_HOUR)
ASSIGNMENT(PROJ_NUM,EMP_NUM ,ASSIGN_HOURS)
6
DATABASE MANAGEMENT SYSTEMS
The data entries in the above table are inappropriate because they duplicate existing records yet there
is no violation of either the entity integrity or referential integrity. This “Multiple duplicate records “
problem was created when the JOB_CODE attribute was added as the PK .In any case , if JOB_CODE
is to be the surrogate PK , we still must ensure the existence of unique values in the
JOB_DESCRIPTION through the use of a unique index
Higher Normal Forms :
The BOYCE-CODD NORMAL FORM(BCNF):
When a relation has more than one candidate key, anomalies may present even though the relation is
in 3NF .
A relation is said to be in BCNF if and only if every determinant in the table is a candidate key.
That is When a table contains only one candidate key, the 3NF and BCNF are equivalent .
7
DATABASE MANAGEMENT SYSTEMS
STEP 1: Modify the relation by making determinant as the component of the primary key
as illustrated below
Revised Table :
Step 2:If we examine above modified table , we will discover that revised table has a partial
functional dependency , so as a second step decompose the table to eliminate the Partial functional
dependency as shown below
9
DATABASE MANAGEMENT SYSTEMS
TO represent the M:N relationship between EMPLOYEE and PROJECT , we might think that two
1:M relationships could be used –An can be assigned to many projects , and each project can have
many employees assigned to it .
Incorrect M:N Relationship representation
10
DATABASE MANAGEMENT SYSTEMS
DENORMALIZATION :
Denormalization: Denormalization is the process of combining two or more tables into a single table
by using denormalization process. We can reduce the data in memory space or storage space. In
general, Denormalization may partition a relation into several physical records, may combine
attributes from several relations together into one physical record, or may do combination of both.
The problem with normalization is that as tables are decomposed to
conform to normalization requirements, the number of database tables expands. Therefore, in order to
generate information data must be put together from various tables. Joining a large number of tables
takes additional input output (I/O) operations and processing logic, thereby reducing the system speed
The database design process could, in some cases, introduce some small degree of redundant data in
the model. This, in effect, creates “denormalized” relations.
Common Denormalization Examples
Case Example Rationale and Controls
Redundant data Storing ZIP ,CITY attributes in • Avoid extra join operations
the CUSTOMER table when ZIP • program can validate city (drop
determines CITY down box) based on the zip code
Pre-aggregated Storing the student grade point • Avoid extra join operations
data (also average(STU_GPA) aggregate
derived data) value in the STUDENT table • Program computes the GPA every
when this can be calculated from time a grade is entered or updated
the ENROLL and COURSE
tables
• STU_GPA can be updated only via
administrative routine .
11
DATABASE MANAGEMENT SYSTEMS
In this case there is enough storage space , the designer’s choices could be narrowed down to
• Store the data in permanent denormalized table
• Create temporary denormalized table from the permanent normalized tables .The denormalized
tables exist only as long as it takes to generate the report .
12