DBMS Module-IV
DBMS Module-IV
DBMS Module-IV
Module -4
Functional Dependencies & Normalization for Relational
Databases
Functional Dependence:
- The attribute B is functionally dependent on A if A determines B. “A determines B”
indicates that knowing the value of attribute A means that we can look up (determine)
the value of attribute B. In this case, attribute A is the called the determinant.
- For example, knowing the PROD_CODE in the PRODUCT table in Figure 1 means
we can look up the product’s description, product price and so on. The notation we use
for “A determines B” is:
AB if A determines B
A B, C, D if A determines B, C & D
PROD_CODE PROD_DESCRIPT
- Normalization
- Normalization is a technique used to design table structures in which data redundancies
are minimized/eliminated.
- Normalization works through a series of stages called normal forms. The first three stages
are described as first normal form (1NF), second normal form (2NF), and third normal
form (3NF).
- From a structural point of view, 2NF is better than 1NF, and 3NF is better than 2NF.
- For most business database design purposes, 3NF is as high as we need to go in the
normalization process.
- You should not assume that the highest level of normalization is always the most
desirable. Generally, the higher the normal form, the more joins are required to produce a
specified output and the more slowly the database responds to end user demands.
Therefore, you will occasionally be expected to denormalize some table structures to
avoid performance degradation.
- Denormalization is the reverse process of normalization and it produces a lower normal
form; that is, 3NF can be denormalized to 2NF and 2NF can be denormalized to 1NF.
- The table in Figure 2 contains information about projects and employees working on
these projects. This table does not conform to the Relational Database Model we learned
in Chapter 5 for the following reasons:
3. All attributes are dependent on the primary key. Note that the primary key
determines any other attribute in the table.
- Note that all tables conforming to the Relational Model we learned in Chapter 5 are in
1NF by default.
- Figure 4 below displays the conversion of the table in Figure 2 to 1NF by satisfying the above
three conditions.
- Note that the table in Figure 4 satisfies the three conditions. The first condition is satisfied
because the table has a composite primary key consisting of the key attributes PROJ_NUM
and EMP_NUM. The second condition is satisfied because the table no longer contains
repeating groups. The third condition is satisfied because any attribute is dependent on the
primary key. For example, EMP_NAME is dependent on the primary key PROJ_NUM,
EMP_NUM because if we know PROJ_NUM and EMP_NUM (15 and 103 for example), we
can determine EMP_NAME (J. Arbough).
- Dependencies between attributes for the table in Figure 4 can be identified with the help of
the dependency diagram as shown in Figure 5 below:
3. The arrows below the dependency diagram indicate less desirable dependencies:
- A table is said to be in second normal form (2NF) if it satisfies the following conditions:
1. It is in 1NF.
2. It includes no partial dependencies; that is, no attribute is dependent on a portion of the
primary key.
- Let us convert the table in Figure 4 from 1NF to 2NF. To do that, follow these steps:
1. Write each key attribute (recall from Chapter 2 that a key attribute is an attribute
that is part of the key) on a separate line, then write the primary key on the last
line:
PROJ_NUM
EMP_NUM
PROJ_NUM EMP_NUM
The attributes on each line will become keys in a new table. In other words, the
original table is now split into three tables. We’ll call these tables PROJECT,
EMPLOYEE and ASSIGN respectively.
2. The attribute(s) on each line will form the primary key for the new tables. The
rest of the attributes for each table can be determined from the dependency
diagram:
- Note that the new tables are in 2NF because each table is in 1NF and none of the tables
contain partial dependencies. The dependency diagram for the new tables is shown in Figure
6:
- Note that the conversion to 2NF did not eliminate the transitive dependency (the conversion
to 3NF will eliminate it). This dependency causes data redundancy if multiple employees
have the same JOB_CLASS and it will therefore create the data anomalies we learned in
Chapter 1.
- Because a partial dependency applies only to tables with composite primary keys, a table
whose primary key consists of only a single attribute must automatically be in 2NF if it is in
1NF.
- To eliminate the transitive dependency, the table must be converted to 3NF. A table is said to
be in 3NF if it satisfies the following two conditions:
1. It it is in 2NF.
2. It contains no transitive dependencies.
- Note that in Figure 6, the EMPLOYEE table is the one that contains transitive dependency:
- To eliminate the transitive dependency, simply move the attributes causing data redundancy
(namely JOB_CLASS and CHG_HOUR) into a new table called JOB. Then in the
EMPLOYEE table, define a foreign key that will be used to link it to the JOB table. In this
case, the foreign key will be JOB_CLASS. The result of the conversion to 3NF will be the
following tables:
- Note that the conversion to 3NF has eliminated the original EMPLOYEE table’s transitive
dependency; the tables are now said to be in third normal form (3NF).
A B C D
A, B C, D
CB
- Note that the table structure in Figure 7 has no partial dependencies nor does it contain
transitive dependencies (Note that C B indicates that a nonkey attribute determines a
key attribute and this dependency is neither partial nor transitive).
- Therefore, the table structure in Figure 7 is in 3NF but not in BCNF because of the
dependency C B.
- To convert the table structure in Figure 7 to BCNF, follow the following procedure:
A C D
C B
- The table structure above is in 3NF but not BCNF because CLASS_CODE is a
determinant but not a superkey key. Why? To convert this table structure, we apply the
above procedure by creating two tables. The first table will contain STU_ID,
CLASS_CODE and ENROLL_GRADE (with STU_ID and CLASS_CODE as the primary
key). The second table will contain the attributes CLASS_CODE and STAFF_ID with
CLASS_CODE as the primary key. The result of the conversion to BCNF is shown in
Figure 10 below.
CLASS_CODE STAFF_ID