Normalization
Normalization
Normalization
Normalization
Normalization is a process of organizing data in a database to reduce
redundancy and improve data consistency. Primary keys are really important in
organizing information in a database. They help to make sure that every row in
a table has a unique identification so that nothing gets mixed up or lost.
Normalization is the process of organizing data in a database to minimize
redundancy and dependency. In database design, there are different normal
forms based on the primary keys of a table.
A table is in 1NF if each column contains atomic values and each row is uniquely
identified. For example, a table that lists customers and their phone numbers −
2 Jane 555-9876
3 Michael 555-5555
This violates 1NF because the Phone Numbers column contains repeating
groups.
To normalize this table to 1NF, we can split the Phone Numbers column into
separate rows and add a separate primary key column −
1 John 555-1234
1 John 555-5678
2 Jane 555-9876
3 Michael 555-5555
This is the single table storing all the data related to a college
student. Storing data in this way leads to the following problems:
If you want to “insert” a name and ID of a new student, you
cannot do it until you don’t have his branch name and
branch code. This is called Insertion Anomaly.
If you want to “delete” a student name, then the branch
name and code will also be deleted. The deleted branch
name and code cannot be recovered again. This is called
Deletion Anomaly.
If you want to “update” the branch name of John from CS to
Civil, the update for Brat will also happen. Thus, multiple
updations of single data occur. This is called Updation
Anomaly.
For example, if there are multiple branch names for a student, they
must be converted to a single-valued attribute as shown below.
Here,
Both the above tables are in 1NF. Also, the Functional Dependencies
are Customer ID->Product Name and Product Name->Price in which
there is no Partial Dependency. Also, the price of the Fan is not
stored multiple times. So, Data Redundancy is removed.
Now, let’s see how to convert a table into the Third Normal Form
(3NF) of Normalization in DBMS. Suppose there is a table containing
the data of students as follows:
To convert this into 3NF, we decompose the relation into two tables
as below.
It exists in 3NF.
For every Functional Dependency A->B in the table, A is a
Super Key.
Let’s see how to identify whether a Table exists in BCNF or not using
an example. Suppose there is a table named ‘Customer Service’
which contains the data about the Product, Customer, and Seller.
We can observe here that despite Normalization up to 3NF, there is
Data Redundancy in the table.
1. Since all the values in the columns are atomic, the Table is
in 1NF.
2. The dependency Customer Name ->Seller Name is not
possible because One Customer may purchase from more
than one seller. Thus, there is no Partial Dependency i.e.
seller name(non-prime attribute) does not depend on
Customer Name(a subset of Candidate Key). Thus, the
Table is in 2NF.
3. Functional Dependencies present in the table are: