Lecture 5 Examples

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

IE 3425 Eng’g.

Database
Lecture 5 Examples
Topics: Normalization

© Professor Zeid, Northeastern University


Dependencies

 A dependency is a constraint that applies to or defines the


relationship between attributes.
 It occurs in a database when information stored in the same
database table uniquely determines other information stored
in the same table.
 You can also describe this as a relationship where knowing
the value of one attribute (or a set of attributes) is enough to
tell you the value of another attribute (or set of attributes) in
the same table.
 The desired dependency in database design is that a table
attribute should depend on it’s PK or CPK.
Types of Dependencies

 Three types of non-PK dependencies exist:


1. Partial dependency: exists only if the table has a CPK where part
of CPK determines another table attribute
2. Transitive dependency: a non-PK attribute determines a table
attribute
3. Non-Key dependency: a non-key attribute determines a PK or
part of a CPK
 Class discussion
 These are undesired dependencies; why? See next slide
Data Redundancy
THINK: What is Data Redundancy? Can you define it?

 Partial and transitive dependencies mean that there is data


redundancy
 Data redundancy leads to anomalies or problems in table updating,
deleting, or searching records
 Class discussion
 See example 1
Normalization
 Normalization is a formal way of checking the fields to ensure
they are in the right table or to see if we might need to
restructure restructure or create additional tables to help keep
the data accurate
 The initial idea of normalization was first proposed by E. F.
Codd in 1970 and has been a cornerstone of relational database
design since then

The goals of normalization are to:


 Be able to characterize the level of redundancy in a relational
schema
 Provide mechanisms for transforming schemas in order to
remove redundancy
Normalization
 Normalization should be part of the database design process.
However, it is difficult to separate the normalization process from
the ERD so the two techniques should be used concurrently.
 Use an entity relation diagram (ERD) to provide the big picture, or
macro view, of an organization’s data requirements and
operations. This is created through an iterative process that
involves identifying relevant entities, their attributes and their
relationships.
 Relational schema shows table attributes, PKs, FKs, connectivity.
 Neither ERD or relational schema shows if table attributes depend
on each other or not.
 Normalization procedure focuses on characteristics of specific
entities and detecting and solving the attribute dependency
problem.
Normalization

 We write a dependency as follows:


 Y→X reads “ Y determines X” (Y is the determinant of X)
 Y → X, Z reads “Y determines X and Z”
 Dependencies based on PK or CPK are good and desirable (by design)
 Dependencies based on non-PK or non-CPK are bad ones and should be
avoided. Class discussion: what about derived attributes?
Table Types
 Table type is known as table Normal Form (NF) when considering types of dependencies
 Four types of table NF exists:
 Worst → 1NF, 2NF, 3NF, BCNF ← Best
best type
------------------------------------------------>
worst type
<------------------------------------------------
 1NF: has at least partial dependency (part of CPK → attribute). If table does not have
CPK, then it is automatically 2NF at least
 2NF: has at least transitive dependency (non-key attribute → non-key attribute), but no
partial dependency
 3NF: has non-key attribute → a PK or part of a CPK
 BCNF: has no dependency at all
 Ideally, we only want minimal redundancy for PK to FK. Everything else should be derived
from other tables.
Normalization steps
 Sketch the dependency diagram for each DB table to
start the normalization process
 We perform 3 steps to normalize a DB table:
1. Show PK or CPK attribute
2. Write dependent attributes using step 1
3.Break all transitive dependencies and leave a copy of
determinant in original table
 Class discussion:
 Whatis a dependency diagram?
 How many dependency diagrams does a DB have?
 See Example 1
Example 1: Normalization – Step 1

invoice_number invoice_line product_code product_name customer_number invoice_total

Transitive Dependency

Partial Dependency

THINK: Which of these are good dependencies?


Step 1: Show PK or CPK attributes separately:

invoice_number

invoice_line
Class discussion
invoice_number invoice_line

THINK: What is/are the PKs and what is the CPK?


Example 1: Normalization – Step 2
invoice_number

invoice_line
This step eliminates
partial dependency
invoice_number invoice_line

Step 2: Write dependent attributes next to boxes of Step 1:


No dependency,
invoice_number customer_number invoice_line DROP IT!

Partial dependency
invoice_number invoice_line product_code product_name invoice_total (customer_number)
removed
Transitive Dependency
Example 1: Normalization – Step 3

invoice_number invoice_line product_code product_name invoice_total

Transitive Dependency

Step 3: Break all transitive dependencies to become new tables and leave a copy of determinant
in original table:

invoice_number customer_number invoice table , 3NF

product_code product_name Final


product table , 3NF
Design

invoice_number invoice_line product_code invoice_total line table , 3NF

Class discussion: Is this a good design of invoice table?


BCNF
 Special case of 3NF
 3NF may allow non-key dependencies, but BCNF has no
dependency
 Only in rare occasions that a 3NF is not a BCNF
 Process (Steps) to convert 3NF to BCNF:
1. Switch attributes: make the one that determines PK
a PK. This creates a partial dependency
2. Remove partial dependency
 Class discussion:
 Think of a 3NF table that is not a BCNF
 See Example 2
Example 2: Normalization – BCNF

Table is 3NF,
A B C D E F G
not BCNF
Non-key Dependency
1. Switch attributes

A E C D B F G

Partial Dependency
2. Remove partial dependency No dependency,
THINK: When may a
Final
A DROP IT! Design 3NF table not be a
BCNF?
E B

A E C D F G
Example 3:Additional 3NF Normalization
Example
Consider a table in the following format:

What problems do you see with the


table format?

Note: the naming convention for example 3 is different from the one that has been discussed
and adopted in this course
Example 3: Step 1 Conversion to 1NF

Eliminate repeating groups: each row


should define a single entity

Identify the Primary Key: PK must uniquely identify any attribute value

All attributes are dependent on PROJ_NUM + EMP_NUM


Example 3: Step 1 Conversion to 1NF
Dependency Diagram

1NF

Transitive
Partial
dependency
dependency
Partial
dependencies
Example 3: Step 2 Conversion to 2NF
Dependency Diagram
Eliminate partial dependency

2NF
Example 3: Step 2 Conversion to 2NF
Dependency Diagram

3NF
Example 4: Additional BCNF
Normalization Example

The table is in 3NF


and it contains a
non-key attribute
that determines a
key attribute

BCNF

Note: the naming convention for example 4 is different from the one that has been discussed
and adopted in this course

You might also like