SS2 TERM 1

DATA MODELS
The term data model can be used in two related senses, as follows,
 It is a description of the objects represented by a computer system together with their
properties and relationship which are typically real world objects, such as products,
supplies, customer, order etc.
 It means a collection of concepts and rules used in defining data models eg. The
relational model uses relations and tuples while the network model uses records sets and
fields.
Definition:-A data model is a conceptual representation of the data structure that is required by a
database.
The data structure includes the data objects the associations between data objects and the rules
that govern operations on the objects. As the name implies, the data model focuses on what data
is required and how it should be organized rather than what operations will be performed on the
data.
A data model is independent of hardware or software constraints. Rather than representing the
data as a database would see it the data model focuses on representing the data as the user sees it
in the real world. The data model serves as a bridge between the concepts that make up real-
world events and processes and the physical representation of those concepts in a database. A
common analogy is that a data model is equivalent to an architect’s building plans.
Data models are often used as an aid of communication between the business people
define the requirements for a computer system and the technical people defining the design in
response to those requirements. They are used to show the data needed and created by business
processes.
A data model can be thought of as a diagram or flowchart that illustrates the relationships
between data. Although capturing all the possible relationships in a data model can be very time-
intensive, it is an important step and should not be rushed. Well documented models allow stake-
holders to identify errors and make changes before any programming code has been written.
Data modellers often use multiple models to view the same data and ensure that all
processes, entities, relationship and data flow have been identified. Data model is based on data,
data relationship, data semantic and data constraint. It provides the details of information to be
stored and is of primary use where the final product is the generation of computer software code
for an application or the preparation of a functional specification to aid a computer software
make-or- long decision..
There are three different types of data models produced while progressing from requirements to
the actual database to be used for information system. They are,
Conceptual data model:- the data requirements are initially recorded as a conceptual data
model which is a set of technology independent specifications about the data and is used to
discuss initial requirement with the business stakeholders.
Conceptual Data Model
A conceptual data model identifies the highest-level relationships between the different entities.
Features of conceptual data model include:
 Includes the important entities and the relationships among them.
 No attribute is specified.
 No primary key is specified.
The figure below is an example of a conceptual data model.
1
Conceptual Data Model
From the figure above, we can see that the only information shown via the conceptual data
model is the entities that describe the data and the relationships between those entities. No other
information is shown through the conceptual data model.
Logical data model:- The conceptual data model is here translated into logical data model. This
is where the documentation of the data that can be implemented in the database takes place.
Implementation of one conceptual data model may require multiple logical data models.
Logical data model describes the data in as much detail as possible, without regard to how they
will be physical implemented in the database. Features of a logical data model include:
 Includes all entities and relationships among them.
 All attributes for each entity are specified.
 The primary key for each entity is specified.
 Foreign keys (keys identifying the relationship between different entities) are specified.
Normalization occurs at this level.
The steps for designing the logical data model are as follows:
Specify primary keys for all entities.
Find the relationships between different entities.
Find all attributes for each entity.
Resolve many-to-many relationships.
Normalization.
The figure below is an example of a logical data model.
Logical Data Model
2
Comparing the logical data model shown above with the conceptual data model diagram, we see
the main differences between the two:
In a logical data model, primary keys are present, whereas in a conceptual data model, no
primary key is present.
In a logical data model, all attributes are specified within an entity. No attributes are specified in
a conceptual data model.
Relationships between entities are specified using primary keys and foreign keys in a logical
data model. In a conceptual data model, the relationships are simply stated, not specified, so we
simply know that two entities are related, but we do not specify what attributes are used for this
relationship.
Physical data model:- The last step is to transform the logical data model to physical data
model. Here the data is organized into tables, and account for access, performance and storage
details.
Physical data model represents how the model will be built in the database. A physical database
model shows all table structures, including column name, column data type, column constraints,
primary key, foreign key, and relationships between tables. Features of a physical data model
include:
 Specification of all tables and columns.
 Foreign keys are used to identify relationships between tables.
 Denoralization may occur based on user requirements.
Physical considerations may cause the physical data model to be quite different from the logical
data model.
Physical data model will be different for different RDBMS. For example, data type for a column
may be different between MySQL and SQL Server.
The steps for physical data model design are as follows:
Convert entities into tables.
Convert relationships into foreign keys.
Convert attributes into columns.
Modify the physical data model based on physical constraints / requirements.
The figure below is an example of a physical data model.
Physical Data Model
3
Comparing the logical data model shown above with the logical data model diagram, we see the
main differences between the two:
 Entity names are now table names.
 Attributes are now column names.
 Data type for each column is specified.
 Data types can be different depending on the actual database being used.
The table below compares the three models:

Feature Conceptual Logical Physical
Entity Names ✓ ✓
Entity Relationships ✓ ✓
Attributes ✓
Primary Keys ✓ ✓
Foreign Keys ✓ ✓
Table Names ✓
Column Names ✓
Column Data Types ✓
TYPES OF DATA MODELS

There are six major types of data models, they are,
DATABASE MODEL:- this is a specification describing how a database is structured
and used. The following are database models.
 Flat model:-the flat model consists of a single two-dimensional array of data
elements, where all member of a given column are assumed to be similar values,
and all members of a row are assumed to be related to one another. The diagram
below ia an example,
Student ID FirstName Last name Class

123 Akin Ade SS2
124 Ojo Olojede SS3
Flat file
 Hierarchical model:-this is a logical database model that organizes data in a tree-
like structure. Each record or data element is subdivided into segments that are
connected to one another in a one-to-one parent-child relationship. This model
looks like an organizational chart, one top-level called the root that connects the
stems and roots. The diagram below is an example,
Process students
records
Create/append Modify Generate

record records reports
Academic Finacial
reports reports
Hierarchical
4model
 Network model:-this model is similar to the hierarchical model but it is
composed for each child having more than one parent. The model offers many-to-
many relationships as against hierarchical which is one-to-many.
The diagram below is an example,

Maths English Yoruba
Student 1 Student 2 Student 3 Student 4
Network model
 Relational model:- This is the most common of all the models. It represents all
the data in the database in two-dimensional tables. Each row in the table is called
a tuple while each column represents an attribute. The tuple represents a record
and the attribute represents a field.
 Object-oriented model:- This is similar to a relational database model, but
objects classes and inheritance are directly supported in database schemes and in
the query language.
 Star schema:-This is the simplest style of data warehouse schema. The star
schema consists of a few “fact tables” (possibly only one justifying the name)
referencing any number of “dimension tables”.
DATA STRUCTURE DIAGRAM:- A data structure diagram (DSD) is a diagram and
data model used to describe conceptual data models by providing graphical notations
which document entities and their relationships and the constraint that bind them. The
basic graphic elements of DSDs are boxes, representing entities and arrows representing
relationships. DSDs are most useful for documenting complex entities. DSDs are
extension of ER-model. In DSDs attributes are specified inside the entity boxes rather
than outside of them, while relationships are drawn as boxes compose of attributes which
specify the contracts that bind entities together. ER-model does not provide a way to
specify the constraints between relationships and therefore cumbersome when
representing entities with several attributes. ER-model focuses on the relationships
between different entities, while DSDs focuses on the relationships of the elements
within the entity and enable users to fully see the links and relationships between each
entity.
ENTITY-RELATIONSHIP MODEL:-This is a model that allows us to describe the
data involved in a real world enterprise in terms of objects and their relationships and it
is widely used to develop an initial database design. It provides useful concepts that
allow us to move from an informal description of what users want for their database to a
more detailed and precise description that can be implemented in a DBMS. This model is
an abstract conceptual data model or semantic data model used in software engineering
to represent structured data.
GEOGRAPHIC DATA MODEL:- This is a data model in geographic information
system which is a mathematical construct for representing geographic objects or surfaces
as data. Examples of this model are,
 The vector Data model represents geography as collection of points,
lines, and polygons.
 The Raster Data model represents geography as cell matrixes that stores
numeric values.
 Triangulated irregular Network(TIN) Data model represents geography as
sets of contiguous, non-overlapping triangles.
GENERIC DATA MODEL:-These are generalization of conventional data models.
They define standardised general relation types. They are developed as an approach to
5
solve some shortcomings of the conventional data models. For example a modular
usually produce different conventional data models of the same domain. This can lead to
difficulty in bringing the models of different people together and is an obstacle for data
exchange and data integration. Invariably, however, this difference is attributable to
different level of abstraction in the models and differences in the kinds of facts that can
be instantiated ( the semantic expression capabilities of the model). The modeller need to
communicate and agree on certain elements which are to be rendered more concrete in
order to make the differences less significant.
SEMANTIC DATA MODEL:- This is a technique in software engineering to define the
meaning of data within the context of its interrelationships with other data. A semantic
data model is an abstraction which defines how the stored symbols relate to the real
world. It is sometimes called a conceptual data model.
NOTE:- The logical data structure of a database management system(DBMS), whether

hierarchical, network, or relational cannot totally satisfy the requirement for a conceptual
definition of data because it is limited in scope and biased toward the implementation strategy
employed by DBMS. Therefore, the need to define data from a conceptual view has led to the
development of semantic data modelling technique. That is techniques to define the meaning of
data within the context of its interrelationships with other data. The real world in term of
resources, idea, events etc is symbolically defined within physical data stores. A semantic data
model is an abstraction which defines how the stored symbols relate to the real word. Thus, the
model must be a true representation of the real world.
DATA MODELING
Data modelling is the analysis of data objects that are used in a business or other context and the
identification of the relationships among these data objects. Data modelling is a first step in
doing object-oriented programming.
Data modelling is the formalization and documentation of existing processes and events that
occur during application software design and development.
BASIC STEPS IN DATA MODELLING

Identify the thing you are interested in:- All the things that will make the model to be relevant
to the database must be identify at the early stage of the modelling. These things are called
ENTITIES in the logical model and TABLES in the physical model.
Identify relationship between entities:- This shows that each entity relate with another entity
in the model. This is in form of line drawn between the entities. Example a manager has workers
under him, workers are assigned workflows, and work flows are supported by tools. Note that
managers are not directly related to tools, but are indirectly related through workers and
workflows. Relationships can be either one or many. For example, a manager can have many
workers, but each worker has only one manager.
Identify the keys for each entity:- Keys are used to look up arrow of data. Keys describe the
minimum amount of data necessary to identify a particular thing. Often a computer generated
number is used for the key. For example, an employee ID number.
Identify the attribute for each entity:- Attribute are the information which is needed to be kept
about an entity. Example, an attribute about a worker may include the work’s name and e-mail
address.
Note Data models are expressed in an ERD and data element dictionary (DED). The ERD is a
schematic representation of the database while DED is the text representation. The two must be
combined to get a clear picture of the data model.
6
DATA MODELLING APPROACHES
The following are the data modelling approaches
 Semantic modelling
 Relational modelling
 Entity-Relationship modelling
 Binary modelling
 Semantic modelling:- is the modelling that uses the concept of type. A type is the
collection of certain number of properties into a unit or component. The properties
are also considered as type.
Example: Student registering for a course.
type registration =student, course
type student =name, student ID, street address, city, state, zip code.
Type course =course name, course number, day-of-week time
Registration
Student Course
Semantic modelling: abstraction hierarchy
Note:
 Aggregation is represented in the model diagram by placing it above the
properties.
 Base types such as name, student ID, address and so on are not represented in the
graphical notation.
 Relation modelling :-this uses the concept of mathematical relation which forms the
basis for the data structure in the relational model. A relation is visualized as a two-
dimensional table with rows and columns containing only atomic values. The
example of student registration is defined in the relational modelling as follows.
Registration (student ID, course number)
Student (name, student ID, street address, city, state, zip code)
Course (course name, course number, day-of-week, time)
Relation registration has two attributes forming two table columns. The number of rows in each
table depends on the actual data stored. Each row is uniquely identified by values of the columns
in bold type.
7
 Entity-relationship modelling:- This uses the concept of entity type, attribute type
and relationship type to form a complete ER-model as shown below, using the
example of student registering for course,
name Student ID
Zip code Course number
m1n1 m2n2 Registration

Student ID Student
m2n2
St. address
state
city Course name

m1n1
Course
time Course number

Day-of week
E-R diagram for student registration for a course

The parameters (m1n1) and (m2n2) denote maximum and minimum cardinalities of the
relationships. Cardinalities specify how many instances of one entity type may be associated
with how many instances of the other entity type, cardinality parameters take the value 0, 1 or *
(where * indicate many).
 Binary modelling:- This separates the object types into lexical and non-lexical
object types. Lexical object types are those that can be used as names for other
object types or for references to other object types. Non-lexical object types are
named object types or those referred to by other object types. The relationship
between a lexical and non-lexical object type is called bridge type. Relationship
between two non-lexical object types is called idea type. Graphical constraints may
be imposed on an information structure diagram of the binary model. Uniqueness
constraint and totality constraint are among the imposable constraints. In a binary
model, it must always be possible to refer uniquely to a non-lexical object type that
is each binary model must be referable. The information structure diagram for the
student registration example becomes more complex than using the other modelling
approaches.
Note:-The dotted circles denote lexical object types and closed circles representing non-lexical
object types. The bridge type indicate the relationship between student and course. Uniqueness
constraints are indicated by “u” and total constraints by “v”. The information structure diagram
is more complex than using the other modelling approaches.
8
The diagram below illustrate binary modelling,
v
v
v v
v
v
v
u
v v
v
Binary model information structure diagram
IMPORTANCE OF DATA MODELLING

Data model is a plan for building a database. This should be as simple as possible for
communication.
It is an assurance that all objects required by the database are completely and accurately
represented.
It is a blueprint to database developer.
USES OF DATA MODELLING

Data modelling techniques and methodologies are used to model data in a standard consistent
predictable manner in other to manage it as a resource. Some of the uses are as follows,
To manage data as a resource.
For integration of information system.
For designing databases/data.
For communication between the modeller and the users.
TYPES OF DATA MODELLING

There are two types of data modelling:
 Strategic data modelling:- This is part of the creation of an information systems
strategy, where an overall vision and architecture for information system is defined.
 Data modelling during systems analysis:-In system analysis logical data models are
created as part of the development of new databases.
9
MODELLING METHODOLOGIES
Data models represent information areas of interest. There are several ways of creating data
model, but these two methodologies stand out.
 Bottom-up methodology
 Top-down methodology
Bottom-up methodology:-These are methodologies that usually start with existing data structure
forms, field on application screens, or reports.
Top-down methodology:- This are methodologies that are created in an abstract way by getting
information from people who know the subject area.
Note:- The most common method for building data models for relational database is entity-
relationship model.
Data Modelling In the Context of Database Design

Database design is defined as: "design the logical and physical structure of one or more
databases to accommodate the information needs of the users in an organization for a
defined set of applications". The design process roughly follows five steps:
Planning and analysis
Conceptual design
Logical design
Physical design
Implementation
The data model is one part of the conceptual design process. The other, typically is the
functional model. The data model focuses on what data should be stored in the database
while the functional model deals with how the data is processed. To put this in the
context of the relational database, the data model is used to design the relational tables.
The functional model is used to design the queries which will access and perform
operations on those tables.
COMPONENTS OF A DATA MODEL

The Data models gets its input from the planning and analysis stage. The modeller and the
analysts, collects information about the requirements of the database by reviewing existing
documentation and interviewing end-users.
Data model has two outputs as follows,
 Entity-relationship diagram:- This represents the data structures in a pictorial
form. This diagram is easy to learn, it is therefore a very useful tool to
communicate the model to the end-user.
 Data document:- This is the document that describes in details the data objects,
relationships, and rules required by the database. The dictionary provides the
detail required by the database developer to construct the physical database.
DATA MODEL THEORY

The term data model have two meanings, they are,
 A data model theory, i.e. a formal description of how data may be structured
and accessed.
10
 A data model instance, i.e. applying a data model theory to create a practical
data model instance for some particular applications.
A data model theory has three main components:

The structural part:- This is a collection of data structure which are used to
create databases representing the entities or objects modelled by the database.
The integrity part:-This is a collection of rules governing the constraints placed
on these data structures to ensure structural integrity.
The manipulation part:-this is a collection of operations which can be applied to
the data structures, to update and query the data contained in the database.
STANDARD DATA MODEL

A standard data model or industry standard data model (ISDM) is a data model that is widely
applied in some industry, and shared amongst competitors to some degree. They are often
defined by standards bodies, database vendors or operating system vendors.
When in use, they enable easier and faster information sharing because heterogeneous
organizations have a standard vocabulary and pre-negotiated semantics, format, and quality
standards for exchanged data. The standardization has an impact on software architecture as
solutions that vary from the standard may cause data sharing issues and problems if data is out of
compliance with the standard.
The more effective standard models have developed in the banking, insurance, pharmaceutical
and automotive industries, to reflect the stringent standards applied to customer information
gathering, customer privacy, consumer safety, or just in time manufacturing.
Typically these use the popular relational model of database management, but some use the
hierarchical model, especially those used in manufacturing or mandated by governments, e.g.,
the DIN codes specified by Germany. While the format of the standard may have
implementation trade-offs, the underlying goal of these standards is to make sharing of data
easier.
The most complex data models known are in military use, and consortia such as NATO tend to
require strict standards of their members' equipment and supply databases. However, they
typically do not share these with non-NATO competitors, and so calling these 'standard' in the
same sense as commercial software is probably not very appropriate.
An emerging area of standard data model is in the identity card arena, where a vast number of
security engineering solutions for public spaces, e.g., airports, other public transport, hospitals,
are expected soon to rely on a standard data model for identifying the card holder/user of the
facility. This may contain biometric information or other data that would be standardized across
an entire trade bloc, e.g., the European Union or the North American Free Trade Agreement
(NAFTA). This raises many privacy and carceral state concerns. These are discussed more
deeply in an article on standard user models.
Examples Standard Data Models

 ISO 10303:- CAE Data Exchange Standard - includes its own data modelling language,
EXPRESS
 ISO 15926:- Process Plants including Oil and Gas facilities Life-Cycle data
 IDEAS:- Group Foundation Ontology agreed by defence departments of Australia,
Canada, France, Sweden, UK and USA
11
NORMALIZATION
Normalization is a process in which an initial DB design is transformed, or decomposed, into a
different, but equivalent, design. The resulting schema is equivalent to the original one in the
sense that no information is lost when going from one to the other.
The normalization procedure consists of a sequence of projections that is; some attributes are
extracted from one table to form a new one. In other words, tables are split up vertically. The
decomposition is lossless, only if you can restore the original table by joining its projections.
Through such non-loss decompositions it is possible to transform an original schema into a
resulting one that satisfies certain conditions, known as Normal Forms:
Tables that contain redundant data can suffer from update anomalies, which can introduce inconsistencies into a
database.
The rules associated with the most commonly used normal forms, namely first (1NF), second (2NF), and third (3NF).
The identification of various types of update anomalies such as insertion, deletion, and modification anomalies can be
found when tables that break the rules of 1NF, 2NF, and 3NF and they are likely to contain redundant data and suffer
from update anomalies.
Normalization is a technique for producing a set of tables with desirable properties that support the requirements of
a user or company.
Major aim of relational database design is to group columns into tables to minimize data redundancy and reduce file
storage space required by base tables.
ƒ
 The First Normal Form (1NF) addresses the structure of an isolated table.
 The Second (2NF), Third (3NF), and Boyce-Codd (BCNF) Normal Forms address
one-to-one and one-to-many relationships.
 The Fourth (4NF) and Fifth (5NF) Normal Forms deal with many-to-many
relationships.
These Normal Forms form a hierarchy in such a way that a schema in a higher normal form
automatically fulfils all the criteria for all of the lower Normal Forms.
The Fifth Normal Form is the ultimate normal form with respect to projections and joins -- it is
guaranteed to be free of anomalies that can be eliminated by taking projections.
First Normal Form

A table is said to be in First Normal Form (1NF), if all entries in it are scalar-valued. Relational
database tables are 1NF by construction since vector-valued entries are forbidden. Vector-valued
data (that is, entries which have more than one value in each row) are referred to as repeating
groups.
The following relation violates 1NF because the SupplierID forms a repeating group (here and in
the following examples and text, primary key fields are in bold):
{ PartID, Supplier1ID, Supplier2ID, Supplier3ID }
Repeating groups indicate a one-to-many relationship -- in other words, a relationship which in

relational databases is treated using foreign keys. Note that the problem of repeating groups
cannot be solved by adding any number of fields to a record; even if the number of elements of
the vector-valued data was fixed, finite, and predetermined, searching for a value in all these
parallel fields is prohibitively cumbersome.
To achieve 1NF, eliminate repeating groups by creating separate tables for each set of related
data.
To demonstrate the typical anomalies that occur in tables that are only 1NF, consider the
following example:
{ CustomerID, OrderID, CustomerAddress, OrderDate }
Note the following problems:
12
Insert: It is not possible to add a record for a customer who has never placed an order.
Update: To change the address for a customer, this change has to be repeated for all of
the customer's existing orders.
Delete: Deleting the last order for a customer loses all information about the customer.
The First Normal Form (1NF)

To get a database into 1NF you need to consider three elements:
• Eliminate repeated data in individual tables.
• Create separate table(s) for each set of related data
• Identify each set of related data with a primary key
Functional dependency
The Second and Third Normal Forms address dependencies among attributes, specifically
between key and non-key fields.
By definition, a key uniquely determines a record: Knowing the key determines the values of all
the other attributes in the table row, so that given a key, the values of all the other attributes in
the row are fixed.
This kind of relationship can be formalized as follows. Let X and Y be attributes (or sets of
attributes) of a given relationship. Then Y is functionally dependent on X if, whenever two
records agree on their X-values, they must also agree on their Y-values. In this case, X is called
the determinant and Y is called the dependent. Since for any X there must be a singleY, this
relationship represents a single-valued functional dependency. If the set of attributes in the
determinant is the smallest possible (in the sense that after dropping one or more of the attributes
from X, the remaining set of attributes does no longer uniquely determine Y), then the
dependency is called irreducible.
Note that functional dependency is a semantic relationship: It is the business logic of the
problem domain, represented by the relation, which determines whether a certain X determines
Y.
Second Normal Form

A table is in Second Normal Form (2NF) if every non-key field is a fact about the entire key. In
other words, a table is 2NF if it is 1NF and all non-key attributes are functionally dependent on
the entire primary key (that is, the dependency is irreducible).
Clearly, 2NF is only relevant when the key is composite (that is, consisting of several fields).
The following example describes a table which is not 2NF since the WarehouseAddress attribute
depends only on WarehouseID but not on PartID:
{ PartID, WarehouseID, Quantity, WarehouseAddress }
To achieve 2NF, create separate tables for sets of values that apply to multiple records and relate
these tables through foreign keys. The determinants of the initial table become the primary keys
of the resulting tables.
The Second Normal Form (2NF)

To get to 2NF you only need to consider two rules:
• Create a separate table for sets of values that apply to multiple records.
• Relate the tables using a foreign key
The process for transforming a 1NF table to 2NF is:

Identify any determinants other than the composite key, and the columns they determine.
Create and name a new table for each determinant and the unique columns it determines.
Move the determined columns from the original table to the new table. The determinate
becomes the primary key of the new table.
Delete the columns you just moved from the original table except for the determinate
which will serve as a foreign key.
The original table may be renamed to maintain semantic meaning.
13
Third Normal Form
A relation is in Third Normal Form (3NF) if it is 2NF and none of its attributes is a fact about
another non-key field. In other words, no non-key field functionally depends on any other non-
key field. (Such indirect dependencies are known as transitive dependencies.)
The following example violates 3NF since the Location is functionally dependent on the
DepartmentID:
{ EmployeeID, DepartmentID, Location }
To achieve 3NF, eliminate fields that do not depend on the key from the original table and add
them to the table whose primary key is their determinant.
To summarize the normalization procedure up to and including Third Normal Form:
Note:- Every field in a record must depend on The Key (1NF), the Whole Key (2NF), and
Nothing But The Key (3NF).
The Third Normal Form (3NF)

To get to 3NF you only need to consider one rule:
• Eliminate fields that do not depend on the key.
The process of transforming a table into 3NF is:

Identify any determinants, other than the primary key, and the columns they determine.
Create and name a new table for each determinant and the unique columns it determines.
Move the determined columns from the original table to the new table. The determinate
becomes the primary key of the new table.
Delete the columns you just moved from the original table except for the determinate
which will serve as a foreign key.
The original table may be renamed to maintain semantic meaning.
Boyce-Codd Normal Form

Boyce-Codd Normal Form (BCNF) is an extension of 3NF in the case with two or more
candidate keys which are composite and overlapping (that is, they have at least one field in
common). If these conditions are not fulfilled, 3NF and BCNF are equivalent. A table is BCNF
if, and only if its only determinants are candidate keys.
In the following table, both {SupplierID, PartID}, as well as {SupplierName, PartID}, are
candidate keys. The table is not BCNF since it contains two determinants (SupplierID and
SupplierName) which are not candidate keys. (SupplierID and SupplierName are determinants,
since they determine each other.)
{ SupplierID, PartID, SupplierName, Quantity }
However, either of the following decompositions is BCNF:
{ SupplierID, SupplierName }
{ SupplierID, PartID, Quantity }
Benefits of normalizing Database design

The following are the good things you’ll gain from normalizing your database designs.
(Advantages)
Data integrity and Data consistency(There is no repeating data available so the
consistency is well defined)
Queries are optimized(The queries will be efficient since database works well around
joints)
Faster creation of indexes and sorting of columns
Update performance is high since we are updating a specific table that has few indexes
Concurrency issues are handled(No unnecessary blocking of data due to row locking)
14
Logical Database design(It will be easy to comprehend and for further development)
Each attribute (column) must be a fact about the key, the whole key, and nothing but the key
Keys in tables
Primary key
The primary key is an attribute or a set of attributes that uniquely identify a specific instance of an
entity. Every entity in the data model must have a primary key whose values uniquely identify instances
of the entity.
To qualify as a primary key for an entity, an attribute must have the following properties:
* It must have a non-null value for each instance of the entity
* The value must be unique for each instance of an entity
* The values must not change or become null during the life of each entity instance
Composite Keys
Sometimes it requires more than one attribute to uniquely identify an entity. A primary key that
made up of more than one attribute is known as a composite key.
-
Artificial Keys
An artificial key is one that has no meaning to the business or organization. Artificial keys are
permitted when
 No attribute has all the primary key properties, or
 The primary key is large and complex.
Foreign Keys
A foreign key is an attribute that completes a relationship by identifying the parent entity.
Foreign keys provide a method for maintaining integrity in the data (called referential integrity)
and for navigating between different instances of an entity. Every relationship in the model must
be supported by a foreign key.
PROPERTIES OF RELATIONAL TABLES

Relational tables have six properties:
Values are atomic.
Column values are of the same kind.
Each row is unique.
The sequence of columns is insignificant.
The sequence of rows is insignificant.
Each column must have a unique name.
Values Are Atomic

This property implies that columns in a relational table are not repeating group or arrays. Such
tables are referred to as being in the "first normal form" (1NF). The atomic value property of
relational tables is important because it is one of the cornerstones of the relational model. The
key benefit of the one value property is that it simplifies data manipulation logic.
Column Values Are of the Same Kind

In relational terms this means that all values in a column come from the same domain. A domain
is a set of values which a column may have. For example, a Monthly_Salary column contains
only specific monthly salaries. It never contains other information such as comments, status,
flags, or even weekly salary.
15
This property simplifies data access because developers and users can be certain of the type of
data contained in a given column. It also simplifies data validation. Because all values are from
the same domain, the domain can be defined and enforced with the Data Definition Language
(DDL) of the database software.
Each Row is Unique

This property ensures that no two rows in a relational table are identical; there is at least one
column, or set of columns, the values of which uniquely identify each row in the table. Such
columns are called primary keys and are discussed in more detail in Relationships and Keys.
This property guarantees that every row in a relational table is meaningful and that a specific
row can be identified by specifying the primary key value.
The Sequence of Columns is Insignificant

This property states that the ordering of the columns in the relational table has no meaning.
Columns can be retrieved in any order and in various sequences. The benefit of this property is
that it enables many users to share the same table without concern of how the table is organized.
It also permits the physical structure of the database to change without affecting the relational
tables.
The Sequence of Rows is Insignificant

This property is analogous the one above but applies to rows instead of columns. The main
benefit is that the rows of a relational table can be retrieved in different order and sequences.
Adding information to a relational table is simplified and does not affect existing queries.
Each Column Has a Unique Name

Because the sequence of columns is insignificant, columns must be referenced by name and not
by position. In general, a column name need not be unique within an entire database but only
within the table to which it belongs.
DATA INTEGRITY
Data integrity means, in part, that you can correctly and consistently navigate and manipulate the
tables in the database. There are two basic rules to ensure data integrity; entity integrity and
referential integrity.
The entity integrity rule states that the value of the primary key, can never be a null value (a null
value is one that has no value and is not the same as a blank). Because a primary key is used to
identify a unique row in a relational table, its value must always be specified and should never
be unknown. The integrity rule requires that insert, update, and delete operations maintain the
uniqueness and existence of all primary keys.
The referential integrity rule states that if a relational table has a foreign key, then every value of
the foreign key must either be null or match the values in the relational table in which that
foreign key is a primary key.
16
CREATING TABLES IN NORMAL FORMS
Example: The following are the paper forms used for recording data about EMPLOYEES and
their QUALIFICATIONS in a certain company.
Employee Number Employee Name
1
01267 Clark
Department number Department Name Department Location
05 Auditing HQ
Qualification Year
Bachelor of art 1970
Master of art 1973
Doctor of Philosophy 1976
2 Employee Number Employee Name

70964 Smith
12 Legal MS
Qualification Year

2261 Walsh
05 Auditing HQ
Qualification Year
Master of art 1977

50607 Black
05 Auditing HQ
Qualification Year
Question: Create a table for the form and normalize the table.
17
Solutions:
Step 1: Create a table for the form
Employee Employee Dept. Dept. Dept. Qualftn 1 Qualftn 2 Qualftn 3

Number Name number Name Location Description Year Description Year Description Year
01267 Clark 05 Auditing HQ BA art 1970 MA art 1975 Dr. Philo 1976
70964 Smith 12 Legal MS BA art 1969
22617 Walsh 05 Auditing HQ BA art 1772 MA art 1777
50607 Black 05 Auditing HQ
Step 2: Remove repeated groups to a separate table.

Since some of the employee has more than one qualification and we cannot fit the
qualification into one row of a table. We have to resolve the data into two tables. The first table
holds the employee data and the second holds the qualification data one row per qualification as
follows. Leave the Employee No in the qualification table (foreign key).
EMPLOYEE TABLE
Employee Employee Dept. Dept. Dept.
Number Name number Name Location
01267 Clark 05 Auditing HQ
70964 Smith 12 Legal MS
22617 Walsh 05 Auditing HQ
50607 Black 05 Auditing HQ
Qualification Table
Employee Qualification Qualification
Number Description Year
01267 BA Art 1970
01267 MA Art 1973
01267 Dr Philo 1976
70964 BA Art 1969
22617 BA Art 1972
22617 MA Art 1977
18
Note:- In the tables above we have remove the “repeating group” of qualification data
(consisting of qualification descriptions and year) to it own table. We hold employee number in
the second table to serve as a cross-reference top the first table, because we need to know to
whom each of the qualification belongs. With the Qualification table there is no limit on the
qualification that any given employee is ready to have.
Step 3:-Eliminating duplicated data.

The next task is to eliminate duplicate data, for example the fact that department number
“05” IS “Auditing” and is located at “HQ” is repeated for every employee in that department,
updating data is therefore complicated.
If we move Auditing department to another location, updating of records will be very
difficult because of duplication since several rows in the employee table will be involve. This
will violate two qualities of normalized table i.e. “non-redundancy” and “elegance”.
Since we know that department name and addresses are data about departments rather
than employee we therefore need to create department table.
Department number should be left behind to serve as a cross-reference to the employee table.
We therefore have the following tables.
EMPLOYEE TABLE QUALIFICATION TABLE
Employee Employee Dept. Employee Qualification Qualification
Number Name number Number Description Year
01267 Clark 05 01267 BA Art 1970
70964 Smith 12 01267 MA Art 1973
22617 Walsh 05 01267 Dr Philo 1976
50607 Black 05 70964 BA Art 1969
22617 BA Art 1972
22617 MA Art 1977

DEPARTMENT TABLE
Department Number Department Name Department Location

05 Auditing HQ
02 Legal MS
Insertion anomalies
1. To insert the details of a new member of staff (staffNo, name, position and salary) located at
a given branch into the StaffBranch table, we must also enter the correct details for that branch
(branchNo, branchAddress and telNo). For example, to insert the details of a new member of
staff at branch B002, we must enter the correct details of branch B002 so that the branch details
are consistent with values for branch B002 in other records of the StaffBranch table. The data
shown in the StaffBranch table is also shown in the Staff and Branch tables. These tables do
have redundant data and do not suffer from this potential inconsistency, because for each staff
member we only enter the appropriate branch number into the Staff table. In addition, the details
of branch B002 are recorded only once in the database as a single record in the Branch table.
2. To insert details of a new branch that currently has no members of staff into the StaffBranch
table, it’s necessary to enter NULLs into the staff-related columns, such as staffNo. However, as
staffNo is the primary key for the StaffBranch table, attempting to enter nulls for staffNo
19
violates entity integrity, and is not allowed. The design of the tables shown in Staff and Branch
avoids this problem because new branch details are entered into the Branch table separately from
the staff details. The details of staff ultimately located at a new branch can be entered into the
Staff table at a later date.
Deletion anomalies
If we delete a record from the StaffBranch table that represents the last member of staff located
at a branch, the details about that branch are also lost from the database. For example, if we
delete the record for staff Art Peters (S0415) from the StaffBranch table, the details relating to
branch B003 are lost from the database. The design of the tables that separate the Staff and
Branch table avoids this problem because branch records are stored separately from staff records
and only the column branchNo relates the two tables. If we delete the record for staff Art Peters
(S0415) from the Staff table, the details on branch B003 in the Branch table remain unaffected.
Modification anomalies
If we want to change the value of one of the columns of a particular branch in the StaffBranch
table, for example the telephone number for branch B001, we must update the records of all staff
located at that branch (row 1 and 2). If this modification is not carried out on all the appropriate
records of the StaffBranch table, the database will become inconsistent. In this example, branch
B001 would have different telephone numbers in different staff records.
The above examples illustrate that the Staff and Branch tables have more desirable properties
than the StaffBranch table.
STAFF BRANCH TABLE
STAFF TABLE
20
BRANCH TABLE
StaffBranch table has redundant data; the details of a branch are repeated for every
memberof staff for example row 1 and 2, row 3 and 4 on branchNo, branchAddress and
telNo.
In contrast, the branch information appears only once for each branch in the Branch table
an only the branch number (branchNo) is repeated in the Staff table, to represent where
each member of staff is located.
The First normal form (1NF)

A table in which the intersection of every column and record contains only one value. It prohibits
nesting or repeating groups in table. The intersection must be atomic.
For example the telNos column contains multiple values.
21
The Second normal form (2NF)
2NF ONLY applies to tables with composite primary keys (more than one primary key).
A table that is in 1NF and in which the values of each non-primary-key column can be
worked out from the values in ALL the columns that make up the primary key.
A table is in 2NF if each non-key (non primary and/or candidate keys) column depends
on ALL candidate keys, NOT on a subset of ANY candidate key.
The 2NF violation occurs when Functional Dependency (FD) in which part of key
(instead of the whole keys) determines a non-key. An FD containing a single column
Left Hand Side (LHS) cannot violate 2NF.
For example, TempStaffAllocation table in the following Figure is in 2NF because
branchAddress can depend on branchNo only not both of staffNo AND branchNo (staffNo &
branchNo are candidate keys and at the same time can be primary keys and at the same time is
composite key because more than one primary keys). Another one is the values in name and
position columns can depend on (can stand on it own) the staffNo ONLY not both of staffNo
and branchNo. What we want is the hoursperWeek column, which depends on both staffNo and
branchNo. In another word we must avoid the partial dependencies on the candidate keys.
22
Third normal form (3NF)
A table that is in 1NF and 2NF and in which all non-primary-key column can be worked
out from only the primary key column(s) and no other columns.
At this level, the combined definition of 2NF and 3NF is a table is in 3NF if each non-
key column depends on all candidate keys, whole candidate keys and nothing but
candidate keys.
For 2NF we should remove partial dependency and for 3NF we should remove transitive
dependency.
ƒ For example the StaffBranch table is not in 3NF.
23
The formal definition of 3NF is a table that is in 1NF and 2NF and in which no
non-primary-key column is transitively dependent on the primary key.
For example, consider a table with A, B, and C. If B is functional dependent on A
(A → B) and C is functional dependent on B (B → C), then C is transitively
dependent on A via B (provided that A is not functionally dependent on B or C).
If a transitive dependency exists on the primary key, the table is not in 3NF.
ENTITY RELATIONSHIP MODEL

Definition:- ER- model is a conceptual data model that view the real world as entities and
relationships. A basic component of the model is the Entity-Relationship diagram which is used
to visually represent data objects. The mode has been extended and it is commonly used for
database design. The ER-model was originally proposed to unify the network and relational
database views.
For the database designer, the utility of ER-model are,

It maps well to the relational model. The constructs used in the ER-model can easily
be transformed into relational tables.
It is simple and easy to understand with minimum of training. Therefore, the model
can be used by the database designer to communicate the design to the end user.
The model can also be used as a design plan by the database developer to implement
a data model in a specific database management software.
24
BASIC CONSTRUCT OF ER-MODEL
The ER-model views the real world as a construct of entities and association between entities.
The following are the constructs of ER-model

Entities
Attributes
Relationships
Entities:-these are the principal data objects about which information is to be collected.they
are usually recognizable concepts, either concrete or abstract, such as person, places, things, or
events which have relevance to the database. Some specific example of entities are STUDENTS,
EMPLOYEES, PROJECTS, INVOICES, CUSTOMERS. An entity is analogous to a table in the
relational model.
Entities are classified as independent or dependent or strong and week respectively.
An independent entity is one that does not rely on another for identification while a dependent
entity is one that relies on another for identification. An entity occurrence also called an instance
is an individual occurrence of an entity. An occurrence is analogous to a row in the relational
table.
Entities are drawn as rectangular boxes containing a noun in singular form, example is as shown
below,
Customer
An entity name “Customer”
Special entity Type
Associative entities:-This is also known as intersection entities. They are used to associate two
or more entities in order to reconcile a many-to-many relationship.
Subtype entities:-these are used in generalization hierarchies to represent a subset of instances
of their parent entity, called the supertype, but which have attributes or relationships that apply
to the subset.
Attributes:-These are properties or characteristics of a particular entity about which we

wish to collect and store data. In the characteristics there is usually one attribute that uniquely
identifies a particular instance of the entity, this attribute is called the KEY attribute.
Attributes describe the entity of which they are associated. A particular instance of an attribute is
a value. The domain of an attribute is the collection of all possible values an attribute can have.
The domain of Name is a character string. Attribute can be classified as identifiers or
descriptors. The identifiers is commonly called keys, uniquely identify an instance of an entity.
A descriptor describes a non-unique characteristic of an entity instance. It provides us with a
means of organizing and structuring the data. Example is as shown below,
Name Phone No
Cust ID
Customer
Contact Add
Attributes are shown as oval containing the name of the attribute as in the diagram above.
Relationships:- These represent an association between two or more entities. Examples

of relationships are,
25
Employees are assigned to projects
Project have subtask
Departments manage one or more projects.
Relationships are classified in terms of degree, connectivity, cardinality and existence. A

relationship between entities is drawn as a line bisected by a diamond contains a verb ( or short
phrase) that describes the nature of the relationship between the entities as shown in the diagram
below.
buys
Named relationships are used to make the ERDs more readable. Relationship names do not
showup in the final database like the entity name.
Cardinality:-The cardinality of a relationship constraint the number of one entity type that
can be associated with a single instance of the other entity type.
Types of cardinality
There are three fundamental types of cardinality in ERDs. They are,
One-to-one
One-to-many
Many-to-many
Example of ER-Diagram
Unit price
Product ID Product
Unit price Entity
Supplied Cardinality
by
Relationship
Phone No
Name Attribute
Supplier
Address
An ERD showing relationship between product and suppliers

The relationship above is ne-to-many relationship between products and supplier.
Steps in constructing an ERD

Identify the entities
Specify a relationship between the entities.
Determine the cardinality of the relationship.
Identify a few important attributes.
26
Example: Draw a ERD showing the relationship between customer and product
Cust ID Product ID
Cust ID buys Cust ID
Name
Contact person Unit price Qty on hand
An initial ERD showing many-to-many relationship
Note:-dealing with many-to-many relationship: We must take note of certain important

attributes that do not belong to either of the entities participating in the relationship. The solution
is to assign the attributes to the relationship itself.
Attributes of relationships:- this are attributes that do not belong to any of the entities that are
interacting together. Examples,
 Date:- the date on which the purchase is made.
 Actual price: -the price at which the item ( or multiple items within the same class of
product) are actually sold to the customer.
 Quantity ordered:- the number of item within a certain product ID requested by the
customer.
 Quantity shipped:- the actual number of items shipped to the customer.
Associative Entities
Associative entities:-This is also known as intersection entities. They are used to associate two
or more entities in order to reconcile a many-to-many relationship.
To transform a relationship into an entity on ERD we use a special symbol called associative
entity. The notation for an associative entity is a relationship diamond nested inside of an
entity rectangle.
To transform a many-to-many relationship without attributes into an associative entity with
attribute. From the example above we have the following transformation,
Draw a rectangle around the “buy” relationship.

Replace the relationship name ”buy” with an appropriate noun for example “sale”.
Decompose the many-to-many relationship into two one-to-many relationships.
Add the attributes to the associative entity.
Quantity shipped
Date
Quality
Cust ID Product ID
Cust ID sale Cust ID
Name
Contact person Unit price Qty on hand
Transformed many-to-many relationship

27 into associative entity
Class work
 Create a relationship to reflect the fact that Customers place order and orders consist
of products.
 Add cardinality symbol to reflect the fact that each customer can place many order
but each order belongs to a single customer.
 Add cardinality by symbols to reflect the fact that each order can contain many
products and each product may be contained in many order.
 Add a handful of attributes to the diagram to help clarify the meaning of the entities.
Customer places Order
Cust ID
Cust ID places
Order
From the diagram above transform the ‘‘contain’’ relationship into an associative entity using
the procedure describe before.
Describe between Logical and physical data model
Logical data model captures general information about entities and
28
APPLICATION PACKAGE –MICROSOFT ACCES
Access is a powerful database management program that can be used for storing,
organizing, retrieving and reporting information. For example, it can be used to manage financial
records personnel data, inventory information, and personal and professional contacts.
A database is any collection of information that is organized for quick retrieval. Example banks
use database to manage costumers account, schools use database to maintain students records
businesses use database for payroll, sales and financial records.
Most databases are computer based,. Documents such as telephone book can be
considered as a manual database because it allows for quick retrieval of information and it is
organized alphabetically.
Computer databases are both flexible and fast. They can save a lot of time compared to manual
databases.
Advantages of computer databases over manual databases are,

To find specific information with little effort
Easily change of database information.
Access information in many different ways.
Generate many different reports.
Microsoft Access is an object-oriented database management system (DBMS) which consist of
related objects. The primary objects in Access database are tables, queries, forms and reports.
LOADING ACCESS
Click start button
Click All program
Click Microsoft Office
Click Microsoft Office Access 2007.
When you open Access you can do the following.

 Create a new blank database
 Create a database form from template.
 Open an existing database.
Note:- A template is a predefined database consisting of a set of objects such as tables and forms
that can be customized with ones own data.
The open Recent Database section lists the names of database you have worked with recently.
The more links consist of database that is not in this list which can be open by clicking the link.
Open command on the office menu can also be used to open a database. The navigation pane
displays all the objects in the database, objects such as tables, forms, queries, reports macros and
modules.
Note:- the objects displayed in the navigation pane can be narrowed down to just one type
29
DATA MODELING
Data modelling is the analysis of data objects that are used in a business or other context and the
identification of the relationships among these data objects. Data modelling is a first step in doing object-
oriented programming.
Data modelling is the formalization and documentation of existing processes and events that occur during
application software design and development. Data modelling techniques and tools capture and translate
complex system designs into easily understood representations of the data flows and processes, creating a
blueprint for construction and/or re-engineering.
A data model can be thought of as a diagram or flowchart that illustrates the relationships between data.
Although capturing all the possible relationships in a data model can be very time-intensive, it's an
important step and shouldn't be rushed. Well-documented models allow stake-holders to identify errors
and make changes before any programming code has been written.
Data modellers often use multiple models to view the same data and ensure that all processes, entities,
relationships and data flows have been identified. There are several different approaches to data
modelling, including:
Conceptual Data Modelling - identifies the highest-level relationships between different entities.
Enterprise Data Modelling – similar to conceptual data modelling, but addresses the unique requirements
of a specific business.
Logical Data Modelling - illustrates the specific entities, attributes and relationships involved in a
business function. Serves as the basis for the creation of the physical data model.
BASIC CONSTRUCTS OF E-R MODELING

The ER model views the real world as a construct of entities and association between
entities.
Entities
Entities are the principal data object about which information is to be collected. Entities
are usually recognizable concepts, either concrete or abstract, such as person, places,
things, or events which have relevance to the database. Some specific examples of
entities are EMPLOYEES, PROJECTS, INVOICES. An entity is analogous to a table in
the relational model.
Entities are classified as independent or dependent (in some methodologies, the terms
used are strong and weak, respectively). An independent entity is one that does not rely
on another for identification. A dependent entity is one that relies on another for
identification.
Data Integrity
Data integrity refers to the validity of data, meaning data is consistent and correct. In the data
warehousing field, we frequently hear the term, "Garbage In, Garbage Out." If there is no data
integrity in the data warehouse, any resulting report and analysis will not be useful.
In a data warehouse or a data mart, there are three areas of where data integrity needs to be
enforced:
Database level
We can enforce data integrity at the database level. Common ways of enforcing data integrity
include:
Referential integrity
The relationship between the primary key of one table and the foreign key of another table must
always be maintained. For example, a primary key cannot be deleted if there is still a foreign key
that refers to this primary key.
Primary key / Unique constraint
Primary keys and the UNIQUE constraint are used to make sure every row in a table can be
uniquely identified.
Not NULL vs. NULL-able
For columns identified as NOT NULL, they may not have a NULL value.
Valid Values
30
Only allowed values are permitted in the database. For example, if a column can only have
positive integers, a value of '-1' cannot be allowed.
ETL process
For each step of the ETL process, data integrity checks should be put in place to ensure that
source data is the same as the data in the destination. Most common checks include record
counts or record sums.
Access level
We need to ensure that data is not altered by any unauthorized means either during the ETL
process or in the data warehouse. To do this, there needs to be safeguards against unauthorized
access to data (including physical access to the servers), as well as logging of all data access
history. Data integrity can only ensured if there is no unauthorized access to the data.
Data Integrity
Data integrity means, in part, that you can correctly and consistently navigate and manipulate the
tables in the database. There are two basic rules to ensure data integrity; entity integrity and
referential integrity.
The entity integrity rule states that the value of the primary key, can never be a null value (a null
value is one that has no value and is not the same as a blank). Because a primary key is used to
identify a unique row in a relational table, its value must always be specified and should never
be unknown. The integrity rule requires that insert, update, and delete operations maintain the
uniqueness and existence of all primary keys.
The referential integrity rule states that if a relational table has a foreign key, then every value of
the foreign key must either be null or match the values in the relational table in which that
foreign key is a primary key.
31

SS2 TERM 1

Uploaded by

Copyright:

Available Formats

SS2 TERM 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SS2 TERM 1

Uploaded by

Copyright:

Available Formats

DATA MODELS

The table below compares the three models:

TYPES OF DATA MODELS

Student ID FirstName Last name Class

Create/append Modify Generate

The diagram below is an example,

Student 1 Student 2 Student 3 Student 4

NOTE:- The logical data structure of a database management system(DBMS), whether

BASIC STEPS IN DATA MODELLING

Semantic modelling: abstraction hierarchy

m1n1 m2n2 Registration

city Course name

time Course number

E-R diagram for student registration for a course

Binary model information structure diagram

IMPORTANCE OF DATA MODELLING

USES OF DATA MODELLING

TYPES OF DATA MODELLING

Data Modelling In the Context of Database Design

COMPONENTS OF A DATA MODEL

DATA MODEL THEORY

A data model theory has three main components:

STANDARD DATA MODEL

Examples Standard Data Models

First Normal Form

{ PartID, Supplier1ID, Supplier2ID, Supplier3ID }

Repeating groups indicate a one-to-many relationship -- in other words, a relationship which in

The First Normal Form (1NF)

Second Normal Form

The Second Normal Form (2NF)

The process for transforming a 1NF table to 2NF is:

The Third Normal Form (3NF)

The process of transforming a table into 3NF is:

Boyce-Codd Normal Form

Benefits of normalizing Database design

PROPERTIES OF RELATIONAL TABLES

Values Are Atomic

Column Values Are of the Same Kind

Each Row is Unique

The Sequence of Columns is Insignificant

The Sequence of Rows is Insignificant

Each Column Has a Unique Name

2 Employee Number Employee Name

3 Employee Number Employee Name

4 Employee Number Employee Name

Employee Employee Dept. Dept. Dept. Qualftn 1 Qualftn 2 Qualftn 3

70964 Smith 12 Legal MS BA art 1969

22617 Walsh 05 Auditing HQ BA art 1772 MA art 1777

50607 Black 05 Auditing HQ

Step 2: Remove repeated groups to a separate table.

70964 Smith 12 Legal MS

22617 Walsh 05 Auditing HQ

50607 Black 05 Auditing HQ

01267 MA Art 1973

01267 Dr Philo 1976

70964 BA Art 1969

22617 BA Art 1972

22617 MA Art 1977

Step 3:-Eliminating duplicated data.

01267 Clark 05 01267 BA Art 1970