Database Systems Lec 4 PDF
Database Systems Lec 4 PDF
Database Systems Lec 4 PDF
Lecture #4
Topic: Develop Data Models for Business Databases
Instructor: Lecturer Ayesha Naseer
Learning Objectives
• Develop ERDs that are consistent with narrative problems.
• Use transformations to generate alternative ERDs.
• Document design decisions implicit in an ERD.
• Analyze an ERD for common design errors.
• Convert an ERD to a table design using conversion rules.
Guidelines for Analyzing Business Information Needs
Data modeling involves the collection and analysis of business requirements resulting in an ERD to represent the requirements. Business
requirements are rarely well structured.
Rather, as an analyst you will often face an ill-defined business situation in which you need to add structure. You will need to interact with a variety of
stakeholders who sometimes provide competing statements about the database requirements.
In collecting the requirements, you will conduct interviews, review documents and system documentation, and examine existing data. The main goal
when analyzing narrative problem statements is to create an ERD that is consistent with the narrative.
The ERD should not contradict the implied ERD elements in the problem narrative.
Identifying Entity Types
In a narrative, you should look for nouns involving people, things, places, and events as potential entity types. The nouns may appear as subjects or
objects in sentences. For example, the sentence, "Students take courses at the university" indicates that student and course may be entity types.
Determining Attributes
The properties often indicate attributes of entity types. For example, the sentence, "Students choose their major and minor in their first year"
indicates the major and minor may be attributes of student. The sentence, "Courses have a course number, semester, year, and room listed in the
catalog" indicates that course number, semester, year, and room are attributes of course.
Determining Primary Keys
Identification of primary keys is an important part of entity type identification. Ideally, primary keys should be stable and single purpose. "Stable"
means that a primary key should never change after it has been assigned to an entity. "Single purpose" means that a primary key attribute should
have no purpose other than entity identification. Typically, good choices for primary keys are integer values automatically generated by a DBMS.
Guidelines for Analyzing Business Information Needs
Adding Relationships
Relationships often appear as verbs connecting nouns previously identified as entity types.
For example, the sentence, "Students enroll in courses each semester" indicates a relationship between
students and courses.
For relationship cardinality, you should look at the number (singular or plural) of nouns along with other
words that indicate cardinality.
For example, the sentence, "A course offering is taught by an instructor" indicates that there is one
instructor per course offering. You should also look for words such as "collection" and "set" that indicate
a maximum cardinality of more than one.
For example, the sentence, "An order contains a collection of items" indicates that an order is related to
multiple items.
Minimum cardinality can be indicated by words such as "optional" and "required." In the absence of
indication of minimum cardinality, the default should be mandatory.
Summary of Analysis Guidelines
Information Requirements - The Water Utility Database
The water utility database should support the recording of water usage and billing for water usage. To support these
functions, the database should contain data about customers, rates, water usage, and bills. Other functions such as
payment processing and customer service are omitted from this description for brevity. The following list describes the
data requirements
in more detail.
• Customer data include a unique customer number, a name, a billing address, a type (commercial or residential), an
applicable rate, and a collection (one or more) of meters.
• Meter data include a unique meter number, an address, a size, and a model. The meter number is engraved on the
meter before it is placed in service. A meter is associated with one customer at a time.
• An employee periodically reads each meter on a scheduled date. When a meter is read, a meter-reading document is
created containing a unique meter reading number, an employee number, a meter number, a timestamp (includes date
and time), and a consumption level. When a meter is first placed in service, there are no associated readings for it.
• A rate includes a unique rate number, a description, a fixed dollar amount, a consumption threshold, and a variable
amount (dollars per cubic foot). Consumption up to the threshold is billed at the fixed amount. Consumption greater
than the threshold is billed at the variable amount. Customers are assigned rates using a number of factors such as
customer type, address, and adjustment factors. Many customers can be assigned the same rate. Rates are typically
proposed months before approved and associated with customers.
• The water utility bills are based on customers' most recent meter readings and applicable rates. A bill consists of a
heading part and a list of detail lines. The heading part contains a unique bill number, a customer number, a preparation
date, a payment due date, and a date range for the consumption period. Each detail line contains a meter number, a
water consumption level, and an amount. The water consumption level is computed by subtracting the consumption
levels in the two most recent meter readings. The amount is computed by multiplying the consumption level by the
customer's rate.
Analysis of the Information Requirements for
the Water Utility Database
Refinements to an ERD
Data modeling is usually an iterative or repetitive process. You construct a preliminary data model and then refine
it many times. In refining a data model, you should generate feasible alternatives and evaluate them according to
user requirements. You typically need to gather additional information from users to evaluate alternatives. This
process of refinement and evaluation may continue many times for large databases.
1. Transforming Attributes into Entity Types
A common refinement is to transform an attribute into an entity type. When the database should contain more
than just the identifier of an entity, this transformation is useful. This transformation involves the addition of an
entity type and a 1-M relationship. In the water utility ERD, the Reading entity type contains the EmpNo attribute.
If other data about an employee are needed, EmpNo can be expanded into an entity type and 1-M relationship as
in following Figure.
Refinements to an ERD
2. Splitting Compound Attributes
Another common refinement is to split compound attributes into smaller attributes. A
compound attribute contains multiple kinds of data. For example, the Customer entity
type has an address attribute containing data about a customer's street, city, state, and
postal code. Splitting compound attributes can facilitate search of the embedded data.
Splitting the address attribute as shown in Following Figure, supports searches by
street, city, state, and postal code.
Refinements to an ERD
3. Expanding Entity Types
A third transformation is to make an entity type into two entity types and a relationship. This
transformation can be useful to record a finer level of detail about an entity. For example,
rates in the water utility database apply to all levels of consumption beyond a fixed level. It
can be useful to have a more complex rate structure in which the variable amount depends
on the consumption level. Figure 6.5 shows a transformation to the Rate entity type to
support a more complex rate structure. The RateSet entity type represents a set of rates
approved by the utility's governing commission. The primary key of the Rate entity type
borrows from the RateSet entity type. Identification dependency is not required when
transforming an entity type into two entity types and a relationship. In this situation,
identification dependency is useful, but in other situations, it may not be appropriate.
Refinements to an ERD
4. Transforming a Weak Entity into a Strong Entity
A fourth transformation is to make a weak entity into a strong entity and change the
associated identifying relationships into non identifying relationships. This
transformation can make it easier to reference an entity type after conversion to a
table design. After conversion, a reference to a weak entity will involve a combined
foreign key with more than one column. This transformation is most useful for
associative entity types, especially associative entity types representing M-way
relationships.
Refinements to an ERD
5. Adding History
A fifth transformation is to add historical details to a data model. Historical details may be necessary for legal
requirements as well as strategic reporting requirements. This transformation can be applied to attributes and
relationships. When applied to attributes, the transformation is similar to the attribute to entity type transformation. For
example, to maintain a history of employee titles, the EmpTitle attribute is replaced with an entity type and a 1-M
relationship. The new entity type typically contains a version number as part of its primary key and borrows from the
original entity type for the remaining part of its primary key, as shown in following Figure. The beginning and ending
dates indicate the effective dates for the change. When applied to a relationship, this transformation typically involves
changing a 1-M relationship into an associative entity type and a pair of identifying 1-M relationships. The following
Figure depicts the transformation of the 1 -M Uses relationship into an associative entity type with attributes for the
version number and effective dates.
Refinements to an ERD
A sixth transformation is to make an entity type into a generalization hierarchy. This transformation
should be used sparingly because the generalization hierarchy is a specialized modeling tool. If there are
multiple attributes that do not apply to all entities and there is an accepted classification of entities, a
generalization hierarchy may be useful.
For example, water utility customers can be classified as commercial or residential. The attributes
specific to commercial customers (TaxPayerlD and EnterpriseZone) do not apply to residential customers
and vice versa. In following Figure, the attributes specific to commercial and residential customers have
been moved to the subtypes.
Summary of Transformations
Detecting Common Design Errors
Misplaced and Missing Relationships
In a large ERD, it is easy to connect the wrong entity types or omit a necessary relationship. You can connect the
wrong entity types if you do not consider all of the queries that a database should support. For example in
following Figure , if Customer is connected directly to Reading instead of being connected to Meter, the control of
a meter cannot be established unless the meter has been read for the customer. Queries that involve meter
control cannot be answered except through consideration of meter readings.
If the requirements do not directly indicate a relationship, you should consider indirect implications to detect
whether a relationship is required. For example, the requirements for the water utility database do not directly
indicate the need for a relationship from Bill to Reading. However, careful consideration of the consumption
calculation reveals the need for a relationship. The Includes relationship connects a bill to its most recent meter
readings, thus supporting the consumption calculation.
Detecting Common Design Errors
Incorrect Cardinalities
The typical error involves the usage of a 1-M relationship instead of an M-N relationship. This
error can be caused by an omission in the requirements. For example, if the requirements
just indicate that work assignments involve a collection of employees, you should not assume
that an employee can be related to just one work assignment. You should gather additional
requirements to determine if an employee can be associated with multiple work
assignments.
Other incorrect cardinality errors that you should consider are reversed cardinalities (1-M
relationship should be in the opposite direction) and errors on a minimum cardinality. The
error of reversed cardinality is typically an oversight.
Overuse of Specialized Data Modeling Constructs
Generalization hierarchies and M-way associative entity types are specialized data modeling
constructs. A typical novice mistake is to use them inappropriately. You should not use
generalization hierarchies just because an entity can exist in multiple states.
For example, the requirement that a project task can be started, in process, or complete does
not indicate the need for a generalization hierarchy. If there is an established classification
and specialized attributes and relationships for subtypes, a generalization hierarchy is an
appropriate tool.
Detecting Common Design Errors
Redundant Relationships
Cycles in an ERD may indicate redundant relationships. A cycle involves a collection of relationships arranged in a loop starting and ending with the same
entity type. For example in Figure 6.10, there is a cycle of relationships connecting Customer, Bill, Reading, and Meter. In a cycle, a relationship is redundant if
it can be derived from other relationships.
For the SentTo relationship, the bills associated with a customer can be derived from the relationships Uses, ReadBy, and Includes. In the opposite direction,
the customer associated with a bill can be derived from the Includes, ReadBy, and Uses relationships. Although a bill can be associated with a collection of
readings, each associated reading must be associated with the same customer. Because the SentTo relationship can be derived, it is removed in the final ERD.
Converting an ERD to Relational Tables
1. Entity Type Rule: Each entity type (except subtypes) becomes a table. The primary key of the entity
type (if not weak) becomes the primary key o f the table. The attributes of the entity type become
columns in the table. This rule should be used first before the relationship rules.
2. 1-M Relationship Rule: Each 1-M relationship becomes a foreign key in the table corresponding to
the child entity type (the entity type near the Crow's Foot symbol). If the minimum cardinality on the
parent side of the relationship is one, the foreign key cannot accept null values.
3. M-N Relationship Rule: Each M-N relationship becomes a separate table. The primary key of the
table is a combined key consisting of the primary keys of the entity types participating in the M-N
relationship.
4. Identification Dependency Rule: Each identifying relationship (denoted by a solid relationship line)
adds a component to a primary key. The primary key of the table corresponding to the weak entity
consists of (i) the underlined local key (if any) in the weak entity and (ii) the primary key(s) of the entity
type(s) connected by identifying relationship(s).
5. Generalization Hierarchy Rule: Each entity type of a generalization hierarchy becomes a table. The
columns of a table are the attributes of the corresponding entity type plus the primary key o f the
parent entity type. For each table representing a subtype, define a foreign key constraint that
references the table corresponding to the parent entity type.
6. 1-1 Relationship Rule: Each 1-1 relationship is converted into two foreign keys. If the relationship is
optional with respect to one of the entity types, the corresponding foreign key may be dropped to
eliminate null values.
Converting an ERD to Relational Tables
Applying Rule 1 &2
Converting an ERD to Relational Tables
Applying Rule 3
Converting an ERD to Relational Tables
Identification Dependency Rule
Converting an ERD to Relational Tables
Self Referencing
Converting an ERD to Relational Tables
Generalization Hierarchy Rule
Converting an ERD to Relational Tables
Converting 1-1 Relationships
Comprehensive Conversion Example
Review Concepts
• Identifying entity types and attributes in a narrative.
* Criteria for primary keys: stable and single purpose.
• Identifying relationships in a narrative.
- Transformations to add detail to an ERD: attribute to entity type, expanding an entity
type, adding history.
• Splitting an attribute to standardize information content.
. Changing a weak entity to a strong entity to remove combined foreign keys after
conversion.
• Adding a generalization hierarchy to avoid null values.
• Common design errors: misplaced relationships, missing relationships, incorrect cardinalities,
overuse of generalization hierarchies, overuse of associative entity types representing
M-way relationships, and redundant relationships.
• Basic rules to convert entity types and relationships.
• Specialized conversion rules to convert optional 1-M relationships, generalization hierarchies,
and 1-1 relationships.
Resources
Chapter 6, Michael V. Mannion., Database Design, Application Development, and
Administration, , Mc Graw Hill Publishers, 3rd edition,
Chapter 7, Avi Silberschatz, Henry F. Korth, S. Sudarshan, Database Systems
Concepts, Mc Graw Hill Publishers, 7th edition
Chapter 3, Elmasri and Navathe. Fundamentals of Database Systems. Addison-
Wesley, 7th edition, 2007.