Physical Database Design: Chapter Four
Physical Database Design: Chapter Four
Physical Database Design: Chapter Four
1
1. Introduction
We have established that there are three levels of database
design:
Conceptual design: producing a data model which accounts
for the relevant entities and relationships within the target
application domain;
Logical design: ensuring, via normalization procedures and
the definition of integrity rules, that the stored database
will be non-redundant and properly connected;
Physical design: will be the focus of this chapter
2
the purpose of physical database design
1. How to map the logical database design to a physical database design.
2. How to design enterprise constraints for target DBMS.
3. How to select appropriate file organizations based on analysis of
transactions.
4. When to use secondary indexes to improve performance.
5. How to estimate the size of the database
6. How to design security mechanisms to satisfy user requirements.
7. How to design procedures and triggers.
3
Steps in physical database design
1. Translate logical data model for target DBMS
1.1. Design base relation
1.2. Design representation of derived data
1.3. Design enterprise constraint
2. Design physical representation
2.1. Analyze transactions
2.2. Choose file organization
2.3. Choose indexes
2.4. Estimate disk space and system requirement
3. Design user view
4. Design security mechanisms
5. Consider controlled redundancy
6. Monitor and tune the operational system
4
1. Translate logical data model for target DBMS
This phase is the translation of the global logical data model
to produce a relational database schema in the target DBMS.
Knowledge of the DBMS includes:
how to create base relations
whether the system supports:
• definition of Primary key
• definition of Foreign key
• definition of Alternate key(Unique keys)
• definition of Domains
• Referential integrity constraints
• definition of enterprise level constraints
Remark: The implementation of the physical model is dependent on the target DBMS
since some has more facilities than the other in defining database definitions. 5
1.1. Design base relation
Designing base relation involves identification of all necessary
requirements about a relation starting from the name up to the
referential integrity constraints.
For each relation, we need to define:
• The name of the relation;
• A list of simple attributes in brackets;
• The PK and, where appropriate.
• A list of any derived attributes and how they should be
computed;
• Referential integrity constraints for any FKs identified.
For each attribute, need to define:
Its domain, consisting of a data type, length, and any constraints
on the domain;
An optional default value for the attribute;
Whether the attribute can hold nulls.
Whether the attribute can be derived , if do how it should be
computed
6
1.2. Design representation of derived data
While analyzing the requirement of users, we may encounter that
there are some attributes holding data that will be derived from
existing or other attributes. A decision on how to represent any
derived data present in the global logical data model in the target
DBMS should be devised.
Most of the time derived attributes are not expressed in the logical
model but will be included in the data dictionary. Whether to store
derived attributes in a base relation or calculate them when required is
a decision to be made by the designer considering the performance
impact.
Option selected is based on:
Additional cost to store the derived data and keep it consistent
with operational data from which it is derived;
Cost to calculate it each time it is required.
7
1.3. Design enterprise constraint
Data in the database is not only subjected to constraints on the
database and the data model used but also with some
enterprise dependent constraints.
One need to know the functionalities of the DBMS since in
designing the enterprise constraints for the target DBMS some
DBMS provide more facilities than others.
2. Design physical representation
This phase is the level for determining the optimal file organizations to
store the base relations and the indexes that are required to achieve
acceptable performance;
Number of factors that may be used to measure efficiency:
Transaction throughput: number of transactions processed in
given time interval.
Response time: elapsed time for completion of a single
transaction.
Disk storage: amount of disk space required to store database files.
8
2.1. Analyze transactions
The objective here is to understand the functionality of the
transactions that will run on the database and to analyze the
important transactions.
Attempt to identify performance criteria, e.g.:
Transactions that run frequently and will have a significant
impact on performance;
Transactions that are critical to the business;
Times during the day/week when there will be a high demand
made on the database (called the peak load).
Use this information to identify the parts of the database that may
cause performance problems.
To select appropriate file organizations and indexes, also need to
know high level functionality of the transactions, such as:
Attributes that are updated in an update transaction;
Criteria used to restrict tuples that are retrieved in a query. 9
To help identify which transactions to investigate, can use:
Transaction/relation cross-reference matrix, showing
relations that each transaction accesses, and/or
Transaction usage map, indicating which relations are
potentially heavily used.
10
2.3. Choose indexes
The objective here is to determine whether adding indexes will
improve the performance of the system.
One approach is to keep tuples unordered and create as many
secondary indexes as necessary.
Another approach is to order tuples in the relation by specifying
a primary or clustering index.
• In this case, choose the attribute for ordering or clustering
the tuples as:
– Attribute that is used most often for join operations -
this makes join operation more efficient, or
– Attribute that is used most often to access the tuples in
a relation in order of that attribute.
12
2.4. Estimate disk space and system requirement
The objective here is to estimate the amount of disk space
that will be required by the database.
Purpose is to answer the following questions:
If system already exists: is there adequate storage?
If procuring new system: what storage will be required?
13
4. Design security mechanisms
To design the security measures for the database as specified
by the users.
System security – Authentication
Data security-authorizations
15
6. Monitoring and Tuning the operational system
The objective here is to monitor operational system and improve
performance of system to correct inappropriate design decisions
or reflect changing requirements.
Importance of monitoring and tuning the operational system
Avoids procurement of additional hardware
Down size the hardware configuration….> less and cheaper
hardware ..> less expensive maintenance.
Faster response time and high throughput >> more
productive
Faster response time .>>good staff moral, customer
satisfaction
16