Physical Database Design: Chapter Four

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Chapter Four

Physical Database Design

topic of physical database design :


• What are implemented within a particular DBMS
• What techniques for storing and finding data exist

1
1. Introduction
 We have established that there are three levels of database
design:
 Conceptual design: producing a data model which accounts
for the relevant entities and relationships within the target
application domain;
 Logical design: ensuring, via normalization procedures and
the definition of integrity rules, that the stored database
will be non-redundant and properly connected;
 Physical design: will be the focus of this chapter

2
the purpose of physical database design
1. How to map the logical database design to a physical database design.
2. How to design enterprise constraints for target DBMS.
3. How to select appropriate file organizations based on analysis of
transactions.
4. When to use secondary indexes to improve performance.
5. How to estimate the size of the database
6. How to design security mechanisms to satisfy user requirements.
7. How to design procedures and triggers.

3
 Steps in physical database design
1. Translate logical data model for target DBMS
1.1. Design base relation
1.2. Design representation of derived data
1.3. Design enterprise constraint
2. Design physical representation
2.1. Analyze transactions
2.2. Choose file organization
2.3. Choose indexes
2.4. Estimate disk space and system requirement
3. Design user view
4. Design security mechanisms
5. Consider controlled redundancy
6. Monitor and tune the operational system

4
1. Translate logical data model for target DBMS
 This phase is the translation of the global logical data model
to produce a relational database schema in the target DBMS.
 Knowledge of the DBMS includes:
 how to create base relations
 whether the system supports:
• definition of Primary key
• definition of Foreign key
• definition of Alternate key(Unique keys)
• definition of Domains
• Referential integrity constraints
• definition of enterprise level constraints
Remark: The implementation of the physical model is dependent on the target DBMS
since some has more facilities than the other in defining database definitions. 5
1.1. Design base relation
 Designing base relation involves identification of all necessary
requirements about a relation starting from the name up to the
referential integrity constraints.
 For each relation, we need to define:
• The name of the relation;
• A list of simple attributes in brackets;
• The PK and, where appropriate.
• A list of any derived attributes and how they should be
computed;
• Referential integrity constraints for any FKs identified.
 For each attribute, need to define:
 Its domain, consisting of a data type, length, and any constraints
on the domain;
 An optional default value for the attribute;
 Whether the attribute can hold nulls.
 Whether the attribute can be derived , if do how it should be
computed

6
1.2. Design representation of derived data
 While analyzing the requirement of users, we may encounter that
there are some attributes holding data that will be derived from
existing or other attributes. A decision on how to represent any
derived data present in the global logical data model in the target
DBMS should be devised.

 Most of the time derived attributes are not expressed in the logical
model but will be included in the data dictionary. Whether to store
derived attributes in a base relation or calculate them when required is
a decision to be made by the designer considering the performance
impact.
 Option selected is based on:
 Additional cost to store the derived data and keep it consistent
with operational data from which it is derived;
 Cost to calculate it each time it is required.

7
1.3. Design enterprise constraint
 Data in the database is not only subjected to constraints on the
database and the data model used but also with some
enterprise dependent constraints.
 One need to know the functionalities of the DBMS since in
designing the enterprise constraints for the target DBMS some
DBMS provide more facilities than others.
2. Design physical representation
 This phase is the level for determining the optimal file organizations to
store the base relations and the indexes that are required to achieve
acceptable performance;
 Number of factors that may be used to measure efficiency:
 Transaction throughput: number of transactions processed in
given time interval.
 Response time: elapsed time for completion of a single
transaction.
 Disk storage: amount of disk space required to store database files.
8
2.1. Analyze transactions
 The objective here is to understand the functionality of the
transactions that will run on the database and to analyze the
important transactions.
 Attempt to identify performance criteria, e.g.:
 Transactions that run frequently and will have a significant
impact on performance;
 Transactions that are critical to the business;
 Times during the day/week when there will be a high demand
made on the database (called the peak load).
 Use this information to identify the parts of the database that may
cause performance problems.
 To select appropriate file organizations and indexes, also need to
know high level functionality of the transactions, such as:
 Attributes that are updated in an update transaction;
 Criteria used to restrict tuples that are retrieved in a query. 9
 To help identify which transactions to investigate, can use:
 Transaction/relation cross-reference matrix, showing
relations that each transaction accesses, and/or
 Transaction usage map, indicating which relations are
potentially heavily used.

2.2. Choose file organization


 The objective here is to determine an efficient file organization for
each base relation .
 File organizations include Heap, Hash, Indexed Sequential office
Access Method (ISAM), B+-Tree, and Clusters. /reading assignment/
 Most DBMSs provide little or no option to select file organization.
 However, they prove the user with an option to select an index for
every relation

10
2.3. Choose indexes
 The objective here is to determine whether adding indexes will
improve the performance of the system.
 One approach is to keep tuples unordered and create as many
secondary indexes as necessary.
 Another approach is to order tuples in the relation by specifying
a primary or clustering index.
• In this case, choose the attribute for ordering or clustering
the tuples as:
– Attribute that is used most often for join operations -
this makes join operation more efficient, or
– Attribute that is used most often to access the tuples in
a relation in order of that attribute.

 If ordering attribute chosen is on the primary key of a relation,


index will be a primary index; otherwise, index will be a clustering 11
index.
Guidelines for Choosing Indexes
1. Do not index small relations.
2. Index PK of a relation if it is not a key of the file organization.
3. Add secondary index to a FK if it is frequently accessed.
4. Add secondary index to any attribute that is heavily used as asecondary
key.
5. Add secondary index on attributes that are involved in:
 selection or join criteria; ORDER BY; GROUP BY;
 and other operations involving sorting (such as UNION or DISTINCT).
6. Add secondary index on attributes involved in built-in functions.
7. Add secondary index on attributes that could result in an index only plan.
8. Avoid indexing an attribute or relation that is frequently updated.
9. Avoid indexing an attribute if the query will retrieve a significant
proportion of the tuples in the relation.
10. Avoid indexing attributes that consist of long character strings.

12
2.4. Estimate disk space and system requirement
 The objective here is to estimate the amount of disk space
that will be required by the database.
 Purpose is to answer the following questions:
 If system already exists: is there adequate storage?
 If procuring new system: what storage will be required?

3. Design user view


 To design the user views that was identified during the
Requirements Collection and Analysis stage of the relational
database application development lifecycle.
 Define views in DDL to provide user views identified in data
model Map onto objects in physical data model

13
4. Design security mechanisms
 To design the security measures for the database as specified
by the users.
 System security – Authentication
 Data security-authorizations

5. Consider the Introduction of Controlled Redundancy


 The objective here is to determine whether introducing
redundancy in a controlled manner by relaxing the normalization
rules will improve the performance of the system
 This is sometimes known as denormalization (merging of
relations)
 Result of normalization is a logical database design that is
structurally consistent and has minimal redundancy.
 However, sometimes a normalized database design does not
provide maximum processing efficiency.
14
 Consider denormalization in following situations, specifically
to speed up frequent or critical transactions:
 Step 1 Combining 1:1 relationships
 Step 2 Duplicating non-key attributes in 1:* relationships
to reduce joins
 Step 3 Duplicating foreign key attributes in 1:*
relationships to reduce joins

15
6. Monitoring and Tuning the operational system
 The objective here is to monitor operational system and improve
performance of system to correct inappropriate design decisions
or reflect changing requirements.
 Importance of monitoring and tuning the operational system
 Avoids procurement of additional hardware
 Down size the hardware configuration….> less and cheaper
hardware ..> less expensive maintenance.
 Faster response time and high throughput >> more
productive
 Faster response time .>>good staff moral, customer
satisfaction

16

You might also like