Er Model
Er Model
Er Model
Design of a Database
Design phases:
Requirement Analysis
Conceptual Database Design
Talk to people and figure out what they want Do the design Many tools/modeling techniques
Actual database tables in relational model, or OO model or XML model Here only relational model.
What are the entities and relationships in the enterprise? What information about these entities and relationships should we store in the database? What are the integrity constraints or business rules that hold? A database `schema in the ER Model can be represented pictorially (ER diagrams). Can map an ER diagram into a relational schema.
Entity-Relationship Model
Entity
Sets Relationship Sets Mapping Constraints Keys E-R Diagram Extended E-R Features Design Issues Design of an E-R Database Schema Reduction of an E-R Schema to Tables
Entity Sets
An entity is an object that exists and is distinguishable from other objects. Example: specific person, company, event, plant Entities are described using attributes Example: people have names and addresses An entity set is a set of entities of the same type that share the same properties. Example: set of all persons, companies, trees, holidays
Entity Sets customer and loan customer-id customer- customer- customerloan- amount
name street city number
Attributes
An entity is represented by a set of attributes, that is descriptive properties possessed by all members of an entity set. Example: customer = (customer-id, customer-name, customer-street, customer-city) loan = (loan-number, amount) Domain the set of permitted values for each attribute Keys: Minimal set of attributes whose values uniquely identify an entity in the set
Candidate Keys: all sets of attributes that can potentially be a key. Primary Key: One of the candidate keys is chosen to be a primary key.
Relationship Sets
account entity
A relationship set is a mathematical relation among n 2 entities, each taken from entity sets {(e1, e2, en) | e1 E1, e2 E2, , en En}
where (e1, e2, , en) is a relationship
Example:
There can be multiple relationship sets between the same two entities. A relationship must be uniquely identified by the participating entities.
Descriptive Attributes
Descriptive attributes: used to record information about the relationship When was the last time that the customer accessed his/her account.
E-R Diagrams
Rectangles represent entity sets. Diamonds represent relationship sets. Lines link attributes to entity sets and entity sets to relationship
sets.
Ellipses represent attributes Underline indicates primary key attributes (coming up)
Ternary Relationships
Ternary relationships - used to record associations between three entity sets. Example: Each branch has several jobs that can be worked on by
For this we need to record the association between employees, branches and jobs.
Constraints in ER
Key
Key Constraints
Consider depositor relationship: A customer can deposit into many accounts; an account can have many depositors. Compare with: Each department has at most one Manager
Contrast with: Each customer can be the borrower on one loan. However, each loan can have many borrowers. The restriction that each customer can be borrower on one loan => Key Constraint
Key Constraint II
Relationship set like borrower - sometimes said to be one-to-many Relationship set between customers and accounts -> many-to-many
Additional Restriction: a loan may be borrowed by only one customer -> one-to-one
Means same thing! Implies that customer entity participates in the borrower relationship set only once.
Participation Constraints
Total participation (indicated by double/thick line): every
entity in the entity set participates in at least one relationship in the relationship set E.g. participation of loan in borrower is total every loan must have a customer associated to it via borrower Partial participation: some entities may not participate in any relationship in the relationship set E.g. participation of customer in borrower is partial Not every customer has a loan
Keys
A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity. A candidate key of an entity set is a minimal super key
Although several candidate keys may exist, one of the candidate keys is selected to be the primary key.
Assumption so far:
Attributes associated with an entity contain a key (to uniquely identify the entities)
Employees can purchase policies to cover their dependents. We need to record information about policies:
Who is covered, Who owns the policy
Dont really care about the dependents beyond that If employee quits, policy is deleted and coverage for dependents stopped! An entity set that does not have a primary key is referred to as a weak entity set.
Weak entity is uniquely identified by a conjunction of some of its attributes and the primary key of another entity - Identifying entity set
Restrictions:
it must relate to the identifying entity set via a one-tomany relationship set from the identifying to the weak entity set It must have total participation in the identifying relationship set.
We depict a weak entity set by double rectangles. We underline the discriminator of a weak entity set with a dashed line. payment-number discriminator of the payment entity set Primary key for payment (loan-number, paymentnumber)
Design choices:
Should a concept be modeled as an entity or an attribute? Should a concept be modeled as an entity or a relationship? Identifying relationships: Binary or ternary? Aggregation? A lot of data semantics can (and should) be captured. But some constraints cannot be captured in ER diagrams.
Remember attribute values are atomic (cannot be broken down further) Should address be an attribute of Employees or an entity (connected to Employees by a relationship)? Depends upon the use of address information, and the semantics of the data:
If we have several addresses per employee, address must be an entity (since attributes cannot be set-valued). If address is to be shared by many employees, address should be an entity. If the structure (city, street, etc.) is important, e.g., we want to retrieve employees in a given city, address must be modeled as an entity (since attribute values are atomic).
Works_In2 does not did ssn lot budget allow an employee to Departments Works_In2 Employees work in a department for two or more periods. Similar to the problem of wanting to record several addresses for an name dname ssn lot did budget employee: we want to record several values of Works_In3 Departments Employees the descriptive attributes for each instance of this Duration to from relationship.
name
First ER diagram OK if name a manager gets a ssn separate discretionary Employees budget for each dept. What if a manager gets a discretionary budget name ssn that covers all managed depts? Employees
since lot
dbudget
dname did budget Departments
Manages2
Redundancy of dbudget, which is stored for each dept managed by the manager.
Consider Figure 1 - Whatname lotit does ssn depict? Employees Additional constraints:
Dependents
Beneficiary A policy cannot be owned jointly by two employees Policies Every policy must be owned by some employee policyid cost Dependents is a weak entity set - uniquely identified by policyId
Purchaser
Binary vs Ternary
Better Solution
name ssn Employees lot Covers pname age Dependents
Policies
policyid name ssn lot cost pname age Dependents Employees
Purchaser
Beneficiary
Policies
policyid
cost
in C++ or Java, attributes are inherited If we declare A ISA B, every A entity is also considered to be a B entity.
Overlap Constraints: Can Joe be both an employee and a customer? (Allowed/Disallowed) Does every employee entity also have to be an officer or teller or secretary entity? (Yes/No) Reasons for using ISA:
Aggregation
name
ssn
Employees
lot
Used when we have to model a relationship involving (entitity sets and) a pid relationship set.
Monitors
until
started_on
pbudget
since did
dname
budget
Aggregation allows Sponsors Departments Projects us to treat a relationship set as an entity set for Aggregation vs. ternary relationship: purposes of Monitors is a distinct relationship, participation in (other) relationships.with a descriptive attribute.
Also, can say that each sponsorship is monitored by at most one employee.
Yields a high-level description of data to be stored Constructs are expressive, close to the way people think about their applications.
Basic constructs: entities, relationships, and attributes (of entities and relationships). Some additional constructs: weak entities, ISA hierarchies, and aggregation. Note: There are many variations on ER model.
Summary of ER (Contd.)
Several kinds of integrity constraints can be expressed in the ER model: key constraints, participation constraints, and overlap/covering constraints for ISA hierarchies. Some foreign key constraints are also implicit in the definition of a relationship set.
Some constraints (notably, functional dependencies) cannot be expressed in the ER model. Constraints play an important role in determining the best database design for an enterprise.
Summary of ER (Contd.)
ER design is subjective. There are often many ways to model a given scenario! Analyzing alternatives can be tricky, especially for a large enterprise. Common choices include:
Entity vs. attribute, entity vs. relationship, binary or n-ary relationship, whether or not to use ISA hierarchies, and whether or not to use aggregation.
Ensuring good database design: resulting relational schema should be analyzed and refined further. FD information and normalization techniques are especially useful.
UML
UML:
Unified Modeling Language UML has many components to graphically model different aspects of an entire software system UML Class Diagrams correspond to E-R Diagram, but several differences.
Entity sets are shown as boxes, and attributes are shown within the box, rather than as separate ellipses in E-R diagrams. Binary relationship sets are represented in UML by just drawing a line connecting the entity sets. The relationship set name is written adjacent to the line. The role played by an entity set in a relationship set may also be specified by writing the role name on the line, adjacent to the entity set. The relationship set name may alternatively be written in a box, along with attributes of the relationship set, and the box is connected, using a dotted line, to the line depicting the relationship set. Non-binary relationships cannot be directly represented in UML -they have to be converted to binary relationships.
Cardinality constraints are specified in the form l..h, where l denotes the minimum and h the maximum number of relationships an entity can participate in. Beware: the positioning of the constraints is exactly the reverse of the positioning of constraints in E-R diagrams. The constraint 0..* on the E2 side and 0..1 on the E1 side means that each E2 entity can participate in at most one relationship, whereas each E1 entity can participate in many relationships; in other words, the relationship is many to one from E2 to E1. Single values, such as 1 or * may be written on edges; The single value 1 on an edge is treated as equivalent to 1..1, while * is equivalent to 0..*.