Lec4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

Database Systems I

CMPT 354 Summer 2024


Zhengjie Miao

22 May 2024
Assignment 1 Setup
• Due May 26 May 29
• Please start now, if not already
• Problem 1: either local radb or RATest
• RATest is ready (with occasionally server deployment issues)
• Post any issues you encounter on Piazza
• But you are still able to work on this problem without the tools
• Will have a more detailed document tonight or tomorrow

2
Outline
• Relational Model continued

• E/R Basics: Entities & Relationships

• Advanced E/R Concepts

3
Recap: Relational Model
• A database is a collection of relations (or tables)
• Each relation has a set of attributes (or columns)
• Each attribute has a name and a domain (or type)
• Each relation contains a set of tuples (or rows)

4
Keys
• A set of attributes 𝐾 is a key for a relation 𝑅 if
• In no instance of 𝑅 will two different tuples agree on all attributes of 𝐾
• That is, 𝐾 can serve as a “tuple identifier”
• No proper subset of 𝐾 satisfies the above condition
• That is, 𝐾 is minimal
• Example: User (uid, name, age, pop)
• uid is a key of User
• age is not a key (not an identifier)
• {uid, name} is not a key (not minimal)

5
Schema vs. Instance
uid name age pop
142 Bart 10 0.9
123 Milhouse 10 0.2
857 Lisa 8 0.7
456 Ralph 8 0.3

• Is name a key of User?


• Yes? Seems reasonable for this instance
• No! User names are not unique in general
• Key declarations are part of the schema

6
More examples of keys
• Member (uid, gid)
• {uid, gid}
• A key can contain multiple attributes

7
More examples of keys
• Member (uid, gid)
• {uid, gid}
• A key can contain multiple attributes
• Address (street_address, city, state, zip)
• {street_address, city, state}
• {street_address, zip}
• A relation can have multiple keys!
• We typically pick one as the “primary” key, and underline all its attributes, e.g.,
Address (street_address, city, state, zip)

8
Use of keys
• More constraints on data, fewer mistakes
• Look up a row by its key value
• Many selection conditions are “key = value”
• “Pointers” to other rows (often across tables)
• Example: Member (uid, gid)
• uid is a key of User
• gid is a key of Group
• A Member row “links” a User row with a Group row
• Many join conditions are “key = key value stored in another table”

9
Outline
• Relational Model continued

• E/R Basics: Entities & Relationships

• Advanced E/R Concepts

10
Database design
• Understand the real-world domain being modeled
• Specify it using a database design model
• More intuitive and convenient for schema design
• But not necessarily implemented by DBMS
• A few popular ones:
• Entity/Relationship (E/R) model
• Object Definition Language (ODL)
• UML (Unified Modeling Language)
• Translate specification to the data model of DBMS
• Relational, XML, object-oriented, etc.
• Create DBMS schema
11
Motivation
• How to figure out this database design?
User Group
uid name age pop gid name
142 Bart 10 0.9 abc Book Club
… … … … … …

Member
uid gid
142 dps
… …
• What tables to create?
• Which attributes should be added to each table?
• What are the relationships between the tables?
12
But what about ORM?
• Automatic object-relational mappers are made popular by rapid
Web development frameworks
• For example, with Python SQLAlchemy:
• You declare Python classes and their relationships
• It automatically converts them into database tables
• If you want, you can just work with Python objects, and never need to be aware of
the database schema or write SQL
• But you still need designer discretion in all but simple cases
• Each language/library has its own syntax for creating schema and
for querying/modifying data
• Quirks and limitations cause portability problems
• They are not necessarily easier to learn than SQL
13
History of E/R Model
• E/R Model (Entity-Relationship Modeling)
• Codd wrote a long letter criticizing paper
• Many people suggested him to give up this idea

Dr. Peter Chen

• Why not build RDBMS based on E/R Model?


• No query language proposed
• Relational DBMS in the 1970’s
14
Entity-relationship (E/R) model
• Historically and still very popular
• Concepts applicable to other design models as well
• Can think of as a “watered-down” object-oriented design model
• Primarily a design model — not directly implemented by DBMS
• Designs represented by E/R diagrams
• We use the style of E/R diagram covered by the GMUW book; there are
other styles/extensions
• Very similar to UML diagrams

15
Database Design Process

1. Requirements Analysis 2. Conceptual Design 3. Logical, Physical, Security, etc.

1. Requirements analysis
• What data is going to be stored?
• What are we going to do with the data?
• Who should access the data?

Technical and non-technical


people are involved
17
Database Design Process

1. Requirements Analysis 2. Conceptual Design 3. Logical, Physical, Security, etc.

2. Conceptual Design
• A high-level description of the database
• Sufficiently precise that technical people can understand it
• But, not so precise that non-technical people can’t participate

This is where E/R fits in


18
Database Design Process

1. Requirements Analysis 2. Conceptual Design 3. Logical, Physical, Security, etc.

3. More:
• Logical Database Design
• Physical Database Design
• Security Design

19
Database Design Process

1. Requirements Analysis 2. Conceptual Design 3. Logical, Physical, Security, etc.

E/R Model & Diagrams used

uid gid
Users IsMemberOf Groups
name name

E/R is a visual syntax for DB design, which is precise


enough for technical points but abstracted enough
20
Entities and Entity Sets
• Entity: a “thing,” like an object
• Entity set: a collection of things of the same type, like a
relation of tuples or a class of objects
• These are what is shown in E/R diagrams - as rectangles
• Eg: A specific person or product

Users Groups

21
Attributes
• Attributes: properties of entities or relationships, like
attributes of tuples or objects
• Represented by ovals attached to an entity set

uid gid
Users Groups
name name

22
Keys
• A key of an entity set is represented by underlining all
attributes in the key
• A key is a set of attributes whose values can belong to at most
one entity in an entity set—like a key of a relation
• Every entity set must have a key
• Denote by underlining.

uid gid
Users Groups
name name

23
Relationships
• Relationship: an association among entities
• Relationship set: a set of relationships of the same type (among
same entity sets)
• Represented as a diamond

uid gid
Users IsMemberOf Groups
name name

24
What is a Relationship exactly?
• Relationship: an association among entities
• A mathematical definition:
• Let A, B be sets B=
A= 1 a
• A={1,2,3}, B={a,b,c,d} b
2
c
3
d

25
What is a Relationship exactly?
• Relationship: an association among entities
• A mathematical definition:
• Let A, B be sets B=
A= 1 a
• A={1,2,3}, B={a,b,c,d} b
2
• A x B (the cross-product) is the set of all pairs (a,b)
c
• A ´ B = {(1,a), (1,b), (1,c), (1,d), (2,a), (2,b), (2,c), (2,d), (3,a), 3
(3,b), (3,c), (3,d)} d

26
What is a Relationship exactly?
• Relationship: an association among entities
• A mathematical definition:
• Let A, B be sets B=
A= 1 a
• A={1,2,3}, B={a,b,c,d} b
2
• A x B (the cross-product) is the set of all pairs (a,b)
c
• A ´ B = {(1,a), (1,b), (1,c), (1,d), (2,a), (2,b), (2,c), (2,d), (3,a), 3
(3,b), (3,c), (3,d)} d
• We define a relationship to be a subset of A x B
• R = {(1,a), (2,c), (2,d), (3,b)}

27
What is a Relationship exactly?
User Group User × Group
uid name age pop gid name uid u.name age pop gid g.name
123 Milhouse 10 0.2 abc Book Club 123 Milhouse 10 0.2 abc Book Club
857 Lisa 8 0.7
× gov Student Government 123 Milhouse 10 0.2 gov Student Government
… … … … dps Dead Putting Society 123 Milhouse 10 0.2 dps Dead Putting Society
… … 857 Lisa 8 0.7 abc Book Club
857 Lisa 8 0.7 gov Student Government
857 Lisa 8 0.7 dps Dead Putting Society
… … … … … …
uid gid
Users IsMemberOf Groups
name name

Member uid gid


123 gov
A relationship between entity sets User and Group
857 abc
• A subset of all possible combinations
857 gov
• with tuples uniquely identified by keys
… …
28
Attributes of relationships
• Example: a user belongs to a group since a particular date
uid gid
Users IsMemberOf Groups
name name

fromDate
• Where do the dates go?
• With Users?
• But a user can join multiple groups on different dates
• With Groups?
• But different users can join the same group on different dates
• With IsMemberOf!

29
Outline
• Relational Model continued

• E/R Basics: Entities & Relationships

• Advanced E/R Concepts

30
More on relationships
• There could be multiple relationship sets between the same entity
sets
• Example: Users IsMemberOf Groups; Users Likes Groups
• In a relationship set, each relationship is uniquely identified by the
entities it connects
• Example: Between Bart and “Dead Putting Society”, there can be at most
one IsMemberOf relationship and at most one Likes relationship
• What if Bart joins DPS, leaves, and rejoins? How can we modify the design
to capture historical membership information?
• Make an entity set of MembershipRecords

31
Multiplicity of relationships
• 𝐸 and 𝐹: entity sets
• Many-many: Each entity in 𝐸 is related to 0 or more entities in 𝐹 and vice
versa
• Example: Users IsMemberOf Groups

• Many-one: Each entity in 𝐸 is related to 0 or 1 entity in 𝐹, but each entity


in 𝐹 is related to 0 or more in 𝐸
• Example: Groups IsOwnedBy Users

• One-one: Each entity in 𝐸 is related to 0 or 1 entity in 𝐹 and vice versa


• Example:
Users IsLinkedTo TwitterUsers

• “One” (0 or 1) is represented by an arrow


• “Exactly one” is represented by a rounded arrow
32
Roles in relationships
• An entity set may participate more than once in a relationship set
• May need to label edges to distinguish roles
• Examples
• Users may be parents of others; label needed
• Users may be friends of each other; label not needed

parent

IsFriendOf Users IsParentOf

child

33
𝑛-ary relationships
• Example: a user must have an initiator in
member
order to join a group
Users IsMemberOf Groups

initiator
Rule for interpreting an arrow into entity set 𝐸 in an 𝑛-ary
relationship:
• Pick one entity from each other entity set (excluding only 𝐸);
together they can relate to ≤ one entity in 𝐸
• Exercise: hypothetically, member
what do these two
Users IsMemberOf Groups
arrows imply?
initiator 34
𝑛-ary versus binary relationships
• Can we model 𝑛-ary relationships using just binary relationships,
like this?
Users IsMemberOf Groups

InitiatesFor
member initiator
O N G!
IsInitiatedBy
WR
• No; for example:
• Ralph is in both abc and gov
• Lisa has served as initiator in both abc and gov
• Ralph was initiated by Lisa in abc, but not by her in gov

35
Next: two special relationships
… is part of/belongs to …

… is a kind of …

http://blogs.library.duke.edu/renovation/files/2012/08/Rubenstein-Library-First-Floor-Floorplan.jpg 36
http://www.sharky-jones.com/Sharkyjones/Artwork/taxonomy%20artwork/Class1.jpg
Weak entities & supporting relationships
Sometimes, an entity’s identity depends on some others’
• The key of a weak entity set 𝐸 comes not completely from its own
attributes, but from the keys of one or more other entity sets
• 𝐸 must link to them via many-one or one-one relationship sets
• Example: Rooms inside Buildings are partly identified by Buildings’ name
• A weak entity set is drawn
𝐸
as a double rectangle
• The relationship sets through which
it obtains its key are called supporting
relationship sets, drawn as double diamonds

37
Weak entity set examples
• Seats in rooms in building
number name
Rooms In Buildings
capacity year

In

number
Seats
L/R?

• Why must double diamonds be many-one/one-one?


• With many-many, we would not know which entity provides the key value!

38
Remodeling 𝑛-ary relationships
• An 𝑛-ary relationship set can be replaced by a weak entity set
(called a connecting entity set) and 𝑛 binary relationship sets
member

Users IsMemberOf Groups

initiator

Users Member Memberships Group Groups

Initiator
How would you capture the multiplicity
constraint for IsMemberOf now?
Make Initiator regular instead of supporting
39
ISA relationships
• Similar to the idea of subclasses in object-oriented programming:
subclass = special case, fewer entities, and possibly more
properties
• Represented as a triangle (direction is important)
• Example: paid users are users, but they also get avatars (yay!)
uid gid
Users IsMemberOf Groups
name name

ISA fromDate

avatar PaidUsers Automatically “inherits” key, attributes,


relationships
40
Summary of E/R concepts
• Entity sets
• Keys
• Weak entity sets
• Relationship sets
• Attributes of relationships
• Multiplicity
• Roles
• Binary versus 𝑛-ary relationships
• Modeling 𝑛-ary relationships with weak entity sets and binary relationships
• ISA relationships

41
Case study 1
• Design a database representing cities, counties, and states
• For each state, record name and capital (city)
• For each county, record name, acreage, and state it’s in
• For each city, record name, population, and county/state it belongs to
• Assume the following:
• Names of states are unique
• Names of counties are only unique within a state
• Names of cities are only unique within a county
• A city is always in a single county
• A county is always in a single state

42
Case study 1: first design
name name
Cities In States
population capital

county_name

county_acreage

• County acreage information is repeated for every


city in the county
FRedundancy is bad (why?)
• State capital should really be a city
FShould “reference” entities through explicit
relationships
43
Case study 1: second design
name
Cities
population

In IsCapitalOf

name
Counties In States name
acreage

• Technically, nothing in this design prevents a city in


state 𝑋 from being the capital of another state 𝑌,
but oh well…

44
Case study 2
• Design a database consistent with the following:
• A station has a unique name and an address, and is either an express
station or a local station
• A train has a unique number and an engineer, and is either an express train
or a local train
• A local train can stop at any station
• An express train only stops at express stations
• A train can stop at a station for any number of times during a day
• Train schedules are the same every day

45
Case study 2: first design
number name

engineer Trains StopsAt Stations address

E/L? time E/L?

• Nothing in this design prevents express trains from stopping at


local stations
FWe should capture as many constraints as possible
• A train can stop at a station only once during a day
FWe should not introduce unintended constraints
46
Case study 2: second design
time
number name
Trains LocalTrainStops Stations
engineer address

ISA ISA

LocalTrains LocalStations

ExpressTrains ExpressStations
time No double-diamonds here
because train number + time
ExpressTrainStops uniquely determine a stop

Is the extra complexity worth it?


47
What’s Next
• Next class:
• More about Database design
• Assignment 1 due 5/29

48

You might also like