Lec4
Lec4
Lec4
22 May 2024
Assignment 1 Setup
• Due May 26 May 29
• Please start now, if not already
• Problem 1: either local radb or RATest
• RATest is ready (with occasionally server deployment issues)
• Post any issues you encounter on Piazza
• But you are still able to work on this problem without the tools
• Will have a more detailed document tonight or tomorrow
2
Outline
• Relational Model continued
3
Recap: Relational Model
• A database is a collection of relations (or tables)
• Each relation has a set of attributes (or columns)
• Each attribute has a name and a domain (or type)
• Each relation contains a set of tuples (or rows)
4
Keys
• A set of attributes 𝐾 is a key for a relation 𝑅 if
• In no instance of 𝑅 will two different tuples agree on all attributes of 𝐾
• That is, 𝐾 can serve as a “tuple identifier”
• No proper subset of 𝐾 satisfies the above condition
• That is, 𝐾 is minimal
• Example: User (uid, name, age, pop)
• uid is a key of User
• age is not a key (not an identifier)
• {uid, name} is not a key (not minimal)
5
Schema vs. Instance
uid name age pop
142 Bart 10 0.9
123 Milhouse 10 0.2
857 Lisa 8 0.7
456 Ralph 8 0.3
6
More examples of keys
• Member (uid, gid)
• {uid, gid}
• A key can contain multiple attributes
7
More examples of keys
• Member (uid, gid)
• {uid, gid}
• A key can contain multiple attributes
• Address (street_address, city, state, zip)
• {street_address, city, state}
• {street_address, zip}
• A relation can have multiple keys!
• We typically pick one as the “primary” key, and underline all its attributes, e.g.,
Address (street_address, city, state, zip)
8
Use of keys
• More constraints on data, fewer mistakes
• Look up a row by its key value
• Many selection conditions are “key = value”
• “Pointers” to other rows (often across tables)
• Example: Member (uid, gid)
• uid is a key of User
• gid is a key of Group
• A Member row “links” a User row with a Group row
• Many join conditions are “key = key value stored in another table”
9
Outline
• Relational Model continued
10
Database design
• Understand the real-world domain being modeled
• Specify it using a database design model
• More intuitive and convenient for schema design
• But not necessarily implemented by DBMS
• A few popular ones:
• Entity/Relationship (E/R) model
• Object Definition Language (ODL)
• UML (Unified Modeling Language)
• Translate specification to the data model of DBMS
• Relational, XML, object-oriented, etc.
• Create DBMS schema
11
Motivation
• How to figure out this database design?
User Group
uid name age pop gid name
142 Bart 10 0.9 abc Book Club
… … … … … …
Member
uid gid
142 dps
… …
• What tables to create?
• Which attributes should be added to each table?
• What are the relationships between the tables?
12
But what about ORM?
• Automatic object-relational mappers are made popular by rapid
Web development frameworks
• For example, with Python SQLAlchemy:
• You declare Python classes and their relationships
• It automatically converts them into database tables
• If you want, you can just work with Python objects, and never need to be aware of
the database schema or write SQL
• But you still need designer discretion in all but simple cases
• Each language/library has its own syntax for creating schema and
for querying/modifying data
• Quirks and limitations cause portability problems
• They are not necessarily easier to learn than SQL
13
History of E/R Model
• E/R Model (Entity-Relationship Modeling)
• Codd wrote a long letter criticizing paper
• Many people suggested him to give up this idea
15
Database Design Process
1. Requirements analysis
• What data is going to be stored?
• What are we going to do with the data?
• Who should access the data?
2. Conceptual Design
• A high-level description of the database
• Sufficiently precise that technical people can understand it
• But, not so precise that non-technical people can’t participate
3. More:
• Logical Database Design
• Physical Database Design
• Security Design
19
Database Design Process
uid gid
Users IsMemberOf Groups
name name
Users Groups
21
Attributes
• Attributes: properties of entities or relationships, like
attributes of tuples or objects
• Represented by ovals attached to an entity set
uid gid
Users Groups
name name
22
Keys
• A key of an entity set is represented by underlining all
attributes in the key
• A key is a set of attributes whose values can belong to at most
one entity in an entity set—like a key of a relation
• Every entity set must have a key
• Denote by underlining.
uid gid
Users Groups
name name
23
Relationships
• Relationship: an association among entities
• Relationship set: a set of relationships of the same type (among
same entity sets)
• Represented as a diamond
uid gid
Users IsMemberOf Groups
name name
24
What is a Relationship exactly?
• Relationship: an association among entities
• A mathematical definition:
• Let A, B be sets B=
A= 1 a
• A={1,2,3}, B={a,b,c,d} b
2
c
3
d
25
What is a Relationship exactly?
• Relationship: an association among entities
• A mathematical definition:
• Let A, B be sets B=
A= 1 a
• A={1,2,3}, B={a,b,c,d} b
2
• A x B (the cross-product) is the set of all pairs (a,b)
c
• A ´ B = {(1,a), (1,b), (1,c), (1,d), (2,a), (2,b), (2,c), (2,d), (3,a), 3
(3,b), (3,c), (3,d)} d
26
What is a Relationship exactly?
• Relationship: an association among entities
• A mathematical definition:
• Let A, B be sets B=
A= 1 a
• A={1,2,3}, B={a,b,c,d} b
2
• A x B (the cross-product) is the set of all pairs (a,b)
c
• A ´ B = {(1,a), (1,b), (1,c), (1,d), (2,a), (2,b), (2,c), (2,d), (3,a), 3
(3,b), (3,c), (3,d)} d
• We define a relationship to be a subset of A x B
• R = {(1,a), (2,c), (2,d), (3,b)}
27
What is a Relationship exactly?
User Group User × Group
uid name age pop gid name uid u.name age pop gid g.name
123 Milhouse 10 0.2 abc Book Club 123 Milhouse 10 0.2 abc Book Club
857 Lisa 8 0.7
× gov Student Government 123 Milhouse 10 0.2 gov Student Government
… … … … dps Dead Putting Society 123 Milhouse 10 0.2 dps Dead Putting Society
… … 857 Lisa 8 0.7 abc Book Club
857 Lisa 8 0.7 gov Student Government
857 Lisa 8 0.7 dps Dead Putting Society
… … … … … …
uid gid
Users IsMemberOf Groups
name name
fromDate
• Where do the dates go?
• With Users?
• But a user can join multiple groups on different dates
• With Groups?
• But different users can join the same group on different dates
• With IsMemberOf!
29
Outline
• Relational Model continued
30
More on relationships
• There could be multiple relationship sets between the same entity
sets
• Example: Users IsMemberOf Groups; Users Likes Groups
• In a relationship set, each relationship is uniquely identified by the
entities it connects
• Example: Between Bart and “Dead Putting Society”, there can be at most
one IsMemberOf relationship and at most one Likes relationship
• What if Bart joins DPS, leaves, and rejoins? How can we modify the design
to capture historical membership information?
• Make an entity set of MembershipRecords
31
Multiplicity of relationships
• 𝐸 and 𝐹: entity sets
• Many-many: Each entity in 𝐸 is related to 0 or more entities in 𝐹 and vice
versa
• Example: Users IsMemberOf Groups
parent
child
33
𝑛-ary relationships
• Example: a user must have an initiator in
member
order to join a group
Users IsMemberOf Groups
initiator
Rule for interpreting an arrow into entity set 𝐸 in an 𝑛-ary
relationship:
• Pick one entity from each other entity set (excluding only 𝐸);
together they can relate to ≤ one entity in 𝐸
• Exercise: hypothetically, member
what do these two
Users IsMemberOf Groups
arrows imply?
initiator 34
𝑛-ary versus binary relationships
• Can we model 𝑛-ary relationships using just binary relationships,
like this?
Users IsMemberOf Groups
InitiatesFor
member initiator
O N G!
IsInitiatedBy
WR
• No; for example:
• Ralph is in both abc and gov
• Lisa has served as initiator in both abc and gov
• Ralph was initiated by Lisa in abc, but not by her in gov
35
Next: two special relationships
… is part of/belongs to …
… is a kind of …
http://blogs.library.duke.edu/renovation/files/2012/08/Rubenstein-Library-First-Floor-Floorplan.jpg 36
http://www.sharky-jones.com/Sharkyjones/Artwork/taxonomy%20artwork/Class1.jpg
Weak entities & supporting relationships
Sometimes, an entity’s identity depends on some others’
• The key of a weak entity set 𝐸 comes not completely from its own
attributes, but from the keys of one or more other entity sets
• 𝐸 must link to them via many-one or one-one relationship sets
• Example: Rooms inside Buildings are partly identified by Buildings’ name
• A weak entity set is drawn
𝐸
as a double rectangle
• The relationship sets through which
it obtains its key are called supporting
relationship sets, drawn as double diamonds
37
Weak entity set examples
• Seats in rooms in building
number name
Rooms In Buildings
capacity year
In
number
Seats
L/R?
38
Remodeling 𝑛-ary relationships
• An 𝑛-ary relationship set can be replaced by a weak entity set
(called a connecting entity set) and 𝑛 binary relationship sets
member
initiator
Initiator
How would you capture the multiplicity
constraint for IsMemberOf now?
Make Initiator regular instead of supporting
39
ISA relationships
• Similar to the idea of subclasses in object-oriented programming:
subclass = special case, fewer entities, and possibly more
properties
• Represented as a triangle (direction is important)
• Example: paid users are users, but they also get avatars (yay!)
uid gid
Users IsMemberOf Groups
name name
ISA fromDate
41
Case study 1
• Design a database representing cities, counties, and states
• For each state, record name and capital (city)
• For each county, record name, acreage, and state it’s in
• For each city, record name, population, and county/state it belongs to
• Assume the following:
• Names of states are unique
• Names of counties are only unique within a state
• Names of cities are only unique within a county
• A city is always in a single county
• A county is always in a single state
42
Case study 1: first design
name name
Cities In States
population capital
county_name
county_acreage
In IsCapitalOf
name
Counties In States name
acreage
44
Case study 2
• Design a database consistent with the following:
• A station has a unique name and an address, and is either an express
station or a local station
• A train has a unique number and an engineer, and is either an express train
or a local train
• A local train can stop at any station
• An express train only stops at express stations
• A train can stop at a station for any number of times during a day
• Train schedules are the same every day
45
Case study 2: first design
number name
ISA ISA
LocalTrains LocalStations
ExpressTrains ExpressStations
time No double-diamonds here
because train number + time
ExpressTrainStops uniquely determine a stop
48