Lecture2-Conceptual Modelling
Lecture2-Conceptual Modelling
Lecture2-Conceptual Modelling
• At instance level: members form a balanced tree (all the branches have the same length)
• All parent members have at least one child member, and a child belongs exactly
to one parent
Unbalanced Hierarchies
• At schema level: only one path where all parent-child relationships are many-
to-one, but some are optional (e.g., not all branches have agencies and not all
agencies have ATMs)
• Schema level
• At instance level: Path length from the leaves to the same parent can be
different for different members
Alternative Hierarchies
• At schema level: Multiple nonexclusive hierarchies that share at least the leaf
level and account for the same analysis criterion
• Users must choose only one of the alternative paths for their analysis
• At instance level: Members form graph (child has more than one parent
• Fiscal year, here, starts in Feb
Generalised vs. Alternative Hierarchies
• Both hierarchies
– Share some levels
– Use one analysis criterion
• A child member
– Related to only one path in generalized hierarchies
– Related to all paths in alternative hierarchies and users must choose
one for analysis
Parallel Hierarchies
• Dimension has several hierarchies accounting for different analysis criteria
• Two different types
– Parallel independent hierarchies
– Parallel dependent hierarchies
• Parallel independent hierarchies
– Composed of disjoint hierarchies, i.e., hierarchies that do not share levels
Parallel Hierarchies
• Parallel dependent hierarchies
• Composed of several hierarchies that account for different analysis criteria and
share some levels
Alternative vs. Parallel Hierarchies
• Both hierarchies
– Share some levels
– May include several hierarchies
• Criterion
– Only one for alternative hierarchies
– Several for parallel hierarchies
• Combining hierarchies
– Meaningless for alternative hierarchies
– Useful for parallel hierarchies (e.g., Sales for stores in city A that
belongs to sales district B).
• Reusing aggregated measures for common levels
– Can be done for alternative hierarchies
– Cannot be done for parallel hierarchies
Alternative vs. Parallel Hierarchies
• Aggregated measure for the Month
level can be reused between both
paths
• Traversing the Calendar hierarchy
from a specific day in the Time level
will end up in the same year
independently of which path is used.
• Employees may work in several sections. Salary of employee refers to the sum
of the salaries paid in each section.
• Distributing factor determines how measures are divided between several
parent members (e.g., percentage of time spent in each section)
• Distributing factor is not always known
– Percentage of time that an employee works in a section must be added to schema
• Sometimes this distribution is impossible to specify
– E.g., participation of customer in joint account
• Distributing factor can be approximated by considering the total number of
parent members with which the child member is associated
– If an employee works in three sections, 1/3 of the value of the measure
aggregated for each one
MultiDim Conceptual Schema of the
Northwind DW
Conceptual Design of the Northwind DW
• Conceptual schema is based on the queries and on the table summarising requirements
• Source data are organized into orders: must transform orders data into sales facts
during the ETL process
• Measures Quantity, UnitPrice, Discount, SalesAmount obtained directly from the
sources; Freight is produced in the ETL; NetAmount is derived from the data cube
(calculated from the other measures)
• Aggregate functions also specified, following the requirements
• Orders are associated with different time instants: Time dimension participates in the
Sales fact with roles OrderDate, DueDate, and ShippedDate (not indicated in the
requirements table)
• Time dimension contains four aggregation levels.
• Dimension Product and parent level Category follows from query 3(a)
• Many-to-many relationship between Employee and City, discovered analysing the
content of the source database in the requirements phase
• For HR analysis (queries 4(a) to 4(c)), we need to analyse sales by employee
supervisors, a recursive hierarchy Supervision in dimension Employee
• The dimensions Customer and Supplier share the geographic hierarchy starting with
City. This information is incomplete in the operational database → data for State,
Region, Country, and Continent must be obtained from an external source.
Summary
• What you need to take away from this lecture:
1. Be able to translate requirements into a conceptual
multidimensional model
2. Be able to read and create a conceptual
multidimensional model
3. Be able to distinguish between the different types of
hierarchies
• Further Reading: Chapter 4 and Chapter 10
(Sections 1 to 4) of Textbook
Tutorial
• Exercise 1: Consider a time dimension composed of two
alternative hierarchies: (a) day, month, quarter, and year and (b)
day, month, bimester (2 months), and year. Design the conceptual
schema of this dimension and show examples of instances.