DimensionalityModeling 2023
DimensionalityModeling 2023
DimensionalityModeling 2023
Modelling
Advanced databases
Dr David Hamill
Content
Dimensionality Modelling
• Definitions
• Schemas
• Dimensional Model vs Entity Relationship
• Dimensional Modeling Stage of Kimball’s Business Dimensional Lifecycle
Definitions – Dimensional Models
• Dimensionality Modelling (DM): a logical design
technique to present data in a standard, intuitive form
to allow high-performance access.
• A DM forms a ‘star-like’ structure, called a star schema
or star join.
Definitions - Keys
• A DM is composed of one table with
composite primary keys, called a fact
table, and a set of smaller tables called
dimension tables.
• Simple (non-composite) primary key: Each
dimension table has one of these that
corresponds to one component from the
composite key in the fact table.
Definitions - Keys
• Natural keys are replaced with surrogate
keys. Meaning that all dimension and fact
table joins are done using the surrogate
keys instead.
• This is so the warehouse can
independently decide how to store data
instead of using the OLTP indexes.
Definitions - Facts
• Star schema’s are logical structures containing
fact tables in the center, surrounded by
denormalized dimension tables.
• Facts are generated by events occurring in the
past.
• Most data is contained in the fact table.
• Once inserted the facts should be read-only.
• Useful fact tables are usually numerical, or
additive.
Definitions
– Dimension tables
• Dimension tables usually contain
descriptive textual information.
• Dimension attributes are used as
constraints in data warehouse queries.
• Star schemas can be used to speed up
query performance by denormalizing
reference information into a single
dimension table.
Snowflake schema
• Snowflake schema: a variant of star
schema that has a fact table at the center,
surrounded by normalized dimension
tables.
• Starflake schema: a hybrid structure that
contains a mixture of star and snowflake
dimension tables.
Dimensional Modelling Advantages
Star, Snowflake, and starflake representations have the following advantages:
• Efficiency
• Handling of changing requirements
• All dimensions are equivalent to access to the fact table
• Extensibility
• Adding new facts
• Adding dimensions
• Added dimensional attributes
• Breaking down to lower levels of granularity
• Ability to model business situations
• Predictable query processing
Comparing DM and ER models
Entity Relationship Dimensional Model
• A single ER model normally • Multiple DMs are associated
decomposes to multiple DMs. through ‘shared’ dimension
• Used for identifying relations tables.
among entities to remove • Attraction is the high
redundancy. performance of ad-hoc user
queries.
Dimensional Modeling Stage of Kimball’s
Business Dimensional Lifecycle
• This stage can result in the creation of a DM for a data mart or to
dimensionalize a schema for an OLTP.
• Starts by defining high-level dimensional model, which progressively gains
detail and is achieved in a two-phased approach:
1. Creation of a high-level DM – this has 4 steps