Ba 01 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Business Analytics By: Hasan Sahily

Chapter 1
Introduction to Business Analytics

An Overview of Business Intelligence, Analytics, and Decision Support


Business Intelligence: A Managerial Perspective on Analytics (3rd Edition)
Copyright © 2014 Pearson Education,

The Business Environment


 The environment in which organizations operate today is becoming more
and more complex, creating Opportunities and problems. Example:
globalization.
 Business environment factors: markets, consumer demands, technology,
and societal…
Business Environment Factors

Factor Description
Markets Strong competition
Expanding global markets
growing electronic markets on the Internet
Innovative marketing methods
Opportunities for outsourcing with IT support
Need for real-time, on-demand transactions
Consumer Desire for customization
demand Desire for quality, diversity of products, and
speed of delivery
Customers getting powerful and less loyal
Technology More innovations, new products, and new
services
Increasing information overload
Social networking, HTML5, CSS, Bootstrap,
we services, ..
Societal Growing government regulations and
deregulation
Workforce more diversified, older, and
composed of more women
Security concerns and terrorist attacks
Necessity of legislation
Increasing social responsibility of companies
Greater emphasis on sustainability

1 | Page
Business Analytics By: Hasan Sahily

2 | Page
Business Analytics By: Hasan Sahily

Closing the Strategy Gap


One of the major objectives of computerized decision support is to facilitate
closing the gap between the current performance of an organization and its
desired performance, as expressed in its mission, objectives, and goals, and
the strategy to achieve them.

A Framework for Business Intelligence (BI/BA)


 BI is an evolution of decision support concepts over time
 BI systems are enhanced with additional visualizations, alerts, and
performance measurement capabilities

What is Business Intelligence?


 Business Intelligence (BI) applications are decision support systems that
provide information used to:
 Support internal enterprise users in the assessment, enhancement
and optimization of organizational performance & operation.
 Deliver critical business information to end-users about value chain
constituencies such as customers and supply-chain partners.
 End-users can utilize the BI tools to "drill-down" and "slice and dice" to
gain a better understanding of transactional and operational information; for
example, Star Schema and OLAP are usually used for provide this
functionality.

BI - How It All Works Together

Business Intelligence Software

3 | Page
Business Analytics By: Hasan Sahily

A set of tools that allow users to access enterprise data via reports, Online
Analytical Processing (OLAP) cubes, graphs/charts, ad-hoc queries and
dashboards, in addition to extracting new pattern of knowledge out of the
enterprise database. They allow users to view the data from all levels of the
enterprise and provide them with information necessary to make timely,
well-informed business decisions.

SOME COMPONENTS IN THE BI TOOLKIT


• Reports
• Cubes : provide Multi-dimensional view of data and interactive that give
the user the ability to change what is viewable on the fly
• Charts & Graphs
• Dashboards: graphical displays that provides business users with updates
on key metrics, drilldown and rollup capabilities allowing users to start
with summary data and dive in to the details

COMPONENTS IN A BI ARCHITECTURE
• Data warehouse stores the data. Originally, the data warehouse included
only historical data that was organized and summarized, so end users
could easily view or manipulate it. Today, some data warehouses include
access to current data as well, so they can provide real-time decision
support
• Business analytics are the tools that help the user transform data into
knowledge (e.g., queries, data/text mining tools, etc.)
• Business Performance Management (BPM), provides enterprises tools
they need to better manage their operations
• User Interface (i.e., dashboards) provide a comprehensive graphical view
of corporate performance measures, trends, and exceptions.
TRANSACTION PROCESSING VERSUS ANALYTIC
PROCESSING
Online Transaction processing Online analytic processing
systems (OLTP) (OLAP) systems
Handles updates Are involved in extracting
(add/edit/delete) to information from data stored by
operational databases OLTP systems
 ATM withdrawal  Routine sales reports by
transaction, sales order product, by region, by sales
person, by …

4 | Page
Business Analytics By: Hasan Sahily

entry via an ecommerce  Often built on top of a data


site – updates DBs warehouse where the data is not
 OLTP – handles routine transactional
on-going business  Main goal is to provide correct
 ERP, SCM, CRM information in a timely manner
systems generate and store
data in OLTP systems

BI - ALIGNMENT WITH BUSINESS STRATEGY


 To be successful, BI must be aligned with the company’s business strategy
 BI changes the way a company conducts business by improving business
processes, and transforming decision making to a more data/fact/information
driven activity
 BI should help execute the business strategy

5 | Page
Business Analytics By: Hasan Sahily

Chapter 2
Business Intelligence and data
warehouse
BI Framework
• Derive data from OLTP
• Equipped with an ETL tool that select, extract, transform, integrate, and Load the data
the Data warehouse / Data mart.
• Dashboards, scoreboards, OLAP, data/text mining, reports.

Characteristics of data warehouse “DW”


* DW is a relational DB designed for query and analysis and not for transaction
processing.
* Main characteristics of DW: subject oriented, integrated, non volatile, time variant.

1- Subject oriented: DW concentrates on a specific subject at a time. If DW contains a


big amount of data concerning different subjects then it is preferable to decompose them
into subjects by building small data warehouse called “data mart”.
2- Integrated: Data comes from different heterogeneous systems. They must be
transformed and integrated into a unique format. Then they will be loaded into the DW.
Ex: Cobol, Access DB, Oracle, SQL Server, Pervasive DB, My SQL, … and many
others.
3- Non volatile: Where we load data into DW, we are not allowed to change them. Data
is read-only (non volatile).
6 | Page
Business Analytics By: Hasan Sahily

4- Time variant: To make analysis and prediction, DW must have historical up-to-date
data. DW focuses on change over time (time variant).

Operational and Analytical System:


DW
Transaction Processing System
Transaction Processing System
TPS
TPS
OLAP

Transaction Processing System


TPS

* Operational systems (called online transaction processing OLTP or transaction


processing system TPS): Systems that run daily operations.
* Analytical systems like OLAP (online analytical processing) make analysis of data and
help in making decisions, (Mid and long range).
DW is the source for OLAP that is used for planning, forecasting, and managing the
organization.

Types of analysis:
* Simple queries and reports: used by functional managers who use predefined queries
to use for instance monthly sales, expenses…?
* Statistical analysis: provide summary information and find the relationship between
different data.
* Data mining analysis: analyze big amount of data and tries to find hidden new pattern
from existing ones.
Ex: what item should be offered as a set?

The Users of Business Intelligence


One of the great benefits of BI is that it can support the data needs of the entire business.
This support comes from the many different ways that users can consume BI data

7 | Page
Business Analytics By: Hasan Sahily

1. Executives and business decision makers look at the business from a high level,
performing limited analysis. They use focus on the Strategic descions that helps in
making long term decisions that is vital to the organization. They use statistical models
to forecast. These systems use data mining tools and techniques.
2. Analysts perform complex, detailed data analysis. They are considered as the middle
managers. They use Tactical systems that require fast decision making using OLAP
engine. They manage data through some aggregation techniques such as (sum, average,
count, min, max…). It applies cube analysis using OLAP operators such as drill down,
rollup, slice, dice.
3. Information and Line workers need static reports or limited analytic power. They do
not need analytic capabilities. BI is presented to them as part of their job. They mainly
focus on Operational systems (called production systems or OLTP : Manage data
through daily transactions in RDBMS. Decisions at this level are predetermined and
predefined.

8 | Page
Business Analytics By: Hasan Sahily

Volume of information

Strategic
systems

Tactical
systems

Operational
systems

Data Information Knowledge

9 | Page
Business Analytics By: Hasan Sahily

Strategic, Tactical & Functional Benefits of Business Intelligence

Analysis of Data in DW
• OLTP (online transaction processing): Capturing and storing data from ERP, CRM,
POS, … The main focus is on efficiency of routine tasks
• OLAP (Online analytical processing): Converting data into information for decision
support
o Data cubes, drill-down / rollup, slice & dice, …
o Requesting ad hoc reports
o Conducting statistical and other analyses
o Developing multimedia-based applications

10 | P a g e
Business Analytics By: Hasan Sahily

11 | P a g e
Business Analytics By: Hasan Sahily

Why company need data warehouse?


In a big company, systems are divided according to domain of activity (finance,
commercial, marketing, manufacturing…).
It is necessary to have all of these data from the different systems available for decisions
makers (ex: top management).
It is necessary that we store all information associated to all systems concerning a certain
subject in one database.
We usually use a middleware that is able to make interface with the different systems to
load the data into the DW.

* DW helps in data analysis which uses a set of techniques that allow extracting
information from historical archive and using some aggregation functions (sum, average
…) and statistical formulas.

* DW helps in data mining that extract information from DW and uses some techniques
to describe the present situation and predict the future

Some examples where DW is useful:


* Banking domain: for a bank, it is important to regroup the information concerning the
customers so that managers can take decisions about the credit demands of any of them.
* Marketing campaign: where a company wants to make a campaign to increase sales.
* Supermarkets or big distribution companies: It is important to regroup the sales
information to determine the most success products, detects the purchasing habits and
preferences of customers according to some criteria, such as: area, branch …

Data Integration and the Extraction, Transformation, and Load (ETL) Process
Data is frequently stored in different locations and formats (Relational databases, XML,
Excel,….). Data might be inconsistent or missing values (dirty data).
Consolidation is the process of reading data from source system, cleansing these data and
making it consistent as much as possible. The process of data consolidation is often called
Extraction, Transformation, and Loading (ETL).
Data integration: includes data access, data association, and change capture.
o Enterprise application integration (EAI): a tool that push data from source
systems into a data warehouse
o Enterprise information integration (EII): a tool that provides real-time data
integration from a variety of sources (Relational databases, XML, Excel, Web
services, etc. )
• The ETL process extracts data from the various source systems
• Data is then transformed to make it consistent and improve data quality
• The consolidated, consistent, and cleaned data is then loaded into a data repository
• Developing the ETL process often consumes 80% of the development time

12 | P a g e
Business Analytics By: Hasan Sahily

Packaged Transient
application data source

Data
warehouse

Legacy
Extract Transform Cleanse Load
system

Data mart
Other internal
applications

Exploration
User 2 User 1
level

Exploration Analysis
Using OLAP Using DM

Fusion level DW
(Merging):
Transform
Mediator
and load

Transformation
and loading

Monitor Wrapper Monitor Wrapper Monitor Wrapper


Extraction
level

Operational Operational Operational


DB 1 DB 2 DB 3

1- Extract:
data are extracted from operational DBs using “source monitor” software that uses
push or pull techniques. Push: detect the update done on the operational DBs and
send them to DW. Pull: monitor will be active periodically to take the updates
done on operational DBs toward the DW. There are 2 kinds of logical extraction:
• A full extraction: Data are extracted completely from the source system.
• Incremental Extraction: Only the changed data since last extraction will be
extracted. It extracts a much smaller volume of data techniques used are:
1- Time stamps: specifies the time and data that a row was modified (insert,
update, delete).
2- Partitioning: tables in operational DB are partitioned .ex: partition for each
month then it is easy to identify the updated data related to a certain
partitions.
3- Trigger: After each DML statement (insert, update , delete) that is executed
on the table , trigger updates time stamp with the current times and tells

13 | P a g e
Business Analytics By: Hasan Sahily

source monitor about the update in case of using push techniques . Pull
techniques uses the time stamp to identify the updated rows after a certain
period of time.
There are 2 physical extraction methods:
1- Online extraction: the data is extracted directly from the source system.
2- Offline extraction: the data is stored in a staging (intermediate system). It
prevents the direct access to the source system so that no effect on the
performance of the operational system. Download of data from operational
(source) system into staging system is done during the night at low
operational period.

2- Transport/Transform/Fusion(merge) level:
Transport of data from different heterogeneous sources and integrate them into a
single DW. Transposrtaion of data is done either using flat file where we store
data and then transport to DW using FTP or using distributed queries to transport
each single table at a time.
Transform of heterogeneous data into a single format understood by the DW.

3- Load: Data will be validated before loaded into the DW to preserve the coherence and
consistency.

4- Explore: After loading of data, user can make a complex query to explore some facts,
data analysis help the decision makers to detect new facts. Using OLAP and data mining
techniques.

Middle ware:
It is a data collection technique. It is able to detect the changes done on the source
(operational) systems and then transform and load these data into a staging database or
directly into DW.
Middle ware insures integrity and coherence between DW and operational systems. It
makes update on DW to include all modifications done on the operational systems.
Middle ware contains three software programs that help in performing its functions.
1- Source monitor: for every source of data (operational) DB there are a source monitor,
software that detect any modification done of the source system. Source monitor follows
push or pull method.
* Push: Any update is detecting automatically and send towards the DW. This process
happens with the aid of trigger and the stamp technique.
* Pull: DW formulates a query asking for the changes happened after the last successful
extraction. Ex: changed rows that are the stamp data of last extraction.

2- Source wrapper (adapter): The monitor is connected to a source wrapper adapter that
is responsible of translating the queries and the data from the source systems (operational
DBs) toward the DW and vice-versa.

14 | P a g e
Business Analytics By: Hasan Sahily

3- Mediator: data are received from the heterogeneous sources in different formats.
These data need transformation recognizing, and filtering to adapt with the DW model.
Data will be integrated and merged and might need some selections and aggregations.
Mediator is responsible of these functions.

Flat Files Monitor Adapter

Mediator
DW

Oracle
Monitor Adapter
DB

Server DB Monitor Adapter


(External)

• Issues affecting the purchase of an ETL tool


o Data transformation tools are expensive
o Data transformation tools may have a long learning curve
• Important criteria in selecting an ETL tool
o Ability to read from and write to an unlimited number of data
sources/architectures
o Automatic capturing and delivery of metadata
o A history of conforming to open standards
o An easy-to-use interface for the developer and the functional user

15 | P a g e
Business Analytics By: Hasan Sahily

Traditionnel versus Active DW

Data Model:

Modelling by subject: For easy usage of data in DW, we have to classify data by subject
rather than by application. It is advisable to have a Meta data that describe every attribute
in the DW (source, calculation, formula...).

Data mart:
• A data mart is a repository of data gathered from operational data and other sources
that is designed to serve a particular community of knowledge workers.
• The data may derive from an enterprise-wide database or data warehouse or be more
specialized.
• The emphasis of a data mart is on meeting the specific demands of a particular group
of knowledge users in terms of analysis, content, presentation, and ease-of-use. Users
of a data mart can expect to have data presented in terms that are familiar.
• Focus on a single subject, a departmental small-scale “DW” that stores only
limited/relevant data
• Dependent data mart : A subset that is created directly from a data warehouse
• Independent data mart: A small data warehouse designed for a strategic business unit
or a department
• Established in a simple infra structure.
• Implementation is easy and quick.

16 | P a g e
Business Analytics By: Hasan Sahily

Multidimensional cube Model and On-Line Analytical Processing (OLAP)

Cubes are the structures in which data is stored, organized and summarized in
multidimensional structure. It supports DSS and acts as a multi-dimensional data
modeling that facilitates the analysis of data with different dimensions. Multidimensional
cube consists of a set of dimensions and summary data (aggregated) in cells.Users access
data in the cubes by navigating through various dimensions.

The data cube is a great tool for the


presentation of the results. It permits to
present results while rotating the cube
Time Dimension

according to the desired dimensions. We


14/3/2005
could make drill down, roll up, slice, and Sup
13/3/2005
dice according to some dimensions. 12/3/2005
Jamal
s io
n
en
At a given time, only the selected slice of a Sup
Samir r di
m
TV TV Radio Radio
lie
cube would be active. Sony Sharp Aiwa Sony u pp
S

Product dimension

Roll up, Drill down, Dice, and Slice are very powerful visual operators for analysis.
They are activated by simple click on the axis of dimensions and selecting the attributes
of calculated value. The two dimensional visualization is presented according to different
formats.
User could select different graph styles and calculation formula to present the calculated
values.
17 | P a g e
Business Analytics By: Hasan Sahily

- Result could be shown by row or column style.


- Result could be rotated according to a certain dimensions.

Ex: Sale cube (Date, Product, Supplier, Amt)


Date
Supplier

x Quantity Price

Product
Dimension
Axe Date

2005
Dimension
Axe Supplier
2004 500 S2

S1
2003

P1 P2 P3

Dimension
Axe Product

The figure presents a data cube with 3dimensions: Data, Product, Supplier.
Measure unit based on sum (Q*price)

View of cube:
A cube with n dimensions represents 2N different views. These views could be organized
in a tree view.

18 | P a g e
Business Analytics By: Hasan Sahily

Product, Supplier, Date

Product, Supplier Product, Date Supplier, Date

Product Supplier Date

Example: Slicing Operations on a Simple Tree-Dimensional Data Cube

A 3-dimensional
OLAP cube with Sales volumes of
slicing a specific Product
operations on variable Time
and Region
e
m
Ti

Product
Geography

Cells are filled


with numbers
Sales volumes of
representing a specific Region
sales volumes on variable Time
and Products

Sales volumes of
a specific Time on
variable Region
and Products

19 | P a g e
Business Analytics By: Hasan Sahily

Cube analysis is based on the hierarchy and the 4 OLAP operators (Drill down,
rollup, Slice, Dice)

Hierarchy Drill down Rollup


(granularity)
navigating among Apply analysis on a
Hierarchy is a logical levels of data ranging more general
structure that uses from the most dimension (Roll up
ordered level as a mean summarized (up) to the analysis).
to define data most detailed (down)
aggregation and Minimize the cube
navigational drill down Perform the analysis on dimensions by
or rollup path. more detailed replacing smaller
dimensions (Drill down dimensions by larger
For example, Location analysis) one, as go from details
is expressed in Area, Example go from toward the global.
City, Country. country to city to Area. Example go from Area
to city to country.
hierarchy can be used to As going from global
aggregate data from towards details.
Area to city, and from
city to country level.

We can drill down or


rollup the dimension
based on the hierarchy
of Area , City, Country.
Slice of the cube: Dice of the cube: Pivot - used to change
a subset of a a slice on more than the dimensional
multidimensional array. two dimensions orientation of a report
Selection of a portion of Selection some of the or an ad hoc query-
the cube, filter the columns from the cube, page display
dimension (axis) of a Location(L#, Area,
cube according to some City, Country)
criteria under the Where select City, Area.
clause in the select (wheras country is not
statement. (some of the selected)
rows)
EX: retrieve only TV; EX:
where Product= ‘TV’; Select Country,
Product, Sum (Amt)
FROM sales, Supplier
Where Sales.S_NO =
Supplier.S_NO

20 | P a g e
Business Analytics By: Hasan Sahily

Group by cube
(Country, Product);
Drill down from
country to city;
System replaces
country by city.

21 | P a g e
Business Analytics By: Hasan Sahily

Dimensional Model
multidimensional thinking: Humans tend to think in a multidimensional way, even if
they don’t realize it. For example, managers want to see sales by month by product for
Middle East Area.

• “What” we want to see (sales in this case) is called a measure . Ex: money, sales
volume, inventory profit, actual versus forecast. Measures and KPIs are stored in a
single fact table. They are almost always numeric and are often additive (Dollar sales,
unit sales, profit, expenses, and more).
• Key Performance Indicators (KPIs) are a special type of measure used to evaluate
the “health” of the value. KPIs often contain target numbers. For example, the
Customer satisfaction is measured through rating, generated profits, number of
repeated purchases. Sales of this year vs. last year. Expenditures vs revenues, ROI
(return on investment).
• How we want to see the data (month, product, and Middle East Area) is called a
dimension. Example: products, salespeople, market segments, business units,
geographical locations, distribution channels, country, or industry. A dimension is an
entity (table) defined in its entirety with a single primary key
o Dimensions are made up of attributes and may or may not include
hierarchies(Year – Semester – Quarter – Month – Day)
o Product Category – Product Subcategory - Product
• Attributes are individual values that make up dimensions. For example, a Product
dimension may have a Part Number attribute, a size attribute, a color attribute, a
manufacturer attribute, and more.
• Hierarchies contains attributes in a hierarchical structure to assist user analysis.
Hierarchies helps in drill down and rollup. For example:
o A Time dimension may have a Month attribute, a Year attribute, and so forth.
Time hierarchy might be to go from Year to Quarter to Month to Week to Day
o A Geography dimension may have a Country attribute, a Region attribute, a
City attribute, and so on

Data are represented in DW in a form of star or snowflake schema


o Star schema: Contain a fact table surrounded by and connected to several
dimension tables
o Snowflakes schema: An extension of star schema where the diagram resembles
a snowflake in shape

22 | P a g e
Business Analytics By: Hasan Sahily

Star Schema Snowflake Schema


Dimension Dimension Dimension Dimension
TIME PRODUCT MONTH BRAND
Quarter Brand M_Name Brand
... ... ... Dimension Dimension ...
DATE PRODUCT
Date LineItem
Fact Table
SALES Dimension ... ... Dimension
QUARTER CATEGORY
UnitsSold
Q_Name Category
... Fact Table
... SALES ...
UnitsSold
Dimension Dimension
PEOPLE GEOGRAPHY ...
Division Country
... ... Dimension Dimension
PEOPLE STORE
Division LocID
... ... Dimension
LOCATION
State
...

Star Schema vs Snow Flake Schema

Star Schema Snow Flake


Relational schema have a central table called Same as star schema but the
fact table dimension table is normalized
* considered as the central table. and hierarchies are broken into
* contains measure units separate tables that helps in
*contains a FK for each referenced dimension drill down the data from
table (to join the fact with the dimension topmost to the lowest
tables). granularities.
- advantage: less redundancy
Each dimension table has attributes that - disadvantage: slow down the
represent the various hierarchies that are not roll up aggregation because it
normalized because attributes in one table needs more joints between the
define different granularities. hierarchies.

Every dimension has a primary key


dimension Table does not have any parent table
Hierarchies granularities stored in the
dimension table itself to speed up roll up
aggregation
- Disadvantage: a lot of redundancy which lead
to waste of disk storage.

Steps to design the DW schema


• You will be given the details of the Transaction processing system.
• You have to just focus on the management requirements.
o Define the what (measure and KPI), how (dimensions), build the schema, and
then implement using Qlikview.

23 | P a g e
Business Analytics By: Hasan Sahily

Exercise 1
A bank is interested to build a DW to specify the total money withdrawal from
each machine that are located in different areas in different cities at different
dates. Smallest unit of time should be day and might be asked to submit
reports according to week, month or year.

a) Define the dimensions and hierarchies


b) Draw the star schema
c) Convert the star schema to snowflake schema
a-
What do you want to analyse? (Measures)
Total withdrawal money. Aggregate function: Sum
Expression sum(money)

How do you want to analyse? (Dimensions)


• Calendar (day, week, month year)
• Location(machine, area, city

Dimension Key atrributes Hierarchy


Calendar cid DayWeekMonthYear Yes
Location lid Machine area city Yes

Fact:
Fact (cid, lid, money)

b- Star schema:

Calendar
Fact Location
cid
1 n cid n 1 Lid
Day
lid Machine
Week
money Area
Month
city
Year

c- Snow Flake schema:

Calendar
cid 1
Day
Week Fact n
Month n
year cid
Lid
n location Area City
money 1 n 1 1
Lid aid cityid
Machine Area city
Aid cityid

Exercise 2
24 | P a g e
Business Analytics By: Hasan Sahily

The University Computer Lab’s director keeps track of the lab usage, measured by
the number of students using the lab. There are two types of labs: Finance and IT.
The director assigns you the task of developing a data warehouse to keep track of
the labs statistics. The main requirements for the data warehouse are the following:
• Show the total number of users using different labs by different time periods (week,
month, and year).
• Show usage numbers of labs by time periods, by major, and by degree programs.
• Compare usage numbers by labs, and by different time periods.
(a) Define measures, dimensions, and fact
(b) Draw star schema
Solution
What do you want to analyse? (Measures)
Count of students. Aggregate function: Count
Average of students. Aggregate function: Average
Expression count(student), avg(student)

How do you want to analyse? (Dimensions)


• Calendar (week, month year)
• lab(lab)
• major(major)
• degree(degree)
Dimension Key atrributes Hierarchy
Calendar cid WeekMonthYear Yes
Lab lid Lab
major Mid Major
Degree Did Degree
Fact:
Fact (cid, lid, mid, did, count of student)

c- Star Schema:

calendar
cid 1
Degre
Week Fact
Month n 1
Cid n Did
year Lid Degree
n
Did
Mid n
Number of student
Lab 1 Average of student 1 Major
Lid mid
lab major

25 | P a g e
Business Analytics By: Hasan Sahily

Exercise 3
A large company with multiple branches will gather every night in a data
warehouse (DW) the information of sales of a day in order to prepare sales reports.
The transaction processing system has an application about customers and another
application about employees.
The DW has to provide the turnover of sales of a product, per date, customer,
vendor, and all possible amounts of turnover.
 A sale corresponds to one and only one product. Products are grouped by product
family.
 The sale is conducted by one of the vendors of sale service specializing in the
product.
 The sales week is the week number within the year.
Questions:
(a) Define measures, dimensions, and fact
(b) Draw star schema
(c) Enumerate 4 types of decisions that manager can take after the analysis of data
warehouse.

Solution
What do you want to analyse? (Measures)
Sum of sales. Aggregate function: sum
Expression sum(sales)

How do you want to analyse? (Dimensions)


• Calendar (week, month year)
• product(product, category)
• customer(customer)
• vendor(vendor)
Dimension Key attributes Hierarchy
Calendar cid WeekMonthYear Yes
product Pid Product  category Yes
Customer cuid Customer
Vendor Vid vendor
Fact:
Fact (cid, pid, cuid, vid, sales)

b- Star Schema:

26 | P a g e
Business Analytics By: Hasan Sahily

calendar
cid 1
customer
Week Fact cuid
Month n 1
Cid n customer
year pid
n
Cuid
prodcut 1 vid n
Pid sales
1 vendor
Prodcut
vid
category
vendor

4 types of decisions
• Sale promotion and discount for un-sold products
• Incentive for top sale customers
• Sales according to week to update stock amounts accordingly
• Stop sale of un-desired products

27 | P a g e
Business Analytics By: Hasan Sahily

Exercise 4
The general manager of a Hotel-network located in different countries, decided to
analyze the preferences of customers. They collected the following information:
• Hotel: Name, number of stars, city, region, country, provided services such as
restaurant, sauna, Jacuzzi, internet, sportive halls, Room type (double, single),
cost of room per season.
• Client: Name, type of room, reserved days, check-in and check-out dates ordered
services, nationality, Number of persons per room.
They ask for the following parameters: in monthly basis
• Number of client
• Percentage of clients
• Occupation rate of room
• Monthly income
Questions:
1- Determine the axes need for analysis.
2- Determine the hierarchy by for each axe (dimension)
3- Identify the necessary measures
4- Draw the star schema

Axes Hierarchy
(Dim)
Hotel Hotel-NameCityRegionCountry
Room Room type
Date Monthyear

Fact Measures
Income Number of clients: Sum (persons) per room type per month
%of clients: Avg (persons) per room type per month
Room Occupation Rate: Avg (persons) per room
Monthly Income: Sum (bill amount) per room type per month.
Star schema:
Hotel Income Room
H_ID 1 n H_ID 1 R_ID
n
H-Name R_ID R_type
City Dat_ID n
Region Number of clients
Country percentage of clients
Room occupation rate
Monthly income
Date
Dat_ID
1 Month
year

28 | P a g e

You might also like