Business Intelligence: Data Warehouse

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 60

Business Intelligence

Data warehouse
What is Business Intelligence?

BI(Business Intelligence) is a set of processes, architectures, and


technologies that convert raw data into meaningful information
that drives profitable business actions. It is a suite of software
and services to transform data into actionable intelligence and
knowledge.
The process of collecting, organizing, and analyzing
business data and turning it into useful and
actionable information is commonly referred to as
Business Intelligence.
What is Data Warehousing?

A Data Warehousing (DW) is process for collecting and


managing data from varied sources to provide meaningful
business insights. A Data warehouse is typically used to connect
and analyze business data from heterogeneous sources. The
data warehouse is the core of the BI system which is built for
data analysis and reporting.
What Is BI Architecture?

Business intelligence architecture is a


term used to describe standards and
policies for organizing data with the
help of computer-based techniques
and technologies that create business
intelligence systems used for online
data visualization, reporting, and
analysis.
A solid BI architecture framework consists of:

 Collection of data
 Data integration
 Storage of data
 Data analysis
 Distribution of data
 Reaction based on insights
What is OLAP?
Online Analytical Processing (OLAP) is a category of software
that allows users to analyze information from multiple
database systems at the same time. It is a technology that
enables analysts to extract and view business data from
different points of view.

OLAP (Online Analytical Processing) is the technology behind


many Business Intelligence (BI) applications. OLAP is a
powerful technology for data discovery, including capabilities
for limitless report viewing, complex analytical calculations,
and predictive “what if” scenario (budget, forecast) planning.
OLAP  stands for On-Line Analytical Processing. It is used for
analysis of database information from multiple database systems at
one time such as sales analysis and forecasting, market research,
budgeting and etc. Data Warehouse is the example of OLAP system.
OLTP stands for On-Line Transactional processing. It is used for
maintaining the online transaction and record integrity in multiple
access environments. OLTP is a system that manages very large
number of short online transactions for example, ATM.
Sr. No. Key OLAP OLTP

1 Basic It is used for data analysis It is used to manage very large


number of online short transactions

2 Database It uses data warehouse It uses traditional DBMS


Type

3 Data It manages all insert, update  and It is mainly used for data reading
Modification delete transaction

4 Response Processing is little slow In Milliseconds


time

5 Normalizatio Tables in OLAP database are not Tables in OLTP database are
n  normalized. normalized.
example, there are three different application labeled A, B and C.
Information stored in these applications are Gender, Date, and
Balance. However, each application's data is stored different way.
In Application A gender field store logical values like M or F
In Application B gender field is a numerical value,
In Application C application, gender field stored in the form of a
character value.
Same is the case with Date and balance
However, after transformation and cleaning process all this data is
stored in common format in the Data Warehouse.
What is Data Mart?
A Data Mart is focused on a single functional area of an
organization and contains a subset of data stored in a Data
Warehouse. A Data Mart is a condensed version of Data
Warehouse and is designed for use by a specific
department, unit or set of users in an organization. E.g.,
Marketing, Sales, HR or finance. It is often controlled by a
single department in an organization.
Dependent data mart with operational data
store: a three-level architecture
Dependent Data Mart
Independent Data Mart
Hybrid Data Mart:
Data Warehouse - Schemas

A schema is defined as a logical description of


database where fact and dimension tables are joined
in a logical manner. Data Warehouse is maintained in
the form of Star, Snow flakes, and Fact Constellation
schema.

• Star Schema
• Snowflake Schema
• Galaxy Schema
Star Schema
Star Schema in data warehouse, in which the center of the star
can have one fact table and a number of associated dimension
tables. It is known as star schema as its structure resembles a
star. The Star Schema data model is the simplest type of Data
Warehouse schema. 
• Every dimension in a star schema is represented
with the only one-dimension table.
• The dimension table should contain the set of
attributes.
• The dimension table is joined to the fact table
using a foreign key
• The dimension table are not joined to each other
• Fact table would contain key and measure
• The Star schema is easy to understand and
provides optimal disk usage.
• The dimension tables are not normalized. For
instance, in the above figure, Country_ID does not
have Country lookup table as an OLTP design
would have.
• The schema is widely supported by BI Tools
Fact table
In data warehousing, a fact table consists of
the measurements, metrics or facts of a
business process. It is located at the center of
a star schema or a snowflake schema
surrounded by dimension tables. Where
multiple fact tables are used, these are
arranged as a fact constellation schema. 
Snowflake Schema in data warehouse is a logical
arrangement of tables in a multidimensional
database such that the ER diagram resembles a
snowflake shape. A Snowflake Schema is an
extension of a Star Schema, and it adds additional
dimensions. The dimension tables are normalized
which splits data into additional tables.
A Galaxy Schema contains two fact table that
share dimension tables between them. The
schema is viewed as a collection of stars hence
the name Galaxy Schema.

• The dimensions in this schema are separated into separate


dimensions based on the various levels of hierarchy.
• For example, if geography has four levels of hierarchy like
region, country, state, and city then Galaxy schema should
have four dimensions.
• Moreover, it is possible to build this type of schema by splitting
the one-star schema into more Star schemes.
• The dimensions are large in this schema which is needed to
build based on the levels of hierarchy.
• This schema is helpful for aggregating fact tables for better
understanding.
ETL – Extract, Transform, Load

ETL is short for extract, transform, load,


three database functions that are combined into one
tool to pull data out of one database and place it into
another database.

Oracle has introduced an ETL tool known


as Oracle Warehouse Builder (OWB). It is a graphical
environment that is used to build and manage the data
integration process.
Extract is the process of reading data from a database. In this stage,
the data is collected, often from multiple and different types of sources.
Transform is the process of converting the extracted data from its
previous form into the form it needs to be in so that it can be placed
into another database. Transformation occurs by using rules or lookup
tables or by combining the data with other data.
Load is the process of writing the data into the target database.
Mention List of ETL Tools
The lists of the ETL tools are given below.
• Open Text Integration Center
• Relational Junction ETL Manager (Sesame Software)
• CloverETL
• PowerCenter Informatica
• Talend Studio for Data Integration
• Oracle Warehouse Builder (OWB)
• Oracle Data Integrator (ODI)
• Data Migrator (IBI)
• Cognos Data Manager
• IBM Infosphere Warehouse Edition
• SQL Server Integration Services (SSIS)
• IBM Infosphere Information Server
• Pervasive Data Integrator
• Pentaho Data Integration
• 19Adeptia Integration Server
• 4SAS Data Management
• 16Centerprise Data Integrator
• 20Syncsort DMX
• 10Sagent Data Flow
• 21QlikView Expressor
• 2SAP Data Services
• 6Elixir Repertoire for Data ETL
HOW ETL WORKS
Flat-file

A flat-file database is a database stored in a file called a flat


file. Records follow a uniform format, and there are no
structures for indexing or recognizing relationships
between records. The file is simple. A flat file can be a plain
text file, or a binary file
Data Warehouse Architecture

A data warehouse architecture is a method of defining the overall


architecture of data communication processing and presentation that exist for
end-clients computing within the enterprise. Each data warehouse is
different, but all are characterized by standard vital components.

Data Warehouse applications are designed to support the user ad-hoc data
requirements, an activity recently dubbed online analytical processing (OLAP). These
include applications such as forecasting, profiling, summary reporting, and trend
analysis

Ad hoc reporting is a report created for a one-time-use. ... Ad hoc reporting differs


from structured reporting in many ways. Structured reports use a large volume
of data and are produced using a formalized reporting template. Ad hoc reports are
generated as needed, in a visual format relevant to the audience.
• Defining Business Requirements (or
Requirements Gathering) ...
• Setting Up Your Physical
Environments. ...
• Introducing Data Modeling. ...
• Choosing Your Extract, Transfer, Load
(ETL) Solution. ...
• Online Analytic Processing (OLAP)
Cube. ...
• Creating the Front End.

You might also like