Delivery Process 2. System Process

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 31
At a glance
Powered by AI
The key takeaways are that the data warehouse delivery process needs to be different from the traditional waterfall method due to changing requirements, and includes steps like defining business strategy, identifying benefits, education and prototyping, defining business requirements, and ongoing maintenance.

The main steps involved in the data warehouse delivery process are defining the IT strategy, developing the business case, education and prototyping, defining business requirements, and ongoing maintenance.

Education and prototyping in the data warehouse delivery process help organizations experiment with data analysis concepts, educate themselves on the value of a data warehouse, and use prototyping to further the education process by addressing a clearly defined technical objective on a small subset of data.

1.

Delivery Process
2. System Process

Anahory
Delivery Process
The process that delivers a data warehouse has to be fundamentally
different from traditional waterfall method
Issue with the DW projects is that
difficult to complete the tasks and deliverables in the strict,
ordered fashion demanded by a waterfall method
because requirements are rarely understood and are expected
to change over time
Knock-on effect
Architectures, designs, and build components cannot be completed
until the requirements are completed which can lead to constant
requirement iteration without deliver i.e “Paralysis by Analysis”
Steps in DW delivery method
1. IT Strategy
- DW are strategic investments
- Require business process to redesign in order to generate the
projected benefits
- If there is no overall IT strategy that included DW
difficult to procure
retain funding for project
Steps in DW delivery method

2. Business case
Identify the projected business benefits that should be derived
from using the data warehouses
Benefits may or may not be quantifiable Ex : $5000 savings per
annum
Projected benefits should be clear stated
DW that do not have a clear business case tend to suffer from
credibility problems at some stage during the delivery
process
Steps in DW delivery method
3. Education and Prototyping
Organizations will experiment with the concept of data analysis and
educate themselves on the value of a data warehousing
In some instances data warehouse may be the first large-scale client-
server solution being implemented within the organizations and will
require
new skills
experiences
hardware
Steps in DW delivery method
A prototyping activity on a small scale can further the education process
as long as
1. Prototype addresses clearly defined technical objective
2. Prototype can be thrown away once the feasibility of the concept has
been shown
3. Activity addresses a small subset of the eventual data content of the
DW
4. Activity time scale is not critical – seen as a timeboxed effort to
come to grips with the new technologies being considered
Steps in DW delivery method
4. Business Requirements
To produce a set of production-quality deliverables that grow to
full solution
Overall requirements should be understood
Overall system architecture is in place
20% of the time within the business requirements phase should be
spent on understanding the longer-term requirements
Determine the logical model for information within DW
Determine source systems that provide the data
Steps in DW delivery method

Determine business rules to be applied to data


Determine query profiles for the immediate requirement
Determine some aspects of data may not be available from the
existing operational systems
Probably not feasible to populate the DW with that data –
manual process to supplement data captured by the extract and load
process are generally unreliable
Steps in DW delivery method
5. Technical blue print
Delivers an overall architecture that satisfy the longer-term
requirements
Definition of the components that must be implemented in the
short term in order to derive any business profit
Blue print must identify
- Overall system architecture
- Server and data mart architecture
- Essential components of DB design
Steps in DW delivery method
- Data retention strategy
- Backup and recovery strategy
- Capacity plan for hardware and infrastructure (LAN, WAN)

Detailed design of DB is not produced in this stage


Significant components are identified and sized
Steps in DW delivery method
6. Building the Vision
First production deliverable is produced
Smallest component of DW that adds business benefit
Ex: stage builds the major infrastructure components for extracting and
loading data, but limit them to the extraction and load of one or two
data sources, with minimal history
Steps in DW delivery method
7. History Load

Remainder of the required history is loaded into DW


New entities would not be added into DW
Physical entities would be created to store increased data volumes
Ex: Building the vision phase has delivered a retail sales analysis DW
with 3 month’s worth history
- Business users analyze recent trends and address short-term sales issues
- Does not provide sufficient data to identify annual or seasonal sales
trends
Steps in DW delivery method
-Next step could be back load two years worth of sales history from
archieve tape – allows business users to analyze recent trends year on
year
-Data volumes becomes larger
-Operational management issues become complex, disk failure increase
dramatically, load processes take much longer to execute
-This allows the activity to backload history to be loaded in a
a separate phase
Steps in DW delivery method
8. Ad hoc Query
Configure Ad hoc query tool to operate against the DW
End-user access tools are capable of automatically generating the DB
query that answers any questions posed by the user.
Users will typically pose questions in terms that they are familiar Ex:
Sales by store last week which is converted into DB query by access tool
which is aware of the structure of information within DW
Steps in DW delivery method
9. Automation
Operational management process are fully automated within the data
warehouse. These include
1. Extracting and loading the data from variety of source systems
2. Transforming the data into a form suitable for analysis
3. Backing up, restoring and archiving data
4. Generating aggregations from predefined definitions within the DW
5. Monitoring query profiles, and determining the the appropriate
aggregations to maintain system performance
Steps in DW delivery method
10. Extending Scope
Extended to address new set of business requirements
Loading of additional data sources into DW – new data marts
11. Requirement Evolution
Requirements are never static
Business requirements will constantly change during the life of DW –
Process should support this and allows these changes to be reflected
within the system
Data Warehouse delivery process
IT strategy

Education Business case analysis

Technical blue print Business requirements

R
Building the vision E
Q E
History Load I v Extending scope

R l

Ad-hoc query E u
M t
Automation E i
N o
T n
Accessing the DW
1. Do not design the DW around a specific tool or tool type
2. To fully understand the user requirement and round them out, you
must gain an understanding of the business
3. Make sure that period information is captured by department, group
and any other organization divisions
4. It is imperative to get level of detail at which data must be stored
correct. If this decision is made incorrectly , DW must be completely
reorganized at some future date
System Process
Data warehouses must be architected to support 3 major factors
1. Populating the warehouse
2. Day-to-day management of the warehouse
3. Ability to cope with requirements evolution
1. Populating the warehouse

- Cleaning it up
- making it available for analysis
- typically done on a daily basis after the close of business day
2. Day-to-day management

-different from the management of an operational system


-Volumes are larger and require active management such as
creating/deleting summaries, or rolling data on/off the archive
-Essence to satisfy business requirements
3. Ability to cope with
requirement evolution
-Tends to be more complex aspect of a DW
-Requires architecture to be structured to cope future changes in
query profiles
-Evolution of completely new subject areas
Typical Process Flow within DW

1. Extract and load the data


2. Clean and transform the data into the form that can cope
with large data volumes and provide good query
performance
3. Look up and archive data
4. Manage queries and direct them to the appropriate data
sources
1. Extract and Load process
-Extracting data from the sources
-Loading into the DB
-Stripping out any detail that is there to support the
operational systems rather than the business requirements
-Adding more context
-Reconciling data with the other data sources
1. Extract and Load process
a. Controlling the process
Mechanisms that determine when to
start executing the data
run the transformations
Consistency checks
b. When to initiate extraction
Start extracting data from data sources when it represents the same
snapshot of time as all other data sources
1. Extract and Load process
c. Loading the data
- Do not execute consistency checks until all the data sources have been
loaded into temporary data store
- Expect the effort required to clean up the source systems to increase
exponentially with the number of overlapping data sources
2. Clean and Transform Data
1. Clean and transform the loaded data into a structure that speeds up
queries
2. Partition the data in order to speed up queries, optimize hardware
performance and simplify management of the DW
3. Create aggregations to speed up the common queries
Clean and transform the data
Make sure data is consistent within itself
Make sure that data is consistent with other data within the same source
Make sure that data is consistent with other data in other source systems
Make sure the data is consistent with the information already in the
warehouse
Transforming into effective
structures

-Convert the source data in the temporary data store into a structure
that is designed to balance query performance and operational cost
3. Backup and Archive Process

Data in the data warehouse is backed up regularly – to ensure DW can


always be recovered from data loss, S/W and H/W failures
In archiving older data is removed from the system in a format that
allows it to be quickly restored if required
Common to archive the data as a flat file extract where the file is in a
format that allows the data to be fast loaded directly into relevant fact
and dimensional tables
4. Query Management process

Manages the queries and speeds them by directing queries to the


most effective data source
Ensures system resources are used in efficient way
Does not generally operate during the regular load of information

You might also like