Testing 2
Testing 2
Testing 2
• Subject-oriented
• Integrated
• Non-volatile
• Time-variant
Data Warehouse – the ETL process
Legacy
DB Transform
CRM/ER
P DB
Finance
DB
Load
Testing a DW – Resources Involved
Resources involved
Tools Tasks
Review
Mapping
Docs
Write SQL
in favorite Run
editor TESTs
Compare
Dump results
results to a manually or
file w/compare
tool
Report
Defects
and issues
Timeline
Unit testing involves the following
Check the Mapping of fields present in staging
level.
Check for the duplication of values generated
using Sequence generator.
Check for the correctness of surrogate keys,
which uniquely identifies rows in database.
Check for Data type constraints of the fields
present in staging and core levels.
Check for string columns are left and right
trimmed.
Etc.
Database Testing Data Warehouse testing
Smaller in Scale Large scale. Voluminous Data
Ø Structure validation
1. Validate the source and target table structure against
corresponding mapping doc.
2. Source data type and Target data type should be same.
3. Length of data types in both source and target should be equal.
4. Verify that data field types and formats are specified
5. Source data type length should not less than the target data type
length.
6. Validate the name of columns in table against mapping doc.
Ø Constraint Validation
Ensure the constraints are defined for specific table as expected.
Ø Duplicate check
1. Needs to validate the unique key, primary key and
any other column should be unique as per the
business requirements are having any duplicate rows.
2. Check if any duplicate values exist in any column
which is extracting from multiple columns in source
and combining into one column.
Complete Data Validation
(using minus and intersect)
1. To validate the complete data set in source and target table minus query is best solution.
2. We need to source minus target and target minus source.
3. If minus query returns any value those should be considered as mismatching rows.
4. And also we needs to matching rows among source and target using Intersect statement.
5. The count returned by intersect should match with individual counts of source and target
tables.
6. If minus query returns o rows and count intersect is less than source count or target table
count then we can considered as duplicate rows are exists.
Some Useful testscenarios
1. Verify that extraction process did not
extract duplicate data from the source
(usually this happens in repeatable processes
where at point zero we need to extract all
data from the source file, but the during the
next intervals we only need to capture the
modified, and new rows.)
Data cleanness
Unnecessary columns should be deleted before loading
into the staging area.
Example2: If a column have name but it is taking extra
space , we have to “trim” space so before loading in the
staging area with the help of expression transformation
space will be trimmed.
Example1: Suppose telephone number and STD code in
different columns and requirement says it should be in
one column then with the help of expression
transformation we will concatenate the values in one
column.