SCD Type 2: Slowly Changing Dimension Type 2 Is A Model Where The Whole History Is Stored in The

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4
At a glance
Powered by AI
The document describes a Datastage job that implements Slowly Changing Dimension Type 2 (SCD Type 2) on a customer dimension table.

SCD Type 2 is a model where the full history of a dimension attribute is stored. A new record is created for an attribute change and the old and new records can be distinguished using fields like effective date and current indicator.

The job reads new customer data from a file, looks up existing customers to identify changes, updates the current flag on old records and inserts new records with new attribute values, effective date and version number.

SCD Type 2

Slowly changing dimension Type 2 is a model where the whole history is stored in the database. An additional dimension record is created and the segmenting between the old record values and the new (current) value is easy to extract and the history is clear. The fields 'effective date' and 'current indicator' are very often used in that dimension and the fact table usually stores dimension key and version number.

SCD 2 implementation in Datastage


The job described and depicted below shows how to implement SCD Type 2 in Datastage. It is one of many possible designs which can implement this dimension. For this example, we will use a table with customers data (it's name is D_CUSTOMER_SCD2) which has the following structure and data:

D_CUSTOMER dimension table before loading CUST_ID CUST_ NAME CUST_ CUST_ CUST_ REC_ GROUP_ID TYPE_ID COUNTRY_ID VERSION S C S C C C S S S S PL FI CD IT SK US RU PL FI GB REC_ EFFDT REC_ CURRENT_IND

DRBOUA7 Dream Basket EL ETIMAA5 ETL tools info BI FAMMFA0 Fajatso FICILA0 First Pactonic FRDXXA2 Frasir GAMOPA9 Ganpa LTD. GLMFIA6 Glasithklini FD FD EL FD FD TC BN

1 2006-10-01 Y 1 2006-09-29 Y 1 2006-09-27 Y 1 2006-09-25 Y 1 2006-09-23 Y 1 2006-09-21 Y 1 2006-09-19 Y 1 2006-09-17 Y 1 2006-09-15 Y 1 2006-09-13 Y

GGMOPA9 GG electronics EL GLMPEA9 Globiteleco GONDWA5 Goli Airlines

Datastage SCD2 job design

The most important facts and stages of the CUST_SCD2 job processing: The dimension table with customers is refreshed daily and one of the data sources is a text file. For the purpose of this example the CUST_ID=ETIMAA5 differs from the one stored in the database and it is the only record with changed data. It has the following structure and data: SCD 2 - Customers file extract:

There is a hashed file (Hash_NewCust) which handles a lookup of the new data coming from the text file. A T001_Lookups transformer does a lookup into a hashed file and maps new and old values to separate columns. SCD 2 lookup transformer

A T002_Check_Discrepacies_exist transformer compares old and new values of records and passes through only records that differ. SCD 2 check discrepancies transformer

A T003 transformer handles the UPDATE and INSERT actions of a record. The old record is updated with current indictator flag set to no and the new record is inserted with current indictator flag set to yes, increased record version by 1 and the current date. SCD 2 insert-update record transformer

ODBC Update stage (O_DW_Customers_SCD2_Upd) - update action 'Update existing rows only' and the selected key columns are CUST_ID and REC_VERSION so they will appear in the constructed where part of an SQL statement. ODBC Insert stage (O_DW_Customers_SCD2_Ins) - insert action 'insert rows without clearing' and the key column is CUST_ID.
D_CUSTOMER dimension table after Datawarehouse refresh CUST_ID CUST_ NAME CUST_ CUST_ CUST_ REC_ GROUP_ID TYPE_ID COUNTRY_ID VERSION EL FD FD EL FD EL FD TC BN S C S C C C S S S S C PL FI CD IT SK US RU PL FI GB ES REC_ EFFDT REC_ CURRENT_IND

DRBOUA7 Dream Basket FAMMFA0 Fajatso FICILA0 First Pactonic FRDXXA2 Frasir GAMOPA9 Ganpa LTD. GGMOPA9 GG electronics GLMFIA6 Glasithklini GLMPEA9 Globiteleco GONDWA5 Goli Airlines

1 2006-10-01 Y 1 2006-09-29 N 1 2006-09-27 Y 1 2006-09-25 Y 1 2006-09-23 Y 1 2006-09-21 Y 1 2006-09-19 Y 1 2006-09-17 Y 1 2006-09-15 Y 1 2006-09-13 Y 2 2006-12-02 Y

ETIMAA5 ETL tools info BI

ETIMAA5 ETL-Tools.info BI

You might also like