DataStage FAQ

DATASTAGE FAQ’s & TUTORIAL’s TOPIC INDEX
1. DATASTAGE QUESTIONS.............................................................................2
2. DATASTAGE FAQ from GEEK INTERVIEW QUESTIONS......................13
3. DATASTAGE FAQ.........................................................................................26
4. TOP 10 FEATURES IN DATASTAGE HAWK.............................................30
5. DATASTAGE NOTES....................................................................................32
6. DATASTAGE TUTORIAL.............................................................................43
............................................................................................................................................44
About DataStage.................................................................................................................44
Client Components.............................................................................................................44
DataStage Designer. ..........................................................................................................44
DataStage Director.............................................................................................................44
DataStage Manager............................................................................................................44
DataStage Administrator....................................................................................................44
DataStage Manager Roles..................................................................................................44
Server Components............................................................................................................45
DataStage Features.............................................................................................................45
Types of jobs......................................................................................................................45
DataStage NLS...................................................................................................................46
JOB.....................................................................................................................................46
Built-In Stages – Server Jobs.............................................................................................46
Aggregator. ........................................................................................................................46
Hashed File. .......................................................................................................................46
UniVerse. ...........................................................................................................................47
UniData..............................................................................................................................47
ODBC.................................................................................................................................47
Sequential File. ..................................................................................................................47
Folder Stage. ......................................................................................................................48
Transformer........................................................................................................................48
Container............................................................................................................................48
IPC Stage............................................................................................................................48
Link Collector Stage...........................................................................................................49
Link Partitioner Stage.........................................................................................................49
Server Job Properties..........................................................................................................49
Containers...........................................................................................................................50
Local containers. ...............................................................................................................50
Shared containers. .............................................................................................................50
Job Sequences.....................................................................................................................50
7. LEARN FEATURES OF DATASTAGE.........................................................51
8. INFORMATICA vs DATASTAGE:................................................................93
☻Page 1 of 243☻
9. BEFORE YOU DESIGN YOUR APPLICATION........................................104
10. DATASTAGE 7.5x1 GUI FEATURES.........................................................111
11. DATASTAGE & DWH INTERVIEW QUESTIONS...................................115
12. DATASTAGE ROUTINES............................................................................129
13. SET_JOB_PARAMETERS_ROUTINE........................................................197
DATASTAGE QUESTIONS
1. What is the flow of loading data into fact & dimensional tables?
A) Fact table - Table with Collection of Foreign Keys corresponding to the Primary Keys
in Dimensional table. Consists of fields with numeric values.
Dimension table - Table with Unique Primary Key.
Load - Data should be first loaded into dimensional table. Based on the primary key
values in dimensional table, the data should be loaded into Fact table.
2. What is the default cache size? How do you change the cache size if needed?
A. Default cache size is 256 MB. We can increase it by going into Datastage
Administrator and selecting the Tunable Tab and specify the cache size over there.
3. What are types of Hashed File?

A) Hashed File is classified broadly into 2 types.
a) Static - Sub divided into 17 types based on Primary Key Pattern.
b) Dynamic - sub divided into 2 types
i) Generic ii) Specific.
☻Page 2 of 243☻
Dynamic files do not perform as well as a well, designed static file, but do perform better
than a badly designed one. When creating a dynamic file you can specify the following
Although all of these have default values)
By Default Hashed file is "Dynamic - Type Random 30 D"
4. What does a Config File in parallel extender consist of?

A) Config file consists of the following.
a) Number of Processes or Nodes.
b) Actual Disk Storage Location.
5. What is Modulus and Splitting in Dynamic Hashed File?

A. In a Hashed File, the size of the file keeps changing randomly.
If the size of the file increases it is called as "Modulus".
If the size of the file decreases it is called as "Splitting".
6. What are Stage Variables, Derivations and Constants?

A. Stage Variable - An intermediate processing variable that retains value during read
and doesn’t pass the value into target column.
Derivation - Expression that specifies value to be passed on to the target column.
Constant - Conditions that are either true or false that specifies flow of data with a link.
7. Types of views in Datastage Director?

There are 3 types of views in Datastage Director
a) Job View - Dates of Jobs Compiled.
b) Log View - Status of Job last run
c) Status View - Warning Messages, Event Messages, Program Generated Messages.
8. Types of Parallel Processing?

A) Parallel Processing is broadly classified into 2 types.
a) SMP - Symmetrical Multi Processing.
b) MPP - Massive Parallel Processing.
9. Orchestrate Vs Datastage Parallel Extender?

A) Orchestrate itself is an ETL tool with extensive parallel processing capabilities and
running on UNIX platform. Datastage used Orchestrate with Datastage XE (Beta version
of 6.0) to incorporate the parallel processing capabilities. Now Datastage has purchased
Orchestrate and integrated it with Datastage XE and released a new version Datastage 6.0
i.e Parallel Extender.
10. Importance of Surrogate Key in Data warehousing?

A) Surrogate Key is a Primary Key for a Dimension table. Most importance of using it is
it is independent of underlying database. i.e. Surrogate Key is not affected by the changes
going on with a database.
11. How to run a Shell Script within the scope of a Data stage job?
A) By using "ExcecSH" command at Before/After job properties.
☻Page 3 of 243☻
12. How to handle Date conversions in Datastage? Convert a mm/dd/yyyy format to
yyyy-dd-mm?
A) We use a) "Iconv" function - Internal Conversion.
b) "Oconv" function - External Conversion.
Function to convert mm/dd/yyyy format to yyyy-dd-mm is

Oconv(Iconv(Filedname,"D/MDY[2,2,4]"),"D-MDY[2,2,4]")
13 How do you execute datastage job from command line prompt?

A) Using "dsjob" command as follows.
dsjob -run -jobstatus projectname jobname
14. Functionality of Link Partitioner and Link Collector?

Link Partitioner: It actually splits data into various partitions or data flows using various
partition methods.
Link Collector: It collects the data coming from partitions, merges it into a single data
flow and loads to target.
15. Types of Dimensional Modeling?

A) Dimensional modeling is again sub divided into 2 types.
a) Star Schema - Simple & Much Faster. Denormalized form.
b) Snowflake Schema - Complex with more Granularity. More normalized form.
16. Differentiate Primary Key and Partition Key?

Primary Key is a combination of unique and not null. It can be a collection of key values
called as composite primary key. Partition Key is a just a part of Primary Key. There are
several methods of partition like Hash, DB2, and Random etc. While using Hash partition
we specify the Partition Key.
17. Differentiate Database data and Data warehouse data?

A) Data in a Database is
a) Detailed or Transactional
b) Both Readable and Writable.
c) Current.
18. Containers Usage and Types?

Container is a collection of stages used for the purpose of Reusability.
There are 2 types of Containers.
a) Local Container: Job Specific
b) Shared Container: Used in any job within a project.
19. Compare and Contrast ODBC and Plug-In stages?

ODBC: a) Poor Performance.
b) Can be used for Variety of Databases.
c) Can handle Stored Procedures.
☻Page 4 of 243☻
Plug-In: a) Good Performance.
b) Database specific. (Only one database)
c) Cannot handle Stored Procedures.
20. Dimension Modelling types along with their significance

Data Modelling is Broadly classified into 2 types.
a) E-R Diagrams (Entity - Relatioships).
b) Dimensional Modelling.
Q 21 What are Ascential Dastastage Products, Connectivity

Ans:
Ascential Products
Ascential DataStage
Ascential DataStage EE (3)
Ascential DataStage EE MVS
Ascential DataStage TX
Ascential QualityStage
Ascential MetaStage
Ascential RTI (2)
Ascential ProfileStage
Ascential AuditStage
Ascential Commerce Manager
Industry Solutions
Connectivity
Files
RDBMS
Real-time
PACKs
EDI
Other
Q 22 Explain Data Stage Architecture?

Data Stage contains two components,
Client Component.
Server Component.
Client Component:
 Data Stage Administrator.
 Data Stage Manager
 Data Stage Designer
 Data Stage Director
Server Components:
 Data Stage Engine
 Meta Data Repository
☻Page 5 of 243☻
 Package Installer
Data Stage Administrator:

Used to create the project.
Contains set of properties
We can set the buffer size (by default 128 MB)

We can increase the buffer size.
We can set the Environment Variables.
In tunable we have in process and inter-process
In-process—Data read in sequentially
Inter-process— It reads the data as it comes.
It just interfaces to metadata.
Data Stage Manager:

We can view and edit the Meta data Repository.
We can import table definitions.
We can export the Data stage components in .xml or .dsx format.
We can create routines and transforms
We can compile the multiple jobs.
☻Page 6 of 243☻
Data Stage Designer:
We can create the jobs. We can compile the job. We can run the job. We can
declare stage variable in transform, we can call routines, transform, macros, functions.
We can write constraints.
Data Stage Director:

We can run the jobs.
We can schedule the jobs. (Schedule can be done daily, weekly, monthly, quarterly)
We can monitor the jobs.
We can release the jobs.
Q 23 What is Meta Data Repository?

Meta Data is a data about the data.
It also contains
 Query statistics
 ETL statistics
 Business subject area
 Source Information
 Target Information
 Source to Target mapping Information.
Q 24 What is Data Stage Engine?
☻Page 7 of 243☻
It is a JAVA engine running at the background.
Q 25 What is Dimensional Modeling?

Dimensional Modeling is a logical design technique that seeks to present the data
in a standard framework that is, intuitive and allows for high performance access.
Q 26 What is Star Schema?

Star Schema is a de-normalized multi-dimensional model. It contains centralized fact
tables surrounded by dimensions table.
Dimension Table: It contains a primary key and description about the fact table.
Fact Table: It contains foreign keys to the dimension tables, measures and aggregates.
Q 27 What is surrogate Key?

It is a 4-byte integer which replaces the transaction / business / OLTP key in the
dimension table. We can store up to 2 billion record.
Q 28 Why we need surrogate key?

It is used for integrating the data may help better for primary key.
Index maintenance, joins, table size, key updates, disconnected inserts and
partitioning.
Q 29 What is Snowflake schema?

It is partially normalized dimensional model in which at two represents least one
dimension or more hierarchy related tables.
Q 30 Explain Types of Fact Tables?

Factless Fact: It contains only foreign keys to the dimension tables.
Additive Fact: Measures can be added across any dimensions.
Semi-Additive: Measures can be added across some dimensions. Eg, % age, discount
Non-Additive: Measures cannot be added across any dimensions. Eg, Average
Conformed Fact: The equation or the measures of the two fact tables are the same under
the facts are measured across the dimensions with a same set of measures.
Q 31 Explain the Types of Dimension Tables?

Conformed Dimension: If a dimension table is connected to more than one fact table,
the granularity that is defined in the dimension table is common across between the fact
tables.
Junk Dimension: The Dimension table, which contains only flags.
Monster Dimension: If rapidly changes in Dimension are known as Monster Dimension.
De-generative Dimension: It is line item-oriented fact table design.
Q 32 What are stage variables?

Stage variables are declaratives in Transformer Stage used to store values. Stage variables
are active at the run time. (Because memory is allocated at the run time).
Q 33 What is sequencer?
It sets the sequence of execution of server jobs.
☻Page 8 of 243☻
Q 34 What are Active and Passive stages?
Active Stage: Active stage model the flow of data and provide mechanisms for combining
data streams, aggregating data and converting data from one data type to another. Eg,
Transformer, aggregator, sort, Row Merger etc.
Passive Stage: A Passive stage handles access to Database for the extraction or writing of
data. Eg, IPC stage, File types, Universe, Unidata, DRS stage etc.
Q 35 What is ODS?
Operational Data Store is a staging area where data can be rolled back.
Q 36 What are Macros?

They are built from Data Stage functions and do not require arguments.
A number of macros are provided in the JOBCONTROL.H file to facilitate getting
information about the current job, and links and stages belonging to the current job. These
can be used in expressions (for example for use in Transformer stages), job control
routines, filenames and table names, and before/after subroutines.
These macros provide the functionality of using the DSGetProjectInfo, DSGetJobInfo,

DSGetStageInfo, and DSGetLinkInfo functions with the DSJ.ME token as the JobHandle
and can be used in all active stages and before/after subroutines. The macros provide the
functionality for all the possible InfoType arguments for the DSGet…Info functions. See
the Function call help topics for more details.
The available macros are: DSHostName

DSProjectName
DSJobStatus
DSJobName
DSJobController
DSJobStartDate
DSJobStartTime
DSJobStartTimestamp
DSJobWaveNo
DSJobInvocations
DSJobInvocationId
DSStageName
DSStageLastErr
DSStageType
DSStageInRowNum
DSStageVarList
DSLinkRowCount
DSLinkLastErr
DSLinkName
1) Examples
2) To obtain the name of the current job:
☻Page 9 of 243☻
3) MyName = DSJobName
To obtain the full current stage name:
MyName = DSJobName : ″ .″ : DSStageName
Q 37 What is keyMgtGetNextValue?
It is a Built-in transform it generates Sequential numbers. Its input type is literal string &
output type is string.
Q 38 What are stages?

The stages are either passive or active stages.
Passive stages handle access to databases for extracting or writing data. Active
stages model the flow of data and provide mechanisms for combining data streams,
aggregating data, and converting data from one data type to another.
Q 39 What index is created on Data Warehouse?

Bitmap index is created in Data Warehouse.
Q 40 What is container?
A container is a group of stages and links. Containers enable you to simplify and
modularize your server job designs by replacing complex areas of the diagram with a
single container stage. You can also use shared containers as a way of incorporating server
job functionality into parallel jobs.
DataStage provides two types of container:
• Local containers. These are created within a job and are only accessible by that
job. A local container is edited in a tabbed page of the job’s Diagram window.
• Shared containers. These are created separately and are stored in the Repository
in the same way that jobs are. There are two types of shared container
Q 41 What is function? ( Job Control – Examples of Transform Functions )
Functions take arguments and return a value.
 BASIC functions: A function performs mathematical or string manipulations on
the arguments supplied to it, and return a value. Some functions have 0 arguments;
most have 1 or more. Arguments are always in parentheses, separated by commas,
as shown in this general syntax:
FunctionName (argument, argument)
 DataStage BASIC functions: These functions can be used in a job control
routine, which is defined as part of a job’s properties and allows other jobs to be
run and controlled from the first job. Some of the functions can also be used for
getting status information on the current job; these are useful in active stage
expressions and before- and after-stage subroutines.
To do this ... Use this function ...

Specify the job you want to control DSAttachJob
Set parameters for the job you want to control DSSetParam
☻Page 10 of 243☻
Set limits for the job you want to control DSSetJobLimit
Request that a job is run DSRunJob
Wait for a called job to finish DSWaitForJob
Gets the meta data details for the specified link DSGetLinkMetaData
Get information about the current project DSGetProjectInfo
Get buffer size and timeout value for an IPC or Web Service DSGetIPCStageProps
stage
Get information about the controlled job or current job DSGetJobInfo
Get information about the meta bag properties associated with DSGetJobMetaBag
the named job
Get information about a stage in the controlled job or current DSGetStageInfo
job
Get the names of the links attached to the specified stage DSGetStageLinks
Get a list of stages of a particular type in a job. DSGetStagesOfType
Get information about the types of stage in a job. DSGetStageTypes
Get information about a link in a controlled job or current job DSGetLinkInfo
Get information about a controlled job’s parameters DSGetParamInfo
Get the log event from the job log DSGetLogEntry

Get a number of log events on the specified subject from the DSGetLogSummary
job log
Get the newest log event, of a specified type, from the job log DSGetNewestLogId
Log an event to the job log of a different job DSLogEvent

Stop a controlled job DSStopJob
Return a job handle previously obtained from DSAttachJob DSDetachJob
Log a fatal error message in a job's log file and aborts the job. DSLogFatal
Log an information message in a job's log file. DSLogInfo
Put an info message in the job log of a job controlling current DSLogToController
job.
Log a warning message in a job's log file. DSLogWarn
Generate a string describing the complete status of a valid DSMakeJobReport
attached job.
Insert arguments into the message template. DSMakeMsg
Ensure a job is in the correct state to be run or validated. DSPrepareJob
Interface to system send mail facility. DSSendMail
☻Page 11 of 243☻
Log a warning message to a job log file. DSTransformError
Convert a job control status or error code into an explanatory DSTranslateCode
text message.
Suspend a job until a named file either exists or does not exist. DSWaitForFile
Checks if a BASIC routine is cataloged, either in VOC as a DSCheckRoutine

callable item, or in the catalog space.
Execute a DOS or Data Stage Engine command from a DSExecute
before/after subroutine.
Set a status message for a job to return as a termination DSSetUserStatus
message when it finishes
Q 42 What is Routines?
Routines are stored in the Routines branch of the Data Stage Repository, where you can
create, view or edit. The following programming components are classified as routines:
Transform functions, Before/After subroutines, Custom UniVerse functions, ActiveX
(OLE) functions, Web Service routines
Q 43 What is data stage Transform?

Q 44 What is Meta Brokers?
Q 45 What is usage analysis?
Q 46 What is job sequencer?
Q 47 What are different activities in job sequencer?
Q 48 What are triggers in data Stages? (conditional, unconditional, otherwise)
Q 49 Are u generated job Reports? S
Q 50 What is plug-in?
Q 51 Have u created any custom transform? Explain? (Oconv)
☻Page 12 of 243☻
DATASTAGE FAQ from GEEK INTERVIEW QUESTIONS
Question: Dimension Modeling types along with their significance
Answer:
Data Modelling is broadly classified into 2 types.
A) E-R Diagrams (Entity - Relatioships).
B) Dimensional Modelling.
Question: Dimensional modelling is again sub divided into 2 types.
Answer:
A) Star Schema - Simple & Much Faster. Denormalized form.
B) Snowflake Schema - Complex with more Granularity. More normalized form.
Question: Importance of Surrogate Key in Data warehousing?
Answer:
Surrogate Key is a Primary Key for a Dimension table. Most importance of using it is, it is
independent of underlying database, i.e. Surrogate Key is not affected by the changes
going on with a database.
Question: Differentiate Database data and Data warehouse data?
Answer:
Data in a Database is
A) Detailed or Transactional
B) Both Readable and Writable.
C) Current.
Question: What is the flow of loading data into fact & dimensional tables?
Answer:
☻Page 13 of 243☻
Fact table - Table with Collection of Foreign Keys corresponding to the Primary Keys in
Dimensional table. Consists of fields with numeric values.
Dimension table - Table with Unique Primary Key.
Load - Data should be first loaded into dimensional table. Based on the primary key
values in dimensional table, then data should be loaded into Fact table.
Question: Orchestrate Vs Datastage Parallel Extender?
Answer:
Orchestrate itself is an ETL tool with extensive parallel processing capabilities and
running on UNIX platform. Datastage used Orchestrate with Datastage XE (Beta version
of 6.0) to incorporate the parallel processing capabilities. Now Datastage has purchased
Orchestrate and integrated it with Datastage XE and released a new version Datastage 6.0
i.e. Parallel Extender.
Question: Differentiate Primary Key and Partition Key?
Answer:
Primary Key is a combination of unique and not null. It can be a collection of key values
called as composite primary key. Partition Key is a just a part of Primary Key. There are
several methods of partition like Hash, DB2, Random etc...While using Hash partition we
specify the Partition Key.
Question: What are Stage Variables, Derivations and Constants?

Answer:
Stage Variable - An intermediate processing variable that retains value during read and
doesn’t pass the value into target column.
Constraint - Conditions that are either true or false that specifies flow of data with a link.
Derivation - Expression that specifies value to be passed on to the target column.
Question: What is the default cache size? How do you change the cache size if
needed?
Answer:
Default cache size is 256 MB. We can increase it by going into Datastage Administrator
and selecting the Tunable Tab and specify the cache size over there.
Question: What is Hash file stage and what is it used for?

Answer:
Used for Look-ups. It is like a reference table. It is also used in-place of ODBC, OCI
tables for better performance.
Question: What are types of Hashed File?

Answer:
Hashed File is classified broadly into 2 types.
A) Static - Sub divided into 17 types based on Primary Key Pattern.
B) Dynamic - sub divided into 2 types
i) Generic
ii) Specific
Default Hased file is "Dynamic - Type Random 30 D"
☻Page 14 of 243☻
Question: What are Static Hash files and Dynamic Hash files?
Answer:
As the names itself suggest what they mean. In general we use Type-30 dynamic Hash
files. The Data file has a default size of 2GB and the overflow file is used if the data
exceeds the 2GB size.
Question: What is the Usage of Containers? What are its types?

Answer:
Container is a collection of stages used for the purpose of Reusability.
There are 2 types of Containers.
A) Local Container: Job Specific
B) Shared Container: Used in any job within a project.
Question: Compare and Contrast ODBC and Plug-In stages?

Answer:
ODBC PLUG-IN
Poor Performance Good Performance
Can be used for Variety of Databases Database Specific (only one database)
Can handle Stored Procedures Cannot handle Stored Procedures
Question: How do you execute datastage job from command line prompt?
Answer:
Using "dsjob" command as follows.
dsjob -run -jobstatus projectname jobname
Question: What are the command line functions that import and export the DS jobs?
Answer:
 dsimport.exe - imports the DataStage components.
 dsexport.exe - exports the DataStage components.
Question: How to run a Shell Script within the scope of a Data stage job?
Answer:
By using "ExcecSH" command at Before/After job properties.
Question: What are OConv () and Iconv () functions and where are they used?
Answer:
IConv() - Converts a string to an internal storage format
OConv() - Converts an expression to an output format.
Question: How to handle Date convertions in Datastage? Convert mm/dd/yyyy

format to yyyy-dd-mm?
Answer:
We use
a) "Iconv" function - Internal Convertion.
b) "Oconv" function - External Convertion.
Function to convert mm/dd/yyyy format to yyyy-dd-mm is
Oconv(Iconv(Filedname,"D/MDY[2,2,4]"),"D-MDY[2,2,4]")
☻Page 15 of 243☻
Question: Types of Parallel Processing?
Answer:
Parallel Processing is broadly classified into 2 types.
a) SMP - Symmetrical Multi Processing.
b) MPP - Massive Parallel Processing.
Question: What does a Config File in parallel extender consist of?

Answer:
Config file consists of the following.
a) Number of Processes or Nodes.
b) Actual Disk Storage Location.
Question: Functionality of Link Partitioner and Link Collector?

Answer:
Link Partitioner: It actually splits data into various partitions or data flows using various
Partition methods.
Link Collector: It collects the data coming from partitions, merges it into a single data
flow and loads to target.
Question: What is Modulus and Splitting in Dynamic Hashed File?

Answer:
In a Hashed File, the size of the file keeps changing randomly.
If the size of the file increases it is called as "Modulus".
If the size of the file decreases it is called as "Splitting".
Question: Types of views in Datastage Director?

Answer:
There are 3 types of views in Datastage Director
a) Job View - Dates of Jobs Compiled.
b) Log View - Status of Job last Run
c) Status View - Warning Messages, Event Messages, Program Generated Messages.
Question: Did you Parameterize the job or hard-coded the values in the jobs?
Answer:
Always parameterized the job. Either the values are coming from Job Properties or from a
‘Parameter Manager’ – a third part tool. There is no way you will hard–code some
parameters in your jobs. The often Parameterized variables in a job are: DB DSN name,
username, password, dates W.R.T for the data to be looked against at.
Question: Have you ever involved in updating the DS versions like DS 5.X, if so tell
us some the steps you have taken in doing so?
Answer:
Yes.
The following are some of the steps:
Definitely take a back up of the whole project(s) by exporting the project as a .dsx file
☻Page 16 of 243☻
See that you are using the same parent folder for the new version also for your old jobs
using the hard-coded file path to work.
After installing the new version import the old project(s) and you have to compile them all
again. You can use 'Compile All' tool for this.
Make sure that all your DB DSN's are created with the same name as old ones. This step is
for moving DS from one machine to another.
In case if you are just upgrading your DB from Oracle 8i to Oracle 9i there is tool on DS
CD that can do this for you.
Do not stop the 6.0 server before the upgrade, version 7.0 install process collects project
information during the upgrade. There is NO rework (recompilation of existing
jobs/routines) needed after the upgrade.
Question: How did you handle reject data?

Answer:
Typically a Reject-link is defined and the rejected data is loaded back into data warehouse.
So Reject link has to be defined every Output link you wish to collect rejected data.
Rejected data is typically bad data like duplicates of Primary keys or null-rows where data
is expected.
Question: What are other Performance tunings you have done in your last project to
increase the performance of slowly running jobs?
Answer:
 Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the
server using Hash/Sequential files for optimum performance also for data recovery in
case job aborts.
 Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values
for faster inserts, updates and selects.
 Tuned the 'Project Tunables' in Administrator for better performance.
 Used sorted data for Aggregator.
 Sorted the data as much as possible in DB and reduced the use of DS-Sort for
better performance of jobs.
 Removed the data not used from the source as early as possible in the job.
 Worked with DB-admin to create appropriate Indexes on tables for better
performance of DS queries.
 Converted some of the complex joins/business in DS to Stored Procedures on DS
for faster execution of the jobs.
 If an input file has an excessive number of rows and can be split-up then use
standard logic to run jobs in parallel.
 Before writing a routine or a transform, make sure that there is not the
functionality required in one of the standard routines supplied in the sdk or ds utilities
categories.
 Constraints are generally CPU intensive and take a significant amount of time to
process. This may be the case if the constraint calls routines or external macros but if it
is inline code then the overhead will be minimal.
 Try to have the constraints in the 'Selection' criteria of the jobs itself. This will
eliminate the unnecessary records even getting in before joins are made.
☻Page 17 of 243☻
 Tuning should occur on a job-by-job basis.
 Use the power of DBMS.
 Try not to use a sort stage when you can use an ORDER BY clause in the
database.
 Using a constraint to filter a record set is much slower than performing a SELECT
… WHERE….
 Make every attempt to use the bulk loader for your particular database. Bulk
loaders are generally faster than using ODBC or OLE.
Question: Tell me one situation from your last project, where you had faced problem
and How did u solve it?
Answer:
1. The jobs in which data is read directly from OCI stages are running extremely
slow. I had to stage the data before sending to the transformer to make the jobs run
faster.
2. The job aborts in the middle of loading some 500,000 rows. Have an option either
cleaning/deleting the loaded data and then run the fixed job or run the job again from
the row the job has aborted. To make sure the load is proper we opted the former.
Question: Tell me the environment in your last projects

Answer:
Give the OS of the Server and the OS of the Client of your recent most project
Question: How did u connect with DB2 in your last project?
Answer:
Most of the times the data was sent to us in the form of flat files. The data is dumped and
sent to us. In some cases were we need to connect to DB2 for look-ups as an instance then
we used ODBC drivers to connect to DB2 (or) DB2-UDB depending the situation and
availability. Certainly DB2-UDB is better in terms of performance as you know the native
drivers are always better than ODBC drivers. 'iSeries Access ODBC Driver 9.00.02.02' -
ODBC drivers to connect to AS400/DB2.
Question: What are Routines and where/how are they written and have you written
any routines before?
Answer:
Routines are stored in the Routines branch of the DataStage Repository, where you can
create, view or edit.
The following are different types of Routines:
1. Transform Functions
2. Before-After Job subroutines
3. Job Control Routines
Question: How did you handle an 'Aborted' sequencer?

Answer:
In almost all cases we have to delete the data inserted by this from DB manually and fix
the job and then run the job again.
Question: What are Sequencers?
☻Page 18 of 243☻
Answer:
Sequencers are job control programs that execute other jobs with preset Job parameters.
Question: Read the String functions in DS

Answer:
Functions like [] -> sub-string function and ':' -> concatenation operator
Syntax:
string [ [ start, ] length ]
string [ delimiter, instance, repeats ]
Question: What will you in a situation where somebody wants to send you a file and
use that file as an input or reference and then run job.
Answer:
• Under Windows: Use the 'WaitForFileActivity' under the Sequencers and then run
the job. May be you can schedule the sequencer around the time the file is expected to
arrive.
• Under UNIX: Poll for the file. Once the file has start the job or sequencer
depending on the file.
Question: What is the utility you use to schedule the jobs on a UNIX server other
than using Ascential Director?
Answer:
Use crontab utility along with dsexecute() function along with proper parameters passed.
Question: Did you work in UNIX environment?

Answer:
Yes. One of the most important requirements.
Question: How would call an external Java function which are not supported by
DataStage?
Answer:
Starting from DS 6.0 we have the ability to call external Java functions using a Java
package from Ascential. In this case we can even use the command line to invoke the Java
function and write the return values from the Java program (if any) and use that files as a
source in DataStage job.
Question: How will you determine the sequence of jobs to load into data warehouse?
Answer:
First we execute the jobs that load the data into Dimension tables, then Fact tables, then
load the Aggregator tables (if any).
Question: The above might raise another question: Why do we have to load the
dimensional tables first, then fact tables:
Answer:
As we load the dimensional tables the keys (primary) are generated and these keys
(primary) are Foreign keys in Fact tables.
☻Page 19 of 243☻
Question: Does the selection of 'Clear the table and Insert rows' in the ODBC stage
send a Truncate statement to the DB or does it do some kind of Delete logic.
Answer:
There is no TRUNCATE on ODBC stages. It is Clear table blah blah and that is a delete
from statement. On an OCI stage such as Oracle, you do have both Clear and Truncate
options. They are radically different in permissions (Truncate requires you to have alter
table permissions where Delete doesn't).
Question: How do you rename all of the jobs to support your new File-naming
conventions?
Answer:
Create an Excel spreadsheet with new and old names. Export the whole project as a dsx.
Write a Perl program, which can do a simple rename of the strings looking up the Excel
file. Then import the new dsx file probably into a new project for testing. Recompile all
jobs. Be cautious that the name of the jobs has also been changed in your job control jobs
or Sequencer jobs. So you have to make the necessary changes to these Sequencers.
Question: When should we use ODS?

Answer:
DWH's are typically read only, batch updated on a schedule
ODS's are maintained in more real time, trickle fed constantly
Question: What other ETL's you have worked with?

Answer:
Informatica and also DataJunction if it is present in your Resume.
Question: How good are you with your PL/SQL?

Answer:
On the scale of 1-10 say 8.5-9
Question: What versions of DS you worked with?

Answer:
DS 7.5, DS 7.0.2, DS 6.0, DS 5.2
Question: What's the difference between Datastage Developers...?

Answer:
Datastage developer is one how will code the jobs. Datastage designer is how will design
the job, I mean he will deal with blue prints and he will design the jobs the stages that are
required in developing the code
Question: What are the requirements for your ETL tool?

Answer:
Do you have large sequential files (1 million rows, for example) that need to be compared
every day versus yesterday?
If so, then ask how each vendor would do that. Think about what process they are going to
do. Are they requiring you to load yesterday’s file into a table and do lookups?
☻Page 20 of 243☻
If so, RUN!! Are they doing a match/merge routine that knows how to process this in
sequential files? Then maybe they are the right one. It all depends on what you need the
ETL to do.
If you are small enough in your data sets, then either would probably be OK.
Question: What are the main differences between Ascential DataStage and
Informatica PowerCenter?
Answer:
Chuck Kelley’s Answer: You are right; they have pretty much similar functionality.
However, what are the requirements for your ETL tool? Do you have large sequential files
(1 million rows, for example) that need to be compared every day versus yesterday? If so,
then ask how each vendor would do that. Think about what process they are going to do.
Are they requiring you to load yesterday’s file into a table and do lookups? If so, RUN!!
Are they doing a match/merge routine that knows how to process this in sequential files?
Then maybe they are the right one. It all depends on what you need the ETL to do. If you
are small enough in your data sets, then either would probably be OK.
Les Barbusinski’s Answer: Without getting into specifics, here are some differences you
may want to explore with each vendor:
• Does the tool use a relational or a proprietary database to store its Meta data and
scripts? If proprietary, why?
• What add-ons are available for extracting data from industry-standard ERP,
Accounting, and CRM packages?
• Can the tool’s Meta data be integrated with third-party data modeling and/or
business intelligence tools? If so, how and with which ones?
• How well does each tool handle complex transformations, and how much external
scripting is required?
• What kinds of languages are supported for ETL script extensions?
Almost any ETL tool will look like any other on the surface. The trick is to find out which
one will work best in your environment. The best way I’ve found to make this
determination is to ascertain how successful each vendor’s clients have been using their
product. Especially clients who closely resemble your shop in terms of size, industry, in-
house skill sets, platforms, source systems, data volumes and transformation complexity.
Ask both vendors for a list of their customers with characteristics similar to your own that
have used their ETL product for at least a year. Then interview each client (preferably
several people at each site) with an eye toward identifying unexpected problems, benefits,
or quirkiness with the tool that have been encountered by that customer. Ultimately, ask
each customer – if they had it all to do over again – whether or not they’d choose the same
tool and why? You might be surprised at some of the answers.
Joyce Bischoff’s Answer: You should do a careful research job when selecting products.
You should first document your requirements, identify all possible products and evaluate
each product against the detailed requirements. There are numerous ETL products on the
☻Page 21 of 243☻
market and it seems that you are looking at only two of them. If you are unfamiliar with
the many products available, you may refer to www.tdan.com, the Data Administration
Newsletter, for product lists.
If you ask the vendors, they will certainly be able to tell you which of their product’s
features are stronger than the other product. Ask both vendors and compare the answers,
which may or may not be totally accurate. After you are very familiar with the products,
call their references and be sure to talk with technical people who are actually using the
product. You will not want the vendor to have a representative present when you speak
with someone at the reference site. It is also not a good idea to depend upon a high-level
manager at the reference site for a reliable opinion of the product. Managers may paint a
very rosy picture of any selected product so that they do not look like they selected an
inferior product.
Question: How many places u can call Routines?

Answer:
Four Places u can call
1. Transform of routine
a. Date Transformation
b. Upstring Transformation
2. Transform of the Before & After Subroutines
3. XML transformation
4. Web base transformation
Question: What is the Batch Program and how can generate?

Answer: Batch program is the program it's generate run time to maintain by the Datastage
itself but u can easy to change own the basis of your requirement (Extraction,
Transformation, Loading) .Batch program are generate depends your job nature either
simple job or sequencer job, you can see this program on job control option.
Question: Suppose that 4 job control by the sequencer like (job 1, job 2, job 3, job 4 )
if job 1 have 10,000 row ,after run the job only 5000 data has been loaded in target
table remaining are not loaded and your job going to be aborted then.. How can
short out the problem?
Answer:
Suppose job sequencer synchronies or control 4 job but job 1 have problem, in this
condition should go director and check it what type of problem showing either data type
problem, warning massage, job fail or job aborted, If job fail means data type problem or
missing column action .So u should go Run window ->Click-> Tracing->Performance or
In your target table ->general -> action-> select this option here two option
(i) On Fail -- Commit , Continue
(ii) On Skip -- Commit, Continue.
First u check how much data already load after then select on skip option then
continue and what remaining position data not loaded then select On Fail ,
Continue ...... Again Run the job defiantly u gets successful massage
Question: What happens if RCP is disable?
☻Page 22 of 243☻
Answer:
In such case OSH has to perform Import and export every time when the job runs and the
processing time job is also increased...
Question: How do you rename all of the jobs to support your new File-naming
conventions?
Answer: Create a Excel spreadsheet with new and old names. Export the whole project as
a dsx. Write a Perl program, which can do a simple rename of the strings looking up the
Excel file. Then import the new dsx file probably into a new project for testing.
Recompile all jobs. Be cautious that the name of the jobs has also been changed in your
job control jobs or Sequencer jobs. So you have to make the necessary changes to these
Sequencers.
Question: What will you in a situation where somebody wants to send you a file and
use that file as an input or reference and then run job.
Answer: A. Under Windows: Use the 'WaitForFileActivity' under the Sequencers and
then run the job. May be you can schedule the sequencer around the time the file is
expected to arrive.
B. Under UNIX: Poll for the file. Once the file has start the job or sequencer depending on
the file
Question: What are Sequencers?
Answer: Sequencers are job control programs that execute other jobs with preset Job
parameters.
Answer: In almost all cases we have to delete the data inserted by this from DB manually
and fix the job and then run the job again.
Question34: What is the difference between the Filter stage and the Switch stage?
Ans: There are two main differences, and probably some minor ones as well. The two
main differences are as follows.
1) The Filter stage can send one input row to more than one output link. The
Switch stage can not - the C switch construct has an implicit break in every case.
2) The Switch stage is limited to 128 output links; the Filter stage can have a
theoretically unlimited number of output links. (Note: this is not a challenge!)
Question: How can i achieve constraint based loading using datastage7.5.My target
tables have inter dependencies i.e. Primary key foreign key constraints. I want my primary
key tables to be loaded first and then my foreign key tables and also primary key tables
should be committed before the foreign key tables are executed. How can I go about it?
☻Page 23 of 243☻
Ans:1) Create a Job Sequencer to load you tables in Sequential mode
In the sequencer Call all Primary Key tables loading Jobs first and followed by Foreign
key tables, when triggering the Foreign tables load Job trigger them only when Primary
Key load Jobs run Successfully ( i.e. OK trigger)
2) To improve the performance of the Job, you can disable all the constraints on the tables
and load them. Once loading done, check for the integrity of the data. Which does not
meet raise exceptional data and cleanse them.
This only a suggestion, normally when loading on constraints are up, will drastically
performance will go down.
3) If you use Star schema modeling, when you create physical DB from the model, you
can delete all constraints and the referential integrity would be maintained in the ETL
process by referring all your dimension keys while loading fact tables. Once all
dimensional keys are assigned to a fact then dimension and fact can be loaded together. At
the same time RI is being maintained at ETL process level.
Question: How do you merge two files in DS?
Ans: Either use Copy command as a Before-job subroutine if the metadata of the 2 files
are same or create a job to concatenate the 2 files into one, if the metadata is different.
Question: How do you eliminate duplicate rows?
Ans: Data Stage provides us with a stage Remove Duplicates in Enterprise edition. Using
that stage we can eliminate the duplicates based on a key column.
Question: How do you pass filename as the parameter for a job?
Ans: While job development we can create a parameter 'FILE_NAME' and the value can
be passed while
Ans: In almost all cases we have to delete the data inserted by this from DB manually and
fix the job and then run the job again.
Question: Is there a mechanism available to export/import individual DataStage ETL

jobs from the UNIX command line?
Ans: Try dscmdexport and dscmdimport. Won't handle the "individual job" requirement. You can only export full
projects from the command line.
You can find the export and import executables on the client machine usually someplace like: C:\Program
Files\Ascential\DataStage.
Question: Diff. between JOIN stage and MERGE stage.

Answer:
☻Page 24 of 243☻
JOIN: Performs join operations on two or more data sets input to the stage and then
outputs the resulting dataset.
MERGE: Combines a sorted master data set with one or more sorted updated data sets.
The columns from the records in the master and update data set s are merged so that the
out put record contains all the columns from the master record plus any additional
columns from each update record that required.
A master record and an update record are merged only if both of them have the same
values for the merge key column(s) that we specify .Merge key columns are one or more
columns that exist in both the master and update records.
Question: Advantages of the DataStage?

Answer:
Business advantages:
• Helps for better business decisions;

• It is able to integrate data coming from all parts of the company;
• It helps to understand the new and already existing clients;
• We can collect data of different clients with him, and compare them;
• It makes the research of new business possibilities possible;
• We can analyze trends of the data read by him.
Technological advantages:
• It handles all company data and adapts to the needs;

• It offers the possibility for the organization of a complex business intelligence;
• Flexibly and scalable;
• It accelerates the running of the project;
• Easily implementable.
☻Page 25 of 243☻
DATASTAGE FAQ
1. What is the architecture of data stage?
Basically architecture of DS is client/server architecture.
Client components & server components
Client components are 4 types they are

1. Data stage designer
2. Data stage administrator
3. Data stage director
4. Data stage manager
Data stage designer is user for to design the jobs
Data stage manager is used for to import & export the project to view & edit the contents
of the repository.
Data stage administrator is used for creating the project, deleting the project & setting
the environment variables.
Data stage director is use for to run the jobs, validate the jobs, scheduling the jobs.
Server components
☻Page 26 of 243☻
DS server: runs executable server jobs, under the control of the DS director, that extract,
transform, and load data into a DWH.
DS Package installer: A user interface used to install packaged DS jobs and plug-in;
Repository or project: a central store that contains all the information required to build
DWH or data mart.
2. What r the stages u worked on?
3. I have some jobs every month automatically delete the log details what r the steps
u have to take for that
We have to set the option autopurge in DS Adminstrator.
4. I want to run the multiple jobs in the single job. How can u handle.
In job properties set the option ALLOW MULTIPLE INSTANCES.
5. What is version controlling in DS?
In DS, version controlling is used for back up the project or jobs.

This option is available in DS 7.1 version onwards.
Version controls r of 2 types.
1. VSS- visual source safe
2. CVSS- concurrent visual source safe.
VSS is designed by Microsoft but the disadvantage is only one user can access at a time,
other user can wait until the first user complete the operation.
CVSS, by using this many users can access concurrently. When compared to VSS, CVSS
cost is high.
6. What is the difference between clear log file and clear status file?
Clear log--- we can clear the log details by using the DS Director. Under job menu
clear log option is available. By using this option we can clear the log details of
particular job.
Clear status file---- lets the user remove the status of the record associated with all stages
of selected jobs.(in DS Director)
7. I developed 1 job with 50 stages, at the run time one stage is missed how can u
identify which stage is missing?
By using usage analysis tool, which is available in DS manager, we can find out the what r
the items r used in job.
☻Page 27 of 243☻
8. My job takes 30 minutes time to run, I want to run the job less than 30 minutes?
What r the steps we have to take?
By using performance tuning aspects which are available in DS, we can reduce time.
Tuning aspect
In DS administrator : in-process and inter process

In between passive stages : inter process stage
OCI stage : Array size and transaction size
And also use link partitioner & link collector stage in between passive stages
9. How to do road transposition in DS?
Pivot stage is used to transposition purpose. Pivot is an active stage that maps sets of
columns in an input table to a single column in an output table.
10. If a job locked by some user, how can you unlock the particular job in DS?
We can unlock the job by using clean up resources option which is available in DS
Director. Other wise we can find PID (process id) and kill the process in UNIX server.
11. What is a container? How many types containers are available? Is it possible to
use container as look up?
A container is a group of stages and links. Containers enable you to simplify and
modularize your server job designs by replacing complex areas of the diagram with a
single container stage.
• Local containers. These are created within a job and are only accessible by that job
only.
• Shared containers. These are created separately and are stored in the Repository in the
same way that jobs are. Shared containers can use any job in the project.
Yes we can use container as look up.
12. How to deconstruct the shared container?
To deconstruct the shared container, first u have to convert the shared container to local
container. And then deconstruct the container.
13. I am getting input value like X = Iconv(“31 DEC 1967”,”D”)? What is the X
value?
X value is Zero.
Iconv Function Converts a string to an internal storage format.It takes 31 dec 1967 as zero
and counts days from that date(31-dec-1967).
☻Page 28 of 243☻
14. What is the Unit testing, integration testing and system testing?
Unit testing: As for Ds unit test will check the data type mismatching,
Size of the particular data type, column mismatching.
Integration testing: According to dependency we will put all jobs are integrated in to one
sequence. That is called control sequence.
System testing: System testing is nothing but the performance tuning aspects in Ds.
15. What are the command line functions that import and export the DS jobs?
Dsimport.exe ---- To import the DataStage components

Dsexport.exe ---- To export the DataStage components
16. How many hashing algorithms are available for static hash file and dynamic hash
file?
Sixteen hashing algorithms for static hash file.

Two hashing algorithms for dynamic hash file( GENERAL or SEQ.NUM)
17. What happens when you have a job that links two passive stages together?
Obviously there is some process going on. Under covers Ds inserts a cut-down
transformer stage between the passive stages, which just passes data straight from one
stage to the other.
18. What is the use use of Nested condition activity?
Nested Condition. Allows you to further branch the execution of a sequence depending on
a condition.
19. I have three jobs A,B,C . Which are dependent on each other? I want to run A
& C jobs daily and B job runs only on Sunday. How can u do it?
First you have to schedule A & C jobs Monday to Saturday in one sequence.
Next take three jobs according to dependency in one more sequence and schedule that job
only Sunday.
☻Page 29 of 243☻
TOP 10 FEATURES IN DATASTAGE HAWK
The IILive2005 conference marked the first public presentations of the functionality in the
WebSphere Information Integration Hawk release. Though it's still a few months away I
am sharing my top Ten things I am looking forward to in DataStage Hawk:
1) The metadata server. To borrow a simile from that judge on American Idol "Using
MetaStage is kind of like bathing in the ocean on a cold morning. You know it's good for
you but that doesn't stop it from freezing the crown jewels." MetaStage is good for ETL
projects but none of the projects I've been on has actually used it. Too much effort
required to install the software, setup the metabrokers, migrate the metadata, and learn
how the product works and write reports. Hawk brings the common repository and
improved metadata reporting and we can get the positive effectives of bathing in sea water
without the shrinkage that comes with it.
2) QualityStage overhaul. Data Quality reporting can be another forgotten aspect of data
integration projects. Like MetaStage the QualityStage server and client had an additional
install, training and implementation overhead so many DataStage projects did not use it. I
am looking forward to more integration projects using standardisation, matching and
survivorship to improve quality once these features are more accessible and easier to use.
3) Frictionless Connectivity and Connection Objects. I've called DB2 every rude name
under the sun. Not because it's a bad database but because setting up remote access takes
me anywhere from five minutes to five weeks depending on how obscure the error
message and how hard it is to find the obscure setup step that was missed during
installation. Anything that makes connecting to database easier gets a big tick from me.
4) Parallel job range lookup. I am looking forward to this one because it will stop people
asking for it on forums. It looks good, it's been merged into the existing lookup form and
seems easy to use. Will be interested to see the performance.
☻Page 30 of 243☻
5) Slowly Changing Dimension Stage. This is one of those things that Informatica were
able to trumpet at product comparisons, that they have more out of the box DW support.
There are a few enhancements to make updates to dimension tables easier, there is the
improved surrogate key generator, there is the slowly changing dimension stage and
updates passed to in memory lookups. That's it for me with DBMS generated keys, I'm
only doing the keys in the ETL job from now on! DataStage server jobs have the hash file
lookup where you can read and write to it at the same time, parallel jobs will have the
updateable lookup.
6) Collaboration: better developer collaboration. Everyone hates opening a job and being
told it is locked. "Bloody whathisname has gone to lunch, locked the job and now his
password protected screen saver is up! Unplug his PC!" Under Hawk you can open a
readonly copy of a locked job plus you get told who has locked the job so you know
whom to curse.
7) Session Disconnection. Accompanied by the metallic cry of "exterminate!

exterminate!" an administrator can disconnect sessions and unlock jobs.
8) Improved SQL Builder. I know a lot of people cross the street when they see the SQL
Builder coming. Getting the SQL builder to build complex SQL is a bit like teaching a
monkey how to play chess. What I do like about the current SQL builder is that it
synchronises your SQL select list with your ETL column list to avoid column mismatches.
I am hoping the next version is more flexible and can build complex SQL.
9) Improved job startup times. Small parallel jobs will run faster. I call it the death of a
thousand cuts, your very large parallel job takes too long to run because a thousand
smaller jobs are starting and stopping at the same time and cutting into CPU and memory.
Hawk makes these cuts less painful.
10) Common logging. Log views that work across jobs, log searches, log date constraints,
wildcard message filters, saved queries. It's all good. You no longer need to send out a
search party to find an error message.
That's my top ten. I am also hoping the software comes in a box shaped like a hawk and
makes a hawk scream when you open it. A bit like those annoying greeting cards. Is there
any functionality you think Hawk is missing that you really want to see?
☻Page 31 of 243☻
DATASTAGE NOTES
DataStage Tips:
1. Aggregator stage does not support more than one source, if you try to do this you
will get error, “The destination stage cannot support any more stream input links”.
2. You can give N number input links to transformer stage, but you can give
sequential file stage as reference link. You can give only one sequential file stage
as primary link and number other links as reference link. If you try to give
sequential file stage as reference link you will get error as, “The destination stage
cannot support any more stream input links” because reference link represent a
lookup table, but sequential file does not use as lookup table, Hashed file can be
use as lookup table.
Sequential file stage:

• Sequential file stage is provided by datastage to access data from sequential file.
(Text file)
• The access mechanism of a sequential file is sequence order.
• We cannot use a sequential file as a lookup.
• The problem with sequential file we cannot directly ‘filter rows’ and query is not
supported.
Update actions in sequential file:

• Over write existing file (radio button).
• Append to existing file (radio button).
• Backup existing file (check box).
Hashed file stage:

• Hashed file is used to store data in hash file.
• A hash file is similar to a text file but the data will be organized using ‘hashing
algorithm’.
• Basically hashed file is used for lookup purpose.
☻Page 32 of 243☻
• The retrieval of data in hashed file faster because it uses ’hashing algorithm’.
Update actions in Hashed file:

• Clear file before waiting
• Backup existing file.
• Sequential file (all are check boxes).
DATABASE Stages:
ODBC Stage:
ODBC Stage “Stage” Page:
You can use an ODBC stage to extract, write, or aggregate data. Each ODBC stage can
have any number of inputs or outputs. Input links specify the data you are writing. Output
links specify the data you are extracting and any aggregations required. You can specify
the data on an input link using an SQL statement constructed by DataStage, a generated
query, a stored procedure, or a user-defined SQL query.
• GetSQLInfo: is used to get quote character and schema delimiters of your data
source. Optionally specify the quote character used by the data source. By default, this
is set to " (double quotes). You can also click the Get SQLInfo button to connect to
the data source and retrieve the Quote character it uses. An entry of 000 (three zeroes)
specifies that no quote character should be used.
Optionally specify the schema delimiter used by the data source. By default this is set
to. (period) but you can specify a different schema delimiter, or multiple schema
delimiters. So, for example, identifiers have the form
Node:Schema.Owner;TableName you would enter :.; into this field. You can also click
the Get SQLInfo button to connect to the data source and retrieve the Schema
delimiter it uses.
• NLS tab: You can define a character set map for an ODBC stage using the NLS
tab of the ODBC Stage
The ODBC stage can handle the following SQL Server data types:
• GUID
• Timestamp
• SmallDateTime
ODBC Stage “Input” Page:
☻Page 33 of 243☻
• Update action. Specifies how the data is written. Choose the option you want
from the drop-down list box:
 Clear the table, then insert rows. Deletes the contents of the table and adds
the new rows.
 Insert rows without clearing. Inserts the new rows in the table.
 Insert new or update existing rows. New rows are added or, if the insert fails,
the existing rows are updated.
 Replace existing rows completely. Deletes the existing rows, then adds the
new rows to the table.
 Update existing rows only. Updates the existing data rows. If a row with the
supplied key does not exist in the table then the table is not updated but a warning
is logged.
 Update existing or insert new rows. The existing data rows are updated or, if
this fails, new rows are added.
 Call stored procedure. Writes the data using a stored procedure. When you
select this option, the Procedure name field appears.
 User-defined SQL. Writes the data using a user-defined SQL statement. When
you select this option, the View SQL tab is replaced by the Enter SQL tab.
• Create table in target database. Select this check box if you want to
automatically create a table in the target database at run time. A table is created based
on the defined column set for this stage. If you select this option, an additional tab,
Edit DDL, appears. This shows the SQL CREATE statement to be used for table
generation.
• Transaction Handling. This page allows you to specify the transaction handling
features of the stage as it writes to the ODBC data source. You can choose whether to
use transaction grouping or not, specify an isolation level, the number of rows written
before each commit, and the number of rows written in each operation.
 Isolation Levels: Read Uncommitted, Read Committed,
Repeatable Read, Serializable, Versioning, and Auto-Commit.
 Rows per transaction field. This is the number of rows written
before the data is committed to the data table. The default value is 0, that is, all the
rows are written before being committed to the data table.
 Parameter array size field. This is the number of rows written
at a time. The default is 1, that is, each row is written in a separate operation.
ODBC Stage “Output” Page:
==
PROCESSING Stages:
TRANSFORMER Stage:
☻Page 34 of 243☻
Transformer stages do not extract data or write data to a target database. They are used to
handle extracted data, perform any conversions required, and pass data to another
Transformer stage or a stage that writes data to a target data table.
Transformer stages can have any number of inputs and outputs. The link from the main
data input source is designated the primary input link. There can only be one primary
input link, but there can be any number of reference inputs.
Input Links
The main data source is joined to the Transformer stage via the primary link, but the stage
can also have any number of reference input links.
A reference link represents a table lookup. These are used to provide information that
might affect the way the data is changed, but do not supply the actual data to be changed.
Reference input columns can be designated as key fields. You can specify key expressions
that are used to evaluate the key fields. The most common use for the key expression is to
specify an equi-join, which is a link between a primary link column and a reference link
column. For example, if your primary input data contains names and addresses, and a
reference input contains names and phone numbers, the reference link name column is
marked as a key field and the key expression refers to the primary link’s name column.
During processing, the name in the primary input is looked up in the reference input. If the
names match, the reference data is consolidated with the primary data. If the names do not
match, i.e., there is no record in the reference input whose key matches the expression
given, all the columns specified for the reference input are set to the null value.
Where a reference link originates from a UniVerse or ODBC stage, you can look up
multiple rows from the reference table. The rows are specified by a foreign key, as
opposed to a primary key used for a single-row lookup.
Output Links
You can have any number of output links from your Transformer stage.
You may want to pass some data straight through the Transformer stage unaltered, but it’s
likely that you’ll want to transform data from some input columns before outputting it
from the Transformer stage.
You can specify such an operation by entering a BASIC expression or by selecting a

transform to apply to the data. DataStage has many built-in transforms, or you can define
your own custom transforms that are stored in the Repository and can be reused as
required.
The source of an output link column is defined in that column’s Derivation cell within the
Transformer Editor. You can use the Expression Editor to enter expressions or transforms
☻Page 35 of 243☻
in this cell. You can also simply drag an input column to an output column’s Derivation
cell, to pass the data straight through the Transformer stage.
In addition to specify derivation details for individual output columns, you can also
specify constraints that operate on entire output links. A constraint is a BASIC expression
that specifies criteria that data must meet before it can be passed to the output link. You
can also specify a reject link, which is an output link that carries all the data not output on
other links, that is, columns that have not met the criteria.
Each output link is processed in turn. If the constraint expression evaluates to TRUE for
an input row, the data row is output on that link. Conversely, if a constraint expression
evaluates to FALSE for an input row, the data row is not output on that link.
Constraint expressions on different links are independent. If you have more than one
output link, an input row may result in a data row being output from some, none, or all of
the output links.
For example, if you consider the data that comes from a paint shop, it could include
information about any number of different colors. If you want to separate the colors into
different files, you would set up different constraints. You could output the information
about green and blue paint on LinkA, red and yellow paint on LinkB, and black paint on
LinkC.
When an input row contains information about yellow paint, the LinkA constraint
expression evaluates to FALSE and the row is not output on LinkA. However, the input
data does satisfy the constraint criterion for LinkB and the rows are output on LinkB.
If the input data contains information about white paint, this does not satisfy any
constraint and the data row is not output on Links A, B or C, but will be output on the
reject link. The reject link is used to route data to a table or file that is a “catch-all” for
rows that are not output on any other link. The table or file containing these rejects is
represented by another stage in the job design.
Before-Stage and After-Stage Routines
Because the Transformer stage is an active stage type, you can specify routines to be
executed before or after the stage has processed the data. For example, you might use a
before-stage routine to prepare the data before processing starts. You might use an after-
stage routine to send an electronic message when the stage has finished.
Specifying the Primary Input Link
The first link to a Transformer stage is always designated as the primary input link.
However, you can choose an alternative link to be the primary link if necessary. To do
this:
1. Select the current primary input link in the Diagram window.
2. Choose Convert to Reference from the Diagram window shortcut menu.
☻Page 36 of 243☻
3. Select the reference link that you want to be the new primary input link.
4. Choose Convert to Stream from the Diagram window shortcut menu.
==
AGGREGATOR Stage:
Aggregator stages classify data rows from a single input link into groups and compute
totals or other aggregate functions for each group. The summed totals for each group are
output from the stage via an output link.
If you want to aggregate the input data in a number of different ways, you can have
several output links, each specifying a different set of properties to define how the input
data is grouped and summarized.
==
FOLDER Stage:
Folder stages are used to read or write data as files in a directory located on the DataStage
server.
The folder stages can read multiple files from a single directory and can deliver the files to
the job as rows on an output link. The folder stage can also write rows of data as files to a
directory. The rows arrive at the stage on an input link.
Note: The behavior of the Folder stage when reading folders that contain other folders is
undefined.
In an NLS environment, the user running the job must have write permission on the folder
so that the NLS map information can be set up correctly.
Folder Stage Input Data
The properties are as follows:

• Preserve CRLF. When Preserve CRLF is set to Yes field marks are not converted to
newlines on write. It is set to Yes by default.
☻Page 37 of 243☻
The Columns tab defines the data arriving on the link to be written in files to the
directory. The first column on the Columns tab must be defined as a key, and gives the
name of the file. The remaining columns are written to the named file, each column
separated by a newline. Data to be written to a directory would normally be delivered in a
single column.
Folder Stage Output Data
The properties are as follows:

• Sort order. Choose from Ascending, Descending, or None. This specifies the order in
which the files are read from the directory.
• Wildcard. This allows for simple wildcarding of the names of the files found in the
directory. Any occurrence of * (asterisk) or … (three periods) is treated as an instruction
to match any or no characters.
• Preserve CRLF. When Preserve CRLF is set to Yes newlines are not converted to field
marks on read. It is set to Yes by default.
• Fully qualified. Set this to yes to have the full path name of each file written in the key
column instead of just the file name.
The Columns tab defines a maximum of two columns. The first column must be marked
as the Key and receives the file name. The second column, if present, receives the contents
of the file.
==
IPC Stage:
An inter-process (IPC) stage is a passive stage which provides a communication channel

between DataStage processes running simultaneously in the same job. It allows you to
design jobs that run on SMP systems with great performance benefits. To understand the
benefits of using IPC stages, you need to know a bit about how DataStage jobs actually
run as processes, see “DataStage Jobs and Processes”.
The output link connecting IPC stage to the stage reading data can be opened as soon as
the input link connected to the stage writing data has been opened.
You can use Inter-process stages to join passive stages together. For example you could
use them to speed up data transfer between two data sources:
In this example the job will run as two processes, one handling the communication from
sequential file stage to IPC stage, and one handling communication from IPC stage to
ODBC stage. As soon as the Sequential File stage has opened its output link, the IPC stage
☻Page 38 of 243☻
can start passing data to the ODBC stage. If the job is running on a multi processor
system, the two processor can run simultaneously so the transfer will be much faster.
Defining IPC Stage Properties
The Properties tab allows you to specify two properties for the IPC stage:
• Buffer Size. Defaults to 128 Kb. The IPC stage uses two blocks of memory; one block
can be written to while the other is read from. This property defines the size of each block,
so that by default 256 Kb is allocated in total.
• Timeout. Defaults to 10 seconds. This gives time limit for how long the stage will wait
for a process to connect to it before timing out. This normally will not need changing, but
may be important where you are prototyping multi-processor jobs on single processor
platforms and there are likely to be delays.
==
LINK PARTITIONER Stage:
The Link Partitioner stage is an active stage which takes one input and allows you to
distribute partitioned rows to up to 64 output links. The stage expects the output links to
use the same meta data as the input link.
Partitioning your data enables you to take advantage of a multi-processor system and have
the data processed in parallel. It can be used in conjunction with the Link Collector stage
to partition data, process it in parallel, then collect it together again before writing it to a
single target. To really understand the benefits you need to know a bit about how
DataStage jobs are run as processes, see “DataStage Jobs and Processes”.
In order for this job to compile and run as intended on a multi-processor system you must
have inter-process buffering turned on, either at project level using the DataStage
Administrator, or at job level from the Job Properties dialog box.
Before-Stage and After-Stage Subroutines
The General tab on the Stage page contains optional fields that allow you to define
routines to use which are executed before or after the stage has processed the data.
☻Page 39 of 243☻
• Before-stage subroutine and Input Value. Contain the name (and value) of a
subroutine that is executed before the stage starts to process any data. For example,
you can specify a routine that prepares the data before processing starts.
• After-stage subroutine and Input Value. Contain the name (and value) of a
subroutine that is executed after the stage has processed the data. For example, you
can specify a routine that sends an electronic message when the stage has finished.
Choose a routine from the drop-down list box. This list box contains all the routines
defined as a Before/After Subroutine under the Routines branch in the Repository. Enter
an appropriate value for the routine’s input argument in the Input Value field.
If you choose a routine that is defined in the Repository, but which was edited but not
compiled, a warning message reminds you to compile the routine when you close the Link
Partitioner Stage dialog box.
A return code of 0 from the routine indicates success, any other code indicates failure and
causes a fatal error when the job is run.
If you installed or imported a job, the Before-stage subroutine or Afterstage subroutine

field may reference a routine that does not exist on your system. In this case, a warning
message appears when you close the Link Partitioner Stage dialog box. You must install
or import the “missing” routine or choose an alternative one to use.
Defining Link Partitioner Stage Properties
The Properties tab allows you to specify two properties for the Link Partitioner stage:
• Partitioning Algorithm. Use this property to specify the method the stage uses to
partition data. Choose from:
 Round-Robin. This is the default method. Using the round-robin method the stage
will write each incoming row to one of its output links in turn.
 Random. Using this method the stage will use a random number generator to
distribute incoming rows evenly across all output links.
 Hash. Using this method the stage applies a hash function to one or more input
column values to determine which output link the row is passed to.
 Modulus. Using this method the stage applies a modulus function to an integer
input column value to determine which output link the row is passed to.
• Partitioning Key. This property is only significant where you have chosen a
partitioning algorithm of Hash or Modulus. For the Hash algorithm, specify one or
more column names separated by commas. These keys are concatenated and a hash
function applied to determine the destination output link. For the Modulus algorithm,
specify a single column name which identifies an integer numeric column. The value
of this column value determines the destination output link.
Defining Link Partitioner Stage Input Data
☻Page 40 of 243☻
The Link Partitioner stage can have one input link. This is where the data to be partitioned
arrives.
The Inputs page has two tabs: General and Columns.
• General. The General tab allows you to specify an optional description of the
stage.
• Columns. The Columns tab contains the column definitions for the data on the
input link. This is normally populated by the meta data of the stage connecting on the
input side. You can also Load a column definition from the Repository, or type one in
yourself (and Save it to the Repository if required). Note that the meta data on the
input link must be identical to the meta data on the output links.
Defining Link Partitioner Stage Output Data
The Link Partitioner stage can have up to 64 output links. Partitioned data flows along
these links. The Output Name drop-down list on the Outputs pages allows you to select
which of the 64 links you are looking at.
The Outputs page has two tabs: General and Columns.
• General. The General tab allows you to specify an optional description of the
stage.
• Columns. The Columns tab contains the column definitions for the data on the
input link. You can Load a column definition from the Repository, or type one in
yourself (and Save it to the Repository if required). Note that the meta data on the
output link must be identical to the meta data on the input link. So the meta data is
identical for all the output links.
==
LINK COLLECTOR Stage:
The Link Collector stage is an active stage which takes up to 64 inputs and allows you to
collect data from this links and route it along a single output link. The stage expects the
output link to use the same meta data as the input links.
The Link Collector stage can be used in conjunction with a Link Partitioner stage to
enable you to take advantage of a multi-processor system and have data processed in
parallel. The Link Partitioner stage partitions data, it is processed in parallel, then the Link
Collector stage collects it together again before writing it to a single target. To really
understand the benefits you need to know a bit about how DataStage jobs are run as
processes, see “DataStage Jobs and Processes”.
☻Page 41 of 243☻
In order for this job to compile and run as intended on a multi-processor system you must
have inter-process buffering turned on, either at project level using the DataStage
Administrator, or at job level from the Job Properties dialog box.
The Properties tab allows you to specify two properties for the Link Collector stage:
• Collection Algorithm. Use this property to specify the method the stage uses to
collect data. Choose from:
 Round-Robin. This is the default method. Using the round-robin method the stage
will read a row from each input link in turn.
 Sort/Merge. Using the sort/merge method the stage reads multiple sorted inputs
and writes one sorted output.
• Sort Key. This property is only significant where you have chosen a collecting
algorithm of Sort/Merge. It defines how each of the partitioned data sets are known to
be sorted and how the merged output will be sorted. The key has the following format:
Columnname {sortorder] [,Columnname [sortorder]]...

Columnname specifies one (or more) columns to sort on.
sortorder defines the sort order as follows:
In an NLS environment, the collate convention of the locale may affect the sort order. The
default collate convention is set in the DataStage Administrator, but can be set for
individual jobs in the Job Properties dialog box.
Ascending Order Descending Order

A d
asc dsc
ascending descending
A D
ASC DSC
ASCENDING DESCENDING
For example:
FIRSTNAME d, SURNAME D
☻Page 42 of 243☻
Specifies that rows are sorted according to FIRSTNAME column and SURNAME column
in descending order.
The Link Collector stage can have up to 64 input links. This is where the data to be
collected arrives. The Input Name drop-down list on the Inputs page allows you to select
which of the 64 links you are looking at.
The Link Collector stage can have a single output link.
DATASTAGE TUTORIAL
1. About DataStage 14. Hashed File

2. Client 15. UniVerse
Components 16. UniData.
3. DataStage 17. ODBC
Designer. 18. Sequential File
4. DataStage 19. Folder Stage
Director 20. Transformer
5. DataStage 21. Container
Manager 22. IPC Stage
6. DataStage 23. Link Collector
Administrator Stage
7. DataStage 24. Link Partitioner
Manager Roles Stage
8. Server 25. Server Job
Components Properties
9. DataStage 26. Containers
Features 27. Local containers
10. Types of Jobs 28. Shared containers
11. DataStage NLS
12. JOB 29. Job Sequences
13. Aggregator
☻Page 43 of 243☻
About DataStage
DataStage is a tool set for designing, developing, and running

applications that populate one or more tables in a data
warehouse or data mart. It consists of client and server
components.
Client Components
DataStage Designer.
A design interface used to create DataStage applications

(known as jobs). Each job specifies the data sources, the
transforms required, and the destination of the data. Jobs
are compiled to create executables that are scheduled by
the Director and run by the Server.
DataStage Director.
A user interface used to validate, schedule, run, and

monitor DataStage jobs.
DataStage Manager.
A user interface used to view and edit the contents of the

Repository.
DataStage Administrator
A user interface used to configure DataStage
DataStage Manager Roles
• Import table or stored procedure definitions

• Create table or stored procedure definitions, data
elements, custom transforms, server job routines,
mainframe routines, machine profiles, and plug-ins
There are also more specialized tasks that can only be

performed from the DataStage Manager. These include:
• Perform usage analysis queries.

• Report on Repository contents.
• Importing, exporting and packaging DataStage jobs.
☻Page 44 of 243☻
Server Components
There are three server components which are installed on a

server:
• Repository. A central store that contains all the

information required to build a data mart or data
warehouse.
• DataStage Server. Runs executable jobs that extract,
transform, and load data into a data warehouse.
• DataStage Package Installer. A user interface used
to install packaged DataStage jobs and plug-ins.
DataStage Features
Extracts data from any number or types of database
Handles all the meta data definitions required to define your

data warehouse.
Aggregates data. You can modify SQL SELECT statements

used to extract data.
Transforms data. DataStage has a set of predefined transforms

and functions you can use to convert your data.
Loads the data warehouse
Types of jobs
There are three basic types of DataStage job:
• Server jobs. These are compiled and run on the

DataStage server. A server job will connect to
databases on other machines as necessary, extract
data, process it, then write the data to the target data
warehouse.
• Parallel jobs. These are available only if you have
Enterprise Edition installed. Parallel jobs are compiled
and run on a DataStage UNIX server, and can be run
in parallel on SMP, MPP, and cluster systems.
• Mainframe jobs. These are available only if you have
Enterprise MVS Edition installed. A mainframe job is
compiled and run on the mainframe. Data extracted by
such jobs is then loaded into the data warehouse.
☻Page 45 of 243☻
There are two other entities that are similar to jobs in the way
they appear in the DataStage Designer, and are handled by it.
These are:
Shared containers.
These are reusable job elements. They typically comprise

a number of stages and links. Copies of shared containers
can be used in any number of server jobs and edited as
required.
Job Sequences.
A job sequence allows you to specify a sequence of

DataStage jobs to be executed, and actions to take
depending on results
DataStage NLS
• Process data in a wide range of languages

• Accept data in any character set into most DataStage
fields
• Use local formats for dates, times, and money (Server
Jobs)
• Sort data according to local rules
JOB
A job consists of stages linked together which describe the

flow of data from a data source to a final data warehouse.
Built-In Stages – Server Jobs

Aggregator.
Aggregator stages are active stages that classify data rows

from a single input link into groups and compute totals or
other aggregate functions for each group. The summed totals
for each group are output from the stage via an output link.
Hashed File.
Extracts data from or loads data into databases that

contain hashed files. Also acts as an intermediate stage for
quick lookups.
Hashed File stages represent a hashed file, i.e., a file that uses
a hashing algorithm for distributing records in one or more
groups on disk. You can use a Hashed File stage to extract or
☻Page 46 of 243☻
write data, or to act as an intermediate file in a job. The
primary role of a Hashed File stage is as a reference table
based on a single key field.
Each Hashed File stage can have any number of inputs or

outputs. Input links specify the data you are writing. Output
links specify the data you are extracting.
UniVerse.
• Extracts data from or loads data into UniVerse

databases.
UniData.
• Extracts data from or loads data into UniData

databases.
ODBC.
Extracts data from or loads data into databases that support

the industry standard Open Database Connectivity API. This
stage is also used as an intermediate stage for aggregating
data.
ODBC stages are used to represent a database that supports

the industry standard Open Database Connectivity API. You
can use an ODBC stage to extract, write, or aggregate data.
Each ODBC stage can have any number of inputs or outputs.

Input links specify the data you are writing. Output links
specify the data you are extracting and any aggregations
required.
Sequential File.
Extracts data from or loads data into "flat files" in the

Windows NT file system.
Sequential File stages are used to extract data from, or write

data to, a text file in the server file system. The text file can be
created or exist on any drive that is either local or mapped to
the server. Each Sequential File stage can have any number of
inputs or outputs.
☻Page 47 of 243☻
Folder Stage.
Folder stages are used to read or write data as files in a

directory located on the DataStage server.
Folder stages are used to read or write data as files in a

directory.
The folder stages can read multiple files from a single

directory and can deliver the files to the job as rows on an
output link. By default, the file content is delivered with
newlines converted to char(254) field marks. The folder stage
can also write rows of data as files to a directory. The rows
arrive at the stage on an input link.
Transformer.
Receives incoming data, transforms it in a variety of ways,

and outputs it to another stage in the job.
Transformer stages do not extract data or write data to a target

database. They are used to handle extracted data, perform any
conversions required, and pass data to another Transformer
stage or a stage that writes data to a target data table.
Transformer stages in server jobs can have any number of

inputs and outputs. The link from the main data input source
is designated the primary input link. There can only be one
primary input link, but there can be any number of reference
inputs.
Container.
Represents a group of stages and links. The group is replaced

by a single Container stage in the Diagram window.
IPC Stage.
Provides a communication channel between DataStage

processes running simultaneously in the same job. It allows
you to design jobs that run on SMP systems with great
performance benefits
.An inter-process (IPC) stage is a passive stage which

provides a communication channel between DataStage
processes running simultaneously in the same job. It allows
you to design jobs that run on SMP systems with great
☻Page 48 of 243☻
performance benefits. To understand the benefits of using IPC
stages, you need to know a bit about how DataStage jobs
actually run as processes, see Chapter 2 of the Server Job
Developer's Guide for information.
The output link connecting IPC stage to the stage reading data
can be opened as soon as the input link connected to the stage
writing data has been opened.
You can use Inter-process stages to join passive stages

together. For example you could use them to speed up data
transfer between two data sources
Link Collector Stage.
Takes up to 64 inputs and allows you to collect data from

these links and route it along a single output link.
The Link Collector stage is an active stage which takes up to

64 inputs and allows you to collect data from these links and
route it along a single output link. The stage expects the
output link to use the same meta data as the input links
Link Partitioner Stage.
Takes one input and allows you to distribute partitioned

rows to up to 64 output links.
The Link Partitioner stage is an active stage which takes

one input and allows you to distribute partitioned rows to
up to 64 output links. The stage expects the output links to
use the same meta data as the input link.
• Container Input and Container Output. Represent

the interface that links a container stage to the rest of
the job design.
Server Job Properties
The Job Properties dialog box appears. The dialog box

differs depending on whether it is a server job, a mainframe
job, or a job sequence.
A server job has up to six pages: General, Parameters, Job

control, NLS, Performance, and Dependencies. Note that the
NLS page is not available if you open the dialog box from the
Manager, even if you have NLS installed.
☻Page 49 of 243☻
Containers
A container is a group of stages and links. Containers enable

you to simplify and modularize your server job designs by
replacing complex areas of the diagram with a single
container stage. You can also use shared containers as a way
of incorporating server job functionality into parallel jobs.
Local containers.
• These are created within a job and are only accessible

by that job. A local container is edited in a tabbed
page of the job’s Diagram window.
Shared containers.
• These are created separately and are stored in the

Repository in the same way that jobs are. There are
two types of shared container:
Job Sequences
DataStage provides a graphical Job Sequencer which allows

you to specify a sequence of server or parallel jobs to run. The
sequence can also contain control information, for example,
you can specify different courses of action to take depending
on whether a job in the sequence succeeds or fails. Once you
have defined a job sequence, it can be scheduled and run
using the DataStage Director. It appears in the DataStage
Repository and in the DataStage Director client as a job.
☻Page 50 of 243☻
LEARN FEATURES OF DATASTAGE
DATASTAGE:
DataStage has the following features to aid the
design and processing required to build a data
warehouse:
Uses graphical design tools. With simple
point-and-click techniques you can draw a
scheme to represent your processing
requirements.
Extracts data from any number or type of
database.
Handles all the metadata definitions required
to define your data warehouse. You can
view and modify the table definitions at
any point during the design of your
application.
Aggregates data. You can modify SQL SELECT
statements used to extract data.
Transforms data. DataStage has a set of
predefined transforms and functions you
can use to convert your data. You can
☻Page 51 of 243☻
easily extend the functionality by defining
your own transforms to use.
Loads the data warehouse.
COMPONENTS OF DATASTAGE:
DataStage consists of a number of client and
server components. DataStage has four client
components
1. DataStage Designer. A design interface

used to create DataStage applications
(known as jobs). Each job specifies the
data sources, the transforms required, and
the destination of the data. Jobs are
compiled to create executables that are
scheduled by the Director and run by the
Server (mainframe jobs are transferred
and run on the mainframe).
2. DataStage Director. A user interface
used to validate, schedule, run, and
monitor DataStage server jobs and parallel
jobs.
3. DataStage Manager. A user interface
used to view and edit the contents of the
Repository.
4. DataStage Administrator. A user
interface used to perform administration
tasks such as setting up DataStage users,
creating and moving projects, and setting
up purging criteria.
SERVER COMPONENTS:
There are three server components:
1. Repository. A central store that contains
all the information required to build a
data mart or data warehouse.
2. DataStage Server. Runs executable jobs
that extract, transform, and load data into
a data warehouse.
3. DataStage Package Installer. A user
interface used to install packaged
DataStage jobs and plug-ins.
☻Page 52 of 243☻
DATASTAGE PROJECTS:
You always enter DataStage through a DataStage
project. When you start a DataStage client you
are prompted to attach to a project. Each project
contains:
• DataStage jobs.
• Built-in components. These are predefined
components used in a job.
• User-defined components. These are
customized components created using the
DataStage Manager. Each user-defined
component performs a specific task in a
job.
DATASTAGE JOBS:
There are three basic types of DataStage job:
1. Server jobs. These are compiled and run
on the DataStage server. A server job will
connect to databases on other machines as
necessary, extract data, process it, then
write the data to the target
datawarehouse.
2. Parallel jobs. These are compiled and run
on the DataStage server in a similar way to
server jobs, but support parallel processing
on SMP, MPP, and cluster systems.
3. Mainframe jobs. These are available only
if you have Enterprise MVS Edition
installed. A mainframe job is compiled and
run on the mainframe. Data extracted by
such jobs is then loaded into the data
warehouse.
SPECIAL ENTITIES:
• Shared containers. These are reusable
job elements. They typically comprise a
number of stages and links. Copies of
shared containers can be used in any
number of server jobs or parallel jobs and
edited as required.
• Job Sequences. A job sequence allows
you to specify a sequence of DataStage
☻Page 53 of 243☻
jobs to be executed, and actions to take
depending on results.
TYPES OF STAGES:
• Built-in stages. Supplied with DataStage
and used for extracting aggregating,
transforming, or writing data. All types of
job have these stages.
• Plug-in stages. Additional stages that can
be installed in DataStage to perform
specialized tasks that the built-in stages do
not support Server jobs and parallel jobs
can make use of these.
• Job Sequence Stages. Special built-in
stages which allow you to define
sequences of activities to run. Only Job
Sequences have these.
DATASTAGE NLS:
DataStage has built-in National Language
Support (NLS). With NLS installed, DataStage can
do the following:
• Process data in a wide range of languages
• Accept data in any character set into most
DataStage fields
• Use local formats for dates, times, and
money (server jobs)
• Sort data according to local rules
To load a data mart or data warehouse, you must

do the following:
• Set up your project
• Create a job
• Develop the job
• Edit the stages in the job
• Compile the job
• Run the job
☻Page 54 of 243☻
SETTING UP YOUR PROJECT:
Before you create any DataStage jobs, you must
set up your project by entering information about
your data. This includes the name and location of
the tables or files holding your data and a
definition of the columns they contain.
Information is stored in table definitions in the
Repository.
STARTING THE DATASTAGE DESIGNER:

To start the DataStage Designer, choose Start →
Programs → Ascential DataStage →
DataStage Designer. The Attach to Project
dialog box appears:
TO CONNECT TO A PROJECT:
1. Enter the name of your host in the Host
system field. This is the name of the
system where the DataStage Server
components are installed.
2. Enter your user name in the User name
field. This is your user name on the server
system.
3. Enter your password in the Password
field.
4. Choose the project to connect to from the
Project drop-down list box.
5. Click OK. The DataStage Designer window
appears with the New dialog box open,
ready for you to create a new job:
☻Page 55 of 243☻
CREATING A JOB:
Jobs are created using the DataStage Designer.
For this example, you need to create a server
job, so double-click the New Server Job icon.
Choose File → Save to save the job.

The Create new job dialog box appears:
☻Page 56 of 243☻
DEFINING TABLE DEFINITIONS:
For most data sources, the quickest and simplest
way to specify a table definition is to import it
directly from your data source or data
warehouse.
IMPORTING TABLE DEFINITIONS:

1. In the Repository window of the DataStage
Designer, select the Table Definitions
branch, and choose Import ✎✯✤✢✣
Table Definitions… from the shortcut
menu. The Import Metadata (ODBC
Tables) dialog box appears:
2. Choose data Source Name from the DSN

drop-down list box.
3. Click OK. The updated Import Metadata (

ODBC Tables) dialog box displays all the
files for the chosen data source name:
☻Page 57 of 243☻
4. Select project.EXAMPLE1 from the
Tables list box, where project is the name
of your DataStage project.
5. Click OK. The column information from
EXAMPLE1 is imported into DataStage.
6. A table definition is created and is stored
under the Table Definitions → ODBC →
DSNNAME branch in the Repository. The
updated DataStage Designer window
displays the new table definition entry in
the Repository window.
DEVELOPING A JOB:
Jobs are designed and developed using the
Designer. The job design is developed in the
Diagram window (the one with grid lines). Each
data source, the data warehouse, and each
processing step is represented by a stage in the
job design. The stages are linked together to
show the flow of data.
For Example we can develop a job with the

following three stages:
A Universe stage to represent EXAMPLE1 (the

data source).
• A Transformer stage to convert the data in the
DATE column from an YYYY-MM-DD date in
internal date format to a string giving just year
and month (YYYY-MM).
• A Sequential File stage to represent the file
created at run time (the data warehouse in this
example).
☻Page 58 of 243☻
Adding Stages:
Stages are added using the tool palette. This
palette contains icons that represent the
components you can add to a job. The palette
has different groups to organize the tools
available.
To add a stage:
1. Click the stage button on the tool palette that
represents the stage type you want to add.
2. Click in the Diagram window where you want
the stage to be positioned. The stage appears in
the Diagram window as a square. You can also
drag items from the palette to the Diagram
window.
We recommend that you position your stages as
follows:
Data sources on the left
Data warehouse on the right
Transformer stage in the center
When you add stages, they are automatically
assigned default names. These names are based
on the type of stage and the number of the item
in the Diagram window. You can use the default
names in the example.
Once all the stages are in place, you can link
them together to show the flow of data.
Linking Stages
You need to add two links:
☻Page 59 of 243☻
• One between the Universe and Transformer
stages
• One between the Transformer and Sequential
File stages
Links are always made in the direction the data

will flow, that is, usually left to right. When you
add links, they are assigned default names. You
can use the default names in the example.
To add a link:
1. Right-click the first stage, hold the mouse
button down and drag the link to the transformer
stage. Release the mouse button.
2. Right-click the Transformer stage and drag the
link to the Sequential File stage.
The following screen shows how the Diagram
window looks when you have added the stages
and links:
Editing the Stages
Your job design currently displays the stages and

the links between them. You must edit each
stage in the job to specify the data to use and
what to do with it. Stages are edited in the job
design by double-clicking each stage in turn.
Each stage type has its own editor.
Editing the UniVerse Stage
The data source (EXAMPLE1) is represented by a

UniVerse stage. You must specify the data you
☻Page 60 of 243☻
want to extract from this file by editing the
stage.
Double-click the stage to edit it. The UniVerse

Stage dialog box appears:
This dialog box has two pages:

• Stage. Displayed by default. This page
contains the name of the stage you are editing.
The General tab specifies where the file is found
and the connection type.
• Outputs. Contains information describing the
data flowing from the stage. You edit this page to
describe the data you want to extract from the
file. In this example, the output from this stage
goes to the Transformer stage. To edit the
Universe stage:
1. Check that you are displaying the General tab
on the Stage page.
Choose localuv from the Data source name
drop-down list. Localuv is where EXAMPLE1 is
copied to during installation.
The remaining parameters on the General and
Details tabs are used to enter logon details and
describe where to find the file. Because
EXAMPLE1 is installed in localuv, you do not
have to complete these fields, which are
disabled.
2. Click the Outputs tab. The Outputs page
appears:
☻Page 61 of 243☻
The Outputs page contains the name of the link
the data flows along and the following four tabs:
• General. Contains the name of the table to use
and an optional description of the link.
• Columns. Contains information about the
columns in the table.
• Selection. Used to enter an optional SQL
SELECT clause (an Advanced procedure).
• View SQL. Displays the SQL SELECT statement
used to extract the data.
3. Choose dstage.EXAMPLE1 from the
Available tables drop-down list.
4. Click Add to add dstage.EXAMPLE1 to the
Table names field.
5. Click the Columns tab. The Columns tab
appears at the front of the dialog box. You must
specify the columns contained in the file you
want to use. Because the column definitions are
stored in a table definition in the Repository, you
can load them directly.
6. Click Load…. The Table Definitions window
appears with then UniVerse ✎ localuv branch
highlighted.
7. Select dstage.EXAMPLE1. The Select
Columns dialog box appears, allowing you to
select which column definitions you want to load.
8. In this case you want to load all available
columns definitions, so just click OK. The column
definitions specified in the table definition are
copied to the stage. The Columns tab contains
definitions for the four columns in EXAMPLE1:
☻Page 62 of 243☻
9. You can use the Data Browser to view the
actual data that is to be output from the
UniVerse stage. Click the View Data… button to
open the Data Browser window.
11. Choose File → Save to save your job design

so far.
Editing the Transformer Stage
The Transformer stage performs any data

conversion required before the data is output to
another stage in the job design. In this example,
the Transformer stage is used to convert the
data in the DATE column from an YYYYMM-DD
☻Page 63 of 243☻
date in internal date format to a string giving just
the year and month (YYYY-MM).
There are two links in the stage:
• The input from the data source (EXAMPLE1)
• The output to the Sequential File stage
To enable the use of one of the built-in
DataStage transforms, you will assign data
elements to the DATE columns input and output
from the Transformer stage. A DataStage data
element defines more precisely the kind of data
that can appear in a given column. In this
example, you assign the Date data element to
the input column, to specify the date is input to
the transform in internal format, and the
MONTH.TAG data element to the output column,
to specify that the transform produces a string of
the format YYYY-MM. Double-click the
Transformer stage to edit it. The Transformer
Editor appears:
1. Working in the upper-left pane of the

Transformer Editor, select the input columns that
you want to derive output columns from. Click on
the CODE, DATE, and QTY columns while holding
down the Ctrl key.
2. Click the left mouse button again and, keeping
it held down, drag the selected columns to the
output link in the upper-right pane. Drop the
columns over the Column Name field by
releasing the mouse button. The columns
appear in the top pane and the associated
metadata appears in the lower-right pane:
☻Page 64 of 243☻
3. In the Data element field for the
DSLink3.DATE column, select Date from the
drop-down list.
4. In the SQL type field for the DSLink4 DATE
column, select Char from the drop-down list.
5. In the Length field or the DSLink4 DATE
column, enter 7.
6. In the Data element field for the DSLink4
DATE column, select MONTH.TAG from the
drop-down list. Next you will specify the
transform to apply to the input DATE column to
produce the output DATE column. You do this in
the upper right pane of the Transformer Editor.
7. Double-click the Derivation field for the
DSLink4 DATE column. The Expression Editor box
appears. At the moment, the box contains the
text DSLink3.DATE, which indicates that the
output is directly derived from the input DATE
column. Select the text DSLink3 and delete it by
pressing the Delete key.
☻Page 65 of 243☻
10. Select the MONTH.TAG transform. It
appears in the Expression Editor box with the
argument field [%Arg1%] highlighted.
11. Right-click to open the Suggest Operand
menu again. This time, select Input Column. A
list of available input columns appears:
12. Select DSLink3.DATE. This then becomes

the argument for the transform.
☻Page 66 of 243☻
13. Click OK to save the changes and exit the
Transformer Editor. Once more the small icon
appears on the output link from the transformer
stage to indicate that the link now has column
definitions associated with it.
Editing the Sequential File Stage

The data warehouse is represented by a
Sequential File stage. The data to be written to
the data warehouse is already specified in the
Transformer stage. However, you must enter the
name of a file to which the data is written when
the job runs. If the file does not exist, it is
created. Double-click the stage to edit it. The
Sequential File Stage dialog box appears:
This dialog box has two pages:

• Stage. Displayed by default. This page
contains the name of the stage you are editing
and two tabs. The General tab specifies the line
termination type, and the NLS tab specifies a
character set map to use with the stage (this
appears if you have NLS installed).
• Inputs. Describes the data flowing into the
stage. This page only appears when you have an
input to a Sequential File stage. You do not need
to edit the column definitions on this page,
because they were all specified in the
Transformer stage.
☻Page 67 of 243☻
To edit the Sequential File stage:
1. Click the Inputs tab. The Inputs page

appears. This page contains:
• The name of the link. This is automatically set
to the link name used in the job design.
• General tab. Contains the pathname of the
file, an optional description of the link, and
update action choices. You can use the default
settings for this example, but you may want to
enter a file name (by default the file is named
after the input link).
• Format tab. Determines how the data is
written to the file. In this example, the data is
written using the default settings that is, as a
comma-delimited file.
• Columns tab. Contains the column definitions
for the data you want to extract. This tab
contains the column definitions specified in the
Transformer stage’s output link.
2. Enter the pathname of the text file you want to
create in the File name field, for example,
seqfile.txt. By default the file is placed in the
server project directory (for example,
c:\Ascential\DataStage\Projects\datastage) and is
named after the input link, but you can enter, or
browse for, a different directory.
3. Click OK to close the Sequential File Stage
dialog box.
4. Choose File ✎ Save to save the job design.
The job design is now complete and ready to be
compiled.
Compiling a Job
When you finish your design you must compile it

to create an executable job. Jobs are compiled
using the Designer. To compile the job, do one of
the following:
• Choose File → Compile.
• Click the Compile button on the toolbar.
The Compile Job window appears:
☻Page 68 of 243☻
Running a Job
Executable jobs are scheduled by the DataStage

Director and run by the DataStage Server. You
can start the Director from the Designer by
choosing Tools → Run Director.
When the Director is started, the DataStage
Director window appears with the status of all
the jobs in your project:
Highlight your job in the Job name column. To

run the job, choose Job → Run Now or click the
Run button on the toolbar. The Job Run
Options dialog box appears and allows you to
specify any parameter values and to specify any
job run limits. In this case, just click Run. The
status changes to Running. When the job is
complete, the status changes to Finished.
Choose File → Exit to close the DataStage
Director window.
☻Page 69 of 243☻
Developing a Job
The DataStage Designer is used to create and

develop DataStage jobs. A DataStage job
populates one or more tables in the target
database. There is no limit to the number of jobs
you can create in a DataStage project.
A job design contains:

• Stages to represent the processing steps
required
• Links between the stages to represent the flow
of data
There are three different types of job in

DataStage:
• Server jobs. These are available if you have
installed Server. They run on the DataStage
Server, connecting to other data sources as
necessary.
• Mainframe jobs. These are available only if
you have installed Enterprise MVS Edition.
Mainframe jobs are uploaded to a mainframe,
where they are compiled and run.
• Parallel jobs. These are available only if you
have installed the Enterprise Edition. These run
on DataStage servers that are SMP, MPP, or
cluster systems. There are two other entities
that are similar to jobs in the way they appear
in the DataStage Designer, and are handled by it.
These are:
• Shared containers. These are reusable job
elements. They typically comprise a number of
stages and links. Copies of shared containers can
be used in any number of server jobs and parallel
jobs and edited as required.
• Job Sequences. A job sequence allows you to
specify a sequence of DataStage server or
parallel jobs to be executed, and actions to take
depending on results.
STAGES:
A job consists of stages linked together which

describe the flow of data from a data source to a
☻Page 70 of 243☻
data target (for example, a final data
warehouse).
A stage usually has at least one data input and/or
one data output. However, some stages can
accept more than one data input, and output to
more than one stage. The different types of job
have different stage types. The stages that are
available in the DataStage Designer depend on
the type of job that is currently open in the
Designer.
Server Job Stages
DataStage offers several built-in stage types for

use in server jobs. These are used to represent
data sources, data targets, or conversion stages.
These stages are either passive or active stages.
A passive stage handles access to databases for
the extraction or writing of data. Active stages
model the flow of data and provide mechanisms
for combining data streams, aggregating data,
and converting data from one data type to
another.
As well as using the built-in stage types, you can
also use plug-in stages for specific operations
that the built-in stages do not support. The
Palette organizes stage types into different
groups, according to function:
• Database
• File
• PlugIn
• Processing
• Real Time
Stages and links can be grouped in a shared

container. Instances of the shared container can
then be reused in different server jobs. You can
also define a local container within a job, this
groups stages and links into a single unit, but can
only be used within the job in which it is defined.
Each stage type has a set of predefined and
editable properties. These properties are viewed
or edited using stage editors.
☻Page 71 of 243☻
At this point in your job development you need to
decide which stage types to use in your job
design. The following built-in stage types are
available for server jobs:
☻Page 72 of 243☻
Mainframe Job Stages
☻Page 73 of 243☻
DataStage offers several built-in stage types for
use in mainframe jobs. These are used to
represent data sources, data targets, or
conversion stages.
The Palette organizes stage types into different
groups, according to function:
• Database
• File
• Processing
Each stage type has a set of predefined and
editable properties. Some stages can be used as
data sources and some as data targets. Some
can be used as both. Processing stages read
data from a source, process it andwrite it to a
data target target. These properties are viewed
or edited usingstage editors. A stage editor
exists for each stage type and At this point in
your job development you need to decide which
stage types to use in your job design.
☻Page 74 of 243☻
Parallel jobs Processing Stages
☻Page 75 of 243☻
SERVER JOBS:
☻Page 76 of 243☻
When you design a job you see it in terms of
stages and links. When it is compiled, the
DataStage engine sees it in terms of processes
that are subsequently run on the server. How
does the DataStage engine define a process? It is
here that the distinction between active and
passive stages becomes important. Actives
stages, such as the Transformer and Aggregator
perform processing tasks, while passive stages,
such as Sequential file stage and ODBC stage,
are reading or writing data sources and provide
services to the active stages. At its simplest,
active stages become processes. But the
situation becomes more complicated where you
connect active stages together and passive
stages together.
☻Page 77 of 243☻
Single Processor and Multi-Processor
Systems
The default behavior when compiling DataStage

jobs is to run all adjacent active stages in a
single process. This makes good sense when you
are running the job on a single processor system.
When you are running on a multi-processor
system it is better to run each active stage in a
separate process so the processes can be
distributed among available processors and run
in parallel. The enhancements to server jobs at
Release 6 of DataStage make it possible for you
to stipulate at design time that jobs should be
compiled in this way. There are two ways of
doing this:
• Explicitly – by inserting IPC stages between
connected active stages.
• Implicitly – by turning on inter-process row
buffering either project wide (using the
DataStage Administrator) or for individual jobs
(in the Job Properties dialog box)
The IPC facility can also be used to produce
multiple processes where passive stages are
directly connected. This means that an operation
reading from one data source and writing to
another could be divided into a reading process
and a writing process able to take advantage of
multiprocessor systems.
☻Page 78 of 243☻
Partitioning and Collecting
☻Page 79 of 243☻
With the introduction of the enhanced multi-
processor support at Release6, there are
opportunities to further enhance the
performance of server jobs by partitioning data.
The Link Partitioner stage allows you to partition
data you are reading so it can be processed by
individual processors running on multiple
processors. The Link Collector stage allows you
to collect partitioned data together again for
writing to a single data target. The following
diagram illustrates how you might use the Link
Partitioner and Link Collector stages within a job.
Both stages are active, and you should turn on
inter-process row buffering at project or job level
in order to implement process boundaries.
Aggregator Stages
Aggregator stages classify data rows from a

single input link into groups and compute totals
or other aggregate functions for each group. The
summed totals for each group are output from
the stage via an output link.
Using an Aggregator Stage
If you want to aggregate the input data in a

number of different ways, you can have several
output links, each specifying a different set of
properties to define how the input data is
grouped and summarized.
When you edit an Aggregator stage, the
Aggregator Stage dialog box appears:
☻Page 80 of 243☻
This dialog box has three pages:
• Stage. Displays the name of the stage you are
editing. This page has a General tab which
contains an optional description of the stage and
names of before- and after-stage routines
• Inputs. Specifies the column definitions for the
data input link.
• Outputs. Specifies the column definitions for
the data output link.
Defining Aggregator Input Data
Data to be aggregated is passed from a previous

stage in the job design and into the Aggregator
stage via a single input link. The properties of
this link and the column definitions of the data
are defined on the Inputs page in the
Aggregator Stage dialog box.
☻Page 81 of 243☻
Note: The Aggregator stage does not preserve
the order of input rows, even when the incoming
data is already sorted.
The Inputs page has the following field and two
tabs:
• Input name. The name of the input link to the
Aggregator stage.
• General. Displayed by default. Contains an
optional description of the link.
• Columns. Contains a grid displaying the
column definitions for the data being written to
the stage, and an optional sort order.
Column name: The name of the column.
Sort Order: Specifies the sort order. This
field is blank by default, that is, there is no
sort order. Choose Ascending for
ascending order, Descending for
descending order, or Ignore if you do not
want the order to be checked.
Key: Indicates whether the column is part of
the primary key.
SQL type: The SQL data type.
Length: The data precision. This is the length
for CHAR data and the maximum length for
VARCHAR data.
Scale: The data scale factor.
Nullable: Specifies whether the column can
contain null values.
Display: The maximum number of characters
required to display the column data.
Data element: The type of data in the
column.
Description: A text description of the
column.
Defining Aggregator Output Data

When you output data from an Aggregator stage,
the properties of output links and the column
definitions of the data are defined on the
Outputs page in the Aggregator Stage dialog
box.
☻Page 82 of 243☻
The Outputs page has the following field and
two tabs:
• Output name. The name of the output link.
Choose the link to edit from the Output name
drop-down list box. This list box displays all the
output links from the stage.
• General. Displayed by default. Contains an
optional description of the link.
• Columns. Contains a grid displaying the
column definitions for the data being output
from the stage. The grid has the following
columns:
Column name. The name of the column.
Group. Specifies whether to group by the
data in the column.
Derivation. Contains an expression
specifying how the data is aggregated.
This is a complex cell, requiring more than
one piece of information. Double-clicking
the cell opens the Derivation
Transformer Stages
Transformer stages do not extract data or write

data to a target database. They are used to
handle extracted data, perform any conversions
required, and pass data to another Transformer
stage or a stage that writes data to a target data
table.
Using a Transformer Stage
☻Page 83 of 243☻
Transformer stages can have any number of
inputs and outputs. The link from the main data
input source is designated the primary input link.
There can only be one primary input link, but
there can be any number of reference inputs.
When you edit a Transformer stage, the
Transformer Editor appears. An example
Transformer stage is shown below. In this
example, metadata has been defined for the
input and the output links.
Link Area
The top area displays links to and from the

Transformer stage, showing their columns and
the relationships between them. The link area is
where all column definitions, key expressions,
and stage variables are defined. The link area is
divided into two panes; you can drag the splitter
bar between them to resize the panes relative to
one another. There is also a horizontal scroll bar,
allowing you to scroll the view left or right. The
left pane shows input links, the right pane shows
output links. The input link shown at the top of
the left pane is always the primary link. Any
subsequent links are reference links. For all types
of link, key fields are shown in bold. Reference
link key fields that have no expression defined
are shown in red (or the color defined in Tools
‰ Options), as are output columns that have no
derivation defined.
☻Page 84 of 243☻
Within the Transformer Editor, a single link may
be selected at any one time. When selected, the
link’s title bar is highlighted, and arrowheads
indicate any selected columns.
Metadata Area
The bottom area shows the column metadata for

input and output links. Again this area is divided
into two panes: the left showing input link
metadata and the right showing output link
metadata. The metadata for each link is shown
in a grid contained within a tabbed page. Click
the tab to bring the required link to the front.
That link is also selected in the link area.
If you select a link in the link area, its metadata
tab is brought to the front automatically. You can
edit the grids to change the column metadata on
any of the links. You can also add and delete
metadata.
Input Links
The main data source is joined to the

Transformer stage via the primary link, but the
stage can also have any number of reference
input links.
A reference link represents a table lookup. These

are used to provide information that might affect
the way the data is changed, but do not supply
the actual data to be changed. Reference input
columns can be designated as key fields. You can
specify key expressions that are used to evaluate
the key fields. The most common use for the
key expression is to specify an equi-join, which is
a link between a primary link column and a
reference link column. For example, if your
primary input data contains names and
addresses, and a reference input contains names
and phone numbers, the reference link name
column is marked as a key field and the key
expression refers to the primary link’s name
column. During processing, the name in the
primary input is looked up in the reference input.
If the names match, the reference data is
☻Page 85 of 243☻
consolidated with the primary data. If the names
do not match, i.e., there is no record in the
reference input whose key matches the
expression given, all the columns specified for
the reference input are set to the null value.
Output Links
You can have any number of output links from

your Transformer stage. You may want to pass
some data straight through the Transformer
stage unaltered, but it’s likely that you’ll want to
transform data from some input columns before
outputting it from the Transformer stage. You
can specify such an operation by entering a
BASIC expression or by selecting a transform to
apply to the data. DataStage has many built-in
transforms, or you can define your own custom
transforms that are stored in the Repository and
can be reused as required. The source of an
output link column is defined in that column’s
Derivation cell within the Transformer Editor.
You can use the Expression Editor to enter
expressions or transforms in this cell. You can
also simply drag an input column to an output
column’s Derivation cell, to pass the data
straight through the Transformer stage. In
addition to specifying derivation details for
individual output columns, you can also specify
constraints that operate on entire output links. A
constraint is a BASIC expression that specifies
criteria that data must meet before it can be
passed to the output link. You can also specify a
reject link, which is an output link that carries all
the data not output on other links, that is,
columns that have not met the criteria. Each
output link is processed in turn. If the constraint
expression evaluates to TRUE for an input row,
the data row is output on that link. Conversely, if
a constraint expression evaluates to FALSE for an
input row, the data row is not output on that link.
Constraint expressions on different links are
independent. If you have more than one output
link, an input row may result in a data row being
output from some, none, or all of the output
links. For example, if you consider the data that
☻Page 86 of 243☻
comes from a paint shop, it could include
information about any number of different colors.
If you want to separate the colors into different
files, you would set up different constraints. You
could output the information about green and
blue paint on Link A, red and yellow paint on Link
B, and black paint on Link C. When an input
row contains information about yellow paint, the
Link A constraint expression evaluates to FALSE
and the row is not output on Link A. However,
the input data does satisfy the constraint
criterion for Link B and the rows are output on
Link B. If the input data contains information
about white paint, this does not satisfy any
constraint and the data row is not output on
Links A, B or C, but will be output on the reject
link. The reject link is used to route data to a
table or file that is a “catch-all” for rows that are
not output on any other link. The table or file
containing these rejects is represented by
another stage in the job design.
Inter-Process Stages
An inter-process (IPC) stage is a passive stage

which provides a communication channel
between DataStage processes running
simultaneously in the same job. It allows you to
design jobs that run on SMP systems with great
performance benefits. To understand the
benefits of using IPC stages, you need to know a
bit about how DataStage jobs actually run as
processes,
In this example the job will run as two processes,

one handling the communication from sequential
file stage to IPC stage, and one handling
☻Page 87 of 243☻
communication from IPC stage to ODBC stage. As
soon as the Sequential File stage has opened its
output link, the IPC stage can start passing data
to the ODBC stage. If the job is running on a
multi-processor system, the two processor can
run simultaneously so the transfer will be much
faster. You can also use the IPC stage to
explicitly specify that connected active stages
should run as separate processes. This is
advantageous for performance on multi-
processor systems. You can also specify this
behavior implicitly by turning inter process row
buffering on, either for the whole project via
DataStage Administrator, or individually for a job
in its Job Properties dialog box.
Using the IPC Stage
When you edit an IPC stage, the InterProcess

Stage dialog box appears.
☻Page 88 of 243☻
This dialog box has three pages:
• Stage. The Stage page has two tabs, General
and Properties. The General page allows you
to specify an optional description of the page.
The Properties tab allows you to specify stage
properties.
• Inputs. The IPC stage can only have one input
link. the Inputs page displays information about
that link.
• Outputs. The IPC stage can only have one
output link. The Outputs page displays
information about that link.
Defining IPC Stage Properties
The Properties tab allows you to specify two

properties for the IPC stage:
• Buffer Size. Defaults to 128 Kb. The IPC stage
uses two blocks of memory; one block can be
written to while the other is read from. This
property defines the size of each block, so that
by default 256 Kb is allocated in total.
• Timeout. Defaults to 10 seconds. This gives
time limit for how long the stage will wait for a
process to connect to it before timing out. This
normally will not need changing, but may be
important where you are prototyping multi-
processor jobs on single processor platforms and
there are likely to be delays.
Defining IPC Stage Input Data
☻Page 89 of 243☻
The IPC stage can have one input link. This is
where the process that is writing connects.
The Inputs page has two tabs: General and
Columns.
• General. The General tab allows you to
specify an optional description of the stage.
• Columns. The Columns tab contains the
column definitions for the data on the input link.
This is normally populated by the metadata of
the stage connecting on the input side. You can
also Load a column definition from the
Repository, or type one in yourself (and Save it
to the Repository if required). Note that the
metadata on the input link must be identical to
the metadata on the output link.
Defining IPC Stage Output Data
The IPC stage can have one output link. This is

where the process that is reading connects.
The Outputs page has two tabs: General and
Columns.
• General. The General tab allows you to
specify an optional description of the stage.
• Columns. The Columns tab contains the
column definitions for the data on the input link.
This is normally populated by the metadata of
the stage connecting on the input side. You can
also Load a column definition from the
Repository, or type one in yourself (and Save it
to the Repository if required). Note that the
metadata on the output link must be identical to
the metadata on the input link.
Link Partitioner Stage:
The Link Partitioner stage is an active stage

which takes one input and allows you to
distribute partitioned rows to up to 64 output
links. The stage expects the output links to use
the same metadata as the input link. Partitioning
your data enables you to take advantage of a
multi-processor system and have the data
processed in parallel. It can be used in
☻Page 90 of 243☻
conjunction with the Link Collector stage to
partition data, process it in parallel, and then
collect it together again before writing it to a
single target.
In order for this job to compile and run as

intended on a multi-processor system you must
have inter-process buffering turned on, either at
project level using the DataStage Administrator,
or at job level from the Job Properties dialog
box.
Defining Link Partitioner Stage Properties

properties for the Link Partitioner stage:
• Partitioning Algorithm. Use this property to

specify the method the stage uses to partition
data. Choose from:
– Round-Robin. This is the default method.
Using the round-robin method the stage will write
each incoming row to one of its output links in
turn.
– Random. Using this method the stage will use
a random number generator to distribute
incoming rows evenly across all output links.
– Hash. Using this method the stage applies a
hash function to one or more input column
values to determine which output link the row is
passed to.
– Modulus. Using this method the stage applies
a modulus function to an integer input column
value to determine which output link the row is
passed to.
☻Page 91 of 243☻
• Partitioning Key. This property is only
significant where you have chosen a partitioning
algorithm of Hash or Modulus. For the Hash
algorithm, specify one or more column names
separated by commas. These keys are
concatenated and a hash function applied to
determine the destination output link. For the
Modulus algorithm, specify a single column name
which identifies an integer numeric column. The
value of this column value determines the
destination output link.
Link Collector Stages
The Link Collector stage is an active stage which

takes up to 64 inputs and allows you to collect
data from this links and route it along a single
output link. The stage expects the output link to
use the same metadata as the input links. The
Link Collector stage can be used in conjunction
with a Link Partitioner stage to enable you to
take advantage of a multi-processor system and
have data processed in parallel. The Link
Partitioner stage partitions data, it is processed
in parallel, then the Link Collector stage collects
it together again before writing it to a single
target.
The following diagram illustrates how the Link

Collector stage can be used in a job in this way.
In order for this job to compile and run as

intended on a multi-processor system you must
☻Page 92 of 243☻
have inter-process buffering turned on, either at
project level using the Data Stage Administrator,
or at job level from the Job Properties dialog
box.
Defining Link Collector Stage Properties

properties for the Link Collector stage:
• Collection Algorithm. Use this property to
specify the method the stage uses to collect
data. Choose from:
– Round-Robin. This is the default method.
Using the round-robin method the stage will read
a row from each input link in turn.
– Sort/Merge. Using the sort/merge method the
stage reads multiple sorted inputs and writes one
sorted output.
• Sort Key. This property is only significant
where you have chosen a collecting algorithm of
Sort/Merge. It defines how each of the partitioned
data sets are known to be sorted and how the
merged output will be sorted. The key has the
following format: Column name {sort order]
[,Column name [sort order]]...
INFORMATICA vs DATASTAGE:
Features Informatica DataStage

System Requirement
- Platform Support Win NT/UNIX Win NT/UNIX/More Platforms
Deployment Facility
- Ability to handle initial Yes, No,
deployment, major releases, My experience has been Ascential has done a good job
minor releases and patches that INFA is definitely in recent releases.
with equal ease easier to implement
initially and upgrade.
Transformations
☻Page 93 of 243☻
- No of available 58 28,
transformation functions DS has many more canned
transformation functions than
28.
- Support for looping the Supports for comparing Does not support
source row (For While Loop) immediate previous
record
- Slowly Changing Dimension Full history, recent Supports only through Custom
values, Current & Prev scripts. Does not have a
values wizard to do this.
DS has a component called
ProfileStage that handles this
type of comparison. You'll want
to use it judiciously in your
production processing because
it does take extra resources to
use it but I have
found it to be very useful.
- Time Dimension generation Does not support. Does not support.
- Rejected Records Can be captured Cannot be captured in
separate file.
DS absolutely has the ability to
capture rejected records in a
separate file. That's a pretty
basic capability and I don't
know of any ETL tool
that can't do it...
- Debugging Facility Not Supported. Supports basic debugging
facilities for testing.
Application Integration
Functionality
- Support for real Time Not Available Not Available,
Data Exchange The 7.x version of DS has a
component to handle real-time
data exchange. I think it is
called RTE.
- Support for CORBA/XML Does not support Does not support
Metadata
- Ability to view & navigate Does Not Support Job sessions can be monitored
metadata on the web using Informatica
Classes.
This is completely not true. DS
has a very strong metadata
component (MetaStage) that
works not only with DS, but
also has plug-ins to work
with modeling tools (like
ERWin) and BI tools (like
Cognos). This is one
of their strong suits (again,
IMHO).
- Ability to Customize views of Supports Not Available,
metadata for different users Also not true - MetaStage
(DBA Vs allows publishing of metadata
☻Page 94 of 243☻
Business user). in HTML format for different
types of users. It is completely
customizable.
- Metadata repository can be Yes No. But the proprietary meta
stored in RDBMS data can be moved to a
RDBMS using the DOC Tool
Support And Maintenance

- Command line operation Pmcmd -server interface Not Available
for command line
- Ability to maintain versions Yes No,
of mappings Not true - this has been a weak
spot for DS in past releases,
but the7.x version of DS has a
good versioning tool.
Job Controlling & Scheduling

- Alerts like sending mails Supported Does not support directly (no
option). But possible
to call custom programs after
the job get executed)
1) System Requirement
1.1 Platform Support
1.1.1 Informatica: Win NT/ Unix
1.1.2 DataStage: Win NT/ Unix/More platforms.
2) Deployment facility
2.1. Ability to handle initial deployment, major releases, minor
releases and patches with equal ease
2.1.1.Informatica:. Yes
2.1.2.DataStage: No
My experience has been that INFA is definitiely easier to
implementinitially and upgrade. Ascential has done a good job in
recent releases
to improve, but IMHO INFA still does this better.
3) Transformations
3.1. No of available transformation functions
3.1.1.Informatica:. 58
3.1.2.DataStage: 28
DS has many more canned transformation functions than 28. I'm
not surewhat leads you to this number, but I'd recheck it if I were
you.
3.2. Support for looping the source row (For While Loop)
3.2.1.Informatica:. Supports for comparing immediate previous
record
3.2.2.DataStage: Does not support
3.3. Slowly Changing Dimension

3.3.1.Informatica:. Supports Full history, recent values, Current &
Prev values.
3.3.2.DataStage: Supports only through Custom scripts. Does not
☻Page 95 of 243☻
have a
wizard to do this
DS has a component called ProfileStage that handles this type
ofcomparison. You'll want to use it judisciously in your production
processing because it does take extra resources to use it but I have
found it to be very useful.
3.4. Time Dimension generation

3.4.1.Informatica:. Does not support.
3.5. Rejected Records

3.5.1.Informatica:. Can be captured
3.5.2.DataStage: Cannot be captured in separate file
DS absolutely has the ability to capture rejected records in a
separatefile. That's a pretty basic capability and I don't know of any
ETL tool
that can't do it...
3.5. Debugging Facility

3.5.1.Informatica:. Not Supported
3.5.2.DataStage: Supports basic debugging facilities for testing.
4) Application Integration Functionality

4.1. Support for real Time
Data Exchange
4.1.1..Informatica:. Not Available
4.1.2.DataStage: Not Available.
The 7.x version of DS has a component to handle real-time
dataexchange. I have not personnaly used it yet, but you should
look into
it. I think it is called RTE.
4.2. Support for CORBA/XML

4.1.1..Informatica:. Does not support
5) Metadata
5.1. Ability to view & navigate metadata on the web
5.1.1..Informatica:. Does not support
5.1.2.DataStage: Job sessions can be monitored using Informatica
Classes
This is completely not true. DS has a very strong metadata

component(MetaStage) that works not only with DS, but also has
plug-ins to work
with modeling tools (like ERWin) and BI tools (like Cognos). This is
one
of their strong suits (again, IMHO).
5.1. Ability to Customize views of metadata for different users (DBA

Vs
Business user)
5.1.1..Informatica:. Supports.
5.1.2.DataStage: Not Available
Also not true - MetaStage allows publishing of metadata in HTML
☻Page 96 of 243☻
format for different types of users. It is completely customizable.
5.1. Metadata repository can be stored in RDBMS

5.1.1..Informatica:. Yes
5.1.2.DataStage: No. But the proprietary meta data can be moved
to a
RDBMS using the DOC Tool
6) Support And Maintenance

6.1. Command line operation
6.1.1..Informatica:. Pmcmd -server interface for command line
6.1.2.DataStage: Not Available
6.2. Ability to maintain versions of mappings

6.1.1..Informatica:. Yes
6.1.2.DataStage: No
Not true - this has been a weak spot for DS in past releases, but
the7.x version of DS has a good versioning tool.
7) Job Controlling & Scheduling

7.1. Alerts like sending mails
7.1.1..Informatica:. Supported.
7.1.2.DataStage: Does not support directly ( no option). But
possible
to call custom programs after the job get executed)
Further mistakes in your comparison, mainly from a DataStage

based angle as my experience is with that product:
 Both DataStage and Informatica support XML. DataStage

comes with XML input, transformation and output stages.
 Both products have an unlimited number of transformation
functions since you can easily write your own using the
command interface.
 Both products have options for integrating with ERP
systems such as SAP, PeopleSoft and Seibel but these
come at a significant extra cost. You may need to evaluate
these. SAP is a reseller of DataStage for SAP BW,
PeopleSoft bundles DataStage in its EPM products.
 DataStage has some very good debugging facilities
including the ability to step through a job link by link or row
by row and watch data values as a job executes. Also
server side tracing.
 DataStage 7.x releases have intelligent assistants
(wizards) for creating the template jobs for each type of
slowly changing dimension table loads. The DataStage
Best Practices course also provides training in DW loading
with SCD and surrogate key techniques.
 Ascential and Informatica both have robust metadata
management products. Ascential MetaStage comes
bundled free with DataStage Enterprise and manages
metadata via a hub and spoke architecture. It can import
metadata from a wide range of databases and modelling
tools and has a high degree of interaction with DataStage
☻Page 97 of 243☻
for operational metadata. Informatica SuperGlue was
released last year and is rated more highly by Gartner in
the metadata field. It integrates closely with PowerCenter
products. They both support multiple views (business and
technical) of metadata plus the functions you would expect
such as impact analysis, semantics and data lineage.
 DataStage can send emails. The sequence job has an
email stage that is easy to configure. DataStage 7.5 also
has new mobile device support so you can administer your
DataStage jobs via a palm pilot. There are also 3rd party
web based tools that let you run and review jobs over a
browser. I found it easy to send sms admin messages
from a DataStage Unix server.
 DataStage has a command line interface. The dsjob
command can be used by any scheduling tool or from the
command line to run jobs and check the results and logs of
jobs.
 Both products integrate well with Trillium for data quality,
DataStage also integrate with QualityStage for data quality.
This is the preferred method of address cleansing and
fuzzy matching.
Milind - I've got to ask - where are you getting your information
from??? I have done ETL tool comparisons for several clients over
the past 7 or so years. They are both good tools with different
strengths so it really depends on what your organizations needs /
priorities are as to which one is "better". I have spent much more
time in the past couple of years on DS than INFA so I don't feel I
can speak to the changes INFA has made lately, but I know you
have incorrect info about DS.
I am currently working with a client on DS v7.1. I've made a few

comments below for the more glaring inaccuracies or topics where I
have up-to-date experience. I suggest you re-research and perhaps
do a proof-of-concept with each vendor.
FYI - I don't know if you have looked at the Parallel Extender

component of DS 7.x, but it is a terrific capability if you have
challenges with meeting availability requirements. It is one of the
most impressive changes Ascential has made lately (IMHO).
Gartner has vendor reports on Ascential and Informatica.

They also have a magic quadrant that lists both DataStage
and Informatica as the clear market leaders. I don't think you
can go wrong with either product, it comes down to whether
you can access experts in these products for your project and
what options you have for training. I think if you go into a
major project with either product and you don't have an expert
on your team it can go badly wrong.
Further mistakes in your comparison, mainly from a

DataStage based angle as my experience is with that product:
☻Page 98 of 243☻
- Both DataStage and Informatica support XML.
DataStage comes with XML input, transformation
and output stages.
- Both products have an unlimited number of
transformation functions since you can easily write
your own using the command interface.
- Both products have options for integrating with ERP
systems such as SAP, PeopleSoft and Seibel but these
come at a significant extra cost. You may need to
evaluate these. SAP is a reseller of DataStage for SAP
BW, PeopleSoft bundles DataStage in its EPM
products.
- DataStage has some very good debugging facilities
including the ability to step through a job link by link
or row by row and watch data values as a job executes.
Also server side tracing.
- DataStage 7.x releases have intelligent assistants
(wizards) for creating the template jobs for each type
of slowly changing dimension table loads. The
DataStage Best Practices course also provides training
in DW loading with SCD and surrogate key
techniques.
- Ascential and Informatica both have robust metadata
management products. Ascential MetaStage comes
bundled free with DataStage Enterprise and manages
metadata via a hub and spoke architecture. It can
import metadata from a wide range of databases and
modelling tools and has a high degree of interaction
with DataStage for operational metadata. Informatica
SuperGlue was released last year and is rated more
highly by Gartner in the metadata field. It integrates
closely with PowerCenter products. They both
support multiple views (business and technical) of
metadata plus the functions you would expect such as
impact analysis, semantics and data lineage.
- DataStage can send emails. The sequence job has an
email stage that is easy to configure. DataStage 7.5
also has new mobile device support so you can
administer your DataStage jobs via a palm pilot.
There are also 3rd party web based tools that let you
run and review jobs over a browser. I found it easy to
send sms admin messages from a DataStage Unix
server.
- DataStage has a command line interface. The dsjob
command can be used by any scheduling tool or from
the command line to run jobs and check the results
and logs of jobs.
☻Page 99 of 243☻
- Both products integrate well with Trillium for data
quality, DataStage also integrate with QualityStage for
data quality. This is the preferred method of address
cleansing and fuzzy matching.
How Should We Implement A Slowly Changing

Dimension?
Currently, our data warehouse has only Type 1 Slowly

Changing Dimensions (SCD). That is to say we overwrite the
dimension record with every update. The problem with that is
when data changes, it changes for all history while this is
valid for data entry corrections, it may not be valid for all
data. An acceptable example could be Customer Date of
Birth. If the date of birth was changed, chances are the reason
was that their data was incorrect.
However, if the Customer address were changed, this may

and probably does mean the customer moved. If we simply
overwrite the address then all sales for that customer will
belong to the new address. Suppose the customer moved
from Florida to Ohio. If we were trying to track sales patterns
by region, all of the customer’s purchase that were made in
Florida would now appear to have been made in Ohio.
Type 1 Slowly Changing Dimension

Customer Dimension
CODE
ID CustKey Name DOB City State
1001 BS001 Bob Smith 6/8/1961 Tampa FL
1002 LJ004 Lisa Jones 10/15/1954 Miami FL
Customer Dimension After Edits
CODE
1001 BS001 Bob Smith 6/8/1961 Dayton OH
In the example above, the DOB change doesn’t affect any

dimensional reporting facts. However, the City, State change
would have an affect. Now all sales for Bob Smith would
appear to come from Dayton, Ohio rather than from Tampa,
Florida.
The solution we have chosen for solving this problem is to
☻Page 100 of 243☻

implement a Type 2 slowly changing dimension. A Type 2
SCD records a separate row each time a value is changed in
the dimension. In our case, we are declaring that we will only
create a new dimension record when certain columns are
changed. In the example above, we would not record a new
record for the DOB change but we would for the address
change.
Type 2 Slowly Changing Dimension

Customer Dimension
CODE
ID CustKey Name DOB City St Curr
Effective Date
Y 5/1/2004
Y 5/2/2004
Customer Dimension After Edits
CODE
Effective Date
1001 BS001 Bob Smith 6/8/1961 Tampa FL N
5/1/2004
1002 LJ004 Lisa Jones 10/15/1954 Miami FL Y
5/2/2004
1003 BS001 Bob Smith 6/8/1961 Dayton OH Y
5/27/2004
As you can see, there are two dimension records for Bob
Smith now. They both have the same CustKey values, but the
have different ID values. All future fact table rows will use
the new ID to link to the Customer dimension. This is
accomplished by the use of the Current Flag. The ETL
process looks only at the current flag when recording new
orders. However, in the case of an update to an order the
Effective Date must be used to determine which customer the
update applies to.
The primary issue with Type 2 SCD is the volume of data

grows exponentially as more changes are tracked. This can
impact performance in a star schema. The principle behind
the star schema design is that while facts are few columns,
they have many rows but they only have to perform single
level joins to resolve their dimensions. The assumption is that
☻Page 101 of 243☻

the dimensions have lots of columns but relatively few
rows. This allows for very fast joining of data.
Conforming Dimensions
For the purposes of this discussion conforming dimensions

only need a brief definition. Conforming dimensions are a
feature of star schemas that allow facts to share dimensional
data. A conforming dimension occurs when two dimensions
share the same keys. Often they have different
attributes. The goal is to ensure that any fact table can link to
the conforming dimension and consume its data so long as the
dimension is relevant.
Conforming Dimension
Customer Dimension
CODE
Billing Dimension
CODE
ID Bill2Ky Name Account Type Credit
Limit CustKey
1001 9211 Bob Smith Credit $10,000
BS001
1002 23421 Lisa Jones Cash $100
LJ004
In the example above, we could use the ID from the Customer

dimension in a fact and in the future a link to the Billing
dimension could be established without having to reload the
data.
We are considering a slight modification to the standard Type

2 SCD. The idea is to maintain two dimensions one as a Type
1 and one as a Type 2. The problem with this is we lose the
ability to use conforming dimensions.
Type 2 and Type 1 Slowly Changing Dimension

Customer Dimension Type 1
☻Page 102 of 243☻

CODE
Effective Date
Y 5/1/2004
Y 5/2/2004
CODE
Effective Date
N 5/1/2004
Y 5/2/2004
Y 5/27/2004
As you can see, the current ID for Bob Smith in the Type 1
SCD is 1001, while it is 1003 in the Type 2 SCD. This is not
conforming.
Our solution is to create a composite key for the Type 2 SCD.
Type 2 and Type 1 Slowly Changing Dimension

CODE
ID CustKey Name DOB City St
CODE
ID SubKey CustKey Name DOB City
St Curr Eff Date
1001 001 BS001 Bob Smith 6/8/1961 Tampa
FL N 5/1/2004
1002 001 LJ004 Lisa Jones 10/15/1957 Miami
FL Y 5/2/2004
1001 002 BS001 Bob Smith 6/8/1961
Dayton OH Y 5/27/2004
☻Page 103 of 243☻

In the example above, the Type 1 and the Type 2 dimensions
conform on the ID level. If a fact needs the historical data it
will consume both the ID and the SubKey.
BEFORE YOU DESIGN YOUR

APPLICATION
You must assess your data. Data Stage jobs can be quite
complex and so it is advisable to consider the following
before starting a job:
• The number and type of data sources. You will

need a stage for each data source you want to access.
For each different type of data source you will need a
different type of stage.
• The location of the data. Is your data on a networked
disk or a tape? You may find that if your data is on a
tape, you will need to arrange for a custom stage to
extract the data.
• Whether you will need to extract data from a

mainframe source. If this is the case, you will need
Enterprise MVS Edition installed and you will use
mainframe jobs that actually run on the mainframe.
• The content of the data. What columns are in your

data? Can you import the table definitions, or will
you need to define them manually? Are definitions
of the data items consistent between data sources?
• The data warehouse. What do you want to store in
the data warehouse and how do you want to store
it?
☻Page 104 of 243☻

To assign a null value to a variable, use this syntax:
variable = @NULL
To assign a character string containing only the character

used to represent the null value to a variable, use this
syntax:
variable = @NULL.STR
Errors that occur as the files are loaded into Oracle are
recorded in the sqlldr log file.
Rejected rows are written to the bad file. The main reason for
rejected rows is an integrity constraint in the target table; for
example, null values in NOT NULL columns, nonunique values in
UNIQUE columns, and so on. The bad file is in the same format
as the input data file
• String operators for:

• Concatenating strings with Cat or :
• Extracting sub strings with [ ]
Hello. ′ : ′ My Name is ′ : X : ″ . What’s yours?″
... evaluates to:
″ Hello. My name is Tarzan. What’s yours?″
Field Function: Returns delimited substrings in a string
Returns delimited substrings in a string
MyString = "London+0171+NW2+AZ"
SubString = Field(Mystring, "+", 2, 2)
* returns "0171+NW2"
A=′ 12345′
A[3]=1212
The result is 121212.
MyString = "1#2#3#4#5"
String = Fieldstore (MyString, "#", 2, 2, "A#B")
* above results in: "1#a#B#4#5"
Operator Relation Example
☻Page 105 of 243☻

Eq or = Equality X=Y
Ne or # or >< or <> Inequality X # Y, X <> Y
Lt or < Less than X<Y
Gt or > Greater than X>Y
Le or <= or =< or #> Less than or equal to X <= Y
Ge or >= or => or #< Greater than or equal to X >= Y
You cannot use relational operators to test for a null value.

Use the IsNull function instead.
Tests if a variable contains a null value.
MyVar = @Null ;* sets variable to null value

If IsNull(MyVar * 10) Then
* Will be true since any arithmetic involving a null value
* results in a null value.
End
IF Operator:
Assigns a value that meets the specified conditions
• Return A or B depending on value in Column1:

If Column1 > 100 Then "A" Else "B"
Function MyTransform(Arg1, Arg2, Arg3)

* Then and Else clauses occupying a single line each:
If Arg1 Matches "A..."
Then Reply = 1
Else Reply = 2
* Multi-line clauses:
If Len(arg1) > 10 Then
Reply += 1
Reply = Arg2 * Reply
End Else
Reply += 2
Reply = (Arg2 - 1) * Reply
End
* Another style of multiline clauses:
If Len(Arg1) > 20
Then
Reply += 2
Reply = Arg3 * Reply
End
Else
Reply += 4
Reply = (Arg3 - 1) * Reply
End
Return(Reply)
Calls a subroutine. Not available in expressions.
☻Page 106 of 243☻

Syntax
Call subroutine [ ( argument [ ,argument ] … ) ]
Subroutine MyRoutineA(InputArg, ErrorCode)

ErrorCode = 0 ;* set local error code
* When calling a user-written routine that is held in the
* DataStage Repository, you must add a "DSU." Prefix.
* Be careful to supply another variable for the called
* routine's 2nd argument so as to keep separate from our
own.
Call DSU.MyRoutineB("First argument", ErrorCodeB)
If ErrorCodeB <> 0 Then
... ;* called routine failed - take action
Endif
Return
Special DataStage BASIC Subroutines
DataStage provides some special DataStage subroutines

for use in a before/after subroutines or custom transforms.
You can:
• Log events in the job's log file using DSLogInfo,

DSLogWarn, DSLogFatal, and
DSTransformError
• Execute DOS or DataStage Engine commands using
DSExecute
All the subroutines are called using the Call statement.
Logs an information message in a job's log file.
Syntax
Call DSLogInfo (Message, CallingProgName)
Example
Call DSLogInfo("Transforming: ":Arg1,
"MyTransform")
Example
Call DSLogInfo("Transforming: ":Arg1,
"MyTransform")
Date( ) :
Returns a date in its internal system format.
☻Page 107 of 243☻

This example shows how to turn the current date in
internal form into a string representing the next day:
Tomorrow = Oconv(Date() + 1, "D4/YMD") ;*

"1997/5/24"
Ereplace Function:
Formats data for output.:
Replaces one or more instances of a substring.
Syntax
Ereplace (string, substring, replacement [ ,number [ ,begin] ]

)
MyString = "AABBCCBBDDBB"
NewString = Ereplace(MyString, "BB", "")
* The result is "AACCDD"
= FMT("1234567", "14R2") X = "1234567.00"

X = FMT("1234567", "14R2$,")X = X = " $1,234,567.00"
FMT("12345", "14*R2$,")
X = FMT("1234567", "14L2") X = "1234567.00"
X = FMT("0012345", "14R") X = "0012345"
X = FMT("0012345", "14RZ") X = "12345"
X = FMT("00000", "14RZ") X=""
X = FMT("12345", "14'0'R") X = "00000000012345"
X = FMT("ONE TWO THREE", "10T") X = "ONE TWO ":T:"THREE "
X = FMT("ONE TWO THREE", "10R") X = "ONE TWO TH":T:"REE "
X = FMT("AUSTRALIANS", "5T") X = "AUSTR":T:"ALIAN":T:"S "
X = FMT("89", "R#####") X = " 89"
X = FMT("6179328323", "L###-#######") X = "617-9328323"
X = FMT("123456789", "L#3-#3-#3") X = "123-456-789"
X = FMT("123456789", "R#5") X = "56789"
X = FMT("67890", "R#10") X = " 67890"
X = FMT("123456789", "L#5") X = "12345"
X = FMT("12345", "L#10") X = "12345 "
X = FMT("123456", "R##-##-##") X = "12-34-56"
X = FMT("555666898", "20*R2$,") X = "*****$555,666,898.00"
X = FMT("DAVID", "10.L") X = "DAVID....."
X = FMT("24500", "10R2$Z") X = " $24500.00"
X = FMT("0.12345678E1", "9*Q") X = "*1.2346E0"
☻Page 108 of 243☻

X = FMT("233779", "R") X = "233779"
Date Conversions
The following examples show the effect of various D (Date)
conversion codes.
Conversion Expression Internal Value

X = Iconv("31 DEC 1967", "D") X=0
X = Iconv("27 MAY 97", "D2") X = 10740
X = Iconv("05/27/97", "D2/") X = 10740
X = Iconv("27/05/1997", "D/E") X = 10740
X = Iconv("1997 5 27", "D YMD") X = 10740
X = Iconv("27 MAY 97", "D X = 10740
DMY[,A3,2]")
X = Iconv("5/27/97", "D/MDY[Z,Z,2]") X = 10740
X = Iconv("27 MAY 1997", "D X = 10740
DMY[,A,]")
X = Iconv("97 05 27", "DYMD[2,2,2]") X = 10740
Date Conversions
The following examples show the effect of various D (Date)
conversion codes.
Conversion Expression External Value

X = Oconv(0, "D") X = "31 DEC 1967"
X = Oconv(10740, "D2") X = "27 MAY 97"
X = Oconv(10740, "D2/") X = "05/27/97"
X = Oconv(10740, "D/E") X = "27/05/1997"
X = Oconv(10740, "D-YJ") X = "1997-147"
X = Oconv(10740, "D2*JY") X = "147*97"
X = Oconv(10740, "D YMD") X = "1997 5 27"
X = Oconv(10740, "D X = "MAY 97"
MY[A,2]")
X = Oconv(10740, "D X = "27 MAY 97"
DMY[,A3,2]")
X = Oconv(10740, X = "5/27/97"
"D/MDY[Z,Z,2]")
X = Oconv(10740, "D X = "27 MAY 1997"
DMY[,A,]")
X = Oconv(10740, X = "97 05 27"
"DYMD[2,2,2]")
☻Page 109 of 243☻

X = Oconv(10740, "DQ") X = "2"
X = Oconv(10740, "DMA") X = "MAY"
X = Oconv(10740, "DW") X = "2"
X = Oconv(10740, "DWA") X = "TUESDAY"
OpenSeq ".\ControlFiles\File1" To PathFvar Locked

FilePresent = @True
End Then
FilePresent = @True
End Else
FilePresent = @False
End
Example
This example shows how a before/after routine must be

declared as a subroutine at DataStage release 2. The
DataStage Manager will automatically ensure this when
you create a new before/after routine.
Subroutine MyRoutine(InputArg, ErrorCode)

* Users can enter any string value they like when using
* MyRoutine from within the Job Designer. It will appear
* in the variable named InputArg.
* The routine controls the progress of the job by setting
* the value of ErrorCode, which is an Output argument.
* Anything non-zero will stop the stage or job.
ErrorCode = 0 ;* default reply
* Do some processing...
...
Return
MyStr = Trim(" String with whitespace ")

* ...returns "String with whitespace"
MyStr = Trim("..Remove..redundant..dots....", ".")
* ...returns "Remove.redundant.dots"
MyStr = Trim("Remove..all..dots....", ".", "A")
* ...returns "Removealldots"
MyStr = Trim("Remove..trailing..dots....", ".", "T")
* ...returns "Remove..trailing..dots"
This list groups BASIC functionality under tasks to help

you find the right statement or function to use:
• Compiler Directives
• Declaration
• Job Control/Job Status
☻Page 110 of 243☻

• Program Control
• Sequential Files Processing
• String Verification and Formatting
• Substring Extraction and Formatting
• Data Conversion
• Data Formatting
• Locales
Function MyTransform(Arg1)
Begin Case
Case Arg1 = 1
Reply = "A"
Case Arg1 = 2
Reply = "B"
Case Arg1 > 2 And Arg1 < 11
Reply = "C"
Case @True ;* all other values
Call DSTransformError("Bad arg":Arg1, "MyTransform"
Reply = ""
End Case
Return(Reply)
DATASTAGE 7.5x1 GUI FEATURES
New and Expanded Functionality to aid DataStage users

in job design and debugging.
• New Stored Procedure Stage:

A new stored procedure stage allows users to easily use
Oracle stored procedures written in PL/SQL via OCI. The
Stored Procedure Stage supports input and output
parameters making it easier to get information back from
☻Page 111 of 243☻

a stored procedure. It can return a result set via output
parameters and can return more than one row if the
procedure uses cursors. The stage can also execute a
stored function and returns status information from the
procedure.
• HTML Job Reporting from the Designer:

A detailed printable HTML format job report can be generated
for the currently open joy or shared container. The report can
be produced using the new menu option in Designer: File ->
Generate Report. The final HTML report can be customized by
applying different XSL style sheets to the generated XML file.
• Changes to File & Directory Browser Form:

The old style File & Directory browser form has been replaced
with one modeled on the standard Windows 2K browser. The
new browser provides enhanced functionality on directory
navigation (tree-oriented) and file selection, it has filtering as
well as saving and restoring capabilities for the last viewed file
list.
• Ability to globally set Annotation properties:

The Annotation properties dialog is presented in the Tools ->
Options dialog and the settings are saved in the registry per
user. The Annotation stage always defaults to the saved
settings.
• Ability to unset Environment Variables when a Job

Runs ($UNSET):
A special value $UNSET was introduced where there is a need
for a user defined environment variable to explicitly unset the
Unix environment variable to indicate false.
A dialog that provides information about what all the special

environment variable values are and what they are for is
available by double-clicking at:
- Job properties dialog, Parameters tab, when editing the
Default Value cell for a job parameter defined as an
environment variable.
- Admin Client, Environment dialog, when editing a value
cell.
Article-II:
• Transformer “Cancel” operation:
If the Cancel button or <ESC> key are pressed from the main
Transformer dialog and changes have been made, then a
confirmation message box is displayed, to check that the user
wants to quit without saving the changes. If no changes have
been made, no confirmation message is displayed.
☻Page 112 of 243☻

• Multi-Client Manager:
The previously unsupported “Client Switcher” tool has been
enhanced and integrated into the DataStage Client. This tool
allows the users to install and switch between multiple different
versions of the client. Switching between them also changes
the desktop shortcuts and the Start Menu group to point to
another installed DataStage client.
Enterprise Edition:
• Complex Flat File Stage:
A new Parallel Complex Flat File stage has been added to read
or write files that contain complex structures (for example
groups, arrays, redefines, occurs depending on, etc.). Arrays
from complex source can be passed as-is or optionally flattened
or normalized.
• Parallel Job Runtime Message Handling:

When DataStage parallel jobs are run they can generate a large
number of messages that are logged and may be viewed in the
Director Client.
Note: When the Local Run button is disabled, you cannot view
log information from the Director. Someone authorized to do
so can enable the Local Run via the Apiary.
Message Handlers allow the user to customize the severity of

individual messages and can be applied at project of job level.
Messages can be suppressed from the log (Information and
Warning messages only), promoted (from Information to
Warning) or demoted (from warning to Information). A
message handler management tool (available from DS Manager
and Director) provides options to edit, add or delete message
handlers. A new Director option allows message handling to
be enabled/disabled for the current job.
• Visual Cues in Designer – Designer time job validation:

For parallel jobs (including parallel shared containers) and job
sequences, errors that would occur during compilation are
optionally presented on the canvas without requiring the user to
explicitly compile the job. If there are potential problems with
the stage that would cause a compilation error, a warning
triangle icon (Visual Cue) is shown on the top of the stage.
When the user hovers the mouse over a stage with the Visual
Cue, a tool tip is displayed. The Visual Cues can be turned off
via a toolbar button (a ‘tick’ image).
• Additional properties for Parallel Job Stages:
☻Page 113 of 243☻

- File Name Column (optional) – add a column to the stage
output that contains the name of the file that the record is
sourced from. Available on Sequential File and File Set
Stages.
- Source Name Column (optional) – adds a column to the
stage output that contains the name of the source that the
record is sourced from. Available on External Source
Stage.
- Row Number Column (optional) – adds a column to the
stage output that contains the row number of the record.
Available on Sequential File, File Set and External Source
Stages.
- Read First Rows (optional) – constrains the stage to only
read the specified number of rows from each file.
Available on Sequential File Stage.
- First Line is Column Names (mandatory) – on reading this
tells the stage to ignore the first line since it contains
column names. On writing it causes the first line written to
be the column names. Available on Sequential File Stage.
• View Data functionality on the Source & Target custom

stages:
- View data support was added to custom parallel stages for
both source and targets.
- “Show file” had been replaced with “View Data” for
Parallel Job, Sequential File and File Set stages.
• New Parallel Job Advanced Developer’s Guide:

A new Parallel Job Advanced Developer’s Guide gives
DataStage Enterprise Edition users information on efficient job
design, stage usage, performance turning, and more. It also
documents all of the parallel environment variables available
for use.
☻Page 114 of 243☻

DATASTAGE & DWH INTERVIEW
QUESTIONS
COMPANY: TCS (DataStage)
1. Tell about yourself?

2. Types of Stages? Examples
3. What are active stages and passive stages?
4. Can you filter data in hashed file? (No)
5. Difference between sequential and hashed file?
6. How do you populate time dimension?
7. Can we use target hashed file as lookup? (Yes)
8. What is Merge Stage?
9. What is your role?
10. What is Job Sequencer?
11. What are stages in sequences?
12. How do you pass parameters?
13. What parameters you used in your project?
14. What are log tables?
15. What is job controlling?
16. Facts and dimension tables?
17. Confirmed dimensions?
18. Time dimension contains what data? (numeric data)
19. Difference between OLTP and OLAP?
20. Difference between star schema and snow flake schema?
21. What are hierarchies? Examples?
22. What are materialized views?
☻Page 115 of 243☻

23. What is aggregation?
24. What is surrogate key? Is it used for both fact and
dimension tables?
25. Why do you go for oracle sequence generator rather than
datastage routine?
26. Flow of data in datastage?
27. Initial loading and incremental loading?
28. What is SCD? Types?
29. How do you develop SCD type2 in your project?
30. How do you load dimension data and fact data? Which is
first?
31. Any idea about shell scripting and UNIX?
32. Difference between oracle function and procedure?
33. Difference between unique and primary key?
34. Difference between union and union all?
35. What is minus operator?
COMPANY: ACCENTURE (Datastage)
1. What is audit table?

2. If there is a large hash file and a smaller oracle table and if
you are looking up from transformer in different jobs which
will be faster?
3. Tell me about SCD’s?
4. How did you implement SCD in your project?
5. Do a business people need to know the surrogate key?
6. What are derivations in transformer?
7. How do you use surrogate key in reporting?
8. Logs view in datastage, logs in Informatica which is clear?
9. Have you used audit table in your project?
10. What is keen? Have you used it in your project?
11. While developing your project what are the considerations
you take first like performance or space?
12. What is job scheduler? Have you used it? How did you do?
13. Have you used datastage parallel extender?
14. What is the Link Partitioner and link collector stage?
15. How does pivot stage work?
16. What is surrogate key? What is the importance of it? How
did you implement it in your project?
17. Totally how many jobs did you developed and how many
lookups did you use totally?
18. How do constraint in transformer work?
19. How will you declare a constraint in datastage?
☻Page 116 of 243☻

20. How will you handle rejected data?
21. Where the data stored in datastage?
22. Give me some performance tips in datastage?
23. Can we use sequential file as a lookup?
24. How does hash file stage lookup?
25. Why can’t we use sequential file as a lookup?
26. What is data warehouse?

27. What is ‘Star-Schema’?
28. What is ‘Snowflake-Schema’?
29. What is difference between Star-Schema and Snowflake-
Schema?
30. What is mean by surrogate key?
31. What is ‘Conformed Dimension’?
32. What is Factless Fact Table?
33. When will we use connected and unconnected lookup?
34. Which cache supports connected and unconnected lookup?
35. What is the difference between SCD Type2 and SCD
Type3?
36. Draw the ETL Architecture?
37. Draw the DWH Architecture?
38. What is materialized view?
39. What is procedure?
40. What is Function?
41. What is the difference between procedure and function?
42. What is trigger?
43. What are types of triggers?
COMPANY: SATYAM (Datastage)
1. Tell me about yourself?

2. What are the client components?
3. About administrator? With this, what do you do in your
project?
4. What is you project and explain the process?
5. Informational dimensions?
6. Measures?
7. What is data mart size and data warehouse size?
8. Fact table? Dimension table?
9. Data Mart?
10. How do you clear source files?
11. Pivot Stage?
12. How do you find a link, if not found?
13. Difference between transformer and routine?
14. How do you secure your project?
15. How do you handle errors? Exception handlers?
16. How do you know, how many rows rejected?
17. How do you manage surrogate key in datastage?
18. What is lookup?
19. Aggregator Stage?
20. Universe Stage?
☻Page 117 of 243☻

21. How do you merge two tables in datastage?
22. What is export and import?
23. What are Integration testing, unit testing, performance
testing?
24. UAT testing? (User Acceptance Testing)
25. Local, development, preproduction, production server?
COMPANY: SYNTEL, Mumbai (DataStage – Telephonic

Interview).
Basic DWH:
1. Tell me about your current project?

2. What is your role or job profile in the project?
3. What is your Job profile?
4. What is dimesion and fact?
5. What are types of dimensions?
6. What are confirmed dimensions?
7. What are generated dimensions?
8. What are slowly changing dimensions?
9. How many data marts in your project?
10. What is data mart name in your project?
11. What is the size of your data mart?
12. What is factless fact table? Give example.
13. How many fact tables are used in the project?
14. What is your fact table name in your project?
15. How many dimension tables used in the project?
16. What are the names of the dimension tables?
☻Page 118 of 243☻

17. What is Schema? Types? Explain Star-Schema and
Snowflake Schema with difference. Which schema you
used in your project? Why?
18. Why star-schema called as star-schema? Give example.
19. How frequently and from where you get the data as source?
20. What is difference between data mart and data warehouse?
21. What is composite key?
22. What is surrogate key? When you will go for it?
23. What is dimensional modeling?
24. What are SCD and SGT? Difference between them?
Example of SGT from your project.
25. How do you rate yourself in data warehouse?
26. What is the status of your current project?
DataStage:
27. How do you import your source and targets? What are the
types of sources and targets?
28. What is Active Stages and Passive Stages means in
datastage?
29. What is difference between Informatica and DataStage?
Which do you think is best?
30. What are the stages you used in your project?
31. Whom do you report?
32. What is orchestrate? Difference between orchestrate and
datastage?
33. What is parallel extender? Had you work on this?
34. What do you mean by parallel processing?
35. What is difference between Merge Stage and Join Stage?
36. What is difference between Copy Stage and Transformer
Stage?
37. What is difference between ODBC Stage and OCI Stage?
38. What is difference between Lookup Stage and Join Stage?
39. What is difference between Change Capture Stage and
Difference Stage?
40. What is difference between Hashed file and Sequential
File?
41. What are different Joins used in Join Stage?
42. How you decide when to go for join stage and lookup
stage?
43. What is partition key? Which key is used in round robin
partition?
44. How do you handle SCD in datastage?
45. What are Change Capture Stage and Change Apply Stages?
46. How many streams to the transformer you can give?
47. What is primary link and reference link?
48. What is routine? What is before and after subroutines?
These are run after/before job or stage?
49. Had you write any subroutines in your project?
50. What is Config File? Each job having its own config file or
one is needed?
☻Page 119 of 243☻

51. What is Node?
52. What is IPC Stage? What it increase performance?
53. What is Sequential buffer?
54. What are Link Partioner and Link Collector?
55. What are the performance tunning you have done in your
project?
56. Did you done scheduling? How? Can you schedule a job at
the every end date of month? How?
57. What is job sequence? Had you run any jobs?
58. What is status view? Why you clear this? If you clear the
status view what internally done?
59. What is hashed file? What are the types of hashed file?
Which you use? What is default? What is main advantage
of hashed file? Difference between them. (static and
dynamic)
60. What are containers? Give example from your project.
61. Had you done any hardware configuration while running
parallel jobs?
62. What are operators in parallel jobs?
63. What are parameters and parameter file?
64. Can you use variables? In which stages?
65. How do you convert columns to rows and rows to columns
in datastage? (Using Pivot Stage).
66. What is Pivot Stage?
67. What is execution flow of constraints, derivations and
variables in transformer stage? What are these?
68. How do you eliminate duplicates in datastage? Can you use
hash file for it?
69. If 1st and 8th record is duplicate then which will be skipped?
Can you configure it?
70. How do you import and export datastage jobs? What is the
file extension? (See each component while importing and
exporting).
71. How do you rate yourself in DataStage?
72. Explain DataStage Architecture?
73. What is repository? What are the repository items?
74. What is difference between routine and transform?
75. I have 10 tables with four key column values, in this
situation lookup is necessary, but which type of lookup is
used? Either OCBC or Hashed file lookup? Why?
76. When you write the routines?
77. In one project how many shared containers are created?
78. How do you protect your project?
79. What is the complex situation you faced in DataStage?
80. How will you move hashed file from one location to
another location?
81. How will you create static hashed file?
82. How many Jobs you have done in your project? Explain
one of complex Job.
☻Page 120 of 243☻

COMPANY: KANBAY, Pune (DataStage – Personal
Interview)
1. All about company details, project details, and client

details, sample data of your source?
2. DataStage Architecture?
3. System variable, what are system variables used your
project?
4. What are the different datastage functions used in your
project?
5. Difference between star schema and snow flake schema?
6. What is confirmed, degenerated and junk dimension?
7. What are confirmed facts?
8. Different type of facts and their examples?
9. What are approaches in developing data warehouse?
10. Different types of hashed files?
11. What are routines and transforms? How you used in your
project?
12. Difference between Data Mart and Data Warehouse?
13. What is surrogate key? How do you generate it?
14. What are environment variables and global variables?
15. How do you improve the performance of the job?
16. What is SCD? How do you developed SCD type1 and SCD
type2?
17. Why do you go for oracle sequence to generate surrogate
key rather than datastage routines?
18. How do you generate surrogate key in datastage?
19. What is job sequence?
20. What are plug-ins?
21. How much data you can get every day?
22. What is the biggest table and size in your schema or in your
project?
23. What is the size of data warehouse (by loading data)?
24. How do you improve the performance of the hashed file?
25. What is IPC Stage?
26. What are the different types of stages and used in your
project?
27. What are the operations you can do in IPC Stage and
transformer stage?
28. What is merge stage? How do you merge two flat files?
29. I have two table, in one table contains 100 records and
other table contains 1000 records which table is the master
table? Why?
30. I have one job from one flat file. I have to load data to
database, 10 lakhs records are there, after loading 9 lakhs
job is aborted? How do you load remaining records?
31. Which data your project contains?
32. What is the source in your project?
☻Page 121 of 243☻

COMPANY: IBM, Bangalore (DataStage – Telephonic
Interview)
1. Tell me about your educational and professional

background?
2. What is team size? What is your role in that?
3. What is fact less fact table? As it don’t have facts then
what’s the purpose of using it? Had you used in your
project?
4. How many jobs you have done in your project?
5. You handled different complex logic jobs in your project or
not?
6. Out of all jobs you have done, what is most complex job u
feel? Explain it?
7. You do yourself the complex logic or someone give you
the specifications and you convert them to datastage?
8. What are the sources you used in your project?
9. What are the stages you used in your project?
10. What is difference between ODBC and ORACLE OCI
stage?
11. As you told, if your sources are flat files and ORACLE
OCI then why you need ODBC in your project rather than
ORACLE OCI stage?
12. What difference between sequential file and hashed file?
13. Can you use sequential file as source to hashed file? Have
you done it? What error it will give?
14. Why hashed file improve the performance?
15. How do you sort your data in jobs?
16. Had you use sort stage in your job? (sort stage is parallel
stage, be sure that you are using server jobs only, then he
will ask Q.12)
17. Can aggregator and transformer stage used for sorting data?
How
18. If I have two sources to aggregator stage and oracle as
target, I can sort data in aggregator but if I don’t want to
use aggregator to sort data then how you will do it?
19. Why we use surrogate key in data warehouse? How it will
improve the performance? Where it will store? How do you
handle your surrogate key in your project? Where we use
mostly surrogate key?
☻Page 122 of 243☻

20. How many input links you can give to transformer?
21. Can you give more than one source to transformer? (If you
say “No” he will ask what error it will give when you try to
do this?)
22. Definition of Slowly Changing Dimensions? Types?
23. If a company maintaining type1 SCD, now the company
decided to change there plan to maintain type2 SCD, e.g.
customer table, so what are the changes to do in customer
table? (Whether you have to change the structure of the
table, if it is under type3 right? Or no changes? How do
you implement this?)
24. How many dimensions in your project? What are they?
25. What are the facts in your fact table?
26. Are all these facts are specific (related) to all dimensions?
27. How do you get system date in oracle?
28. What is a dual table in oracle?
29. What is the use of UNION in oracle? If I write query select
* from EMP UNION select * from dept, is it executed
well?
30. I have a query select * from EMP table group by dept; is
this query executed? If no what is the error?
MORE QUESTIONS ON DATASTAGE:

1. What are the difficulties faced in using DataStage?
2. What are the constraints in using DataStage?
3. How do you eliminate duplicate rows?
4. How do we do the automation of dsjobs?
5. What are XML files? How do you read data from XML
files and which stage to be used?
6. How do you catch bad rows from OCI stage?
7. Why do you use SQL LOADER or OCI STAGE?
8. How do you populate source files?
9. How do you pass filename as the parameter for a job?
10. How do you pass the parameter to the job sequence if the
job is running at night?
11. What happens if the job fails at night?
12. What is SQL tuning? How do you do it?
13. What is project life cycle and how do you implement it?
14. How will you call external function or subroutine from
datastage?
15. How do you track performance statistics and enhance it?
16. How do you do oracle 4 way inner join if there are 4 oracle
input files?
17. Explain your last project and your role in it?
18. What are the often used Stages or stages you worked with
in your last project?
19. How many jobs have you created in your last project?
☻Page 123 of 243☻

20. How do you merge two files in DS?
21. What is DS Manager used for - did u use it?
22. What is DS Director used for - did u use it?
23. What is DS Administrator used for - did u use it?
24. What is DS Designer used for - did u use it?
25. Explain the differences between Oracle8i/9i?
26. Do you know about INTEGRITY/QUALITY stage?
27. Do you know about METASTAGE?
28. Difference between Hashfile and Sequential File?
29. What is iconv and oconv functions?
30. How can we join one Oracle source and Sequential file?
31. How can we implement Slowly Changing Dimensions in
DataStage?
32. How can we implement Lookup in DataStage Server jobs?
33. What are all the third party tools used in DataStage?
34. What is the difference between routine and transform and
function?
35. What are the Job parameters?
36. How can we improve the performance of DataStage jobs?
37. How can we create Containers?
38. What about System variables?
39. What is difference between operational data stage (ODS) &
data warehouse?
40. How do you fix the error "OCI has fetched truncated data"
in DataStage?
41. How to create batches in Datastage from command prompt
42. How do you eliminate duplicate rows?
43. Suppose if there are million records, did you use OCI? If
not then what stage do you prefer?
44. What is the order of execution done internally in the
transformer with the stage editor having input links on the lft
hand side and output links?
45. I want to process 3 files in sequentially one by one, how I
can do that. While processing the files it should fetch files
automatically.
Datastage:
1. How to create a flat file job… (steps)

2. Is there any tool by ascential to pull the metadata
from various sources
3. What if definition of a table changes...what impact
will it have on ur job...
4. how to use debugger
5. how u schedule a DS job via unix script
6. Any third party tools for scheduling the jobs
7. how to use hash file...how to create Hash file...
8. Aggregator Transformations..
9. pre sql post sql..How to use these...truncate table
10. what is the use of administrator is used
11. what was the most complex mapping u hv
developed using datastage
☻Page 124 of 243☻

12. how much exp u hv on DS
13. if table definition has been changed in manager
will it automatically propogate into JOb
14. Can out link from one active stage can become
inout link in another active stage
15. Can u use a sequential file as a reference
file...difference between a sequential file and a hash file A)NO
16. What diffrent options are there to see a table
definition..
17. What all products of ascential u r aware of
18. What is the advantage of using OCI stage as
compared to ODBC stage
19. Normalizer Transformation..
20. what steps will you take to increase performance in
Datastage for large volumes of data
21. what are bridge tables
22. Types of Indexes
23. Table Partitioning
24. Types of schemas,explain
25. How do you do requirements gathering in case of
non-availability of the personnel and thereafter the project plan
26. How do you take care of unknown values for the
primary key for dimension?
27. Factless fact tables
28. Overview of Datastage projects
29. Link Partitioner/ Collector
Data stage:
1. What is ETL Architecture?

2. Explain your project Architecture?
3. How many Data marts and how many facts and
dimensions Available in your project?
4. What is the size of your Data mart?
5. How many types of loading-techniques are
Available?
6. Before going to design jobs in Data stage what are
preceding-steps in Data stage?
7. What is the Architecture of Data stage?
8. What is the Main difference between different
client components in Data stage?
9. What are the different stage you have worked on
it?
10. Can I call procedures in to datastage.if so How to
call store- procedures in Data stage?
11. What is the difference between sequential file and
hashfile? Can we use sequential file as a lookup? Can we put
filter conditions on sequential file?
12. Differences between DRS Stage and ODBC?
Which one is the best for performance?
☻Page 125 of 243☻

13. What are the different performance tunning aspects
are there in Data stage?
14. How do you remove the duplicates in flat-file?
15. What is the difference between Interprocess and
inprocess? Which one is the best?
16. What is CRC32? On which situation go for
CRC32?
17. What is a pivotstage? Can u explain on scenario
which situation used in your project?
18. What is row-spliter and row-merger can I use
separately is it possible to do it?
19. If one user locked the resource? How to release the
particular Job?
20. What is a version-controll in data stage?
21. What is the difference between clearlog-file?
Clearstage-file?
22. How to scehudle jobs with out using Data stage?
23. What is the difference between Static-hash and
dynamichashfile?
24. How to do error handling in data stage?
25. What is the difference between Active stage and
passive stage? What are the Active and passive stages?
26. How to set Environment variables in datastge?
27. What is job controlled routinue? How set job
parameter in Data stage?
28. How to release a job?
29. How to do Auto-purge in Data stage?
30. What is the difference between Datastge7.1 and
7.5?
DATA WAREHOUSING QUESTIONS:
1. What are the different Dimensional

modeling Techniques are Available?
2. What is the Difference between
Star-schema and snow-flake-schema? When we go for
star and snow-flake?
3. What are the types of dimension
and facts are in DW?
4. What is the life cycle of Data
warehousing project?
5. What is a Data-model?
6. What is the Difference between
Top-down Approach and Bottom-up Approach?
7. What is a factless-fact Table?
8. What is a confirmed-dimension?
9. What is a junk-dimension?
10. What is a cleansing?
11. Tell me about your current project?
☻Page 126 of 243☻

12. What is your role or job profile in the
project?
13. What is your Job profile?
14. What is dimesion and fact?
15. What are types of dimensions?
16. What are confirmed dimensions?
17. What are generated dimensions?
18. What are slowly changing dimensions?
19. How many data marts in your project?
20. What is data mart name in your
project?
21. What is the size of your data mart?
22. What is factless fact table? Give
example.
23. How many fact tables are used in the
project?
24. What is your fact table name in your
project?
25. How many dimension tables used in
the project?
26. What are the names of the dimension
tables?
27. What is Schema? Types? Explain Star-
Schema and Snowflake Schema with difference. Which schema
you used in your project? Why?
28. Why star-schema called as star-
schema? Give example.
29. How frequently and from where you
get the data as source?
30. What is difference between data mart
and data warehouse?
31. What is composite key?
32. What is surrogate key? When you will
go for it?
33. What is dimensional modeling?
34. What are SCD and SGT? Difference
between them? Example of SGT from your project.
35. How do you rate yourself in data
warehouse?
36. What is the status of your current
project?
37. What is data warehouse?
38. What is ‘Star-Schema’?
39. What is ‘Snowflake-Schema’?
40. What is difference between Star-
Schema and Snowflake-Schema?
41. What is mean by surrogate key?
42. What is ‘Conformed Dimension’?
43. What is Factless Fact Table?
44. When will we use connected and
unconnected lookup?
☻Page 127 of 243☻

45. Which cache supports connected and
unconnected lookup?
46. What is the difference between SCD
Type2 and SCD Type3?
47. Draw the ETL Architecture?
48. Draw the DWH Architecture?
DWH FAQ:
Conformed dimension:
• A dimension table connects to more than one fact
table. We present this same dimension table in both
schemes and we refer to dimension table as conformed
dimension.
Conformed fact:
• Definitions of measurements (facts) are highly
consistent we call them as conformed fact.
Junk dimension:
• It is convenient grouping of random flags and
aggregates to get them out of a fact table and into a useful
dimensional framework.
Degenerated dimension:
☻Page 128 of 243☻

• Usually occur in line item oriented fact table designs.
Degenerate dimensions are normal, expected and useful.
• The degenerated dimension key should be the actual
production order of number and should set in the fact
table without a join to anything.
Time dimension:
• It contains a number of useful attributes for describing
calendars and navigating.
• An exclusive time dimension is required because the
SQL date semantics and functions cannot generate several
important features, attributes required for analytical
purposes.
• Attributes like week days, week ends, holidays,
physical periods cannot be generated by SQL statements.
Fact less fact table:
• Fact table which do not have any facts are called fact
less fact table.
• They may consist of keys; these two kinds of fact
tables do not have any facts at all.
• The first type of fact less fact table records an ‘event’.
• Many event tracking tables in dimensional data
warehouses turn out to be factless.
Ex: A student tracking system that details each ‘student
attendance’ event each day.
• The second type of fact less fact table is coverage. The
coverage tables are frequently needed when a primary fact
table in dimensional DWH is sparse.
Ex: The sales fact table that records the sales of products
in stores on particular days under each promotion
condition
Types of facts:
• Additive: facts involved in the calculations for
deriving summarized data.
• Semi additive: facts that involved in the calculations
at a particular context of time.
• Non additive: facts that cannot involved in the
calculations at every point of time.
DATASTAGE ROUTINES
BL:
☻Page 129 of 243☻

BOT v2.3.0 Returns BLANK if passed value is NOT
NULL or BLANK, after trimming spaces
DataIn = "":Trim(Arg1)
If IsNull(DataIn) or DataIn = "" Then

Ans = ""
End Else
Ans = DataIn
End
CheckFileRecords:
Function CheckFileRecords(Arg1,Arg2)
vParamFile = Arg1 : "/" : Arg2

vCountVal = 0
OpenSeq vParamFile To FileVar Else

Call DSLogWarn("Cannot open ":vParamFile ,
"Cannot Open ParamFile")
End
Loop
ReadSeq Dummy From FileVar Else Exit ;* at

end-of-file
vCountVal = vCountVal + 1
Repeat
CloseSeq FileVar
Ans=vCountVal
Return (vCountVal)
CheckFileSizes:
DIR =
"/interface/dashboard/dashbd_dev_dk_int/Source/"
FNAME = "GLEISND_OC_02_20040607_12455700.csv"
*CMD = "ll -tr ":DIR:"|grep ":FNAME

CMD = "cmp -s ":DIR:"|grep ":FNAME
Call DSExecute("UNIX", CMD, Output,

SystemReturnCode)
Ans = Output
CheckIdocsSent:
☻Page 130 of 243☻

Checks If Idoc delivery job actually sent any Idocs to SAP.
This routine will atempt to read the DataStage Director

log for the job name specified as an argument.
If the job has a fatal error with "No link file", the routine
will copy the IDOC link file(s) into the interface error
folder.
In case the fatal error above is not found the routine
aborts the job.
A simple log of which runs produce error link file is

maintained in the module's log directory.
$INCLUDE DSINCLUDE JOBCONTROL.H

vRoutineName = "CheckIdocsSent"
Ans = "Ok"
If System(91) Then
OsType = 'NT'
OsDelim = '\'
NonOsDelim = '/'
Move = 'move '
End Else
OsType = 'UNIX'
OsDelim = '/'
NonOsDelim = '\'
Move = 'mv -f '
End
vJobHandle = DSAttachJob(JobName,
DSJ.ERRFATAL)
vLastRunStart = DSGetJobInfo(vJobHandle,
DSJ.JOBSTARTTIMESTAMP)
vLastRunEnd = DSGetJobInfo(vJobHandle,
DSJ.JOBLASTTIMESTAMP)
* Get the delivery log for the last run

vLogSummary = DSGetLogSummary ( vJobHandle,
DSJ.LOGANY, vLastRunStart, vLastRunEnd, 500)
vLogSummary = Change(vLogSummary,@FM,'')
* Manipulate vLogSummary within routine to return

status
PosOfStr =
Index(Downcase(vLogSummary),"sent",1)
vLogMsg = vLogSummary[PosOfStr,20]
* Now work out Status

If PosOfStr = 0 then
Status = 'NOT SENT'
vLogMsg = ''
☻Page 131 of 243☻

end else
Status = 'SENT'
vLogMsg = vLogSummary[PosOfStr,20]
end
Ans = Status
vErr = DSDetachJob(vJobHandle)
Call DSLogInfo("Job " : JobName : "
Detached" , vRoutineName)
***** Make a log entry to keep track of how often

the pack doesn't work ********
vMessageToWrite = Fmt(Module_Run_Parm, "12'

'L") : Fmt(Status, "10' 'L") : " - " : vLogMsg
vIdocLogFilePath =
Interface_Root_Path_Parm: OsDelim : "logs" :
OsDelim : "IdocSentLog.log"
******** Open the log file

OPENSEQ vIdocLogFilePath TO vIdocLogFile
Then
Call DSLogInfo("IdocSentLog Open" ,
vRoutineName)
** Label to return to if file is created

FileCreated:
*** Write the log entry

vIsLastRecord = @Null
Loop Until vIsLastRecord Do

READSEQ vRecord From vIdocLogFile
Then
*Call DSLogInfo("Record Read - " :
vRecord , vRoutineName)
End Else
*Call DSLogInfo("End of file
reached " , vRoutineName)
vIsLastRecord = @True
End
Repeat
WRITESEQ vMessageToWrite To vIdocLogFile

Then
Call DSLogInfo("Log entry created : "
: vMessageToWrite, vRoutineName)
End Else
Call DSLogFatal("Cannot write to " :
vIdocLogFilePath, vRoutineName)
End
End Else
Call DSLogInfo("Could not open file -
" : vIdocLogFilePath , vRoutineName)
Call DSLogInfo("Creating new file - " :
vIdocLogFilePath , vRoutineName)
☻Page 132 of 243☻

CREATE vIdocLogFile ELSE Call
DSLogFatal("Could not create file - " :
WEOFSEQ vIdocLogFile
WRITESEQ Fmt("Module Run", "12' 'L") :
Fmt("Status", "10' 'L") : " " : "Message" To
vIdocLogFile Else ABORT
Call DSLogInfo("Log file created : " :
GOTO FileCreated
End
**** Abort the delivery sequence and write error

message to the log. ************
If Status = 'NOT SENT' Then
Call DSLogInfo("No Idocs were actually
sent to SAP - Trying to clean up IDOC Link Files:
", vRoutineName)
vIdocSrcLinkPath =
Field(Interface_Root_Path_Parm, OsDelim, 1, 4) :
OsDelim : "dsproject" : OsDelim :
Field(Interface_Root_Path_Parm, OsDelim, 4, 1)
vIdocTgtLinkPath =
Interface_Root_Path_Parm: OsDelim : "error"
OsCmd = Move : " " : vIdocSrcLinkPath :
OsDelim : JobName : ".*.lnk " :
vIdocTgtLinkPath : OsDelim
Call DSExecute(OsType, OsCmd, OsOutput,
OsStatus)
If OsStatus <> 0 Then
Call DSLogWarn("Error when trying to
move link file(s)", vRoutineName)
LogMessMoveFail = 'The move command
(':OsCmd:') returned status
':OsStatus:':':@FM:OsOutput
Call DSLogWarn(LogMessMoveFail,
vRoutineName)
Call DSLogFatal("Cleaning up of IDOC
Link Files failed", vRoutineName)
End
Else
LogMessMoveOK = "Link files were
moved to " : vIdocTgtLinkPath
Call DSLogInfo(LogMessMoveOK,
vRoutineName)
LogMessRetry = "Job " : JobName : "
is ready to be relaunched."
Call DSLogInfo(LogMessRetry,
vRoutineName)
End
End Else
Call DSLogInfo("Delivery job log
indicates run OK ", vRoutineName)
End
ClearMappingTable:
☻Page 133 of 243☻

SUBROUTINE ClearMappingTable
(Clear_Mapping_Table, Errorcode)
Error Code = 0 ;* set this to non-zero to

stop the stage/job
**If Clear_Mapping_Table_Parm = 'Y' Then

EXECUTE "CLEARFILE Vendor_Map_HF.GEN"
**End Else
**End
ComaDotRmv:
DataIn = "":(Arg1)

Ans = ""
End Else
DataIn = Ereplace(DataIn, ".", "")
DataIn = Ereplace(DataIn, ",", "")
Ans = DataIn
End
CopyFiles:
Move files from one directory to another
Function
CopyofFiles(sourceDir,SourceFileMask,TargetDir,Ta
rgetFileMask,Flags)
RoutineName = "CopyFiles"
If SourceDir = '' Then SourceDir = '.'

If TargetDir = '' Then TargetDir = '.'
If SourceFileMask = '' Or SourceDir =

TargetDir Then Return(0)
! If SourceDir # '.' Then

! OpenPath SourceDir To Fv Else
! Call
DSU.DSMkDir(MkStatus,SourceDir,'','777')
! End
! End
! If TargetDir # '.' Then
! OpenPath TargetDir To Fv Else
! Call
DSU.DSMkDir(MkStatus,TargetDir,'','777')
! End
! End
If System(91) Then
OsType = 'NT'
☻Page 134 of 243☻

OsDelim = '\'
NonOsDelim = '/'
Copy = 'copy '
Flag = Flags
End Else
OsType = 'UNIX'
OsDelim = '/'
NonOsDelim = '\'
Copy = 'cp -f '
End
If Flags <> "" then Flag = NonOsDelim:Flags

Else Flag = ""
SourceWorkFiles =
Trims(Convert(',',@FM,SourceFileMask))
SourceFileList =
Splice(Reuse(SourceDir),OsDelim,SourceWorkFiles)
Convert NonOsDelim To OsDelim In

SourceFileList
TargetWorkFiles =
Trims(Convert(',',@FM,TargetFileMask))
TargetFileList =
Splice(Reuse(TargetDir),OsDelim,TargetWorkFiles)
Convert NonOsDelim To OsDelim In

TargetFileList
OsCmd = Copy:' ' : Flag : " "

:SourceFileList:' ':TargetFileList
Call DSLogInfo('Copying ': SourceFileList: '

to ':TargetFileList,RoutineName)
Call DSExecute(OsType,OsCmd,OsOutput,OsStatus)
If OsStatus Then
Call DSLogWarn('The Copy command
':OsStatus:':':@FM:OsOutput, RoutineName)
End Else
Call DSLogInfo('Files
moved...','DSMoveFiles')
End
Ans = OsStatus
CopyofComareROWS:
Function
copyofcompareRows(Column_Name,Column_Value)
vJobName=DSGetJobInfo(DSJ.ME, DSJ.JOBNAME)
vStageName=DSGetStageInfo(DSJ.ME, DSJ.ME,
DSJ.STAGENAME)
☻Page 135 of 243☻

vCommonName=CheckSum(vJobName) :
CheckSum(vStageName) : CheckSum(Column_Name)
Common /vCommonName/ LastValue
vLastValue=LastValue
vNewValue=Column_Value
If vNewValue<>vLastValue Then Ans=1 Else

Ans=0
LastValue=vNewValue
CopyOfZSTPKeyLookup
Check if key passed exists in file passed
Arg1: Hash file to look in
Arg2: Key to look for
Arg3: Number of file to use "1" or "2"
* Routine to look to see if the key passed exists

in the file passed
* If so, then the non-key field from the file is
returned
* If not found, "***Not Found***" is returned
*
* The routine requires the UniVerse file named to
have been created previously
*
EQUATE RoutineName TO 'ZSTPKeyLookup'
* Call DSLogInfo("Routine started",RoutineName)
Common /ZSTPkeylookup/ Init1, SeqFile1,

Init2, SeqFile2, RetVal, msgtext
Ans = 0
If NOT(Init1) And Arg3 = "1" Then

* Not initialised. Therefore open file
Init1 = 1
Open Arg1 TO SeqFile1 Then Clearfile
SeqFile1
Else
Call DSLogInfo("Open failed
1",RoutineName)
msgtext = "Cannot open ZSTP creation
control file ":Arg1
Call DSLogFatal(msgtext,RoutineName)
Ans = -1
End
End
If NOT(Init2) And Arg3 = "2" Then
☻Page 136 of 243☻

* Not initialised. Therefore open file
Init2 = 1
Open Arg1 TO SeqFile2 Then Clearfile
SeqFile2
Else
Call DSLogInfo("Open failed
2",RoutineName)
msgtext = "Cannot open ZSTP creation
control file ":Arg1
Call DSLogFatal(msgtext,RoutineName)
Ans = -1
End
End
* Read the file to get the data for the key

passed, if not found, return "***Not Found***"
If Arg3 = "1"
Then
Read RetVal From SeqFile1, Arg2 Else
RetVal = "***Not Found***"
End
Else
Read RetVal From SeqFile2, Arg2 Else
RetVal = "***Not Found***"
End
Ans = RetVal
Create12CharTS:
Function Create12CharTS(JobName)
vJobHandle = DSAttachJob(JobName, DSJ.ERRFATAL)
vJobStartTime = DSGetJobInfo(vJobHandle,
vDate = Trim(vJobStartTime, "-","A")

vDate = Trim(vDate, ":","A")
vDate = Trim(vDate, " ", "A")
vDate = vDate[1,12]
Ans=vDate
CreateEmptyFile:
Function CreateEmptyFile(Arg1,Arg2)
*Create Empty File
☻Page 137 of 243☻

End
WeofSeq FileVar
CloseSeq FileVar
Ans="1"
Datetrans:
DateVal
Function Datetrans(DateVal)
Function DeleteFiles(SourceDir,FileMask,Flags)
* Function ReverseDate(DateVal)
* Date mat be in the form of DD.MM.YY i.e.
01.10.03
* convert to YYYYMMDD SAP format
Ans = "20" : DateVal[7,2] : DateVal[4,2] :

DateVal[1,2]
DeleteFiles:
RoutineName = "DeleteFiles"
If SourceDir = '' Then SourceDir = '.'
If FileMask = '' SourceDir = '' Then Return(0)
If System(91) Then
OsType = 'NT'
OsDelim = '\'
NonOsDelim = '/'
Delete = 'del '
End Else
OsType = 'UNIX'
OsDelim = '/'
NonOsDelim = '\'
Delete = 'rm ' : Flags : ' '
End
WorkFiles = Trims(Convert(',',@FM,FileMask))
FileList =
Splice(Reuse(SourceDir),OsDelim,WorkFiles)
Convert NonOsDelim To OsDelim In FileList
OsCmd = Delete :' ' : FileList
☻Page 138 of 243☻

Call DSLogInfo('Deleting
':FileList,RoutineName)
If OsStatus Then
Residx= Index(OsOutput,"non-existent",1)
if Index(OsOutput,"non-existent",1) = 0
then
Call DSLogInfo('The Delete command
(':Residx:OsCmd:') returned status
':OsStatus:':':@FM:OsOutput,RoutineName)
End
Else
Call DSLogInfo('No Files matched Wild
Card - Delete was not required...',RoutineName)
OsStatus = 0
End
End Else
deleted...',RoutineName)
End
Ans = OsStatus
DisconnectNetworkDrive:
Map a Network Drive on a Windows Server:
Function Disconnectnetworkdrive(Drive_Letter)
RoutineName = "MapNetworkDrive"
If Drive_Letter = '' Then Return(0)
OsType = 'NT'
OsDelim = '\'
NonOsDelim = '/'
Copy = 'copy '
OsCmd = 'net use ' : Drive_Letter : ":

/delete"
Call DSLogInfo('Disconnecting Network Drive: '

: OsCmd,RoutineName)
If OsStatus Then
Call DSLogWarn('The Copy command
':OsStatus:':':@FM:OsOutput, RoutineName)
End Else
Call DSLogInfo('Drive: ' : Drive_Letter :
'Disconnected ',RoutineName)
End
☻Page 139 of 243☻

Ans = OsStatus
DosCmd:
Move files from one directory to another:
Function DosCmd(Cmd)
RoutineName = "DosCmd"
If System(91) Then
OsType = 'NT'
OsDelim = '\'
NonOsDelim = '/'
End Else
OsType = 'UNIX'
OsDelim = '/'
NonOsDelim = '\'
End
OsCmd = Cmd
Call DSLogInfo("CMD = " : Cmd,RoutineName)

If OsStatus Then
Call DSLogWarn('The command (':OsCmd:')
returned status ':OsStatus:':':@FM:OsOutput,
RoutineName)
End Else
Call DSLogInfo('The command (':OsCmd:') was
successfull
':OsStatus:':':@FM:OsOutput,RoutineName)
End
Ans = OsStatus : " - " : OsOutput
DSMoveFiles:
Move files from one directory to another:
f SourceDir = '' Then SourceDir = '.'

If TargetDir = '' Then TargetDir = '.'
If FileMask = '' Or SourceDir = TargetDir Then

Return(0)
If System(91) Then
OsType = 'NT'
OsDelim = '\'
NonOsDelim = '/'
Move = 'move '
End Else
OsType = 'UNIX'
OsDelim = '/'
NonOsDelim = '\'
Move = 'mv -f '
End
☻Page 140 of 243☻

WorkFiles = Trims(Convert(',',@FM,FileMask))
FileList =
Splice(Reuse(SourceDir),OsDelim,WorkFiles)
Convert NonOsDelim To OsDelim In FileList
OsCmd = Move:' ' : FileList: ' ':TargetDir
Call DSLogInfo('Moving ':FileList: ' to

':TargetDir,'DSMoveFiles')
If OsStatus Then
Call DSLogInfo('The move command
':OsStatus:':':@FM:OsOutput,'DSMoveFiles')
End Else
moved...','DSMoveFiles')
End
Ans = OsStatus
Routine Name:ErrorMgmtDummy:
Value: The Value to Be Mapped

FieldName: The Name of the source field that the Value is
contained in
Format: The name of the Hash file containing the mapping
data
Default: The Default value to return if value is not found
Msg: ny text you want to store against an error
SeverityInd: The Error Severity Indicator: I-Information,
W-Warning, E-Error, F-Fatal
ErrorLogInd: An Indicator to indicate of errors should be
logged (Note this is not yet implemented)
HashFileLocation: A Hashfile could be either local to the
Module or Generic. Enter "G" for Generic "L" for Local
* FUNCTION
Map(Value,FieldName,Format,Default,Msg,ErrorLogIn
d)
*
* Executes a lookup against a hashed file using a
key
*
* Input Parameters : Arg1: Value =
The Value to be Mapped or checked
* Arg2: FieldName =
The Name of the field that is either the Target
of the Derivation or the sourceField that value
is contained in
* Arg3: Format =
The name of the Hash file containing the mapping
data
☻Page 141 of 243☻

* Arg4: Default =
The Default value to return if value is not found
* Arg5: Msg =
Any text you want stored against an error
* Arg6: SeverityInd =
An Indicator to the servity Level
* Arg7: ErrorLogInd =
An Indicator to indicate if errors should be
logged
* Arg8: HashfileLocation =
An Indicator to indicate of errors should be
*
* Return Values: If the Value is not found,
return value is: -1. or the Default value if that
is supplied
* If Format Table not found,
return value is: -2
*
*
*
RoutineName = 'Map'
Common /HashLookup/ FileHandles(100),

FilesOpened
Common /TicketCommon/ Ticket_Group,
Ticket_Sequence, Set_Key, Mod_Root_Path,
Generic_Root_Path, Chk_Hash_File_Name,
Mod_Run_Num
DEFFUN
LogToHashFile(ModRunNum,Ticket_Group,Ticket_Seque
nce,Set_Key,Table,FieldName,Key,Error,Text,Severi
tyInd) Calling 'DSU.LogToHashFile'
If (Ans = "-1" or Ans = "-2" or UpCase(Ans)=

"BLOCKED") and ErrorLogInd = "Y" Then
Ret_Code=LogToHashFile(Mod_Run_Num,Ticket
_Group,Ticket_Sequence,Set_Key,Table,FieldName,Ch
k_Value,Ans,Msg,SeverityInd)
End
RETURN(Ans)
FileExists:
Move files from one directory to another
Function File Exits(Filename)
Routine Name = "File Exists"
File Found = @TRUE
OPENSEQ FileName TO aFile ON ERROR STOP

"Cannot open file (":FileName:")" THEN
☻Page 142 of 243☻

CLOSESEQ aFile
END ELSE
FileFound = @FALSE ;* file not found
END
Ans = FileFound
FileSize:
Returns the size of a file
Function FileSize(FileName)
RoutineName = "FileSize"
FileSize = -99
OPENSEQ FileName TO aFile ON ERROR STOP

"Cannot open file (":FileName:")" THEN
status FileInfo from aFile else stop
FileSize=Field(FileInfo,@FM,4)
* FileSize=FileInfo
CLOSESEQ aFile
END ELSE
FileSize = -999
END
Ans = FileSize
FindExtension:
FunctionFindExtesion(Arg1)
File_Name=Arg1
* Gets rid of the extension part of the filename

LengthofFileName = Len(File_Name)
Extension = Index(File_Name, ".", 1)
If Extension <> 0 Then
LengthofExtension = LengthofFileName - Extension
+ 1
File_Extension=File_Name[Extension,LengthofExtens
ion]
End Else
End
Ans = File_Extension
FindFileSuffix:
Function FindFileSuffix(Arg1)
File_Name=Arg1

☻Page 143 of 243☻

MyLenRead=Index(File_Name, ".", 1) - 1
File_Name=File_Name[0,MyLenRead]
End Else
End
* Gets the timestamp. Doesn't handle the case

where there are suffix types and timestamp only
contains 5 digits without "_" inbetween
If Index(File_Name, "_", 6) = 0 Then
MyLenRead=Index(File_Name, "_", 4) + 1
MyTimestamp =
File_Name[MyLenRead,Len(File_Name)-1]
End Else
MyTimestamp =
Field(File_Name,"_",5):"_":Field(File_Name,"_",6)
End
TimestampEndPos = Index(File_Name,MyTimestamp,1)
+ Len(MyTimestamp)
MySuffix = File_Name[TimestampEndPos + 1,
Len(File_Name)]
Ans = MySuffix
FindTimeStamp:
Function FindTimeStamp(Arg1)
File_Name=Arg1

MyLenRead=Index(File_Name, ".", 1) - 1
File_Name=File_Name[0,MyLenRead]
End Else
End
* Gets the timestamp. Doesn't handle the case

where there are suffix types and timestamp only
contains 5 digits without "_" inbetween
If Index(File_Name, "_", 6) = 0 Then
MyLenRead=Index(File_Name, "_", 4) + 1
Timestamp =
File_Name[MyLenRead,Len(File_Name)-1]
End Else
Timestamp =
Field(File_Name,"_",5):"_":Field(File_Name,"_",6)
End
Ans = Timestamp
formatCharge:
Function FormatCharge(Arg1)
☻Page 144 of 243☻

vCharge=Trim(Arg1, 0, "L")
vCharge=vCharge/100
vCharge=FMT(vCharge,"R2")
Ans=vCharge
formatGCharge:
Ans=1
vLength=Len(Arg1)
vMinus=If Arg1[1,1]='-' Then 1 Else 0
If Arg1='0.00' Then
Ans=Arg1
End
Else
If vMinus=1 Then
vString=Arg1[2,vLength-1]
vString='-':Trim(vString, '0','L')
End
else
vString=Trim(Arg1, '0','L')
end
Ans=vString
End
FTPFile:
Script_Path: he path to where the Unix Script file

lives
File_Path: The Value to Be Mapped
File_Name: The Name of the source field that the
Value is contained in
IP_Address: The name of the Hash file containing the
mapping data
User_ID: The Default value to return if value is not found

Password: Any text you want to store against an error
Target_Path: The target path where the ifle is to saved on the
target server
* FUNCTION
FTPFile(Script_Path,File_Path,File_Name,IP_Addres
s, User_ID,Password,Target_Path)
*
*
☻Page 145 of 243☻

RoutineName = 'FTPFile'
OsCmd = Script_Path : "/ftp_put.sh":"

":File_Path:" ":File_Name:" ":IP_Address:"
":User_ID:" ":Password:" ":Target_Path :"
":Script_Path
Call DSLogInfo('Ftp ':File_Name: ' to ' :

IP_Address : ' ' :Target_Path,'FTPFile')
Call DSLogInfo('Ftp Script =
':Script_Path,'FTPFile')
Call DSExecute("UNIX",OsCmd,OsOutput,OsStatus)
If OsStatus Then
Call DSLogInfo('The FTP command (':OsCmd:')
returned status
End Else
Call DSLogInfo('Files FTPd...':
'(':OsCmd:')','FTPFile')
End
Ans = OsStatus
RETURN(Ans)
FTPmget:
* FUNCTION
FTPFile(Script_Path,Source_Path,File_Wild_Card,IP
_Address, User_ID,Password,Target_Path)
*
*
RoutineName = 'FTPmget'
OsCmd = Script_Path:"/ftp_Mget.sh":"
":Source_Path:" ":File_Wild_Card:" ":IP_Address:"
":User_ID:" ":Password:" ":Target_Path:"
":Script_Path
*OsCmd = Script_Path : "/test.sh"
Call DSLogInfo('Ftp ':File_Wild_Card: ' From '

: IP_Address : ' ' :Source_Path : ' to '
:Target_Path,RoutineName)
Call DSExecute("UNIX",OsCmd,OsOutput,OsStatus)
If OsStatus Then
Call DSLogInfo('The FTP command (':OsCmd:')
returned status
End Else
☻Page 146 of 243☻

Call DSLogInfo('Files FTPd...':
'(':OsCmd:')',RoutineName)
End
Ans = OsStatus
RETURN(Ans)
Concatenate All Input Arguments to Output using TAB

character Concatenate All
Routine="GBIConcatItem"
t = Char(009)
If ISNULL(IND) THEN Pattern = ""

ELSE Pattern = IND [1,1]
If ISNULL(VKORG) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : VKORG [1,4]
If ISNULL(VTWEG) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : VTWEG [1,2]
If ISNULL(SPART) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : SPART [1,2]
If ISNULL(WERKS) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : WERKS [1,4]
If ISNULL(AUART) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : AUART [1,4]
If ISNULL(FKDAT) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : FKDAT [1,8]
If ISNULL(KUNAG) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : KUNAG [1,10]
If ISNULL(KUNRE) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : KUNRE [1,10]
If ISNULL(MATNR) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : MATNR [1,18]
If ISNULL(PSTYV) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : PSTYV [1,4]
If ISNULL(KWMENG) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : KWMENG [1,15]
If ISNULL(XBLNR) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : XBLNR [1,16]
If ISNULL(VGPOS) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : VGPOS [1,6]
If ISNULL(FKARA) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : FKARA [1,4]
If ISNULL(ZOR_DT_PCODE) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : ZOR_DT_PCODE
[1,8]
If ISNULL(ZAWB) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : ZAWB [1,16]
If ISNULL(LGORT) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : LGORT [1,4]
If ISNULL(VKAUS) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : VKAUS [1,3]
If ISNULL(VKBUR) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : VKBUR [1,4]
If ISNULL(VKGRP) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : VKGRP [1,3]
☻Page 147 of 243☻

If ISNULL(ZLSCH) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : ZLSCH [1,1]
If ISNULL(ZTERM) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : ZTERM [1,4]
If ISNULL(KURSK) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : KURSK [1,9]
If ISNULL(TAXM1) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : TAXM1 [1,1]
If ISNULL(VRKME) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : VRKME [1,3]
If ISNULL(ARKTX) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : ARKTX [1,40]
If ISNULL(KTGRM) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : KTGRM [1,2]
If ISNULL(ZZTAXCD) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : ZZTAXCD [1,2]
If ISNULL(LAND2) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : LAND2 [1,3]
If ISNULL(NAME1) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : NAME1 [1,35]
If ISNULL(PSTLZ) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : PSTLZ[1,10]
If ISNULL(ORT01) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : ORT01 [1,35]
If ISNULL(KOSTL) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : KOSTL[1,10]
If ISNULL(WAERS) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : WAERS [1,5]
If ISNULL(KUNRG) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : KUNRG [1,10]
If ISNULL(KUNWE) THEN Pattern = Pattern :
t ELSE Pattern = Pattern : t : KUNWE [1,10]
Ans = Pattern
GBIConcatItem:
Concatenate All Input Arguments to Output using TAB
character:
Routine="GBIConcatItem"
t = Char(009)
If ISNULL(IND) THEN Pattern = ""

ELSE Pattern = IND [1,1]
If ISNULL(KNUMV) THEN Pattern = Pattern : t
ELSE Pattern = Pattern : t : KNUMV [1,16]
If ISNULL(KPOSN) THEN Pattern = Pattern : t
ELSE Pattern = Pattern : t : KPOSN [1,6]
If ISNULL(KSCHL) THEN Pattern = Pattern : t
ELSE Pattern = Pattern : t : KSCHL [1,4]
If ISNULL(KBETR) THEN Pattern = Pattern : t
ELSE Pattern = Pattern : t : KBETR [1,11]
If ISNULL(KWERT) THEN Pattern = Pattern : t
ELSE Pattern = Pattern : t : KWERT [1,13]
If ISNULL(WAERS) THEN Pattern = Pattern : t
ELSE Pattern = Pattern : t : WAERS [1,5]
☻Page 148 of 243☻

If ISNULL(KAWRT) THEN Pattern = Pattern : t
ELSE Pattern = Pattern : t : KAWRT [1,15]
If ISNULL(KHERK) THEN Pattern = Pattern : t
ELSE Pattern = Pattern : t : KHERK [1,1]
Ans = Pattern
GCMFConvert:
Receive GCMF string and change known strings to
required values:

Ans = ""
End Else
DataIn = Ereplace (DataIn,"$B$","")
DataIn = Ereplace (DataIn,"NULL","")
DataIn = Ereplace (DataIn,"<","<")
DataIn = Ereplace (DataIn,">",">")
DataIn = Ereplace
(DataIn,""",'"')
DataIn = Ereplace
(DataIn,"'","'")
DataIn = Ereplace (DataIn,"&amp","&")
DataIn = Ereplace (DataIn,"&#124","|")
Ans = DataIn
End
GCMFFormating:
*
* FUNCTION GCMFFormating(Switch, All_Row)
*
* Replaces some special characters when creating
the GCMF file
*
* Input Parameters : Arg1: Switch = Step to
change.
* Arg2: All_Row = Row
containing the GCMF Record.
*
DataIn=Trim(All_Row)
If Switch=1 Then
If IsNull(DataIn) or DataIn= "" Then
Ans = "$B$"
End
Else
DataInFmt = Ereplace (DataIn ,"&",
"&")
DataInFmt = Ereplace (DataInFmt ,"'",
"'")
DataInFmt = Ereplace (DataInFmt ,'"',
""")
Ans = DataInFmt
End
☻Page 149 of 243☻

End
Else
If Switch=2 Then
DataInFmt = Ereplace (DataIn ,">",
">")
DataInFmt = Ereplace (DataInFmt ,"<",
"<")
Ans = DataInFmt
End
Else
* Final Replace, After the Merge of all
GCMF segments
DataInFmt = Ereplace (DataIn ,"|",
"&#124")
Ans = DataInFmt
End
End
GeneralCounter:
COMMON /Counter/ OldParam, TotCount
NextId = Identifier
IF UNASSIGNED(OldParam) Then
OldParam = NextId
TotCount = 0
END
IF NextId = OldParam THEN

TotCount += 1
END ELSE
OldParam = NextId
TotCount = 1
END
Ans = TotCount
GetNextCustomerNumber:
Sequence number generator.

Routine to get the next sequence number to use for a
customer from a file, and save the usde value in the file.
The routine argument is the name associated with the

super group that the customer is being created in.
The routine uses a file to store the next available number.

It reads the number, then increments and stores the value
in common, writing the next value back to file each time.
☻Page 150 of 243☻

* Routine to generate the next customer number.
The argument is a string used to
* identify the super group for the customer.
*
* The routine uses a UniVerse file to store the
next number to use. This
* value is stored in a record named after the
supplied argument. The
* routine reads the number, then increments and
stores the value
* in common storage, writing the next value back
to file each time.
*
* Declare shared memory storage.

Common /CustSequences/ Initialized, NextVal,
SeqFile
EQUATE RoutineName TO 'GetNextCustomerNumber'
If NOT(Initialized) Then
* Not initialised. Attempt to open the
file.
Initialized = 1
Open "IOC01_SUPER_GRP_CTL_HF" TO SeqFile
Else
Call DSLogFatal("Cannot open customer
number allocation control file",RoutineName)
Ans = -1
End
End
* Read the named record from the file.
Readu NextVal From SeqFile, Arg1 Else
Call DSLogFatal("Cannot find super group
in customer number allocation control
file",RoutineName)
Ans = -1
End
Ans = NextVal
* Increment the sequence value, and write

back to file.
NextVal = NextVal + 1
If Len(NextVal) < 10 then NextVal =
Substrings("0000000000",1,10-
Len(NextVal)):NextVal
Writeu NextVal On SeqFile, Arg1 Else
Call DSLogFatal("Update to customer number
allocation control file failed",RoutineName)
Ans = -1
End
GetNextErrorTableID:
Sequence number generator in a concurrent environment.
Routine to generate a sequential number.
☻Page 151 of 243☻

The routine argument is the name associated with the
sequence.
The routine uses a file to store the next available number.

It reads the number from the file on each invocation; a
lock on the file prevents concurrent access.
* Routine to generate a sequential number. The

argument is a string used to
* identify the sequence.
*
* NOTE: This routine uses locking to allow
multiple processes to access the
* same sequence.
*
* The routine uses a UniVerse file to store the
next number to use. This
* value is stored in a record named after the
supplied argument. The
* routine always attempts to read the number from
the file, so that the
* record for the sequence becomes locked. It
increments and stores the
* value in common storage, writing the next value
back to file each
* time. Writing back this value frees the lock.
*

Common /ErrorTableSequences/ Initialized,
NextVal, SeqFile
EQUATE RoutineName TO 'GetNextErrorTableID'
* Not initialised. Attempt to open the
file.
Initialized = 1
Open "ErrorTableSequences" TO SeqFile Else
* Open failed. Create the sequence
file.
EXECUTE "CREATE.FILE
ErrorTableSequences 2 1 1"
Open "ErrorTableSequences" TO SeqFile
Else Ans = -1
End
End
* Read the named record from the file.

* This obtains the lock (waiting if
necessary).
Readu NextVal From SeqFile, Table_Name Else
NextVal = 1
End
Ans = NextVal
NextVal = NextVal + 1
☻Page 152 of 243☻

* Increment the sequence value, and write
back to file.
* This releases the lock.
Write NextVal On SeqFile, Table_Name Else Ans
= -1
GetNextModSeqNo:
Gets the Next Mod Run Code from an Initialised
Sequence
This routine gets the next Mod Run Number in a
squenced that was initialised,.
The arguments are Mod_Code_Parm and

Supplier_ID_Parm which combined form the key for this
instance of a sequence
GetParameterArray:
* GetParameterArray(Arg1)
* Decription: Get parameters
* Written by:
* Notes:
* Bag of Tricks Version 2.3.0 Release Date 2001-
10-01
* Arg1 = Path and Name of Parameter File
*
* Result = ( <1> = Parameter names, <2> =
Parameter values)
*
-------------------------------------------------
-----------
DEFFUN FileFound(A) Calling 'DSU.FileFound'
cBlank = ''
cName = 1
cValue = 2
vParamFile = Arg1
aParam = cBlank
vParamCnt = 0
vCurRoutineName = 'Routine:
GetParameterArray'
vFailed = @FALSE
Done = @FALSE
IF vParamFile AND FileFound(vParamFile)

Then
OPENSEQ vParamFile TO hParamFile Then
Loop
READSEQ vLineRaw FROM hParamFile
ON ERROR
☻Page 153 of 243☻

Call DSLogWarn('Error from
':vParamFile:'; Status =
':STATUS(),vCurRoutineName)
CLOSE hParamFile
vFailed = @TRUE
Done = @TRUE
End Then
vLine = TRIM(vLineRaw)
vFirstChar = LEFT(vLine,1)
vRemark = LEFT(vLine,4)
IF NOT(vFirstChar = cBlank OR
vFirstChar = '#' OR vFirstChar = '*' OR
vFirstChar = '"' OR vFirstChar = "'" OR
vFirstChar = ';' OR vFirstChar = ':' OR
vFirstChar = '[' OR vRemark = 'REM ') THEN
vParamCnt += 1 ; * Add
to any parameter array passed as an argument
aParam<1,vParamCnt> =
TRIM(FIELD(vLine,'=',cName))
aParam<2,vParamCnt> =
FIELD(vLine,'=',cValue)
END
END ELSE
Done = @TRUE
END
Until Done Do Repeat
CLOSE hParamFile
End Else
Call DSLogWarn('Error from
':vParamFile:'; Status =
':STATUS(),vCurRoutineName)
vFailed = @TRUE
End
End Else
vFailed = @TRUE
End
Call DSLogInfo("Values loaded from file:

":vParamFile:@AM:aParam, vCurRoutineName)
If vFailed Then
Ans = "ERROR"
End Else
Ans = aParam
End
LastDayofMonth:
Returns the Last Day of the Month
Deffun DSRMessage(A1,A2) Calling

"*DataStage*DSR_MESSAGE"
Equate TransformName To "ConvertMonth"
* Check the format of the input value.

If IsNull(Arg1) or (Len(Arg1) < 6) Then
Ans = ""
GoTo ExitLastDayMonth
☻Page 154 of 243☻

End
InYear = Substrings(Arg1,1,4)
InMonth = Substrings(Arg1,5,2)
If InMonth < 1 Or InMonth > 12 Then

Ans = ""
GoTo ExitLastDayMonth
End
* Generate the required output, depending on the

Action argument.
Begin Case
Case InMonth = "1"
* Internal date for first day of month.
OutDt ="31"
Case InMonth = "2"

if mod(Num(InYear),4)<>0 then
OutDt = "28"
end
if mod(Num(InYear),4)=0 then
OutDt = "29"
end
Case InMonth = "3"

OutDt = "31"
Case InMonth = "4"

OutDt = "30"
Case InMonth = "5"

OutDt ="31"
Case InMonth = "6"

OutDt ="30"
Case InMonth = "7"

OutDt ="31"
Case InMonth = "8"

OutDt ="31"
Case InMonth = "9"

OutDt ="30"
Case InMonth = "10"

OutDt ="30"
Case InMonth = "11"
☻Page 155 of 243☻

OutDt ="31"
Case InMonth = "12"

OutDt ="31"
End Case
Ans=OutDt:"-":InMonth:"-":InYear
ExitLastDayMonth:
LogToErrorFile:
Logs errors to an error hashed file
* FUNCTION
LogToErrorFile(Table,Field_Name,Check_Value,Error
_Number,Error_Text_1, Error_Text_2,
Error_Text_3,Additional_Message)
*
*
* Places the current Writes Error Messages to a
Hash File
*
* Input Parameters : Arg1: Table
= The name of Control table being checked
* Arg2: Field_Name
= The name of the Field that is in error
* Arg3: Check_Value
= The value used to look up in the Hash file to
get try and get a look up match
* Arg4: Error_Number
= The error number returned
* Arg5: Error_Text_1
= First error message argument. Used to build the
default error message
= Second error message argument. Used to build
the default error message
= Thrid error message argument. Used to build the
default error message
* Arg8: Additional_Message
= Any text that could be stored against an error
*
RoutineName = "LogToErrorFile"

FilesOpened
Common /TicketErrorCommon/ ModRunID,
TicketFileID, TicketSequence, TicketSetKey,
JobStageName, ModRootPath
☻Page 156 of 243☻

Ans = "ERROR"
If System(91) Then
OsType = 'NT'
OsDelim = '\'
NonOsDelim = '/'
Move = 'move '
End Else
OsType = 'UNIX'
OsDelim = '/'
NonOsDelim = '\'
Move = 'mv -f '
End
JobName = DSGetJobInfo (DSJ.ME , DSJ.JOBNAME)
Path = ModRootPath : OsDelim :"error" :

OsDelim
FileName = "ErrorLog_HF." : ModRunID
PathFile = Path : FileName
Call DSLogInfo(Path:"-- checking --" :
PathFile,RoutineName)
vMessage = "INLOG Error Log = " : PathFile

*Call DSLogInfo(vMessage, RoutineName )
vMessage = "INLOG Error Log Data = " :

ModRunID : "|" : TicketFileID : "|" :
TicketSequence : "|" : TicketSetKey : "|" : Table
: "|" : Field_Name : "|" : Check_Value : "|" :
Error_Number : "|" : Additional_Message
*Call DSLogInfo(vMessage, RoutineName )
Key = JobName : JobStageName : ModRunID:

TicketFileID : TicketSequence : TicketSetKey :
Table : Field_Name
Err_Rec = ""
Err_Rec <1> = JobName
Err_Rec <2> = JobStageName
Err_Rec <3> = ModRunID
Err_Rec <4> = TicketFileID
Err_Rec <5> = TicketSequence
Err_Rec <6> = TicketSetKey
Err_Rec <7> = Table
Err_Rec <8> = Field_Name
Err_Rec <9> = Check_Value
Err_Rec <10> = Error_Number
Err_Rec <11> = Error_Text_1
Err_Rec <14> = Additional_Message
* Attempt to find the table name in our

cache.
Locate FileName in FilesOpened Setting POS
Then
☻Page 157 of 243☻

Write Err_Rec To FileHandles(POS), Key Then
TAns = 0
End Else
TAns = -1
End
End Else
* Table is not in cache of opened tables,
so open it.
Openpath PathFile To FileHandles(POS)
Then
FilesOpened<-1> = FileName
Write Err_Rec To FileHandles(POS),
Key Then TAns = 0
Else
TAns = -1
End
End Else
TAns = -2
End
End
Ans = "ERROR"
Return(Ans)
LogToHashFile:
* FUNCTION
LogToHashFile(ModRunNum,TGrp,TSeg,SetKey,Table,Fi
eldNa,KeyValue,Error,Msg,SeverityInd)
*
*
* Places the current Writes Error Messages to a
Hah File
*
* Input Parameters : Arg1: ModRunNum =
The unique number allocated to a run of an Module
* Arg2: Ticket_Group =
The Ticket Group Number of the Current Row
* Arg3: Ticket_Sequence =
The Ticket Sequence Number of the Current Row
* Arg4: Set_Key = A
Key to identify a set of rows e.g. an Invoice
Number to a set of invoice lines
* Arg5: Table =
The name of Control table being checked
* Arg6: FieldNa =
The name of the Field that is in error
* Arg7: KeyValue =
The value used to look up in the Hash file to get
try and get a look up match
* Arg8: Error =
The error number returned
* Arg9: Msg =
Any text that could be stored against an error
An Indicator to state the error severity level
☻Page 158 of 243☻

RoutineName = "LogToHashFile"

FilesOpened
Ticket_Sequence, Set_Key, Job_Stage_Name,
Mod_Root_Path, Generic_Root_Path,
Chk_Hash_File_Name, Mod_Run_Num
TAns = 0
If System(91) Then
OsType = 'NT'
OsDelim = '\'
NonOsDelim = '/'
Move = 'move '
End Else
OsType = 'UNIX'
OsDelim = '/'
NonOsDelim = '\'
Move = 'mv -f '
End
JobName = DSGetJobInfo (DSJ.ME , DSJ.JOBNAME)

* StageName = DSGetStageInfo (DSJ.ME,DSJ.ME,
DSJ.STAGENAME)
Path = Mod_Root_Path : OsDelim :"error" :

OsDelim
FileName = "ErrorLog_HF." : Mod_Run_Num
PathFile = Path : FileName
*Message = "INLOG Error Log = " : PathFile

*Call DSLogInfo(Message, RoutineName )
*Message = "INLOG Error Log Data = " :

ModRunNum : "|" : TGrp : "|" : TSeq : "|" :
Set_Key : "|" : Table : "|" : FieldNa : "|" :
KeyValue : "|" : Error : "|" : Msg
Key = JobName : Job_Stage_Name : ModRunNum:

TGrp : TSeq : SetKey : Table : FieldNa
Err_Rec = ""
Err_Rec <1> = JobName
Err_Rec <2> = Job_Stage_Name
Err_Rec <3> = ModRunNum
Err_Rec <4> = TGrp
Err_Rec <5> = TSeq
Err_Rec <6> = SetKey
Err_Rec <7> = Table
Err_Rec <8> = FieldNa
Err_Rec <9> = KeyValue
☻Page 159 of 243☻

Err_Rec <10> = Error
Err_Rec <11> = Msg
Err_Rec <12> = SeverityInd

cache.
Locate FileName in FilesOpened Setting POS
Then
Write Err_Rec To FileHandles(POS), Key Then
TAns = 0
End Else
TAns = -1
End
End Else
so open it.
Openpath PathFile To FileHandles(POS)
Then
FilesOpened<-1> = FileName
Write Err_Rec To FileHandles(POS),
Key Then TAns = 0
Else
TAns = -1
End
End Else
TAns = -2
End
End
Ans = TAns
RETURN(Ans)
MandatoryFieldCheck: Check whether the field name

passed is mandatory
Routine to check to see if the passed field is populated,

and if not, to check to see if it is mandatory. If the field
contains "?", then it is handled as if it is blank.
The routine uses a control table containing process name,

field name, group name and exclusion flag to control
mandatory or not.
The routine arguments are the field name, the field, the
group key, whether this is the first mandatory check for
the record, and the process name when the first check flag
is "Y".
A variable kept in memory (Mandlist) is used to record

the mandatory check failures.
☻Page 160 of 243☻

When the passed field name is "Getmand", no processing
is performed except to return the Mandlist field.
* Routine to check whether the passed field is

filled, and if not, whether it is mandatory.
*
* The routine uses a UniVerse file
"MANDATORY_FIELD_HF" which contains the mandatory
field controls
*
* Arg1 Field name to be checked (literal)
* Arg2 Field value
* Arg3 Group name
* Arg4 1st call for record
* Arg5 The process name on the first call
(this is saved in storage for subsequent calls)
*

Common /Mandatory/ Initialized, SeqFile,
DataIn, GroupIn, GroupV, Mandlist, ProcessIn,
ProcessV
EQUATE RoutineName TO 'MandatoryFieldCheck'
* Call DSLogInfo("Routine
started":Arg1,RoutineName)
Initialized = 1
* Call DSLogInfo("Initialisation
Started",RoutineName)
Open "MANDATORY_FIELD_HF" TO SeqFile Else
Call DSLogFatal("Cannot open Mandatory
field control file",RoutineName)
Ans = -1
End
* Call DSLogInfo("Initialisation
Complete",RoutineName)
End
If Arg4 = "Y"
Then
Mandlist = ""
ProcessIn = "":Trim(Arg5)
If IsNull(ProcessIn) or ProcessIn = ""
Then ProcessV = " "
Els
e ProcessV = ProcessIn
End
If Arg1 = "Getmand" Then Ans = Mandlist

Else
GroupIn = "":Trim(Arg3)
☻Page 161 of 243☻

If IsNull(GroupIn) or GroupIn = "" Then
GroupV = " "
Else
GroupV = GroupIn
If IsNull(DataIn) or DataIn = "" or DataIn

= "?"
Then
*
* Field is blank - check for mandatory
*
* Call DSLogInfo(Arg1:" blank - checking
whether mandatory",RoutineName)
*
mystring = ProcessV:Arg1:GroupV:"X"
Read Retval From SeqFile, mystring
then
* Call DSLogInfo(Arg1:" Group
specifically excluded",RoutineName)
Ans = 0
end
else
mystring = ProcessV:Arg1:GroupV
Read Retval From SeqFile, mystring
then
* Call DSLogInfo(Arg1:" Group
specifically included",RoutineName)
Ans = 1
end
else
mystring = ProcessV:Arg1:"ALL"
Read Retval From SeqFile,
mystring
then
* Call DSLogInfo(Arg1:" Global
mandatory",RoutineName)
Ans = 1
end
else
* Call DSLogInfo(Arg1:" blank -
not mandatory",RoutineName)
Ans = 0
end
end
end
End
Else
Ans = 0
* Call DSLogInfo(Arg1:" Not
blank",RoutineName)
End
If Ans = 1
Then
If Mandlist = ""
Then Mandlist = Arg1
Else Mandlist = Mandlist:",":Arg1
end
End
☻Page 162 of 243☻

Map:(Routinue Name)
* FUNCTION
d)
*
key
*
The Value to Be Mapped
* Arg2: FieldName =
The Name of the field that is either the Target
of the Derivation or the sourceField that value
is contained in
* Arg3: Format =
The name of the Hash file containing the mapping
data
* Arg4: Default =
The Default value to return if value is not found
* Arg5: Msg =
logged
* Arg8: HashfileLocation =
An Indicator to indicate of errors should be
*
is supplied
return value is: -2
*
*
*
RoutineName = 'Map'

FilesOpened
Ticket_Sequence, Set_Key, Job_Stage_Name,
Mod_Root_Path, Generic_Root_Path,
Chk_Hash_File_Name, Mod_Run_Num
*Message = "Map Job Stage Name ==>" :

Job_Stage_Name
* Call DSLogInfo(Message, RoutineName )
*Message = "Map Mod Root Path ==>" :

Mod_Root_Path
☻Page 163 of 243☻

*Message = "Generic Root Path ==>" :

Generic_Root_Path
*Message = "Map Chk_Hash_File_Name ==>" :

Chk_Hash_File_Name
*Message = "Map Mod_Run_Num ==>" : Mod_Run_Num

DEFFUN
*
If Len(Chk_Hash_File_Name) = 3 And
HashFileLocation = "G" Then Format_Extn =
Chk_Hash_File_Name Else Format_Extn = Mod_Run_Num
[1,5]
If System(91) Then
OsType = 'NT'
OsDelim = '\'
NonOsDelim = '/'
Move = 'move '
End Else
OsType = 'UNIX'
OsDelim = '/'
NonOsDelim = '\'
Move = 'mv -f '
End
ColumnPosition = 0
PositionReturn = 0
Table = Format
If HashFileLocation = "G" then

PathFormat = Generic_Root_Path :
OsDelim :"format" : OsDelim: Format : "_HF." :
Format_Extn
End Else
PathFormat = Mod_Root_Path :
OsDelim :"format" : OsDelim : Format : "_HF." :
Format_Extn
End
If IsNull(Value) then Chk_Value = "" Else

Chk_Value = Value
*Message = "Map PathFormat ==>" :

PathFormat
☻Page 164 of 243☻

*Message = "Value ==>" : Value
*Message = "Format ==>" : Format

*Message = "Default ==>" : Default

*Message = "ErrorLogInd ==>" : ErrorLogInd

* Set the Default Answer for if a value is

not found
Begin Case
Case UpCase(Default) = "NODEF"
Default_Ans = "-1"
Case Default = "PASS"

NumFields = Dcount(Chk_Value, "|")
If NumFields > 1 Then
Default_Ans =
Field(Chk_Value,"|",2)
*Message = "Num Fields > 1 Default_Ans ==>"

: Default_Ans : "#" : Chk_Value
End Else
Default_Ans = Chk_Value
*Message = "Num Fields NG 0 Default_Ans

==>" : Default_Ans : "#" : Chk_Value
End
Case @TRUE
If UpCase(Field(Default,"|",1)) <> "BL"
Then Default_Ans = Default Else Default_Ans = -1
End Case
* Determine if we are returning one column or

entire row.
If Num(ColumnPosition) then
ColumnPosition = Int(ColumnPosition)
If ColumnPosition > 0 and ColumnPosition
< 99999 Then
PositionReturn = 1
End
End

cache.
Locate Format in FilesOpened Setting POS Then
☻Page 165 of 243☻

Read Rec From FileHandles(POS), Chk_Value
Then
If PositionReturn Then Ans =
Rec<ColumnPosition> Else Ans = Rec
End Else
Ans = Default_Ans
End
End Else
so open it.
Openpath PathFormat To FileHandles(POS)
Then
FilesOpened<-1> = Format
Read Rec From FileHandles(POS),
Chk_Value Else
Rec = Default_Ans
End
If PositionReturn And Rec <>
Default_Ans Then
Ans = Rec<ColumnPosition>
End Else
Ans = Rec
End
End Else
Ans = "-2"
End
End
If UpCase(Field(Default,"|",1)) = "BL" and

Ans <> -2 Then
If Chk_Value = "" then
Ans = Field(Default,"|",2)
End
End
*Message = "Outside LOGGING" :

Mod_Run_Num: "|" : Ticket_Group : "|" :
Ticket_Sequence "|" : Set_Key : "|" : Table : "|"
: FieldName : "|" : Chk_Value : "|" : Msg
*Message = "OUTSIDE PASS Trans

Default_Ans ==>" : Default_Ans : " Ans ==> " :
Ans
LogPass = "N"
If (Default = "PASS" and Default_Ans <> Ans)
then LogPass = "Y"
If LogPass = "Y"
Then
*Message = "PASS Trans Default_Ans
==>" : Default_Ans : " Ans ==> " : Ans
End
☻Page 166 of 243☻

If (Ans = "-1" or Ans = "-2" or UpCase(Ans)=
"BLOCKED" or LogPass = "Y" or SeverityInd = "I")
and ErrorLogInd = "Y" Then
*Message = "Write to Log Ans==> " :

Ans : " ErrorInd==> " : ErrorLogInd
*Message = "LOGGING" : Mod_Run_Num: "|" :

Ticket_Group : "|" : Ticket_Sequence : "|" :
Table : "|" : FieldName : "|" : Chk_Value : "|" :
Ans
_Group,Ticket_Sequence,Set_Key,Table,FieldName,Ch
k_Value,Ans,Msg,SeverityInd)
End
RETURN(Ans)
OutputJobStats: Outputs the job link statistics
hJob = DSAttachJob(JobName, DSJ.ERRFATAL)
Start_TS = DSGetJobInfo (hJob,

End_TS = DSGetJobInfo
(hJob,DSJ.JOBLASTTIMESTAMP)
Elapsed_Secs_Cnt = DSGetJobInfo
(hJob,DSJ.JOBELAPSED)
Job_Term_Status = DSGetJobInfo
(hJob,DSJ.JOBINTERIMSTATUS)
User_Status = DSGetJobInfo
(hJob,DSJ.USERSTATUS)
ErrCode = DSDetachJob(hJob)
Ans = Start_TS : "|" : End_TS : "|" :

Elapsed_Secs_Cnt : "|" : Job_Term_Status : "|" :
User_Status
Pattern:
Routine="Pattern"
Var_Len = len(Value)
Pattern = Value
For i = 1 To Var_Len
If Num(Value [i,1]) Then
Pattern [i,1] = "n"
☻Page 167 of 243☻

end
Else
If Alpha(Value [i,1]) Then
Pattern[i,1] = "a"
end
Else
Pattern[i,1] = Value [i,1]
end
end
Next i
Ans = Pattern
Checks a passed field to see if it matches the

pattern which is also passed.:
The input field is checked to see if it conforms

to the format that is also passed as a second
parameter.
The result of the routine is True is the pattern

matches the required format, and false if it does
not.
If the second parameter is empty, then true is

returned.
Equate TransformName To "PatternMatchCheck"
Begin Case
Case Arg2 = "" ;* No pattern - so return

true
Ans = 1
Case Arg3 = "" ;* Only 1 pattern passed

Ans = Arg1 Matches Arg2
Case 1 ;* All other cases

Ans = Arg1 Matches Arg2 : CHAR(253) :
Arg3
End Case
PrepareJob:

Job_Handle = DSAttachJob (Job_Name, DSJ.ERRWARN)
ErrCode1=DSPrepareJob(Job_Handle)
ErrCode2 = DSDetachJob(Job_Handle)
Ans= ErrCode2
RangeCheck:
* FUNCTION
d)
☻Page 168 of 243☻

*
key
*
The Value to be checked
* Arg2: MinValue =
The Min Value allowed
* Arg3: MaxValue =
The Max Value allowed
* Arg4: FieldName =
The Name of the Source field being checked
* Arg5: Msg =
logged
*
return value is -1. else the value supplied is
returned
*
*
*
RoutineName = 'RangeChk'

Ticket_Sequence,Set_Key, Mod_Root_Path,
Generic_Root_Path, Chk_Hash_File_Name,
Mod_Run_Num
DEFFUN
Table = "Min: " : MinValue "to Max: " :

MaxValue
Msg1 = ""
Msg2 = ""
Msg3 = ""
Msg4 = ""
Ans = ""
If Num (Value) = 0 then
Msg1 = "-Value is not a number"
Ans = -2
End

Msg2 = "-MinValue is not a number"
Ans = -2
End

Msg3 = "-MaxValue is not a number"
Ans = -2
☻Page 169 of 243☻

End
If Ans <> -2 Then

If Value < MinValue Or Value > MaxValue Then
Msg4 = "-Value is outside the Range"
Ans = -1
End
End
OutputMsg = Msg : Msg1 : Msg2 : Msg3: Msg4
*Call DSLogInfo(OutputMsg, RoutineName )
If Ans <> -1 and Ans <> -2 then Ans = Value
If (Ans = "-1" or Ans = "-2") and ErrorLogInd

= "Y" Then
_Group,Ticket_Sequence,Set_Key,Table,FieldName,Va
lue,Ans,OutputMsg,SeverityInd)
End
RETURN(Ans)
ReadParameter:
Read parameter value from configuration file
*
* Function : ReadParameter - Read parameter value from
configuration file
* Arg : ParameterName
(default=JOB_PARAMETER)
* DefaultValue (default='')
* Config file (default=@PATH/config.ini)
* Return : Parameter value from config file
Function
Readparameters(parametersname,Defaultvalue,Config
File)
* Function : ReadParameter - Read parameter value

from configuration file
* Arg : ParameterName
(default=JOB_PARAMETER)
* DefaultValue (default='')
* Config file
(default=@PATH/config.ini)
* Return : Parameter value from config file
*
If ParameterName = "" Then ParameterName =

"JOB_PARAMETER"
☻Page 170 of 243☻

If ConfigFile = "" Then ConfigFile =
@PATH:"/config.ini"
ParameterValue = DefaultValue
OpenSeq ConfigFile To fCfg

Else Call DSLogFatal("Error opening file
":ConfigFile, "ReadParameter")
Loop
While ReadSeq Line From fCfg
If Trim(Field(Line,'=',1)) = ParameterName
Then
ParameterValue = Trim(Field(Line,'=',2))
Exit
End
Repeat
CloseSeq fCfg
Ans = ParameterValue
RETURN(Ans)
ReturnNumber:
String=Arg1
Slen=Len(String)
Scheck=0
Rnum=""
For Scheck = 1 to Slen
Schar=Substrings(String,Scheck,1)
If NUM(Schar) then
Rnum=Rnum:Schar
End
Next Outer
Ans=Rnum
ReturnNumbers:
length=0
length=LEN(Arg1);
length1=1;
Outer=length;
postNum=''
counter=1;
For Outer = length to 1 Step -1
Arg2=Arg1[Outer,1]
If NUM(Arg2)
then
☻Page 171 of 243☻

length2=counter-1
if length2 = 0
then
length2=counter
postNum=RIGHT(Ar
g1,length2)
END
else
postNum=RIGHT(Ar
g1,counter)
END
END
counter=counter+1
Next Outer
Ans=postNum
ReverseDate:
Function ReverseDate(DateVal)
* Function ReverseDate(DatelVal)
* Date mat be in the form of DDMMYYYY i.e.
01102003 or DMMYYYY 1102003
If Len(DateVal) = 7 then
NDateVal = "0" : DateVal
End Else
NDateVal = DateVal
End
Ans = NDateVal[5,4] : NDateVal[3,2] :

NDateVal[1,2]
RunJob:
The routine runs a job. Job parameters may be supplied.

The result is a dynamic array containing the job status,
and row count information for each link. The routine
UtilityGetRunJobInfo can be used to interpret this result.
As well as the job name and job parameters, the routine

parameters allow the job warning limit and row count
limit to be set.
Format of returned dynamic array:
Status<1>=Jobname=FinishStatus
Status<2>=Jobname
Status<3>=JobStartTimeStamp
Status<4>=JobStopTimeStamp
☻Page 172 of 243☻

Status<5>=LinkNames (value mark @VM delimited)
Status<6>=RowCount (value mark @VM delimited)
FunctionRunJob(Arg1,Arg2,Arg3,Arg4)
* Demonstrate how to run a job within the GUI

development enviroment. Arguments may
* be passed in. The result is a dynamic array
with the resulting status and run
* statistics (row counts for every link on every
stage in the job)
*
Equate RoutineName To 'RunJob'

Equate RunJobName to Arg1
Equate Params To Arg2
Equate RowLimit To Arg3
Equate WarnLimit To Arg4
Dim Param(100,2) ;* Limited to max of 100

parameters
Deffun DSRMessage(A1, A2, A3) Calling

Deffun DSRTimestamp Calling "DSR_TIMESTAMP"
JobHandle = ''
Info = ''
ParamCount = Dcount(Params,'|')
If RowLimit = '' Then RowLimit = 0
If WarnLimit = '' Then WarnLimit = 0
For ParamNum = 1 to ParamCount

Param(ParamNum,1) =
Field(Field(Params,'|',ParamNum),'=',1)
Param(ParamNum,2) =
Next ParamNum
JobStartTime = DSRTimestamp()
JobHandle = DSAttachJob(RunJobName,
DSJ.ERRFATAL)
* Prepare the job

ErrorCode = DSPrepareJob(JobHandle)
Message = DSRMessage('DSTAGE_TRX_I_0014',
'Attaching job for processing - %1 - Status of
Attachment = %2', RunJobName:@FM:JobHandle )
Call DSLogInfo(Message, RoutineName)
LimitErr = DSSetJobLimit(JobHandle,
DSJ.LIMITROWS, RowLimit)
☻Page 173 of 243☻

DSJ.LIMITWARN, WarnLimit)
* Need to check if error occurred.

ListOfParams = DSGetJobInfo(JobHandle,
DSJ.PARAMLIST)
ListCount = Dcount(ListOfParams,',')
For ParamNum = 1 To ParamCount
'Setting Job Param - %1 Setting to %2',
Param(ParamNum,1):@FM:Param(ParamNum,2))
ErrCode = DSSetParam(JobHandle,
Param(ParamNum,1),Param(ParamNum,2))
Next ParamNum
ErrCode = DSRunJob(JobHandle, DSJ.RUNNORMAL)

ErrCode = DSWaitForJob(JobHandle)
Status = DSGetJobInfo(JobHandle,
DSJ.JOBSTATUS)
JobEndTime = DSRTimestamp()
If Status = DSJS.RUNFAILED Then
Message =
DSRMessage( 'DSTAGE_TRX_E_0020', 'Job Failed:
%1', RunJobName)
Call DSLogWarn(Message, RoutineName)
End
* Retrieve more information about this job run.
'Getting job statistics', '' )
StageList =
DSGetJobInfo(JobHandle,DSJ.STAGELIST)
'List of Stages=%1', StageList )
StageCount = Dcount(StageList, ',') ; *

Count number of active stages.
Info<1> = RunJobName
Info<2> = JobStartTime ;* StartTime
(Timestamp format)
Info<3> = JobEndTime ;* Now/End (Timestamp
format)
FOR Stage = 1 To StageCount

* Get links on this stage.
LinkNames =
DSGetStageInfo(JobHandle,Field(StageList,',',Stag
e),DSJ.LINKLIST)
Message =
DSRMessage( 'DSTAGE_TRX_I_0018', 'LinkNames for
Stage.%1 = %2',
Field(StageList,',',Stage):@FM:LinkNames)
☻Page 174 of 243☻

LinkCount = Dcount(LinkNames,',')
For StageLink = 1 To LinkCount
* Get Rowcount For this linkname
RowCount =
DSGetLinkInfo(JobHandle,Field(StageList,',',Stage
),Field(LinkNames,',',StageLink),DSJ.LINKROWCOUNT
)
Message =
DSRMessage( 'DSTAGE_TRX_I_0019', 'RowCount for
%1.%2=%3',
Field(StageList,',',Stage):@FM:Field(LinkNames,',
',StageLink):@FM:RowCount)
Info<4,-1> =
Field(StageList,',',Stage):'.':Field(LinkNames,',
',StageLink)
Info<5,-1> = RowCount
Next StageLink
Next Stage
Message = DSRMessage( 'DSTAGE_TRX_I_0020',

'RunJob Status=%1', Info )
Ans = RunJobName:'=':Status:@FM:Info
RunJobAndDetach:
The routine runs a job. Job parameters may be

supplied. The job is detached from so tht others
may be started immediately and the control job
finish.
As well as the job name and job parameters, the

routine parameters allow the job warning limit
and row count limit to be set.
FunctionRunDetachJob(Arg1,Arg2,Arg3,Arg4)
* Run a job, and detach from it so that this job

can end
*
Equate RoutineName To 'RunJobAndDetach'

Equate RunJobName To Arg1
Equate Params To Arg2
Equate RowLimit To Arg3
Equate WarnLimit To Arg4
☻Page 175 of 243☻

Dim Param(100,2) ;* Limited to max of 100
parameters
Deffun DSRMessage(A1, A2, A3) Calling

Deffun DSRTimestamp Calling "DSR_TIMESTAMP"
JobHandle = ''
Info = ''
ParamCount = Dcount(Params,'|')
If RowLimit = '' Then RowLimit = 0
If WarnLimit = '' Then WarnLimit = 0
For ParamNum = 1 to ParamCount

Param(ParamNum,1) =
Param(ParamNum,2) =
Next ParamNum
* Attach to the job

JobHandle = DSAttachJob(RunJobName,
DSJ.ERRWARN)
If JobHandle = 0
Then Call DSLogInfo("Job ":RunJobName:" not
started - attach failed",RoutineName)
Else
* Prepare the job

ErrorCode = DSPrepareJob(JobHandle)
'Attaching job for processing - %1 - Status of
Attachment = %2', RunJobName:@FM:JobHandle )
DSJ.LIMITROWS, RowLimit)
DSJ.LIMITWARN, WarnLimit)
* Need to check if error occurred.

ListOfParams = DSGetJobInfo(JobHandle,
DSJ.PARAMLIST)
ListCount = Dcount(ListOfParams,',')
For ParamNum = 1 To ParamCount
Message =
DSRMessage('DSTAGE_TRX_I_0015', 'Setting Job
Param - %1 Setting to %2',
Param(ParamNum,1):@FM:Param(ParamNum,2))
ErrCode = DSSetParam(JobHandle,
Param(ParamNum,1),Param(ParamNum,2))
Next ParamNum
ErrCode = DSRunJob(JobHandle,
DSJ.RUNNORMAL)
ErrCode = DSDetachJob(JobHandle)
End
☻Page 176 of 243☻

Ans = 0
RunShellCommandReturnStatus:
Function RunShellcommandreturnstatus(Command)
Call DSLogInfo('Running
command:':Command,'RunShellCommandReturnStatus')
Call DSExecute('UNIX',Command,Ans,Ret)
Call DSLogInfo('Output from

command:':Ans,'RunShellCommandReturnStatus')
Return(Ret)
SegKey:
Segment_Num: An Integer number representing the

order number of the Segment in the IDoc
Segment_Parm: A Segment Parameter containing a
string of Y's and N's in order of Segment_Num
denoting of the segment should be written to in
this Module
Key: The Value to Be Mapped
ErrorLogInd: An Indicator to indicate of errors
should be logged (Note this is not yet
implemented)
Function
Seqkey(Segment_Num,segmentparam,key,ErrorLogInd)
* FUNCTION SegKey(Value,ErrorLogInd)
*
key
*
* Input Parameters : Arg1: Segment_Num
* Arg2: Segment_Parm
* Arg1: Key = An
ordered Pip separated set of Seqment Primary Key
Fields
* Arg2: ErrorLogInd = An
Indicator to indicate of errors should be logged
(Note this is not yet implemented)
*
is supplied
return value is: -2
*
*
☻Page 177 of 243☻

*
RoutineName = 'SegKey'
BlankFields = ""
CRLF = Char(13) : Char(10)
Message = "IN Seg Key" : Segment_Num :

"|" : Segment_Parm : "|" : Key : "|" :
ErrorLogInd : "|"
* Determine if this segment should output
Write_Ind = Field(Segment_Parm,"|",Segment_Num)
If Write_Ind = "Y" then
* Count how many keys
NumKeys = Dcount(Key,"|")
* Make a list of any keys that are missing
Blank_Key_Cnt = 0
ReturnKey = ""
For i = 1 to NumKeys
Key_Part = Field(Key,"|",i)
if Key_Part = "" Then
Blank_Key_Cnt = Blank_Key_Cnt + 1
BlankFields<Blank_Key_Cnt> = i
end
ReturnKey = ReturnKey : Key_Part
Next i
If Blank_Key_Cnt > 0 and ErrorLogInd = "Y"

then
Message = "Error in Segment Key: ":
Segment_Num : " There are " : Blank_Key_Cnt : "
Missing Key Parts " : "The Missing Key Parts
are" : BlankFields
End
If Blank_Key_Cnt > 0 then
Ans = "Invalid_Key"
End Else
Ans = ReturnKey
End
End
Else
Ans = "Invalid_Key"
End
☻Page 178 of 243☻

SetDSParamsFromFile:
A before job subroutine to set Job parameters

from an external flat file
Input Arg should be of the form:

ParamDir,ParamFile
If ParamDir is not supplied, the routine assumes
the Project directory
If ParamFile is not supplied, the routine assumes
the Job Name (this could be dangerous)
The routine will abort the job if anything
doesn't go to plan
Note: a lock is placed to stop the same job from

running another instance of this routine. The
second instance will have to wait for the routine
to finish before being allowed to proceed. The
lock is released however the routine terminates
(normal, abort...)
The parameter file should contain non-blank lines

of the form
ParName = ParValue
White space is ignored.
The Routine may be invoked via the normal Before

Job Subroutine setting, or from within the 'Job
Properties->Job Control' window by entering "Call
DSU.SetParams('MyDir,MyFile',ErrorCode)"
For Andrew Webb's eyes only -
The routine could be made to work off a hashed

file, or environment variables quite easily.
It is not possible to create Job Parameters on-

the-fly because they are referenced within a Job
via an EQUATE of the form
JobParam%%1 = STAGECOM.STATUS<7,1>
JobParam%%2 = STAGECOM.STATUS<7,2> etc
This is then compiled up....So forget it!
Subroutinues
SetDsparmsformfile(inputArg,Errorcode)
$INCLUDE DSINCLUDE DSD_STAGE.H

$INCLUDE DSINCLUDE DSD.H
$INCLUDE DSINCLUDE DSD_RTSTATUS.H
Equ SetParams To 'SetDSParamsFromFile'
☻Page 179 of 243☻

ErrorCode = 0 ; * set
this to non-zero to stop the stage/job
JobName = Field(STAGECOM.NAME,'.',1,2)
ParamList =
STAGECOM.JOB.CONFIG<CONTAINER.PARAM.NAMES>
If ParamList = '' Then
Call DSLogWarn('Parameters may not be
externally derived if the job has no parameters
defined.',SetParams)
Return
End
Call DSLogInfo("SetDSParmsFromFile inputarg

>" : InputArg : "<", SetParms)
ArgList = Trims(Convert(',',@FM,InputArg))
ParamDir = ArgList<1>
If ParamDir = '' Then
ParamDir = '.'
End
ParamFile = ArgList<2>
If ParamFile = '' Then
ParamFile = JobName
End
If System(91) Then
Delim = '\'
End Else
Delim = '/'
End
ParamPath = ParamDir:Delim:ParamFile
Call DSLogInfo('Setting Job Parameters from

external source ':ParamPath,SetParams)
Call DSLogInfo(JobName:' -
':ParamList,SetParams)
OpenSeq ParamPath To ParamFileVar On Error

ErrorCode = 1
Call DSLogFatal('File open error on
':ParamPath:'. Status = ':Status(),SetParams)
End Else
Call DSLogWarn('File ':ParamPath:' not
found - using default parameters.',SetParams)
Return
End
StatusFileName =
FileInfo(DSRTCOM.RTSTATUS.FVAR,1)
Readvu LockItem From DSRTCOM.RTSTATUS.FVAR,
JobName, 1 On Error
Call DSLogFatal('File read error for
':JobName:' on ':StatusFileName:'. Status =
':Status(),SetParams)
ErrorCode = 1
☻Page 180 of 243☻

Return
End Else
Call DSLogFatal('Failed to read
':JobName:' record from
':StatusFileName,SetParams)
ErrorCode = 2
Return
End
StatusId = JobName:'.':STAGECOM.WAVE.NUM
Readv ParamValues From
DSRTCOM.RTSTATUS.FVAR, StatusId, JOB.PARAM.VALUES
On Error
Release DSRTCOM.RTSTATUS.FVAR, JobName
On Error Null
ErrorCode = 1
':StatusId:' on ':StatusFileName:'. Status =
Return
End Else
On Error Null
ErrorCode = 2
':StatusId:' record from
Return
End
Loop
ReadSeq ParamData From ParamFileVar On
Error
Release DSRTCOM.RTSTATUS.FVAR,
JobName On Error Null
ErrorCode = 4
Call DSLogFatal('File read error on
Return
End Else
Exit
End
Convert '=' To @FM In ParamData
ParamName = Trim(ParamData<1>)
Del ParamData<1>
ParamValue =
Convert(@FM,'=',TrimB(ParamData))
Locate(ParamName,ParamList,1;ParamPos)
Then
If
Index(UpCase(ParamName),'PASSWORD',1) = 0
Then Call DSLogInfo('Parameter
"':ParamName:'" set to
"':ParamValue:'"',SetParams)
☻Page 181 of 243☻

Else Call DSLogInfo('Parameter
"':ParamName:'" set but not displayed on
log',SetParams)
End
Else
Call DSLogWarn('Parameter
':ParamName:' does not exist in Job
':JobName,SetParams)
Continue
End
ParamValues<1,ParamPos> = ParamValue
Repeat
Writev ParamValues On
On Error
On Error Null
ErrorCode = 5
Call DSLogFatal('File write error for
Return
End Else
On Error Null
ErrorCode = 6
Call DSLogFatal('Unable to write
':StatusId:' record on ':StatusFileName:'. Status
= ':Status(),SetParams)
Return
End
Release DSRTCOM.RTSTATUS.FVAR, JobName On
Error Null
STAGECOM.JOB.STATUS<JOB.PARAM.VALUES> =
ParamValues
setParamsForFileSplit:
Using values from a control file this routine will run a job
multiple times loading the specified number of rows for
each job run.
Function setParamsForFileSplit:
(ControlFilename,Jobname)
*************************************************
**********************
* Nick Bond....
*
* This routine retrieves values from a control
file and passes them as paramters to *
* a job which is run once for each record in the
control file. *
*
*
☻Page 182 of 243☻

*************************************************
**********************
Equate Routine TO 'setParamsForFileSplit'
Call DSLogInfo('Starting Routine ',

Routine)
vFileName = ControlFileName
vJobName = JobName
vRecord = 1
******** Open Control File and retrieve split

values.
Call DSLogInfo('Opening File: ':vFileName,

Routine)
OPEN vFileName TO vFILE ELSE Call
DSLogFatal("Can't open file: ":vFileName,
Routine)
Call DSLogInfo('File is open: ':vFileName,
Routine)
******** Start loop which gets parameters from

control file and runs job.
Loop
** Check record exists for record id

READ vStart FROM vFILE, vRecord
Then
Call DSLogInfo('Loop Started:

':vFileName, Routine)
Call DSLogInfo('Control File ID:

':vRecord, Routine)
READV vStart FROM vFILE, vRecord, 4
Then
READV vStop FROM vFILE, vRecord, 5
Then
Call DSLogInfo('Load Records: ':vStart:
' to ' :vStop, Routine)
End
End
** Set Job Parameters and Run Job.
vNewFile = 'SingleInvoice':vRecord
vJobHandle = DSAttachJob(vJobName,
DSJ.ERRFATAL)
ErrCode = DSSetParam(vJobHandle,
'StartID', vStart)
'StopID', vStop)
☻Page 183 of 243☻

'newfile', vNewFile )
ErrCode = DSRunJob(vJobHandle,
DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(vJobHandle)
vRecord = vRecord+1
End
Else
** If record is empty leave loop
GoTo Label1
End
Repeat
******** End of Loop
Label1:
Call DSLogInfo('All records have been
processed', Routine)
Ans = vStart : ', ' : vStop
SetUserStatus:
Function Setuserstatus(Arg1)
Call DSSetUserStatus(Arg1)
Ans=Arg1
SMARTNumberConversion
Converts numbers in format 1234,567 to format 1234.57
Function SMARTNUMBERconversion(arg1)
INP = CONVERT(",",".",Arg1) ; * Commas to

decimal point
WRK = ICONV(INP,"MD33") ; *
convert to internal to 3 decimal places
Ans = OCONV(WRK,"MD23") ; *
convert to external t 2 decimal places
TicketErrorCommon
Required to use the "LogToErrorFile" Routine. This

stores variables used by the routine in shared memory:
* FUNCTION
TicketErrorCommon(Mod_Run_ID,Ticket_Group,Ticket_
Sequence,Ticket_Set_Key,Job_Stage_Name,Mod_Root_P
ath)
*
* Places the current Row Ticket in Common
☻Page 184 of 243☻

*
* Input Parameters : Arg1: Mod_Run_ID
= The unique number allocated to a run of an
Module
* Arg2: Ticket_File_ID
= The File ID assigned to the source of the
Current Row
* Arg3: Ticket_Sequence
= The Ticket Sequence Number of the Current Row
* Arg4: Ticket_Set_Key
= Identifies a set of rows e.g. an Invoice number
to set of invoice lines
* Arg5: Job_Stage_Name
= The Name of the Stage in the Job you want
recorded in the error log
* Arg6: Mod_Root_Path
= Root of the module - used for location of error
hash file
*
* Don't Return Ans but need to keep the compiler
happy
Ans = ""
RoutineName = 'ErrorTicketCommon'

TicketFileID, TicketSequence, SetKey,
ModRunID = Mod_Run_ID
TicketFileID = Ticket_File_ID
TicketSequence = Ticket_Sequence
SetKey = Ticket_Set_Key
JobStageName = Job_Stage_Name
ModRootPath = Mod_Root_Path
RETURN(Ans)
TVARate:
Function TvaRate(mtt_Base,mtt_TVA)
BaseFormated = "":(Mtt_Base)
TvaFormated = "":(Mtt_TVA)
If IsNull(BaseFormated) or BaseFormated = "0" or

BaseFormated= "" Then
Ans = 0
End Else
TvaFormated = Ereplace(TvaFormated, ".", "")

TvaFormated = Ereplace(TvaFormated, ",", "")
BaseFormated = Ereplace(BaseFormated, ".", "")

BaseFormated = Ereplace(BaseFormated, ",", "")
Ans = Ereplace(TvaFormated/BaseFormated, ".", "")

End
☻Page 185 of 243☻

TVATest:
Function Tvatest(Mtt_TVA,Dlco)
Country = TRIM(Dlco):";"
TestCountry =
Count("AT;BE;CY;CZ;DE;DK;EE;ES;FI;GB;GR;HU;IE;IT;
LT;LU;LV;MT;NL;PL;PT;SE;SI;SK;", Country)
Begin Case
Case Mtt_TVA <> 0
Reply = "B3"
Case Mtt_TVA = 0 And Dlco = "FR" And TestCountry
= 0
Reply = "A1"
Case Mtt_TVA = 0 And Dlco <> "FR" And TestCountry
= 1
Reply = "E6"
Case Mtt_TVA = 0 And Dlco <> "FR" And TestCountry
= 0
Reply = "E7"
Case @True
Reply = "Error"
End Case
Ans = Reply
UnTarFile:
Function Untarfile(Arg1)
DIR =
"/interface/dashboard/dashbd_dev_dk_int/Source/"
FNAME = "GLEISND_OC_02_20040607_12455700.csv"
*CMD = "ll -tr ":DIR:"|grep ":FNAME
*CMD = "cmp -s ":DIR:"|grep ":FNAME
CMD = "tar -xvvf ":DIR:FNAME
*--------------------------------
*---syntax= tar -xvvf myfile.tar
*---------------------------------
Call DSExecute("UNIX", CMD, Output,

SystemReturnCode)
Ans = Output
UtilityMessageToControllerLog
Write an informational message to the log of the

controlling job
☻Page 186 of 243☻

This routine takes a user defined message and
displays it in the job log of the controlling
sequence as an informational message.
The routine should be used sparingly in

production jobs to avoid degrading the
performance.
The return value of the function is always 1.:
Function UtilityMessageToControllerLog(Arg1)
* Write an informational message to the log of

the controlling job.
*
* This function is mainly intended for
development purposes, but can be used
* within a production environment for tracing
data values. The user should
* use this function cautiously, as it will cause
a decrease in performance
* if called often.
*
$include DSINCLUDE JOBCONTROL.H
Equate RoutineName To
"UtilityMessageToControllerLog"
InputMsg = Arg1
If Isnull(InputMsg) Then
InputMsg = " "
End
Call DSLogToController(InputMsg)
Ans = 1
UTLPropagateParms:
Routine allows a job to inherit parameter values from Job

Control.
This routine allows a job to inherit parameter values from

Job Control by listing the parameters of child job and
thereafter find the parameter in the parent job, getting
value and setting parameter value in child job.
Input Argument : Job handle (set by using DSAttachJob

in Job Control)
☻Page 187 of 243☻

Output : If a parameter is not found the routine returns 3,
otherwise 0.
Function UTLprapagateparam(Handle)
#include DSINCLUDE JOBCONTROL.H

Equ Me To 'UTLJobRun'
Ans = 0
ParentJobName =
DSGetJobInfo(DSJ.ME,DSJ.JOBNAME)
ChildParams =
Convert(',',@FM,DSGetJobInfo(Handle,DSJ.PARAMLIST
))
ParamCount = Dcount(ChildParams,@FM)
If ParamCount Then
ParentParams =
Convert(',',@FM,DSGetJobInfo(DSJ.ME,DSJ.PARAMLIST
))
Loop
ThisParam = ChildParams<1>
Del ChildParams<1>
*** Find job parameter in parent job
and set parameter in child job to value of
parent.
Locate(ThisParam,ParentParams;ParamPo
s) Then
ThisValue =
DSGetParamInfo(DSJ.ME,ThisParam,DSJ.PARAMVALUE)
ParamStatus =
DSSetParam(Handle,ThisParam,ThisValue)
Call DSLogInfo ("Setting:
":ThisParam:" To: ":ThisValue,
"UTLPropagateParms")
End
Else
*** If the parameter is not found
in parent job:
*** - write a warning to log file.
*** - return code changed to 3.
Call DSLogWarn ("Parameter :
":ThisParam:" does not exist in ":ParentJobName,
"UTLPropagateParms")
Ans = 3
End
While ChildParams # '' Do Repeat
End
Return(Ans)
UTLRunReceptionJob:
This routines allows generic starting of reception jobs

without creating specific Reception Processing Sequence.
☻Page 188 of 243☻

This routines allows generic starting of reception jobs
without creating specific Reception Processing Sequence.
- Determines job to launch (sequence or elementary job)
- Attaches job
- Propagates parameters using routine
UTLPropagateParms.
- Runs job and takes action upon result (any warning
will lead to a return code NOT OK)
Obligatory parameters in input are :

- Country_Parm
- Fileset_Name_Type_Parm
- Abort_Msg_Parm
- Module_Run_Parm
Function
Utilrunrece[pationjob(countryparam,fileset_name_typepa
ram,modulerunparam,Abort_msg_param)
$INCLUDE DSINCLUDE DSJ_XFUNCS.H

EQU Time$$ Lit "Oconv(Time(), 'MTS:'):': '"
Ans = -3
vRecJobNameBase = Country_Parm : "_" :

Fileset_Name_Type_Parm : "_Reception"
*************************************************
**************************************
***
###################
***
*************************************************
**************************************
*** Define job to launch -
Sequence or Job (START) ***
***
***
L$DefineSeq$START:
summary$<1,-1> = Time$$:Convert(@VM, " ",
DSMakeMsg("DSTAGE_JSG_M_0057\%1 (JOB %2)
started", "ReceptionJob":@FM:vRecJobNameBase))
** If Sequential Job exists - start Sequential
Job.
vJobSuffix = "_Seq"
vRecJobName = vRecJobNameBase : vJobSuffix
GoTo L$AttachJob$START
L$DefineJob$START:
** If no Sequential Job - start Elementary Job
vJobSuffix = "_Job"
☻Page 189 of 243☻

vRecJobName = vRecJobNameBase : vJobSuffix
GoTo L$AttachJob$START
L$ErrNoJob$START:
** If no job found - warn and end job
Msg = DSMakeMsg("No job found to attach" :
vRecJobNameBase : "_Seq or _Job", "")
MsgId = "@ReceptionJob"
GoTo L$ERROR
L$AttachJob$START:
Call DSLogInfo(DSMakeMsg("Checking presence of
" : vRecJobName : " for " : Module_Run_Parm, ""),
"")
jbRecepJob = vRecJobName
hRecepJob = DSAttachJob(jbRecepJob,
DSJ.ERRNONE)
If (Not(hRecepJob)) Then
AttachErrorMsg$ = DSGetLastErrorMsg()
If AttachErrorMsg$ = "(DSOpenJob) Cannot
find job " : vRecJobName Then
If vJobSuffix = "_Seq" Then GoTo
L$DefineJob$START
Else
GoTo L$ErrNoJob$START
End
End
Msg = DSMakeMsg("DSTAGE_JSG_M_0001\Error
calling DSAttachJob(%1)<L>%2",
jbRecepJob:@FM:AttachErrorMsg$)
MsgId = "@ReceptionJob"; GoTo L$ERROR
GoTo L$ERROR
End
If hRecepJob = 2 Then
GoTo L$RecepJobPrepare$START
End
***
***
*** Define job to launch -
Sequence or Job (END) ***
*************************************************
**************************************
***
###################
***
*************************************************
**************************************
*** Setup , Run and Wait for
Reception Job (START) ***
***
***
L$RecepJobPrepare$START:
*** Activity "ReceptionJob": Setup, Run and Wait
for job
hRecepJob = DSPrepareJob(hRecepJob)
If (Not(hRecepJob)) Then
☻Page 190 of 243☻

calling DSPrepareJob(%1)<L>%2",
jbRecepJob:@FM:DSGetLastErrorMsg())
End
started", "ReceptionJob":@FM:vRecJobName))
GoTo L$PropagateParms$START
L$PropagateParms$START:
*** Activity "PropagateParms": Propagating
parameters from parent job to child job using
separate routine.
DSMakeMsg("DSTAGE_JSG_M_0058\%1 (ROUTINE %2)
started",
"PropagateParms":@FM:"DSU.UTLPropagateParms"))
RtnOk =
DSCheckRoutine("DSU.UTLPropagateParms")
If (Not(RtnOk)) Then
Msg = DSMakeMsg("DSTAGE_JSG_M_0005\BASIC
routine is not cataloged: %1",
"DSU.UTLPropagateParms")
MsgId = "@PropagateParms"; GoTo L$ERROR
End
Call 'DSU.UTLPropagateParms'(rPropagateParms,
hRecepJob)
DSMakeMsg("DSTAGE_JSG_M_0064\%1 finished, reply=
%2", "PropagateParms":@FM:rPropagateParms))
IdAbortRact%%Result1%%1 = rPropagateParms
IdAbortRact%%Name%%3 = "DSU.UTLPropagateParms"
*** Checking result of routine. If <> 0 then
abort processing.
If (rPropagateParms <> 0)
Then GoTo L$ABORT
GoTo L$RecepJobRun$START
L$RecepJobRun$START:
ErrCode = DSRunJob(hRecepJob, DSJ.RUNNORMAL)
If (ErrCode <> DSJE.NOERROR) Then
calling DSRunJob(%1), code=%2[E]",
jbRecepJob:@FM:ErrCode)
End
ErrCode = DSWaitForJob(hRecepJob)
GoTo L$RecepJob$FINISHED
***
***
*** Setup , Run and Wait for
Reception Job (END) ***
*************************************************
**************************************
☻Page 191 of 243☻

***
###################
***
*************************************************
**************************************
*** Verification of result
of Reception Job (START) ***
***
***
L$RecepJob$FINISHED:
jobRecepJobStatus = DSGetJobInfo(hRecepJob,
DSJ.JOBSTATUS)
jobRecepJobUserstatus =
DSGetJobInfo(hRecepJob, DSJ.USERSTATUS)
DSMakeMsg("DSTAGE_JSG_M_0063\%1 finished, status=
%2[E]", "ReceptionJob":@FM:jobRecepJobStatus))
IdRecepJob%%Result2%%5 = jobRecepJobUserstatus
IdRecepJob%%Result1%%6 = jobRecepJobStatus
IdRecepJob%%Name%%7 = vRecJobName
Dummy = DSDetachJob(hRecepJob)
bRecepJobelse = @True
If (jobRecepJobStatus = DSJS.RUNOK)
Then GoTo L$SeqSuccess$START; bRecepJobelse =
@False
If bRecepJobelse Then GoTo L$SeqFail$START
***
***
*** Verification of result
of Reception Job (END) ***
*************************************************
**************************************
***
###################
***
*************************************************
**************************************
*** Definition of actions to
take on failure or success (START) ***
***
***
L$SeqFail$START:
*** Sequencer "Fail": wait until inputs ready
Call DSLogInfo(DSMakeMsg("Routine SEQUENCER -
Control End Sequence Reports a FAIL on Reception
Job", ""), "@Fail")
GoTo L$ABORT
L$SeqSuccess$START:
*** Sequencer "Success": wait until inputs ready
Call DSLogInfo(DSMakeMsg("Routine SEQUENCER -
Control End Sequence Reports a SUCCESS on
Reception Job", ""), "@Success")
GoTo L$FINISH
☻Page 192 of 243☻

***
***
*** Definition of actions to
take on failure or success (END) ***
*************************************************
**************************************
***
###################
***
L$ERROR:
Call
DSLogWarn(DSMakeMsg("DSTAGE_JSG_M_0009\Controller
problem: %1", Msg), MsgId)
DSMakeMsg("DSTAGE_JSG_M_0052\Exception raised:
%1", MsgId:", ":Msg))
bAbandoning = @True
GoTo L$FINISH
L$ABORT:
DSMakeMsg("DSTAGE_JSG_M_0056\Sequence failed",
""))
Call DSLogInfo(summary$,
"@UTLRunReceptionJob")
Call DSLogWarn("Unrecoverable errors in
routine UTLRunReceptionJob, see entries above",
Ans = -3
GoTo L$EXIT
*************************************************
*
L$FINISH:
If bAbandoning Then GoTo L$ABORT
DSMakeMsg("DSTAGE_JSG_M_0054\Sequence finished
OK", ""))
Call DSLogInfo(summary$,
Ans = 0
ValidateField:
Checks the length and data type of a value. Also checks

value is a valid date if the type is Date. Any errors are
logged to the Error Hash File
Field_Value: The value from the field being validated

Field_Name: The name of the field being validated
Length: The maximum length of the field being validated
Data_Type: The data type expected - possible values
(Numeric, Alpha, Date, Char)
☻Page 193 of 243☻

Date_Format: If Data_Type is 'Date' Then the format
must be specified. The syntax for this is the same as for
the Iconv function. i.e "D/YMD[4,2,2]" for a date in the
format 2004/12/23

vRoutineName = 'ValidateField'
DEFFUN
LogToErrorFile(Table,Field_Name,Check_Value,Error
_Number,Text_1,Text_2, Text_3, Message) Calling
"DSU.LogToErrorFile"
Common /HashLookup/ FileHandles(100), FilesOpened
TicketFileID, TicketSequence, TicketSetKey,
Ans = "START"
vData_Type = Downcase(Data_Type)
BEGIN CASE
******** Check the arguments
* Value being checked is null
CASE isNull(Field_Value)
Call DSTransformError("The value being checked is
Null - Field_Name = " : Field_Name,
vRoutineName)
* Argument for the data type is not valid
CASE vData_Type <> "char" AND vData_Type <>
"alpha" AND vData_Type <> "numeric" AND
vData_Type <> "date"
Call DSTransformError("The value " : Data_Type :
" is not a valid data type for routine: ",
vRoutineName)
* Length is not a number
CASE Not(Num(Length))
Call DSTransformError("The length supplied is not
a number : Field Checked " : Field_Name,
vRoutineName)
CASE vData_Type = "date" And (Date_Format = "" OR
isNull(Date_Format))
END CASE
*********
******** Check The Values

*** Check the data type of supplied value ***
If vData_Type = 'numeric' Then
If Num(Field_Value) Then
vErr = 'OK'
End Else
vErr = LogToErrorFile("No
Table",Field_Name,Field_Value,'10002','Text1','Te
xt2','Text3','Value provided is not numeric')
Ans = ' [Not Numeric]'
End
End Else
If vData_Type = 'alpha' Then
☻Page 194 of 243☻

If Alpha(Field_Value) Then
vErr = 'OK'
End Else
xt2','Text3','Value provided is not alpha')
Ans = ' [Not Alpha]'
End
End Else
If vData_Type = 'date' Then

vErr = Iconv(Field_Value,Date_Format)
vErr = Status()
If vErr <> 0 Then
xt2','Text3','Value provided is not a valid date
for mask ':Date_Format)
Ans = ' [Invalid Date]'
End
End Else
End
End
End
*** Check the length of the supplied value ***

If Len(Field_Value) <= Length Then
vErr = 'OK'
End Else
xt2','Text3','Value provided is not the correct
length')
Ans = Ans : ' [Length Error]'
End
Ans = Ans
VatCheckSG:
Function VatcheckSg(Arg1)
String=Arg1
Slen=Len(String)
Scheck=0
CharCheck=0
For Scheck = 1 to Slen
Schar=Substrings(String,Scheck,1)
If NUM(Schar) <> 1 then
CharCheck=CharCheck+1
end
☻Page 195 of 243☻

Next
Ans=CharCheck
WriteParmFile:
Function writeparamfile(Arg1,Arg2,arg3,arg4)
Arg1; File Path

Arg2: File Name
Arg3: Parameter Name
Arg4: Parameter Value

vParamName = Arg3
vParamValue = Arg4
If Arg4 = -256 Then

vParamValue = ""
End

End
Loop
ReadSeq Dummy From FileVar Else Exit ;* at
end-of-file
Repeat
MyLine= vParamName : "=" : vParamValue

*Write New Error File
WriteSeqF MyLine To FileVar Else
Call DSLogFatal("Cannot write to ": FileVar ,
"Cannot write to file")
End
WeofSeq FileVar
CloseSeq FileVar
Ans=MyLine
WriteSeg:
* FUNCTION SegKey(Value,ErrorLogInd)
*
key
*
* Input Parameters : Arg1: Segment_Num
* Arg2: Segment_Parm
*
* Return Values: If the Segment should be
written return value is "Y"
* If If not return value is "N"
*
☻Page 196 of 243☻

*
*
RoutineName = 'WriteSeg'
* Determine if this segment should output
Write_Ind = Field(Segment_Parm,"|",Segment_Num)
If Write_Ind = "Y" then

Ans = "Y"
End
Else
Ans = "N"
End
SET_JOB_PARAMETERS_ROUTINE
InputArg……………..Arguments.
ErrorCode…………Arguments.
Routinuename: SetDSParamsFromFile
$INCLUDE DSINCLUDE DSD_STAGE.H
$INCLUDE DSINCLUDE DSD.H
$INCLUDE DSINCLUDE DSD_RTSTATUS.H
Equ SetParams To 'SetDSParamsFromFile'
ErrorCode = 0 ; * set
this to non-zero to stop the stage/job
JobName = Field(STAGECOM.NAME,'.',1,2)
ParamList =
STAGECOM.JOB.CONFIG<CONTAINER.PARAM.NAMES>
If ParamList = '' Then
Call DSLogWarn('Parameters may not be
externally derived if the job has no parameters
defined.',SetParams)
Return
End
Call DSLogInfo("SetDSParmsFromFile inputarg

>" : InputArg : "<", SetParms)
ArgList = Trims(Convert(',',@FM,InputArg))
ParamDir = ArgList<1>
If ParamDir = '' Then
ParamDir = '.'
End
ParamFile = ArgList<2>
If ParamFile = '' Then
ParamFile = JobName
☻Page 197 of 243☻

End
If System(91) Then
Delim = '\'
End Else
Delim = '/'
End
ParamPath = ParamDir:Delim:ParamFile
Call DSLogInfo('Setting Job Parameters from

external source ':ParamPath,SetParams)
Call DSLogInfo(JobName:' -
':ParamList,SetParams)
OpenSeq ParamPath To ParamFileVar On Error

ErrorCode = 1
Call DSLogFatal('File open error on
End Else
Call DSLogWarn('File ':ParamPath:' not
found - using default parameters.',SetParams)
Return
End
End Else
Call StatusFileName =
FileInfo(DSRTCOM.RTSTATUS.FVAR,1)
Readvu LockItem From DSRTCOM.RTSTATUS.FVAR,
JobName, 1 On Error
':JobName:' on ':StatusFileName:'. Status =
ErrorCode = 1
ReturnDSLogFatal('Failed to read
':JobName:' record from
ErrorCode = 2
Return
End
StatusId = JobName:'.':STAGECOM.WAVE.NUM
Readv ParamValues From
On Error
On Error Null
ErrorCode = 1
Return
End Else
On Error Null
☻Page 198 of 243☻

ErrorCode = 2
':StatusId:' record from
Return
End
Loop
ReadSeq ParamData From ParamFileVar On
Error
Release DSRTCOM.RTSTATUS.FVAR,
JobName On Error Null
ErrorCode = 4
Call DSLogFatal('File read error on
Return
End Else
Exit
End
Convert '=' To @FM In ParamData
ParamName = Trim(ParamData<1>)
Del ParamData<1>
ParamValue =
Convert(@FM,'=',TrimB(ParamData))
Locate(ParamName,ParamList,1;ParamPos)
Then
If
Index(UpCase(ParamName),'PASSWORD',1) = 0
Then Call DSLogInfo('Parameter
"':ParamName:'" set to
"':ParamValue:'"',SetParams)
Else Call DSLogInfo('Parameter
"':ParamName:'" set but not displayed on
log',SetParams)
End
Else
Call DSLogWarn('Parameter
':ParamName:' does not exist in Job
':JobName,SetParams)
Continue
End
ParamValues<1,ParamPos> = ParamValue
Repeat
Writev ParamValues On
On Error
On Error Null
ErrorCode = 5
Call DSLogFatal('File write error for
Return
End Else
☻Page 199 of 243☻

On Error Null
ErrorCode = 6
Call DSLogFatal('Unable to write
':StatusId:' record on ':StatusFileName:'. Status
= ':Status(),SetParams)
Return
End
Release DSRTCOM.RTSTATUS.FVAR, JobName On

Error Null
STAGECOM.JOB.STATUS<JOB.PARAM.VALUES> =
ParamValues
vnput Arg should be of the form:
ParamDir,ParamFile
If ParamDir is not supplied, the routine assumes
the Project directory
If ParamFile is not supplied, the routine assumes
the Job Name (this could be dangerous)
The routine will abort the job if anything
doesn't go to plan
Note: a lock is placed to stop the same job from

running another instance of this routine. The
second instance will have to wait for the routine
to finish before being allowed to proceed. The
lock is released however the routine terminates
(normal, abort...)
The parameter file should contain non-blank lines

of the form
ParName = ParValue
White space is ignored.
The Routine may be invoked via the normal Before

Job Subroutine setting, or from within the 'Job
Properties->Job Control' window by entering "Call
DSU.SetParams('MyDir,MyFile',ErrorCode)"
For Andrew Webb's eyes only -
The routine could be made to work off a hashed

file, or environment variables quite easily.
It is not possible to create Job Parameters on-

the-fly because they are referenced within a Job
via an EQUATE of the form
JobParam%%1 = STAGECOM.STATUS<7,1>
JobParam%%2 = STAGECOM.STATUS<7,2> etc
This is then compiled up....So forget it!
Tokens were replaced below as follows:
☻Page 200 of 243☻

* IdV0S0%%Result2%%1 <=
FR_PARIS_End_to_End_Processing_SAct.$UserStatus
FR_PARIS_End_to_End_Processing_SAct.$JobStatus
* IdV0S0%%Name%%3 <=
FR_PARIS_End_to_End_Processing_SAct.$JobName
Set_Job_Parameters_Routine.$ReturnValue
* IdV0S2%%Name%%6 <= Set_Job_Parameters_Routine.
$RoutineName
FR_PARIS_Control_Start_Processing_SAct.
$UserStatus
FR_PARIS_Control_Start_Processing_SAct.$JobStatus
* IdV0S57%%Name%%10 <=
FR_PARIS_Control_Start_Processing_SAct.$JobName
FR_PARIS_Control_End_Processing_SAct.$UserStatus
FR_PARIS_Control_End_Processing_SAct.$JobStatus
* IdV0S61%%Name%%13 <=
FR_PARIS_Control_End_Processing_SAct.$JobName
* IdV0S72%%Result1%%14 <= Abort_RAct.$ReturnValue
* IdV0S72%%Name%%16 <= Abort_RAct.$RoutineName
*
*** [Generated at 2005-07-07 09:41:15 - 7.1.0.8]
$INCLUDE DSINCLUDE DSJ_XFUNCS.H
EQU Time$$ Lit "Oconv(Time(), 'MTS:'):': '"
****************************************
* Graphical Sequencer generated code for Job
FR_PARIS_Control_Seq
****************************************
seq$V0S10$count = 0
seq$V0S43$count = 0
seq$V0S44$count = 0
handle$list = ""
id$list = ""
abort$list = ""
b$Abandoning = @False
b$AllStarted = @False
summary$restarting = @False
*** Sequence start point
summary$ =
DSMakeMsg("DSTAGE_JSG_M_0048\Summary of sequence
run", "")
If summary$restarting Then
DSMakeMsg("DSTAGE_JSG_M_0049\Sequence restarted
after failure", ""))
End Else
DSMakeMsg("DSTAGE_JSG_M_0051\Sequence started",
""))
End
GoSub L$V0S2$START
b$AllStarted = @True
GoTo L$WAITFORJOB
☻Page 201 of 243☻

*************************************************
*
L$V0S0$START:
*** Activity
"FR_PARIS_End_to_End_Processing_SAct": Initialize
job
started",
"FR_PARIS_End_to_End_Processing_SAct":@FM:"FR_PAR
IS_End_to_End_Processing_Seq"))
Call DSLogInfo(DSMakeMsg("SEQUENCE - START
End_to_End_Processing_Seq", ""),
"@FR_PARIS_End_to_End_Processing_SAct")
jb$V0S0 =
"FR_PARIS_End_to_End_Processing_Seq":'.':
(Module_Run_Parm)
h$V0S0 = DSAttachJob(jb$V0S0, DSJ.ERRNONE)
If (Not(h$V0S0)) Then
msg$ = DSMakeMsg("DSTAGE_JSG_M_0001\Error
jb$V0S0:@FM:DSGetLastErrorMsg())
msg$id =
"@FR_PARIS_End_to_End_Processing_SAct"; GoTo
L$ERROR
End
h$V0S0 = DSPrepareJob(h$V0S0)
msg$id =
L$ERROR
End
L$V0S0$PREPARED:
p$V0S0$1 = (Project_Parm)
err$code = DSSetParam(h$V0S0, "Project_Parm",
p$V0S0$1)
If (err$code <> DSJE.NOERROR) Then
calling DSSetParam(%1), code=%2[E]",
"Project_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$2 = (Module_Parm)
err$code = DSSetParam(h$V0S0, "Module_Parm",
p$V0S0$2)
"Module_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$3 = (Run_Parm)
☻Page 202 of 243☻

err$code = DSSetParam(h$V0S0, "Run_Parm",
p$V0S0$3)
"Run_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$4 = (Module_Run_Parm)
err$code = DSSetParam(h$V0S0,
"Module_Run_Parm", p$V0S0$4)
"Module_Run_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$5 = (Data_Object_Parm)
"Data_Object_Parm", p$V0S0$5)
"Data_Object_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$6 = (Interface_Parm)
"Interface_Parm", p$V0S0$6)
"Interface_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$7 = (Interface_Root_Path_Parm)
"Interface_Root_Path_Parm", p$V0S0$7)
"Interface_Root_Path_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$8 = (Generic_Root_Path_Parm)
"Generic_Root_Path_Parm", p$V0S0$8)
☻Page 203 of 243☻

"Generic_Root_Path_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$9 = (Business_Process_Parm)
"Business_Process_Parm", p$V0S0$9)
"Business_Process_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$10 = (Company_Parm)
err$code = DSSetParam(h$V0S0, "Company_Parm",
p$V0S0$10)
"Company_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$11 = (Country_Parm)
err$code = DSSetParam(h$V0S0, "Country_Parm",
p$V0S0$11)
"Country_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$12 = (SAP_Client_Parm)
"SAP_Client_Parm", p$V0S0$12)
"SAP_Client_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$13 = (Source_System_Parm)
"Source_System_Parm", p$V0S0$13)
"Source_System_Parm":@FM:err$code)
☻Page 204 of 243☻

msg$id =
L$ERROR
End
p$V0S0$14 = (Fileset_Type_Name_Parm)
"Fileset_Name_Type_Parm", p$V0S0$14)
"Fileset_Name_Type_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$15 = (Request_Type_Parm)
"Request_Type_Parm", p$V0S0$15)
"Request_Type_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$16 = (Timestamp_Parm)
"Timestamp_Parm", p$V0S0$16)
"Timestamp_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$17 = (Fileset_Parm)
err$code = DSSetParam(h$V0S0, "Fileset_Parm",
p$V0S0$17)
"Fileset_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$18 = (File_Parm)
err$code = DSSetParam(h$V0S0, "File_Parm",
p$V0S0$18)
"File_Parm":@FM:err$code)
msg$id =
L$ERROR
End
☻Page 205 of 243☻

p$V0S0$19 = (File_Name_Parm)
"File_Name_Parm", p$V0S0$19)
"File_Name_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$20 = (ISDB_Database_Parm)
"ISDB_Database_Parm", p$V0S0$20)
"ISDB_Database_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$21 = (ISDB_User_Parm)
"ISDB_User_Parm", p$V0S0$21)
"ISDB_User_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$22 = (ISDB_Password_Parm)
"ISDB_Password_Parm", p$V0S0$22)
"ISDB_Password_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$23 = (CSDB_Database_Parm)
"CSDB_Database_Parm", p$V0S0$23)
"CSDB_Database_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$24 = (CSDB_User_Parm)
"CSDB_User_Parm", p$V0S0$24)
☻Page 206 of 243☻

"CSDB_User_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$25 = (CSDB_Password_Parm)
"CSDB_Password_Parm", p$V0S0$25)
"CSDB_Password_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$26 = (Retest_Parm)
err$code = DSSetParam(h$V0S0, "Retest_Parm",
p$V0S0$26)
"Retest_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$27 = (Versions_Keep_Cnt_Parm)
"Versions_Keep_Cnt_Parm", p$V0S0$27)
"Versions_Keep_Cnt_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$28 = (Parm_File_Comma_Parm)
"Parm_File_Comma_Parm", p$V0S0$28)
"Parm_File_Comma_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$29 = (FTP_Server)
err$code = DSSetParam(h$V0S0, "FTP_Server",
p$V0S0$29)
"FTP_Server":@FM:err$code)
☻Page 207 of 243☻

msg$id =
L$ERROR
End
p$V0S0$30 = (FTP_Target_Path)
"FTP_Target_Path", p$V0S0$30)
"FTP_Target_Path":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$31 = (FTP_Port)
err$code = DSSetParam(h$V0S0, "FTP_Port",
p$V0S0$31)
"FTP_Port":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$32 = (FTP_User)
err$code = DSSetParam(h$V0S0, "FTP_User",
p$V0S0$32)
"FTP_User":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$33 = (FTP_Password)
err$code = DSSetParam(h$V0S0, "FTP_Password",
p$V0S0$33)
"FTP_Password":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$34 = (Abort_Msg_Parm)
"Abort_Msg_Parm", p$V0S0$34)
"Abort_Msg_Parm":@FM:err$code)
msg$id =
L$ERROR
End
☻Page 208 of 243☻

p$V0S0$35 = (Load_ISDB_Rejects_Parm)
"Load_ISDB_Rejects_Parm", p$V0S0$35)
"Load_ISDB_Rejects_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$36 = (Load_ISDB_Source_Parm)
"Load_ISDB_Source_Parm", p$V0S0$36)
"Load_ISDB_Source_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$37 = (Source_Delimiter_Parm)
"Source_Delimiter_Parm", p$V0S0$37)
"Source_Delimiter_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$38 = (KTGRM_Header_Default_Parm)
"KTGRM_Header_Default_Parm", p$V0S0$38)
"KTGRM_Header_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$39 = (VTWEG_Header_Default_Parm)
"VTWEG_Header_Default_Parm", p$V0S0$39)
"VTWEG_Header_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$40 = (SPART_Header_Default_Parm)
"SPART_Header_Default_Parm", p$V0S0$40)
☻Page 209 of 243☻

"SPART_Header_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$41 = (WAERS_Local_Default_Parm)
"WAERS_Local_Default_Parm", p$V0S0$41)
"WAERS_Local_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$42 = (WAERS_Foreign_Default_Parm)
"WAERS_Foreign_Default_Parm", p$V0S0$42)
"WAERS_Foreign_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$43 = (KURSK_Default_Parm)
"KURSK_Default_Parm", p$V0S0$43)
"KURSK_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$44 = (AUART_Header_Default_Parm)
"AUART_Header_Default_Parm", p$V0S0$44)
"AUART_Header_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$45 = (ZLSCH_Header_Default_Parm)
"ZLSCH_Header_Default_Parm", p$V0S0$45)
"ZLSCH_Header_Default_Parm":@FM:err$code)
☻Page 210 of 243☻

msg$id =
L$ERROR
End
p$V0S0$46 = (ZTERM_Header_Default_Parm)
"ZTERM_Header_Default_Parm", p$V0S0$46)
"ZTERM_Header_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$47 = (WERKS_Header_Default_Parm)
"WERKS_Header_Default_Parm", p$V0S0$47)
"WERKS_Header_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$48 = (VKORG_Header_Default_Parm)
"VKORG_Header_Default_Parm", p$V0S0$48)
"VKORG_Header_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$49 = (TAXM1_Line_Default_Parm)
"TAXM1_Line_Default_Parm", p$V0S0$49)
"TAXM1_Line_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$50 = (VRKME_Line_Default_Parm)
"VRKME_Line_Default_Parm", p$V0S0$50)
"VRKME_Line_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
☻Page 211 of 243☻

p$V0S0$51 = (VKGRP_Line_Default_Parm)
"VKGRP_Line_Default_Parm", p$V0S0$51)
"VKGRP_Line_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$52 = (VKBUR_Line_Default_Parm)
"VKBUR_Line_Default_Parm", p$V0S0$52)
"VKBUR_Line_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$53 = (LGORT_Line_Default_Parm)
"LGORT_Line_Default_Parm", p$V0S0$53)
"LGORT_Line_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$54 = (KOSTL_Line_Default_Parm)
"KOSTL_Line_Default_Parm", p$V0S0$54)
"KOSTL_Line_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$55 = (ZZTAXCD_Default_Parm)
"ZZTAXCD_Default_Parm", p$V0S0$55)
"ZZTAXCD_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$56 = (KBETR_Revenue_Default_Parm)
"KBETR_Revenue_Default_Parm", p$V0S0$56)
☻Page 212 of 243☻

"KBETR_Revenue_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$57 = (KSCHL_Revenue_Default_Parm)
"KSCHL_Revenue_Default_Parm", p$V0S0$57)
"KSCHL_Revenue_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$58 = (KSCHL_Surcharge_Default_Parm)
"KSCHL_Surcharge_Default_Parm", p$V0S0$58)
"KSCHL_Surcharge_Default_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$59 = (Reception_Job_Name_Parm)
"Reception_Job_Name_Parm", p$V0S0$59)
"Reception_Job_Name_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$60 = (Invocation_Parm)
"Invocation_Parm", p$V0S0$60)
"Invocation_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$61 = (GUNZIP_Path_Parm)
"GUNZIP_Path_Parm", p$V0S0$61)
"GUNZIP_Path_Parm":@FM:err$code)
☻Page 213 of 243☻

msg$id =
L$ERROR
End
p$V0S0$62 = (Run_Error_Mgmt_Parm)
"Run_Error_Mgmt_Parm", p$V0S0$62)
"Run_Error_Mgmt_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$63 = "Y"
"Run_Move_and_Tidy_Parm", p$V0S0$63)
"Run_Move_and_Tidy_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$64 = (Update_Status_In_DB_Parm)
"Update_Status_In_DB_Parm", p$V0S0$64)
"Update_Status_In_DB_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S0$65 = (Run_Reconciliation_Parm)
"Run_Reconciliation_Parm", p$V0S0$65)
"Run_Reconciliation_Parm":@FM:err$code)
msg$id =
L$ERROR
End
err$code = DSRunJob(h$V0S0, DSJ.RUNNORMAL)
jb$V0S0:@FM:err$code)
msg$id =
L$ERROR
End
handle$list<-1> = h$V0S0
id$list<-1> = "V0S0"
☻Page 214 of 243☻

Return
*************************************************
*
L$V0S0$FINISHED:
job$V0S0$status = DSGetJobInfo(h$V0S0,
DSJ.JOBSTATUS)
job$V0S0$userstatus = DSGetJobInfo(h$V0S0,
DSJ.USERSTATUS)
%2[E]",
"FR_PARIS_End_to_End_Processing_SAct":@FM:job$V0S
0$status))
IdV0S0%%Result2%%1 = job$V0S0$userstatus
IdV0S0%%Result1%%2 = job$V0S0$status
IdV0S0%%Name%%3 =
"FR_PARIS_End_to_End_Processing_Seq"
rpt$V0S0 = DSMakeJobReport(h$V0S0, 1, "CRLF")
dummy$ = DSDetachJob(h$V0S0)
If b$Abandoning Then GoTo L$WAITFORJOB
b$V0S0else = @True
If (job$V0S0$status = DSJS.RUNOK) Then GoSub
L$V0S61$START; b$V0S0else = @False
If b$V0S0else Then GoSub L$V0S44$START
GoTo L$WAITFORJOB
*************************************************
*
L$V0S10$START:
*** Sequencer "Run_Jobs_Seq": wait until inputs
ready
seq$V0S10$count += 1
If seq$V0S10$count < 1 Then Return
GoSub L$V0S57$START
Return
*************************************************
*
L$V0S2$START:
*** Activity "Set_Job_Parameters_Routine": Call
routine
started",
"Set_Job_Parameters_Routine":@FM:"DSU.SetDSParams
FromFile"))
Call DSLogInfo(DSMakeMsg("ROUTINE -
Set_Job_Parameters_Routine", ""),
"@Set_Job_Parameters_Routine")
rtn$ok =
DSCheckRoutine("DSU.SetDSParamsFromFile")
If (Not(rtn$ok)) Then
msg$ = DSMakeMsg("DSTAGE_JSG_M_0005\BASIC
"DSU.SetDSParamsFromFile")
msg$id = "@Set_Job_Parameters_Routine"; GoTo
L$ERROR
End
☻Page 215 of 243☻

Call 'DSU.SetDSParamsFromFile'(p$V0S2$1,
r$V0S2)
%2", "Set_Job_Parameters_Routine":@FM:r$V0S2))
IdV0S2%%Result1%%4 = r$V0S2
IdV0S2%%Name%%6 = "DSU.SetDSParamsFromFile"
If (r$V0S2 = 0) Then GoSub L$V0S10$START
Return
*************************************************
*
L$V0S43$START:
*** Sequencer "Success": wait until inputs ready
If seq$V0S43$count < 3 Then Return
Call DSLogInfo(DSMakeMsg("SEQUENCER - Control
Sequence Reports a SUCCESS on all Stages", ""),
"@Success")
Return
*************************************************
*
L$V0S44$START:
*** Sequencer "Fail": wait until inputs ready
If seq$V0S44$count > 0 Then Return
Call DSLogInfo(DSMakeMsg("SEQUENCER - Control
Sequence Reports at least one Stage FAILED", ""),
"@Fail")
GoSub L$V0S72$START
Return
*************************************************
*
L$V0S57$START:
*** Activity
"FR_PARIS_Control_Start_Processing_SAct":
Initialize job
started",
"FR_PARIS_Control_Start_Processing_SAct":@FM:"FR_
PARIS_Control_Start_Processing_Seq"))
Call DSLogInfo(DSMakeMsg("SEQUENCE - START
Control_Start_Processes_Seq", ""),
"@FR_PARIS_Control_Start_Processing_SAct")
jb$V0S57 =
"FR_PARIS_Control_Start_Processing_Seq":'.':
(Module_Run_Parm)
msg$id =
"@FR_PARIS_Control_Start_Processing_SAct"; GoTo
L$ERROR
End
☻Page 216 of 243☻

msg$id =
L$ERROR
End
L$V0S57$PREPARED:
p$V0S57$1)
msg$id =
L$ERROR
End
p$V0S57$2)
msg$id =
L$ERROR
End
p$V0S57$3)
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 217 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
p$V0S57$10)
msg$id =
L$ERROR
End
☻Page 218 of 243☻

p$V0S57$11)
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 219 of 243☻

msg$id =
L$ERROR
End
p$V0S57$17)
msg$id =
L$ERROR
End
p$V0S57$18)
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 220 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
p$V0S57$26)
msg$id =
L$ERROR
End
☻Page 221 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
p$V0S57$29)
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
p$V0S57$31)
msg$id =
L$ERROR
End
p$V0S57$32)
☻Page 222 of 243☻

msg$id =
L$ERROR
End
p$V0S57$33)
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 223 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 224 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 225 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 226 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 227 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
id$list<-1> = "V0S57"
Return
*************************************************
*
L$V0S57$FINISHED:
DSJ.JOBSTATUS)
DSJ.USERSTATUS)
%2[E]",
"FR_PARIS_Control_Start_Processing_SAct":@FM:job$
V0S57$status))
IdV0S57%%Name%%10 =
"FR_PARIS_Control_Start_Processing_Seq"
rpt$V0S57 = DSMakeJobReport(h$V0S57, 1,
"CRLF")
b$V0S57else = @True
GoTo L$WAITFORJOB
*************************************************
*
L$V0S61$START:
*** Activity
"FR_PARIS_Control_End_Processing_SAct":
Initialize job
started",
☻Page 228 of 243☻

"FR_PARIS_Control_End_Processing_SAct":@FM:"FR_PA
RIS_Control_End_Processing_Seq"))
Call DSLogInfo(DSMakeMsg("SEQUENCE START -
Control_End_Processing_Seq", ""),
"@FR_PARIS_Control_End_Processing_SAct")
jb$V0S61 =
"FR_PARIS_Control_End_Processing_Seq":'.':
(Module_Run_Parm)
msg$id =
"@FR_PARIS_Control_End_Processing_SAct"; GoTo
L$ERROR
End
msg$id =
L$ERROR
End
L$V0S61$PREPARED:
p$V0S61$1)
msg$id =
L$ERROR
End
p$V0S61$2)
msg$id =
L$ERROR
End
p$V0S61$3)
msg$id =
L$ERROR
☻Page 229 of 243☻

End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 230 of 243☻

msg$id =
L$ERROR
End
p$V0S61$10)
msg$id =
L$ERROR
End
p$V0S61$11)
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 231 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
p$V0S61$17)
msg$id =
L$ERROR
End
p$V0S61$18)
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 232 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 233 of 243☻

msg$id =
L$ERROR
End
p$V0S61$26)
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
p$V0S61$29)
msg$id =
L$ERROR
End
☻Page 234 of 243☻

msg$id =
L$ERROR
End
p$V0S61$31)
msg$id =
L$ERROR
End
p$V0S61$32)
msg$id =
L$ERROR
End
p$V0S61$33)
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 235 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 236 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 237 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 238 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
☻Page 239 of 243☻

msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
p$V0S61$60 = (Run_Reconciliation_Parm)
"Run_Reconciliation_Parm", p$V0S61$60)
"Run_Reconciliation_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S61$61 = (Run_Error_Mgmt_Parm)
"Run_Error_Mgmt_Parm", p$V0S61$61)
"Run_Error_Mgmt_Parm":@FM:err$code)
msg$id =
L$ERROR
End
p$V0S61$62 = (Update_Status_In_DB_Parm)
"Update_Status_In_DB_Parm", p$V0S61$62)
"Update_Status_In_DB_Parm":@FM:err$code)
☻Page 240 of 243☻

msg$id =
L$ERROR
End
p$V0S61$63 = (Load_ISDB_Cmn_Fmt_Parm)
"Load_ISDB_Cmn_Fmt_Parm", p$V0S61$63)
"Load_ISDB_Cmn_Fmt_Parm":@FM:err$code)
msg$id =
L$ERROR
End
msg$id =
L$ERROR
End
id$list<-1> = "V0S61"
Return
*************************************************
*
L$V0S61$FINISHED:
DSJ.JOBSTATUS)
DSJ.USERSTATUS)
%2[E]",
"FR_PARIS_Control_End_Processing_SAct":@FM:job$V0
S61$status))
IdV0S61%%Name%%13 =
"FR_PARIS_Control_End_Processing_Seq"
rpt$V0S61 = DSMakeJobReport(h$V0S61, 1,
"CRLF")
b$V0S61else = @True
GoTo L$WAITFORJOB
*************************************************
*
L$V0S72$START:
*** Activity "Abort_RAct": Call routine
☻Page 241 of 243☻

started",
"Abort_RAct":@FM:"DSX.UTILITYABORTTOLOG"))
rtn$ok =
DSCheckRoutine("DSX.UTILITYABORTTOLOG")
If (Not(rtn$ok)) Then
msg$ = DSMakeMsg("DSTAGE_JSG_M_0005\BASIC
"DSX.UTILITYABORTTOLOG")
msg$id = "@Abort_RAct"; GoTo L$ERROR
End
Call 'DSX.UTILITYABORTTOLOG'(r$V0S72,
p$V0S72$1)
%2", "Abort_RAct":@FM:r$V0S72))
IdV0S72%%Result1%%14 = r$V0S72
IdV0S72%%Name%%16 = "DSX.UTILITYABORTTOLOG"
Return
*************************************************
*
L$WAITFORJOB:
If handle$list = "" Then GoTo L$FINISH
handle$ = DSWaitForJob(handle$list)
If handle$ = 0 Then handle$ = handle$list<1>
Locate handle$ In handle$list Setting index$
Then
id$ = id$list<index$>
b$Abandoning = abort$list<index$>
Del id$list<index$>; Del
handle$list<index$>; Del abort$list<index$>
Begin Case
Case id$ = "V0S0"
GoTo L$V0S0$FINISHED
Case id$ = "V0S57"
Case id$ = "V0S61"
End Case
End
* Error if fall though
handle$list = ""
calling DSWaitForJob(), code=%1[E]", handle$)
msg$id = "@Coordinator"; GoTo L$ERROR
*************************************************
*
L$ERROR:
Call
DSLogWarn(DSMakeMsg("DSTAGE_JSG_M_0009\Controller
problem: %1", msg$), msg$id)
DSMakeMsg("DSTAGE_JSG_M_0052\Exception raised:
%1", msg$id:", ":msg$))
abort$list = Ifs(handle$list, Str(1:@FM,
DCount(handle$list, @FM)), "")
b$Abandoning = @True
GoTo L$WAITFORJOB
L$ABORT:
☻Page 242 of 243☻

DSMakeMsg("DSTAGE_JSG_M_0056\Sequence failed",
""))
Call DSLogInfo(summary$, "@Coordinator")
Call
DSLogFatal(DSMakeMsg("DSTAGE_JSG_M_0013\Sequence
job will abort due to previous unrecoverable
errors", ""), "@Coordinator")
*************************************************
*
L$FINISH:
If b$Abandoning Then GoTo L$ABORT
If Not(b$AllStarted) Then Return
DSMakeMsg("DSTAGE_JSG_M_0054\Sequence finished
OK", ""))
Call DSLogInfo(summary$, "@Coordinator")
L$EXIT: Return To L$EXIT
☻Page 243 of 243☻

DataStage FAQ

Uploaded by

Copyright:

Available Formats

DataStage FAQ

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DataStage FAQ

Uploaded by

Copyright:

Available Formats

DATASTAGE FAQ’s & TUTORIAL’s TOPIC INDEX

3. What are types of Hashed File?

By Default Hashed file is "Dynamic - Type Random 30 D"

4. What does a Config File in parallel extender consist of?

5. What is Modulus and Splitting in Dynamic Hashed File?

6. What are Stage Variables, Derivations and Constants?

7. Types of views in Datastage Director?

8. Types of Parallel Processing?

9. Orchestrate Vs Datastage Parallel Extender?

10. Importance of Surrogate Key in Data warehousing?

Function to convert mm/dd/yyyy format to yyyy-dd-mm is

13 How do you execute datastage job from command line prompt?

14. Functionality of Link Partitioner and Link Collector?

15. Types of Dimensional Modeling?

16. Differentiate Primary Key and Partition Key?

17. Differentiate Database data and Data warehouse data?

18. Containers Usage and Types?

19. Compare and Contrast ODBC and Plug-In stages?

20. Dimension Modelling types along with their significance

Q 21 What are Ascential Dastastage Products, Connectivity

Q 22 Explain Data Stage Architecture?

Data Stage Administrator:

We can set the buffer size (by default 128 MB)

Data Stage Manager:

Data Stage Director:

Q 23 What is Meta Data Repository?

Q 24 What is Data Stage Engine?

Q 25 What is Dimensional Modeling?

Q 26 What is Star Schema?

Q 27 What is surrogate Key?

Q 28 Why we need surrogate key?

Q 29 What is Snowflake schema?

Q 30 Explain Types of Fact Tables?

Q 31 Explain the Types of Dimension Tables?

Q 32 What are stage variables?

Q 36 What are Macros?

These macros provide the functionality of using the DSGetProjectInfo, DSGetJobInfo,

The available macros are: DSHostName

2) To obtain the name of the current job:

To obtain the full current stage name:

MyName = DSJobName : ″ .″ : DSStageName

Q 38 What are stages?

Q 39 What index is created on Data Warehouse?

To do this ... Use this function ...

Get a list of stages of a particular type in a job. DSGetStagesOfType

Get information about the types of stage in a job. DSGetStageTypes

Get information about a link in a controlled job or current job DSGetLinkInfo

Get information about a controlled job’s parameters DSGetParamInfo

Get the log event from the job log DSGetLogEntry

Log an event to the job log of a different job DSLogEvent

Log an information message in a job's log file. DSLogInfo

Ensure a job is in the correct state to be run or validated. DSPrepareJob

Interface to system send mail facility. DSSendMail

Checks if a BASIC routine is cataloged, either in VOC as a DSCheckRoutine

Q 43 What is data stage Transform?

Question: What are Stage Variables, Derivations and Constants?

Question: What is Hash file stage and what is it used for?

Question: What are types of Hashed File?

Question: What is the Usage of Containers? What are its types?

Question: Compare and Contrast ODBC and Plug-In stages?