Transformation
Transformation
Transformation
A transformation is a repository object which reads the data, modifies the data and passes the
data. Transformations in a mapping represent the operations that the integration service performs
on the data.
Active Transformations:
A transformation can be called as an active transformation if it performs any of the following
actions.
Change the number of rows: For example, the filter transformation is active because it
removes the rows that do not meet the filter condition. All multi-group transformations
are active because they might change the number of rows that pass through the
transformation.
Change the transaction boundary: The transaction control transformation is active
because it defines a commit or roll back transaction.
Change the row type: Update strategy is active because it flags the rows for insert, delete,
update or reject.
Passive Transformations:
Transformations which does not change the number of rows passed through them, maintains the
transaction boundary and row type are called passive transformation.
Connected Transformations:
Transformations which are connected to the other transformations in the mapping are called
connected transformations.
Unconnected Transformations:
An unconnected transformation is not connected to other transformations in the mapping and is
called within another transformation, and returns a value to that.
Source Qualifier Transformation
The source qualifier transformation is an active, connected transformation used to represent the
rows that the integrations service reads when it runs a session. You need to connect the source
qualifier transformation to the relational or flat file definition in a mapping. The source qualifier
transformation converts the source data types to the Informatica native data types. So, you should
not alter the data types of the ports in the source qualifier transformation.
Joins: You can join two or more tables from the same source database. By default the
sources are joined based on the primary key-foreign key relationships. This can be
changed by explicitly specifying the join condition in the "user-defined join" property.
Filter rows: You can filter the rows from the source database. The integration service
adds a WHERE clause to the default query.
Sorting input: You can sort the source data by specifying the number for sorted ports.
The Integration Service adds an ORDER BY clause to the default SQL query
Distinct rows: You can get distinct rows from the source by choosing the "Select
Distinct" property. The Integration Service adds a SELECT DISTINCT statement to the
default SQL query.
Custom SQL Query: You can write your own SQL query to do calculations.
Property Description
SQL Query To specify a custom query which replaces the default query.
User-Defined Join Condition used for joining multiple sources.
Specifies the filter condition the Integration Service applies when querying
Source Filter
rows.
Number of Sorted
Used for sorting the source data
Ports
Sets the amount of detail included in the session log when you run a session
Tracing Level
containing this transformation.
Select Distinct To select only unique rows from the source.
Pre-session SQL commands to run against the source database before the
Pre-SQL
Integration Service reads the source.
Post-session SQL commands to run against the source database after the
Post-SQL
Integration Service writes to the target.
Output is
Specify only when the source output does not change between session runs.
Deterministic
Output is Specify only when the order of the source output is same between the
Repeatable session runs.
Note: For flat file source definitions, all the properties except the Tracing level will be disabled.
For relational sources, the Integration Service generates a query for each Source Qualifier
transformation when it runs a session. To view the default query generated, just follow the below
steps:
Go to the Properties tab, select "SQL Query" property. Then open the SQL Editor, select
the "ODBC data source" and enter the username, password.
Click Generate SQL.
Click Cancel to exit.
You can write your own SQL query rather than relaying the default query for performing
calculations.
Note: You can generate the SQL query only if the output ports of source qualifier transformation
is connected to any other transformation in the mapping. The SQL query generated contains only
the columns or ports which are connected to the downstream transformations.
Specifying the "Source Filter, Number Of Sorted Ports and Select Distinct" properties:
Follow the below steps for specifying the filter condition, sorting the source data and for
selecting the distinct rows.
Joins:
The SQL transformation can be used to join sources from the same database. By default it joins
the sources based on the primary-key, foreign-key relationships. To join heterogeneous sources,
use Joiner Transformation.
A foreign-key is created on the department_id column of the employees table, which references
the primary-key column, department_id, of the departments table.
Follow the below steps to see the default join
Create only one source qualifier transformation for both the employees and departments.
Go to the properties tab of the source qualifier transformation, select the "SQL QUERY"
property and generate the SQL query.
There might be case where there won't be any relationship between the sources. In that case, we
need to override the default join. To do this we have to specify the join condition in the "User
Defined Join" Property. Using this property we can specify outer joins also. The join conditions
entered here are database specific.
As an example, if we want to join the employees and departments table on the manager_id
column, then in the "User Defined Join" property specify the join condition as
"departments.manager_id=employees.manager_id". Now generate the SQL and observe the
WHERE clause.
You can add the Pre-SQL and Post-SQL commands. The integration service runs the Pre-SQL
and Post-SQL before and after reading the source data respectively.
FALSE will drop all the records from Processing and load nothing to target.
FALSE can also be written as ZERO.
Transformation: You can enter the name and description of the transformation.
Ports: Create new ports and configure them
Properties: You can specify the filter condition to filter the rows. You can also configure
the tracing levels.
Metadata Extensions: Specify the metadata details like name, datatype etc.
The following properties needs to be configured on the ports tab in filter transformation
Use the filter transformation as close as possible to the sources in the mapping. This will
reduce the number of rows to be processed in the downstream transformations.
In case of relational sources, if possible use the source qualifier transformation to filter
the rows. This will reduce the number of rows to be read from the source.
Note: The input ports to the filter transformation mush come from a single transformation. You
cannot connect ports from more than one transformation to the filter.
Sorter Transformation
Sorter transformation is an active and connected transformation used to sort the data. The data
can be sorted in ascending or descending order by specifying the sort key. You can specify one
or more ports as a sort key and configure each sort key port to sort in ascending or descending
order. You can also configure the order of the ports in which the integration service applies to
sort the data.
The sorter transformation is used to sort the data from relational or flat file sources. The sorter
transformation can also be used for case-sensitive sorting and can be used to specify whether the
output rows should be distinct or not.
Case Sensitive: The integration service considers the string case when sorting the data.
The integration service sorts the uppercase characters higher than the lowercase
characters.
Work Directory: The integration service creates temporary files in the work directory
when it is sorting the data. After the integration service sorts the data, it deletes the
temporary files.
Distinct Output Rows: The integration service produces distinct rows in the output when
this option is configured.
Tracing Level: Configure the amount of data needs to be logged in the session log file.
Null Treated Low: Enable the property, to treat null values as lower when performing
the sort operation. When disabled, the integration service treats the null values as higher
than any other value.
Sorter Cache Size: The integration service uses the sorter cache size property to
determine the amount of memory it can allocate to perform sort operation
Use the sorter transformation before the aggregator and joiner transformation and sort the data
for better performance.
Aggregator Transformation
Aggregator transformation is an active transformation used to perform calculations such as sums,
averages, counts on groups of data. The integration service stores the data group and row data in
aggregate cache. The Aggregator Transformation provides more advantages than the SQL, you
can use conditional clauses to filter rows.
Aggregate Cache: The integration service stores the group values in the index cache and
row data in the data cache.
Aggregate Expression: You can enter expressions in the output port or variable port.
Group by Port: This tells the integration service how to create groups. You can configure
input, input/output or variable ports for the group.
Sorted Input: This option can be used to improve the session performance. You can use
this option only when the input to the aggregator transformation in sorted on group by
ports.
Property Description
Cache Directory The Integration Service creates the index and data cache files.
Tracing Level Amount of detail displayed in the session log for this transformation.
Indicates input data is already sorted by groups. Select this option only if
Sorted Input
the input to the Aggregator transformation is sorted.
Aggregator Data
Default cache size is 2,000,000 bytes. Data cache stores row data.
Cache Size
Aggregator Index Default cache size is 1,000,000 bytes. Index cache stores group by ports
Cache Size data
Specifies how the Integration Service applies the transformation logic to
Transformation Scope
incoming data
Group By Ports:
The integration service performs aggregate calculations and produces one row for each group. If
you do not specify any group by ports, the integration service returns one row for all input rows.
By default, the integration service returns the last row received for each group along with the
result of aggregation. By using the FIRST function, you can specify the integration service to
return the first row of the group.
Aggregate Expressions:
This transformation offers even more functionality than SQL’s group by statements since one
can apply conditional logic to groups within the aggregator transformation. Many different
aggregate functions can be applied to individual output ports within the transformation. One is
also able to code nested aggregate functions as well. Below is a list of these aggregate functions:
AVG
COUNT
FIRST
LAST
MAX
MEDIAN
MIN
PERCENTILE
STDDEV
SUM
VARIANCE
Conditional clauses
You can reduce the number of rows processed in the aggregation by specifying a conditional
clause.
Example: SUM(salary, salary >1000)
This will include only the salaries which are greater than 1000 in the SUM calculation.
Non Conditional clauses
You can also use non-aggregate functions in aggregator transformation.
Example: IIF( SUM(sales) <20000, SUM(sales),0)
Note: By default, the Integration Service treats null values as NULL in aggregate functions. You
can change this by configuring the integration service.
Sorted Input:
You can improve the performance of aggregator transformation by specifying the sorted input.
The Integration Service assumes all the data is sorted by group and it performs aggregate
calculations as it reads rows for a group. If you specify the sorted input option without actually
sorting the data, then integration service fails the session.
Router Transformation
Router transformation is an active and connected transformation. It is similar to the filter
transformation used to test a condition and filter the data. In a filter transformation, you can
specify only one condition and drops the rows that do not satisfy the condition. Where as in a
router transformation, you can specify more than one condition and provides the ability for route
the data that meet the test condition. Use router transformation if you need to test the same input
data on multiple conditions.
The router transformation has input and output groups. You need to configure these groups.
Input groups: The designer copies the input ports properties to create a set of output
ports for each output group.
Output groups: Router transformation has two output groups. They are user-defined
groups and default group.
User-defined groups: Create a user-defined group to test a condition based on the incoming
data. Each user-defined group consists of output ports and a group filter condition. You can
create or modify the user-defined groups on the groups tab. Create one user-defined group for
each condition you want to specify.
Default group: The designer creates only one default group when you create one new user-
defined group. You cannot edit or delete the default group. The default group does not have a
group filter condition. If all the conditions evaluate to FALSE, the integration service passes the
row to the default group.
Specify the group filter condition on the groups tab using the expression editor. You can enter
any expression that returns a single value. The group filter condition returns TRUE or FALSE
for each row that passes through the transformation.
Use router transformation to test multiple conditions on the same input data. If you use more
than one filter transformation, the integration service needs to process the input for each filter
transformation. In case of router transformation, the integration service processes the input data
only once and thereby improving the performance.
Union Transformation
Union transformation is an active and connected transformation. It is multi input group
transformation used to merge the data from multiple pipelines into a single pipeline. Basically it
merges data from multiples sources just like the UNION ALL set operator in SQL. The union
transformation does not remove any duplicate rows.
The following rules and guidelines should be used when using a union transformation in a
mapping
Union transformation contains only one output group and can have multiple input groups.
The input groups and output groups should have matching ports. The datatype, precision
and scale must be same.
Union transformation does not remove duplicates. To remove the duplicate rows use
sorter transformation with "select distinct" option after the union transformation.
The union transformation does not generate transactions.
You cannot connect a sequence generator transformation to the union transformation.
Union transformation does not generate transactions.
Note: The ports tab displays the groups and ports you create. You cannot edit the port or group
information in the ports tab. To do changes use the groups tab and group ports tab.
Union is an active transformation because it combines two or more data streams into one.
Though the total number of rows passing into the Union is the same as the total number of rows
passing out of it, and the sequence of rows from any given input stream is preserved in the
output, the positions of the rows are not preserved, i.e. row number 1 from input stream 1 might
not be row number 1 in the output stream. Union does not even guarantee that the output is
repeatable.
Start Value: Specify the start value of the generated sequence that you want the
integration service to use the cycle option. If you select cycle, the integration service
cycles back to this value when it reaches the end value.
Increment By: Difference between two consecutive values from the NEXTVAL port.
Default value is 1. Maximum value you can specify is 2,147,483,647.
End Value: Maximum sequence value the integration service generates. If the integration
service reaches this value during the session and the sequence is not configured to cycle,
the session fails. Maximum value is 9,223,372,036,854,775,807.
Current Value: Current Value of the sequence. This value is used as the first value in the
sequence. If cycle option is configured, then this value must be greater than or equal to
start value and less than end value.
Cycle: The integration service cycles through the sequence range.
Number of Cached Values: Number of sequential values the integration service caches
at a time. Use this option when multiple sessions use the same reusable generator. Default
value for non-reusable sequence generator is 0 and reusable sequence generator is 1000.
Maximum value is ,223,372,036,854,775,807.
Reset: The integration service generate values based on the original current value for
each session. Otherwise, the integration service updates the current value to reflect the
last-generated value for the session plus one.
Tracing level: The level of detail to be logged in the session log file.
The sequence generator transformation contains only two output ports. They are CURRVAL and
NEXTVAL output ports.
NEXTVAL Port:
You can connect the NEXTVAL port to multiple transformations to generate the unique values
for each row in the transformation. The NEXTVAL port generates the sequence numbers base on
the Current Value and Increment By properties. If the sequence generator is not configure to
Cycle, then the NEXTVAL port generates the sequence numbers up to the configured End
Value.
The sequence generator transformation generates a block of numbers at a time. Once the block of
numbers is used then it generates the next block of sequence numbers. As an example, let say
you connected the nextval port to two targets in a mapping, the integration service generates a
block of numbers (eg:1 to 10) for the first target and then another block of numbers (eg:11 to 20)
for the second target.
If you want the same sequence values to be generated for more than one target, then connect the
sequence generator to an expression transformation and connect the expression transformation
port to the targets. Another option is create sequence generator transformation for each target.
CURRVAL Port:
The CURRVAL is the NEXTVAL plus the Increment By value. You rarely connect the
CURRVAL port to other transformations. When a row enters a transformation connected to the
CURRVAL port, the integration service passes the NEXTVAL value plus the Increment By
value. For example, when you configure the Current Value=1 and Increment By=1, then the
integration service generates the following values for NEXTVAL and CURRVAL ports.
NEXTVAL CURRVAL
---------------
1 2
2 3
3 4
4 5
5 6
If you connect only the CURRVAL port without connecting the NEXTVAL port, then the
integration service passes a constant value for each row.
RANK Transformation
Rank transformation is an active and connected transformation. The rank transformation is used
to select the top or bottom rank of data. The rank transformation is used to select the smallest or
largest numeric/string values. The integration service caches the input data and then performs the
rank calculations.
Cache Directory: Directory where the integration service creates the index and data
cache files.
Top/Bottom: Specify whether you want to select the top or bottom rank of data.
Number of Ranks: specify the number of rows you want to rank.
Case-Sensitive String Comparison: Used to sort the strings using case sensitive or not.
Tracing Level: Amount of logging to be tracked in the session log file.
Rank Data Cache Size: The data cache size default value is 2,000,000 bytes. You can
set a numeric value, or Auto for the data cache size. In case of Auto, the Integration
Service determines the cache size at runtime.
Rank Index Cache Size: The index cache size default value is 1,000,000 bytes. You can
set a numeric value, or Auto for the index cache size. In case of Auto, the Integration
Service determines the cache size at runtime.
In the informatica power center, you can define the transaction at the following levels:
Mapping level: Use the transaction control transformation to define the transactions.
Session level: You can specify the "Commit Type" option in the session properties tab.
The different options of "Commit Type" are Target, Source and User Defined. If you
have used the transaction control transformation in the mapping, then the "Commit Type"
will always be "User Defined"
When you run a session, the integration service evaluates the expression for each row in the
transaction control transformation. When it evaluates the expression as commit, then it commits
all the rows in the transaction to the target(s). When the integration service evaluates the
expression as rollback, then it roll back all the rows in the transaction from the target(s).
When you have flat file as the target, then the integration service creates an output file for each
time it commits the transaction. You can dynamically name the target flat files. Look at the
example for creating flat files dynamically - Dynamic flat file creation.
You can configure the following components in the transaction control transformation:
Transformation Tab: You can rename the transformation and add a description.
Ports Tab: You can create input/output ports
Properties Tab: You can define the transaction control expression and tracing level.
Metadata Extensions Tab: You can add metadata information.
You can enter the transaction control expression in the Transaction Control Condition option in
the properties tab. The transaction control expression uses the IIF function to test each row
against the condition. Use the following syntax for the expression
Syntax:
Example:
IIF(dept_id=10, TC_COMMIT_BEFORE,TC_ROLLBACK_BEFORE)
Use the following built-in variables in the expression editor of the transaction control
transformation:
If the transaction control transformation evaluates to a value other than the commit, rollback or
continue, then the integration service fails the session.
Use the following rules and guidelines when you create a mapping with a Transaction Control
transformation:
If the mapping includes an XML target, and you choose to append or create a new
document on commit, the input groups must receive data from the same transaction
control point.
Transaction Control transformations connected to any target other than relational, XML,
or dynamic MQSeries targets are ineffective for those targets.
You must connect each target instance to a Transaction Control transformation.
You can connect multiple targets to a single Transaction Control transformation.
You can connect only one effective Transaction Control transformation to a target.
You cannot place a Transaction Control transformation in a pipeline branch that starts
with a Sequence Generator transformation.
If you use a dynamic Lookup transformation and a Transaction Control transformation in
the same mapping, a rolled-back transaction might result in unsynchronized target data.
A Transaction Control transformation may be effective for one target and ineffective for
another target. If each target is connected to an effective Transaction Control
transformation, the mapping is valid.
Either all targets or none of the targets in the mapping should be connected to an effective
Transaction Control transformation.
When you want to maintain a history or source in the target table, then for every change in the
source record you want to insert a new record in the target table.
When you want an exact copy of source data to be maintained in the target table, then if the
source data changes you have to update the corresponding records in the target.
The design of the target table decides how to handle the changes to existing rows. In the
informatica, you can set the update strategy at two different levels:
Session Level: Configuring at session level instructs the integration service to either treat
all rows in the same way (Insert or update or delete) or use instructions coded in the
session mapping to flag for different database operations.
Mapping Level: Use update strategy transformation to flag rows for inert, update, delete
or reject.
You have to flag each row for inserting, updating, deleting or rejecting. The constants and their
numeric equivalents for each database operation are listed below.
You have to flag rows by assigning the constant numeric values using the update strategy
expression. The update strategy expression property is available in the properties tab of the
update strategy transformation.
Each row is tested against the condition specified in the update strategy expression and a
constant value is assigned to it. A sample expression is show below:
Mostly IIF and DECODE functions are used to test for a condition in update strategy
transformation.
Update strategy transformation is used mostly with lookup transformation. The row from the
source qualifier is compared with row from lookup transformation to determine whether it is
already exists or a new record. Based on this comparison, the row is flagged to insert or update
using the update strategy transformation.
If you place an update strategy before an aggregator transformation, the way the aggregator
transformation performs aggregate calculations depends on the flagging of the row. For example,
if you flag a row for delete and then later use the row to calculate the sum, then the integration
service subtracts the value appearing in this row. If it’s flagged for insert, then the aggregator
adds its value to the sum.
Important Note:
Update strategy works only when we have a primary key on the target table. If there is no
primary key available on the target table, then you have to specify a primary key in the target
definition in the mapping for update strategy transformation to work.
Lookup Transformation in Informatica
Lookup transformation is used to look up data in a flat file, relational table, view or synonym.
Lookup is a passive/active transformation and can be used in both connected/unconnected
modes. From informatica version 9 onwards lookup is an active transformation. The lookup
transformation can return a single row or multiple rows.
You can import the definition of lookup from any flat file or relational database or even from a
source qualifier. The integration service queries the lookup source based on the ports, lookup
condition and returns the result to other transformations or target in the mapping.
Get a Related Value: You can get a value from the lookup table based on the source
value. As an example, we can get the related value like city name for the zip code value.
Get Multiple Values: You can get multiple rows from a lookup table. As an example,
get all the states in a country.
Perform Calculation. We can use the value from the lookup table and use it in
calculations.
Update Slowly Changing Dimension tables: Lookup transformation can be used to
determine whether a row exists in the target or not.
You can configure the lookup transformation in the following types of lookup:
Flat File or Relational lookup: You can perform the lookup on the flat file or relational
database. When you create a lookup using flat file as lookup source, the designer invokes
flat file wizard. If you used relational table as lookup source, then you can connect to the
lookup source using ODBC and import the table definition.
Pipeline Lookup: You can perform lookup on application sources such as JMS, MSMQ
or SAP. You have to drag the source into the mapping and associate the lookup
transformation with the source qualifier. Improve the performance by configuring
partitions to retrieve source data for the lookup cache.
Connected or Unconnected lookup: A connected lookup receives source data, performs
a lookup and returns data to the pipeline. An unconnected lookup is not connected to
source or target or any other transformation. A transformation in the pipeline calls the
lookup transformation with the :LKP expression. The unconnected lookup returns one
column to the calling transformation.
Cached or Uncached Lookup: You can improve the performance of the lookup by
caching the lookup source. If you cache the lookup source, you can use a dynamic or
static cache. By default, the lookup cache is static and the cache does not change during
the session. If you use a dynamic cache, the integratiion service inserts or updates row in
the cache. You can lookup values in the cache to determine if the values exist in the
target, then you can mark the row for insert or update in the target.
Cache the lookup transformation: This will query the lookup source once and stores the data
in the cache. Whenever a row enters the lookup, the lookup retrieves the data from the lookup
source rather than querying the lookup source again. This will improve the performance of
lookup a lot.
Restrict Order by columns: By default, the integration orders by on all ports in the lookup
transformation. Override this default order by clause to include few ports in the lookup.
Persistent Cache: If your lookup source is not going change at all (example: countries, zip
codes). Use persistent cache in this case.
Prefer Static Cache over Dynamic Cache: If you use dynamic cache, the lookup may update
the cache. Updating the lookup cache is overhead. Avoid dynamic cache.
Restrict Number of lookup ports: Make sure that you include only the required ports in the
lookup transformation. Unnecessary ports in the lookup make the lookup to take time in
querying the lookup source, building the lookup cache.
Sort the flat file lookups: If the lookup source is a flat file, using the sorted input option
improves the performance.
Indexing the columns: If you have used any columns in the where clause, creating any index (in
case of relational lookups) on these columns improves the performance of querying the lookup
source.
Database level tuning: For relational lookups you can improve the performance by doing some
tuning at database level.
Lookup Transformation is Active from Informatica version 9.x
One of the changes that made in informatica version 9 was making the lookup transformation as
active transformation. The lookup transformation can return all the matching rows.
When creating the lookup transformation itself you have to specify whether the lookup
transformation returns multiple rows or not. Once you make the lookup transformation as active
transformation, you cannot change it back to passive transformation. The "Lookup Policy on
Multiple Match" property value will become "Use All Values". This property becomes read-only
and you cannot change this property.
As an example, for each country you can configure the lookup transformation to return all the
states in that country. You can cache the lookup table to improve performance. If you configure
the lookup transformation for caching, the integration service caches all the rows form the
lookup source. The integration service caches all rows for a lookup key by the key index.
Follow the below guidelines when you configure the lookup transformation to return multiple
rows:
You can cache all the rows from the lookup source for cached lookups.
You can customize the SQL Override for both cached and uncache lookup that return
multiple rows.
You cannot use dynamic cache for Lookup transformation that returns multiple rows.
You cannot return multiple rows from an unconnected Lookup transformation.
You can configure multiple Lookup transformations to share a named cache if the
Lookup transformations have matching caching lookup on multiple match policies.
Lookup transformation that returns multiple rows cannot share a cache with a Lookup
transformation that returns one matching row for each input row.
Lookup
Lookup Property Description
Type
Lookup SQL Override the default sql query generated by the lookup
Relational
Override transformation. Use this option when lookup cache is enabled.
Lookup Table Pipeline You can choose a source, target or source qualifier as the lookup
table name. This is the lookup source which will be used to query
or cache the data.
Name Relational
If you have override the sql query, then you can ignore this option
Lookup Source You can filter looking up in the cache based on the value of data
Relational
Filter in the lookup ports. Works only when lookup cache is enabled.
When lookup cache is enabled, the integration service queries the
lookup source once and caches the entire data. Caching the lookup
Flat File source improves the performance. If the caching is disabled, the
Lookup Caching
Pipeline integration service queries the lookup source for each row.
Enabled
Relational
The integration service always caches the flat file and pipeline
lookups
Which row to return when the lookup transformation finds
multiple rows that match the lookup condition.
Report Error: Reports error and does not return a row.
Flat File
Lookup Policy on Use Last Value: Returns the last row that matches the lookup
Pipeline
Multiple Match condition.
Relational
Use All Values: Returns all matched rows.
Use Any Value: Returns the first value that matches the lookup
condition.
Flat File
You can define the lookup condition in the condition tab. The
Lookup Condition Pipeline
lookup condition is displayed here.
Relational
Connection
Relational specifies the database that contains the lookup table.
Information
Flat File
Indicates the lookup source type: flat file or relational table or
Source Type Pipeline
source qualifier.
Relational
Flat File
Tracing Level Pipeline Set amount of detail to be included in the lookup
Relational
Flat File
Lookup Cache
Pipeline Specifies the directory used to build the lookup cache files
Directory Name
Relational
Use when the lookup source data does not change at all.
Flat File Examples: zipcodes, countries, states etc.
Lookup Cache
Pipeline
Persistent
Relational The lookup caches the data once and it uses the cache even in
multiple session runs.
Lookup Data
Flat File
Cache Size
Pipeline Cache sizes of the lookup data and lookup index
Lookup Index
Relational
Cache Size
Flat File
Dynamic Lookup Indicates to use a dynamic lookup cache. Inserts or updates rows
Pipeline
Cache in the lookup cache as it passes rows to the target table.
Relational
Use with dynamic caching enabled. When you enable this
property, the Integration Service outputs old values out of the
Flat File lookup/output ports. When the Integration Service updates a row
Output Old Value
Pipeline in the cache, it outputs the value that existed in the lookup cache
On Update
Relational before it updated the row based on the input data. When the
Integration Service inserts a row in the cache, it outputs null
values.
An expression that indicates whether to update dynamic cache.
Create an expression using lookup ports or input ports. The
Flat File
Update Dynamic expression can contain input values or values in the lookup cache.
Pipeline
Cache Condition The Integration Service updates the cache when the condition is
Relational
true and the data exists in the cache. Use with dynamic caching
enabled. Default is true.
Flat File
Cache File Name Use with persistent lookup cache. Specifies the file name prefix to
Pipeline
Prefix use with persistent lookup cache files.
Relational
Flat File
Recache From
Pipeline The integration service rebuilds the lookup cache.
Lookup Source
Relational
Flat File
Use with dynamic caching enabled. Applies to rows entering the
Insert Else Update Pipeline
Lookup transformation with the row type of insert.
Relational
Flat File
Use with dynamic caching enabled. Applies to rows entering the
Update Else Insert Pipeline
Lookup transformation with the row type of update.
Relational
Datetime Format Flat File Specify the date format for the date fields in the file.
Thousand
Flat File specify the thousand separator for the port.
Separator
Decimal Separator Flat File Specify the Decimal Separator for the port.
Case-Sensitive The Integration Service uses case sensitive string comparisons
Flat File
String Comparison when performing lookups on string columns.
Flat File
Null Ordering Specifies how to sort null data.
Pipeline
Flat File
Sorted Input Indicates whether the lookup source data is in sorted order or not.
Pipeline
Flat File
Lookup Source is
Pipeline The lookup source does not change in a session.
Static
Relational
Pre-build Lookup Flat File Allows the Integration Service to build the lookup cache before
Cache Pipeline the Lookup transformation receives the data. The Integration
Service can build multiple lookup cache files at the same time to
Relational
improve performance.
Subsecond
Relational Specifies the subsecond precision for datetime ports.
Precision
The lookup transformation can be used in both connected and unconnected mode. The difference
between the connected and unconnected lookup transformations are listed in the below table:
Relational lookups:
When you want to use a relational table as a lookup source in the lookup transformation, you
have to connect to the lookup source using a ODBC and import the table definition as the
structure for the lookup transformation. You can use the below options for relational lookups:
You can override the default sql query and write your own customized sql to add a
WHERE clause or query multiple tables.
You can sort null data based on the database support.
You can perform case-sensitive comparison based on the database support.
When you want to use a flat file as a lookup source in the lookup transformation, select the flat
file definition in the repository or import the source when you create the transformation. When
you want to import the flat file lookup source, the designer invokes the flat file wizard. You can
use the below options for flat file lookups:
You can use indirect files as lookup sources by configuring a file list as the lookup file
name.
You can use sorted input for the lookup.
You can sort null data high or low.
You can use case-sensitive string comparison with flat file lookups.
For flat file lookup source, you can improve the performance by sorting the flat files on the
columns which are specified in the lookup condition. The condition columns in the lookup
transformation must be treated as a group for sorting the flat file. Sort the flat file on the
condition columns for optimal performance.
Normalizer Transformation
Normalizer transformation type is Active & Connected. The Normalizer transformation is used
in place of Source Qualifier transformations when you wish to read the data from the COBOL
copy book source. Also, a Normalizer transformation is used to convert column-wise data to
row-wise data. This is similar to the transpose feature of MS Excel. You can use this feature if
your source is a COBOL copybook file or relational database table. The Normalizer
transformation converts columns to rows and also generates an index for each converted row.
This is type an active from the transaction which allows you to read the data from cobal
files. Every Cobol source definition default associates with normalizing transformation.
Use the normalize transformation to convert a single input record into multiple output records,
this is known as horizontal port
Normalizer Transformation Uses
Normalizer can be used to deal with
value generated for the session plus one. The maximum generated key value is
9,223,372,036,854,775,807.
Restart the generated key sequence: When you restart the generated key sequence, the
Integration Service starts the generated key sequence at 1 the next time it runs a session.
When you restart the generated key sequence, the generated key start value does not change in
the Normalizer transformation until you run a session. When you run the session, the Integration
Service overrides the sequence number value on the Ports tab.
When you reset or restart the generated key sequence, the reset or restart affects the generated
key sequence values the next time you run a session. You do not change the current generated
key sequence values in the
Normalizer transformation. When you reset or restart the generated key sequence, the option is
enabled for every session until you disable the option.