PDF Export
PDF Export
PDF Export
Talend Components
2022-03-17
1. Access
1. Access components
2. Access scenario
2. Amazon Aurora
1. Amazon Aurora components
2. Amazon Aurora scenario
3. Amazon DynamoDB
1. Amazon DynamoDB components
2. Amazon DynamoDB scenario
4. Amazon EMR
1. Amazon EMR components
2. Amazon EMR scenario
5. Amazon EMR distribution
1. Amazon EMR distribution scenario
6. Amazon MySQL
1. Amazon MySQL components
7. Amazon Oracle
1. Amazon Oracle components
8. Amazon Redshift
1. Amazon Redshift components
2. Amazon Redshift scenarios
9. Amazon S3
1. Amazon S3 components
2. Amazon S3 scenarios
10. Amazon SQS
1. Amazon SQS components
2. Amazon SQS scenarios
11. Apache log
1. Apache log component
2. Apache log scenario
12. Archive/Unarchive
1. Archive/Unarchive components
2. Archive/Unarchive scenarios
13. ARFF
1. ARFF components
2. ARFF scenario
14. AS400
1. AS400 components
2. AS400 scenario
15. Avro
1. Avro components
2. Avro scenario
16. Azure Data Lake Store
1. Azure Data Lake Store components
2. Azure Data Lake Store scenarios
17. Azure SQL Data Warehouse
1. Azure SQL Data Warehouse components
18. Azure Storage Blob
1. Azure Storage Blob components
2. Azure Storage Blob scenarios
19. Azure Storage Queue
1. Azure Storage Queue components
20. Azure Storage Table
1. Azure Storage Table components
2. Azure Storage Table scenario
21. Bonita
1. Bonita components
2. Bonita scenarios
22. Box
1. Box components
2. Box scenario
23. Buffer
1. Buffer components
2. Buffer scenarios
24. Business rules
1. Business rules components
2. Business rules scenarios
25. Cassandra
1. Cassandra components
2. Cassandra scenario
26. Change Data Capture
1. Change Data Capture components
2. Change Data Capture scenarios
27. Chart
1. Chart components
2. Chart scenarios
28. Cloud
1. Cloud components
29. CombinedSQL
1. CombinedSQL components
2. CombinedSQL scenario
30. Context
1. Context components
2. Context scenario
31. CosmosDB
1. CosmosDB components
32. Couchbase
1. Couchbase components
2. Couchbase scenario
33. CyberArk
1. CyberArk component
2. CyberArk scenario
34. Data mapping
1. Data mapping components
2. Data mapping scenarios
35. Data Preparation
1. Data Preparation components
2. Data Preparation scenarios
36. Data Quality
1. Address standardization
1. Address standardization components
2. Address standardization scenarios
2. Continuous matching
1. Continuous matching components
2. Continuous matching scenarios
3. Data extraction
1. Data extraction components
2. Data extraction scenarios
4. Data matching
1. Data matching components
2. Data matching scenarios
5. Data privacy
1. Data privacy components
2. Data privacy scenarios
6. Deduplication
1. Deduplication components
2. Deduplication scenarios
7. Email validation
1. Email validation component
2. Email validation scenario
8. Formatting
1. Formatting component
2. Formatting scenario
9. Fuzzy matching
1. Fuzzy matching components
2. Fuzzy matching scenarios
10. Google address standardization
1. Google address standardization components
2. Google address standardization scenarios
11. Identification
1. Identification components
2. Identification scenarios
12. Loqate address standardization
1. Loqate address standardization component
2. Loqate address standardization scenario
13. Matching with machine learning
1. Matching with machine learning components
2. Matching with machine learning scenarios
14. Melissa Data address standardization
1. Melissa Data address standardization components
2. Melissa Data address standardization scenarios
15. Microsoft SQL Server validation
1. Microsoft SQL Server validation components
16. MySQL validation
1. MySQL validation components
2. MySQL validation scenarios
17. Name standardization
1. Name standardization component
2. Name standardization scenario
18. Oracle validation
1. Oracle validation components
19. Pattern validation
1. Pattern validation components
2. Pattern validation scenarios
20. Phone number standardization
1. Phone number standardization component
2. Phone number standardization scenario
21. PostgreSQL validation
1. PostgreSQL validation components
22. QAS address standardization
1. QAS address standardization components
2. QAS address standardization scenarios
23. Reporting
1. Reporting components
2. Reporting scenarios
24. Sampling
1. Sampling component
2. Sampling scenario
25. Standardization
1. Standardization components
2. Standardization scenarios
26. Synonym index
1. Synonym index components
2. Synonym index scenarios
27. Text standardization
1. Text standardization components
2. Text standardization scenarios
28. Uniserv
1. Uniserv components
2. Uniserv scenarios
29. Validation
1. Validation component
2. Validation scenario
37. Data Stewardship
1. Data Stewardship components
2. Data Stewardship scenarios
38. Database utility
1. Database utility components
2. Database utility scenarios
39. Databricks
1. Databricks components
2. Databricks scenarios
40. DB Generic
1. DB Generic components
41. DB2
1. DB2 components
42. DBFS
1. DBFS components
43. Defining Context Groups
1. Defining Context Groups scenarios
44. Delimited
1. Delimited components
2. Delimited scenarios
45. Delta Lake
1. Delta Lake components
2. Delta Lake scenario
46. DotNET
1. DotNET components
2. DotNET scenarios
47. Dropbox
1. Dropbox components
2. Dropbox scenario
48. Dynamic Schema
1. Dynamic Schema component
2. Dynamic Schema scenarios
49. ElasticSearch
1. ElasticSearch components
50. ELT Greenplum
1. ELT Greenplum components
2. ELT Greenplum scenarios
51. ELT Hive
1. ELT Hive components
2. ELT Hive scenarios
52. ELT JDBC
1. ELT JDBC components
2. ELT JDBC scenarios
53. ELT MSSql
1. ELT MSSql components
2. ELT MSSql scenarios
54. ELT MySQL
1. ELT MySQL components
2. ELT MySQL scenarios
55. ELT Netezza
1. ELT Netezza components
2. ELT Netezza scenarios
56. ELT Oracle
1. ELT Oracle components
2. ELT Oracle scenarios
57. ELT PostgreSQL
1. ELT PostgreSQL components
2. ELT PostgreSQL scenarios
58. ELT Sybase
1. ELT Sybase components
2. ELT Sybase scenarios
59. ELT Teradata
1. ELT Teradata components
2. ELT Teradata scenarios
60. ELT Vertica
1. ELT Vertica components
2. ELT Vertica scenarios
61. ESB REST
1. ESB REST components
2. ESB REST scenarios
62. ESB SOAP
1. ESB SOAP components
2. ESB SOAP scenarios
63. EXASolution
1. EXASolution components
2. EXASolution scenario
64. Excel
1. Excel components
2. Excel scenario
65. EXist
1. EXist components
2. EXist scenario
66. Firebird
1. Firebird components
67. Flume
1. Flume components
68. FTP
1. FTP components
2. FTP scenarios
69. FullRow
1. FullRow components
2. FullRow scenario
70. Global variable
1. Global variable components
2. Global variable scenarios
71. Google BigQuery
1. Google BigQuery components
2. Google BigQuery scenarios
72. Google Dataproc
1. Google Dataproc component
73. Google Drive
1. Google Drive components
2. Google Drive scenario
74. Google PubSub
1. Google PubSub components
75. GPG
1. GPG component
2. GPG scenario
76. Greenplum
1. Greenplum components
77. Groovy
1. Groovy components
2. Groovy scenario
78. GS
1. GS components
2. GS scenario
79. HBase
1. HBase components
2. HBase scenario
80. HCatalog
1. HCatalog components
2. HCatalog scenario
81. HDFS
1. HDFS components
2. HDFS scenarios
82. Hive
1. Hive components
2. Hive scenarios
83. HSQLDB
1. HSQLDB components
84. HTTP
1. HTTP component
2. HTTP scenarios
85. Impala
1. Impala components
86. Informix
1. Informix components
87. Ingres
1. Ingres components
2. Ingres scenario
88. Interbase
1. Interbase components
89. Internet (Integration)
1. Internet (Integration) component
2. Internet (Integration) scenarios
90. Jasper
1. Jasper components
2. Jasper scenario
91. Java custom code for Map Reduce
1. Java custom code for Map Reduce component
2. Java custom code for Map Reduce scenario
92. Java custom code for Storm
1. Java custom code for Storm component
2. Java custom code for Storm scenario
93. Java custom code
1. Java custom code components
2. Java custom code scenarios
94. JavaDB
1. JavaDB components
95. JBoss ESB
1. JBoss ESB components
96. JDBC
1. JDBC components
97. JIRA
1. JIRA components
2. JIRA scenarios
98. JMS
1. JMS components
2. JMS scenario
99. JSON
1. JSON components
2. JSON scenarios
100. Kafka
1. Kafka components
2. Kafka scenarios
101. Kerberos
1. Kerberos component
102. Keystore
1. Keystore component
2. Keystore scenario
103. Kinesis
1. Kinesis components
2. Kinesis scenario
104. Kudu
1. Kudu components
2. Kudu scenario
105. LDAP
1. LDAP components
2. LDAP scenarios
106. LDIF
1. LDIF components
2. LDIF scenario
107. Library import
1. Library import component
2. Library import scenario
108. Logs and errors (Integration)
1. Logs and errors (Integration) components
2. Logs and errors (Integration) scenarios
109. Machine Learning
1. Machine Learning components
2. Machine Learning scenarios
110. Mail
1. Mail components
2. Mail scenarios
111. MapRDB
1. MapRDB components
2. MapRDB scenario
112. MapRStreams
1. MapRStreams components
113. Marketo
1. Marketo components
2. Marketo scenarios
114. MarkLogic
1. MarkLogic components
115. MaxDB
1. MaxDB components
116. MDM (Master Data Management)
1. MDM connection and transaction
1. MDM connection and transaction components
2. MDM data processing
1. MDM data processing components
2. MDM data processing scenarios
3. MDM event processing
1. MDM event processing components
2. MDM event processing scenarios
117. MemSQL
1. MemSQL components
2. MemSQL scenario
118. Microsoft CRM
1. Microsoft CRM components
2. Microsoft CRM scenario
119. Microsoft MQ
1. Microsoft MQ components
2. Microsoft MQ scenario
120. MOM
1. MOM components
2. MOM scenarios
121. Mondrian
1. Mondrian component
2. Mondrian scenario
122. MongoDB
1. MongoDB components
2. MongoDB scenarios
123. MQTT
1. MQTT components
124. MS Delimited
1. MS Delimited components
2. MS Delimited scenario
125. MS Positional
1. MS Positional components
2. MS Positional scenario
126. MS XML connectors
1. MS XML connectors components
2. MS XML connectors scenario
127. MSSql
1. MSSql components
2. MSSql scenarios
128. MySQL
1. MySQL components
2. MySQL scenarios
129. NamedPipe
1. NamedPipe components
2. NamedPipe scenario
130. Natural Language Processing
1. Natural Language Processing components
2. Natural Language Processing scenarios
131. Neo4j
1. Neo4j components
2. Neo4j scenarios
132. Netezza
1. Netezza components
133. Netsuite
1. Netsuite components
2. Netsuite scenario
134. Openbravo ERP
1. Openbravo ERP components
135. Oracle
1. Oracle components
2. Oracle scenarios
136. ORC
1. ORC components
137. Orchestration (Integration)
1. Orchestration (Integration) components
2. Orchestration (Integration) scenarios
138. Palo
1. Palo components
2. Palo scenarios
139. ParAccel
1. ParAccel components
140. Parquet
1. Parquet components
141. Petals
1. Petals components
142. POP
1. POP component
2. POP scenario
143. Positional
1. Positional components
2. Positional scenarios
144. PostgresPlus
1. PostgresPlus components
145. PostgreSQL
1. PostgreSQL components
146. Processing (Integration)
1. Processing (Integration) components
2. Processing (Integration) scenarios
147. Properties
1. Properties components
2. Properties scenario
148. Proxy
1. Proxy component
149. RabbitMQ
1. RabbitMQ components
150. Raw
1. Raw components
151. Regex
1. Regex components
2. Regex scenario
152. REST
1. REST component
2. REST scenario
153. Riak
1. Riak components
2. Riak scenario
154. Route
1. Route components
2. Route scenarios
155. RSS
1. RSS components
2. RSS scenarios
156. Salesforce
1. Salesforce components
2. Salesforce scenarios
157. SAP
1. SAP components
2. SAP scenarios
158. SCD
1. SCD components
2. SCD scenario
159. SCDELT
1. SCDELT components
2. SCDELT scenarios
160. SCP
1. SCP components
2. SCP scenario
161. ServiceNow
1. ServiceNow components
162. SingleStore
1. SingleStore components
163. Snowflake
1. Snowflake components
2. Snowflake scenarios
164. SOAP
1. SOAP component
2. SOAP scenarios
165. Socket
1. Socket components
2. Socket scenario
166. Splunk
1. Splunk component
167. SQLite
1. SQLite components
2. SQLite scenarios
168. SQLTemplate
1. SQLTemplate components
2. SQLTemplate scenarios
169. Sqoop
1. Sqoop components
2. Sqoop scenarios
170. SVNLog
1. SVNLog component
2. SVNLog scenario
171. Sybase
1. Sybase components
2. Sybase scenario
172. System
1. System components
2. System scenarios
173. Tachyon
1. Tachyon component
174. tAddLocationFromIP
1. tAddLocationFromIP component
2. tAddLocationFromIP scenario
175. Talend Cloud
1. Talend Cloud components
176. tChangeFileEncoding
1. tChangeFileEncoding component
2. tChangeFileEncoding scenario
177. tCreateTemporaryFile
1. tCreateTemporaryFile component
2. tCreateTemporaryFile scenario
178. Technical
1. Technical components
2. Technical scenarios
179. Teradata
1. Teradata components
2. Teradata scenarios
180. tFileCompare
1. tFileCompare component
2. tFileCompare scenario
181. tFileCopy
1. tFileCopy component
2. tFileCopy scenario
182. tFileDelete
1. tFileDelete component
2. tFileDelete scenario
183. tFileExist
1. tFileExist component
2. tFileExist scenario
184. tFileList
1. tFileList component
2. tFileList scenarios
185. tFileProperties
1. tFileProperties component
2. tFileProperties scenario
186. tFileRowCount
1. tFileRowCount component
2. tFileRowCount scenario
187. tFileTouch
1. tFileTouch component
188. tFixedFlowInput
1. tFixedFlowInput component
2. tFixedFlowInput scenario
189. tMap
1. tMap component
2. tMap scenarios
190. tMemorizeRows
1. tMemorizeRows component
2. tMemorizeRows scenario
191. tMsgBox
1. tMsgBox component
2. tMsgBox scenario
192. tRowGenerator
1. tRowGenerator component
2. tRowGenerator scenario
193. tServerAlive
1. tServerAlive component
2. tServerAlive scenario
194. tSocketTextStreamInput
1. tSocketTextStreamInput component
195. tXMLMap
1. tXMLMap component
2. tXMLMap scenarios
196. VectorWise
1. VectorWise components
197. Vertica
1. Vertica components
198. VtigerCRM
1. VtigerCRM components
199. Webservice
1. Webservice components
2. Webservice scenarios
200. Workday
1. Workday component
201. XML
1. XML components
2. XML scenarios
202. XML connectors
1. XML connectors components
2. XML connectors scenarios
203. XML validation
1. XML validation components
2. XML validation scenarios
204. XMLRPC
1. XMLRPC component
2. XMLRPC scenario
Access
Access components
tAccessCommit Commits in one go a global transaction instead of doing that on every row
or every batch, and provides gain in performance, using a unique
connection.
tAccessConnection Opens a connection to the specified database that can then be reused in
the subsequent subJob or subJobs.
tAccessOutputBulk Prepares the file which contains the data used to feed the Access database.
tAccessRollback Cancels the transaction commit in the connected database and avoids to
commit part of a transaction involuntarily.
tAccessRow Executes the SQL query stated onto the specified database.
Access scenario
Inserting data in parent/child tables
Amazon Aurora
Amazon Aurora components
tAmazonAuroraInvalidRows Checks Amazon Aurora database rows against specific Data Quality patterns
(regular expression) or Data Quality rules (business rule). Only MySQL is
supported.
tAmazonAuroraValidRows Checks Amazon Aurora database rows against specific Data Quality patterns
(regular expression) or Data Quality rules (business rule). Only MySQL is
supported.
tAmazonAuroraCommit Commits in one go a global transaction instead of doing that on every row or
every batch, and provides gain in performance, using a unique connection.
tAmazonAuroraConnection Opens a connection to an Amazon Aurora database instance that can then be
reused by other Amazon Aurora components.
tAmazonAuroraInput Reads an Amazon Aurora database and extracts fields based on a query.
tAmazonAuroraRollback Rolls back any changes made in the Amazon Aurora database to prevent partial
transaction commit if an error occurs.
Amazon DynamoDB
tDynamoDBLookupInput Executes a database query with a strictly defined order which must
correspond to the schema definition.
tDynamoDBInput Retrieves data from an Amazon DynamoDB table and sends them to the
component that follows for transformation.
Amazon EMR
Amazon EMR components
tAmazonEMRListInstances Lists the details about the instance groups in a cluster on Amazon EMR
(Elastic MapReduce).
tAmazonEMRResize Adds or resizes a task instance group in a cluster on Amazon EMR (Elastic
MapReduce).
Amazon MySQL
tAmazonMysqlCommit Commits in one go a global transaction instead of doing that on every row or
every batch, and provides gain in performance, using a unique connection.
tAmazonMysqlConnection Opens a connection to the specified database that can then be reused in the
subsequent subJob or subJobs.
tAmazonMysqlRollback Cancels the transaction commit in the connected database and avoids to
commit part of a transaction involuntarily.
tAmazonMysqlRow Executes the SQL query stated onto the specified database.
Amazon Oracle
Amazon Oracle components
tAmazonOracleCommit Commits in one go a global transaction instead of doing that on every row or
every batch, and provides gain in performance, using a unique connection.
tAmazonOracleConnection Opens a connection to the specified database that can then be reused in the
subsequent subJob or subJobs.
tAmazonOracleRollback Cancels the transaction commit in the connected database and avoids to
commit part of a transaction involuntarily.
tAmazonOracleRow Executes the SQL query stated onto the specified database.
Amazon Redshift
Amazon Redshift components
tRedshiftBulkExec Loads data into Amazon Redshift from Amazon S3, Amazon EMR cluster,
Amazon DynamoDB, or remote hosts.
tRedshiftConnection Opens a connection to the specified database that can then be reused in
the subsequent subJob or subJobs.
tRedshiftInput Reads data from a database and extracts fields based on a query so that
you may apply changes to the extracted data.
tRedshiftRow Acts on the actual DB structure or on the data (although without handling
data), depending on the nature of the query and the database.
Amazon S3
Amazon S3 components
tS3Configuration Reuses the connection configuration to S3 in the same Job. The Spark
cluster to be used reads this configuration to eventually connect to S3.
tS3Input Reads data from a given S3N system (S3 Native Filesystem).
tS3List Lists the files on Amazon S3 based on the bucket/file prefix settings.
tS3Put Uploads data onto Amazon S3 from a local file or from cache memory via
the streaming mode.
Amazon S3 scenarios
Writing and reading data from S3 (Databricks on AWS)
Writing server-side KMS encrypted data on EMR
Copying an S3 object from one bucket to another
Exchange files with Amazon S3
Listing files with the same prefix from a bucket
Retrieving data from an S3 object in Studio
Tagging S3 objects
Verifing the absence of a bucket, creating it and listing all the S3 buckets
Amazon SQS
tSQSConnection Opens a connection to Amazon Simple Queue Service that can then be reused by
other SQS components.
tSQSInput Retrieves one or more messages, with a maximum limit of ten messages, from an
Amazon SQS (Simple Queue Service) queue.
tSQSMessageChangeVisibility Changes the visibility timeout of a specified message in an Amazon SQS (Simple
Queue Service) queue.
tSQSMessageDelete Deletes a specified message from an Amazon SQS (Simple Queue Service) queue.
tSQSOutput Delivers one or more messages to an Amazon SQS (Simple Queue Service) queue.
tSQSQueueAttributes Gets attributes for a specified Amazon SQS (Simple Queue Service) queue.
tSQSQueueList Iterates and lists the URL of Amazon SQS (Simple Queue Service) queues in a specified
region.
tSQSQueuePurge Purges messages in an Amazon SQS (Simple Queue Service) queue.
Apache log
Apache log component
Archive/Unarchive
Archive/Unarchive components
tFileArchive Creates a new zip, gzip, or tar.gz archive file from one or more files or
folders.
tFileUnarchive Decompresses an archive file for further processing, in one of the following
formats: *.tar.gz , *.tgz, *.tar, *.gz and *.zip.
Archive/Unarchive scenarios
Comparing unzipped files
Zipping files using a tFileArchive
ARFF
ARFF components
tFileInputARFF Reads an ARFF file row by row to split them up into fields and then sends
the fields as defined in the schema to the next component.
tFileOutputARFF Writes an ARFF file that holds data organized according to the defined
schema.
ARFF scenario
Displaying the content of a ARFF file
AS400
AS400 components
tAS400Close Closes the transaction committed in the connected database.
tAS400Commit Commits in one go a global transaction instead of doing that on every row
or every batch, and provides gain in performance, using a unique
connection.
tAS400Connection Opens a connection to the specified database that can then be reused in
the subsequent subJob or subJobs.
tAS400LastInsertId Obtains the primary key value of the record that was last inserted in an
AS/400 table.
tAS400Rollback Cancels the transaction commit in the connected database and avoids to
commit part of a transaction involuntarily.
tAS400Row Executes the SQL query stated onto the specified database.
AS400 scenario
Handling data with AS/400
Avro
Avro components
tAvroInput Extracts records from any given Avro format files for other components to
process the records.
tAvroOutput Receives data flows from the processing component placed ahead of it and
writes the data into Avro format files in a given distributed file system.
tAvroStreamInput Listens on a given directory, reads data from Avro files once they are
created and sends this data to the component that follows.
Avro scenario
Filtering Avro format employee data
tAzureAdlsGen2Input Retrieves data from an ADLS Gen2 file system of an Azure storage account
and passes the data to the subsequent component connected to it through
a Main>Row link.
tAzureAdlsGen2Output Uploads incoming data to an ADLS Gen2 file system of an Azure storage
account in the specified format.
tAzureFSConfiguration Provides authentication information for Spark to connect to a given Azure
file system.
tAzureSynapseBulkExec Loads data into an Azure SQL Data Warehouse table from either Azure Blob
Storage or Azure Data Lake Storage.
tAzureSynapseCommit Commits in one go a global transaction instead of doing that on every row or
every batch and thus provides gain in performance.
tAzureSynapseInput Reads data and extracts fields based on a query from an Azure SQL Data
Warehouse database.
tAzureSynapseOutput Writes, updates, makes changes or suppresses entries in an Azure SQL Data
Warehouse database.
tAzureSynapseRollback Cancels the transaction commit in the connected Azure SQL Data Warehouse
database to prevent partial transaction commit if an error occurs.
tAzureSynapseRow Executes an SQL query stated on an Azure SQL Data Warehouse database.
tAzureFSConfiguration Provides authentication information for Spark to connect to a given Azure file system.
tAzureStorageConnection Uses authentication and the protocol information to create a connection to the
Microsoft Azure Storage system that can then be reused by other Azure Storage
components.
tAzureStorageContainerCreate Creates a new storage container used to hold Azure blobs (Binary Large Object) for a
given Azure storage account.
tAzureStorageContainerDelete Automates the removal of a given blob container from the space of a specific storage
account.
tAzureStorageContainerExist Automates the verification of whether a given blob container exists or not within a
storage account.
tAzureStorageContainerList Lists all containers in a given Azure storage account.
tAzureStorageDelete Deletes blobs from a given container for an Azure storage account according to the
specified blob filters.
tAzureStorageGet Retrieves blobs from a given container for an Azure storage account according to the
specified filters applied on the virtual hierarchy of the blobs and then write selected
blobs in a local folder.
tAzureStorageList Lists blobs in a given container according to the specified blob filters.
tAzureStoragePut Uploads local files into a given container for an Azure storage account.
tAzureStorageConnection Uses authentication and the protocol information to create a connection to the Microsoft
Azure Storage system that can then be reused by other Azure Storage components.
tAzureStorageQueueDelete Deletes a specified queue permanently under a given Azure storage account.
tAzureStorageQueueInput Retrieves one or more messages from the front of an Azure queue.
tAzureStorageQueueInputLoop Runs an endless loop to retrieve messages from the front of an Azure queue.
tAzureStorageQueueList Returns all queues associated with the given Azure storage account.
tAzureStorageInputTable Retrieves a set of entities that satisfy the specified filter criteria from an Azure
storage table.
tAzureStorageOutputTable Performs the defined action on a given Azure storage table and inserts,
replaces, merges or deletes entities in the table based on the incoming data
from the preceding component.
Bonita
Bonita components
Bonita scenarios
Executing a Bonita process via a Talend Job
Outputting the process instance UUID over the Row > Main link
Box
Box components
tBoxConnection Creates a Box connection that the other Box components can
reuse.
Box scenario
Uploading and downloading files from Box
Buffer
Buffer components
tBufferOutput Collects data in a buffer in order to access it later via webservice for
example.
Buffer scenarios
Buffering data to be used as a source system
Buffering data
Buffering output data on the webapp server
Calling a Job exported as Webservice in another Job
Calling a Job with context variables from a browser
Retrieving bufferized data
Returning a value from a child Job to the parent Job
Business rules
tBRMS Applies Drools business rules to an incoming flow and writes the output
data to an XML file.
tRules Uses business rules defined in a Drools file of .xls or .drl format in order to
filter data.
Cassandra
Cassandra components
tCassandraConfiguration Enables the reuse of the connection configuration to a Cassandra server in the
same Job.
tCassandraLookupInput Extracts the desired data from a standard or super column family of a Cassandra
keyspace so as to apply changes to the data.
tCassandraInput Extracts the desired data from a standard or super column family of a Cassandra
keyspace so as to apply changes to the data.
tCassandraOutput Writes data into or deletes data from a column family of a Cassandra keyspace.
tCassandraOutputBulk Prepares an SSTable of large size and processes it according to your needs
before loading this SSTable into a column family of a Cassandra keyspace.
Cassandra scenario
Handling data with Cassandra
tDB2CDC Extracts the changes done to the source operational data and makes them
available to the target system(s) using database CDC views.
tInformixCDC Extracts the data from a source system which has changed since the last
extraction and transports it to another/other system(s).
tIngresCDC (deprecated) Extracts source system data that has changed since the last extraction and
transports it to another/other system(s).
tMSSqlCDC Extracts the changes made to the source operational data and makes them
available to the target system(s) using database CDC views.
tMysqlCDC Extracts only the changes made to the source operational data and makes
them available to the target system(s) using database CDC views.
tOracleCDC Extracts source system data that has changed since the last extraction and
transports it to another/other system(s).
tPostgresqlCDC Addresses data extraction and transportation needs, only extracts the
changes made to the source operational data and makes them available to
the target system(s) using database CDC views.
tSybaseCDC Extracts source system data that has changed since the last extraction and
transports it to another/other system(s).
tTeradataCDC Extracts source system data that has changed since the last extraction and
transports it to another system(s) using the CDC Trigger mode.
Chart
Chart components
tBarChart Generates a bar chart from the input data to ease technical analysis.
tLineChart Reads data from an input flow and transforms the data into a line chart in a
PNG image file to ease technical analysis.
Chart scenarios
Creating a bar chart from the input data
Creating a line chart to ease trend analysis
Cloud
Cloud components
tCloudStop Changes the status of a launched instance on Amazon EC2 (Amazon Elastic
Compute Cloud).
CombinedSQL
CombinedSQL components
tCombinedSQLInput Extracts fields from a database table based on its schema definition.
tCombinedSQLOutput Inserts records from the incoming flow to an existing database table.
CombinedSQL scenario
Filtering and aggregating table columns directly on the DBMS
Context
Context components
tContextDump Copies the context setup of the current Job to a flat file, a database table,
etc., which can then be used by tContextLoad.
Context scenario
Reading data from different MySQL databases using dynamically loaded connection parameters
CosmosDB
CosmosDB components
tCosmosDBSQLAPIInput Retrieves data from a Cosmos database collection through SQL API.
tCosmosDBBulkLoad Imports data files in different formats (CSV, TSV or JSON) into the specified
Cosmos database so that the data can be further processed.
Couchbase
Couchbase components
tCouchbaseDCPInput Queries the documents from the Couchbase database, under the Database
Change Protocol (DCP), a streaming protocol.
tCouchbaseDCPOutput Upserts documents in the Couchbase database based on the incoming flat
data from preceding components, under the Database Change Protocol
(DCP), a streaming protocol.
tCouchbaseOutput Upserts documents in the Couchbase database based on the incoming flat
data from preceding components.
Couchbase scenario
Querying JSON documents from a Couchbase database with a N1QL query
CyberArk
CyberArk component
tCyberarkInput Retrieves the content of an secret object (usually, a password) stored in a
Cyberark vault at runtime. The retrieved content is stored in the after
variable SECRET, which can be referenced by any subsequent components
in the Job. The content can also be passed to the subsequent component
in a column named secret through a Row > Main connection.
CyberArk scenario
Accessing a password-protected file
Data mapping
tHConvertFile Uses Talend Data Mapper structures to perform a conversion from one
representation to another, as a Spark Batch execution.
tHMapFile Runs a Talend Data Mapper map where input and output structures may
differ, as a Spark batch execution.
tHMapInput Runs a Talend Data Mapper map where input and output structures may
differ, as a Spark batch execution, and sends the data for use by a
downstream component.
tHMapRecord Runs a Talend Data Mapper map where input and output structures may
differ, as a Spark streaming execution.
Data Preparation
tDatasetInput Creates a flow with data from a Talend Data Preparation dataset.
Data Quality
Address standardization
Address standardization components
tAddressRowCloud Verifies and formats international addresses in the Cloud by using online
services.
tBatchAddressRowCloud Uses batch processing to parse address data and get formatted addresses
quickly, accurately and without installing any software.
Editing the mapping of the verification codes from address validation providers to Talend verification levels
Parsing addresses against reference data in the Cloud
Parsing addresses against reference data in the Cloud using batch processing
Continuous matching
Continuous matching components
tMatchIndex Indexes a clean and deduplicated data set in ElasticSearch for continuous
matching purposes.
tMatchIndexPredict Compares a new data set with a lookup data set stored in ElasticSearch,
using tMatchIndex. tMatchIndexPredict outputs unique records and
suspect duplicates in separate files.
Data extraction
Data extraction components
tExtractRegexFields Extracts data and generates multiple columns from a formatted string
using regex matching.
tPatternExtract Outputs all data that match a given pattern. You can then implement any
required operation on the extracted data.
Data matching
Data matching components
tMatchGroup Creates groups of similar data records in any source data including large
volumes of data by using one or several match rules.
tRecordMatching Ensures the data quality of any source data against a reference data source.
Grouping output data in separate flows according to the minimal distance computed in each record
Matching customer data through multiple passes
Matching data through multiple passes using Map/Reduce components
Matching entries using the Q-grams and Levenshtein algorithms
Using a custom matching algorithm to match entries
Using survivorship functions to merge two records and create a master record
Data privacy
Data privacy components
tDataMasking Hides original data with random characters or figures to protect the actual
data while having a functional substitute for occasions when it is not
advisable to show sensitive real data.
tDataShuffling Shuffles the data from in an input table to protect the actual data while
having a functional data set. Data will remain usable for purposes such as
testing and training.
tDataUnmasking Unmasks data masked with the tDataMasking component to retrieve the
original data.
tDuplicateRow Creates duplicates with meaningful data for data quality functional testing
purposes.
tPatternMasking Masks data that follows a specific pattern and can transform the original
data in consistent manner, if needed.
tPatternUnmasking Unmasks data masked with the tPatternMasking component to retrieve the
original data.
Deduplication
Deduplication components
Deduplication scenarios
Email validation
Email validation component
tVerifyEmail Verifies if email addresses comply with specific rules and corrects
addresses that do not match the rules by using the content from specific
columns.
Formatting
Formatting component
tChangeFileEncoding Transforms the character encoding of a given file and generates a new file
with the transformed character encoding.
Formatting scenario
Fuzzy matching
Fuzzy matching components
tBlockedFuzzyJoin Helps ensuring the data quality of any source data against a reference data source.
(deprecated)
tFuzzyJoin (deprecated) Joins two tables by doing a fuzzy match on several columns, comparing columns from the
main flow with reference columns from the lookup flow and outputting the main flow data
and/or the rejected data.
tFuzzyMatch Compares a column from the main flow with a reference column from the lookup flow and
outputs the main flow data displaying the distance.
tFuzzyUniqRow Compares columns in the input flow by using a defined matching method and collects the
encountered duplicates.
Identification
Identification components
tGenKey Generates a functional key from the input columns, by applying different
types of algorithms on each column and grouping the computed results in
one key, then outputs this key with the input columns.
tAddCRCRow Provides a unique ID which helps improving the quality of processed data.
CRC stands for Cyclical Redundancy Checking.
Identification scenarios
Comparing columns and grouping in the output flow duplicate records that have the same functional key
Generating functional keys in the output flow
Adding a surrogate key to a file
tMatchPairing Enables you to compute pairs of suspect duplicates from any source data
including large volumes in the context of machine learning on Spark.
tMatchPredict Labels suspect records automatically and groups suspect records which
match the label(s) set in the component properties.
tMSSqlValidRows Extracts DB rows that match a given data quality business rule.
MySQL validation
MySQL validation components
tMySQLInvalidRows Checks MySQL database rows against specific Data Quality patterns
(regular expression) or Data Quality rules (business rule).
tMySQLValidRows Checks MySQL database rows against Data Quality patterns (regular
expression).
Name standardization
Name standardization component
Oracle validation
Oracle validation components
tOracleInvalidRows Checks Oracle database rows against specific Data Quality patterns
(regular expression) or Data Quality rules (business rule).
tOracleValidRows Checks Oracle database rows against Data Quality patterns (regular
expression).
Pattern validation
Pattern validation components
tFindRegexlibExpressions Returns a dataset holding information about all of the regular expressions
that match the request sent to the web server.
tLastRegexlibExpressions Returns a dataset holding information about the N most recent regular
expressions added to the library and that match the query at
http://regexlib.com.
tMultiPatternCheck Checks all existing data in multiple columns against a given Java regular
expression.
tPatternCheck Gives two output flows: Matching Data and Non-Matching Data. The first
collects all data that match a given pattern, and the second collects all data
that do not match a given pattern. You can then implement any required
corrections.
PostgreSQL validation
PostgreSQL validation components
tPostgresqlInvalidRows Extracts DB rows that do not match a given data quality pattern, you can
then implement any required correction.
tQASAddressRow Corrects any formatting or spelling errors and gives the verification status for each row.
tQASAddressUnknown Gives one output flow: Unknown which collects all addresses that do not match to
(deprecated) deliverable results in the QuickAddress data.
tQASAddressVerified Gives three output flows: Verified, Interaction required, and Reject.
(deprecated)
tQASBatchAddressRow Corrects any formatting or spelling errors, adds missing data and gives the verification
status for each row.
Editing addresses against QAS files and giving the verification status
Editing addresses and giving the verification status
Reporting
Reporting components
tDqReportRun Launches the analyses listed in a report and save the results in the data
quality data mart.
Reporting scenarios
Sampling
Sampling component
Sampling scenario
Standardization
Standardization components
tStandardizeRow Normalizes the incoming data in a separate XML or JSON data flow to
separate or standardize the rule-compliant data from the non-compliant
data.
Standardization scenarios
Synonym index
Synonym index components
tSynonymOutput Creates a Lucene index and feeds it with entries and the related synonyms
it receives.
tSynonymSearch Searches a given index for the reference entries matching the data you
input.
Text standardization
Text standardization components
tTransliterate Converts strings from many languages of the world to a standard set of characters
(Universal Coded Character Set, UCS).
Uniserv
Uniserv components
tUniservBTGeneric Executes a process created with the Uniserv product DQ Batch Suite.
(deprecated)
tUniservRTMailOutput Synchronizes the index pool that is used for duplicate search.
(deprecated)
tUniservRTMailSearch Searches for duplicate values based on a given input record and adds additional data to
(deprecated) each record.
tUniservRTPost (deprecated) Improves the addresses quality, which is extremely important for CRM and e-business as it is
directly related to postage and advertising costs.
Uniserv scenarios
Validation
Validation component
tSchemaComplianceCheck Ensures the data quality of any source data against a reference data source.
Validation scenario
Data Stewardship
Data Stewardship components
tDataStewardshipTaskDelete Connects to Talend Data Stewardship and deletes the data stored in campaigns in the
form of tasks.
tDataStewardshipTaskInput Connects to Talend Data Stewardship and retrieves the data stored in campaigns in
the form of tasks.
tDataStewardshipTaskOutput Connects to Talend Data Stewardship and loads data into campaigns in the form of
tasks. The tasks must have the same schema defined in the campaign.
Database utility
Databricks
Databricks components
tDBFSConnection Connects to a given DBFS (Databricks Filesystem) system so that the other
DBFS components can reuse the connection it creates to communicate
with this DBFS.
tDBFSGet Copies files from a given DBFS (Databricks Filesystem) system, pastes them
in a user-defined directory and if needs be, renames them.
tDBFSPut Connects to a given DBFS (Databricks Filesystem) system, copies files from
an user-defined directory, pastes them in this system and if needs be,
renames these files.
Databricks scenarios
Writing and reading data from Azure Data Lake Storage using Spark (Azure Databricks)
Writing and reading data from S3 (Databricks on AWS)
DB Generic
DB Generic components
tDBCDC Extracts only the changes made to the source operational data and makes
them available to the target system(s) using database CDC views.
tDBInvalidRows Checks database rows against specific Data Quality patterns (regular
expression) or Data Quality rules (business rule).
tDBValidRows Checks database rows against Data Quality patterns (regular expression).
tDBColumnList Iterates on all columns of a given database table and lists column names.
tDBCommit Validates the data processed through the Job into the connected database.
tDBConnection Opens a connection to a database to be reused in the subsequent subJob
or subJobs.
tDBLastInsertId Obtains the primary key value of the record that was last inserted in a
database table by a user.
tDBOutputBulk Writes a file with columns based on the defined delimiter and the
standards of the selected database type.
tDBSCDELT Reflects and tracks changes in a dedicated SCD table through SQL queries.
tDBTableList Lists the names of specified database tables using a SELECT statement
based on a WHERE clause.
DB2
DB2 components
tDB2BulkExec Executes the Insert action on the provided data and gains in performance
during Insert operations to a DB2 database.
tDB2Commit Commits in one go a global transaction instead of doing that on every row
or every batch and thus provides gain in performance.
tDB2Connection Opens a connection to the specified database that can then be reused in
the subsequent subJob or subJobs.
tDB2Input Executes a DB query with a strictly defined order which must correspond to
the schema definition. Then tDB2Input passes on the field list to the next
component via a Row > Main link.
tDB2Output Executes the action defined on the table and/or on the data contained in
the table, based on the flow incoming from the preceding component in
the Job.
tDB2Rollback Avoids to commit part of a transaction involuntarily.
tDB2Row Acts on the actual DB structure or on the data (although without handling
data) depending on the nature of the query and the database. The
SQLBuilder tool helps you write easily your SQL statements.
DBFS
DBFS components
tDBFSConnection Connects to a given DBFS (Databricks Filesystem) system so that the other
DBFS components can reuse the connection it creates to communicate
with this DBFS.
tDBFSGet Copies files from a given DBFS (Databricks Filesystem) system, pastes them
in a user-defined directory and if needs be, renames them.
tDBFSPut Connects to a given DBFS (Databricks Filesystem) system, copies files from
an user-defined directory, pastes them in this system and if needs be,
renames these files.
Delimited
Delimited components
tFileStreamInputDelimited Reads data continuously, row by row, to split it into fields, then sends fields
defined in its schema to the next Job component, via a Row > Main link.
tFileInputDelimited Reads a delimited file row by row to split them up into fields and then sends
the fields as defined in the schema to the next component.
tFileOutputDelimited Outputs the input data to a delimited file according to the defined schema.
Delimited scenarios
Reading data from a Delimited file and display the output
Reading data from a remote file in streaming mode
Using a pivot column to aggregate data
Utilizing Output Stream to save filtered data to a local file
Writing data in a delimited file
Delta Lake
tDeltaLakeConnection Opens a connection to the specified database that can then be reused in
the subsequent subJob or subJobs.
tDeltaLakeInput Extracts the latest version or a given snapshot of records from the Delta
Lake layer of your Data Lake system and sends the data to the next
component for further processing.
tDeltaLakeOutput Writes records in the Delta Lake layer of your Data Lake system in the
Parquet format.
tDeltaLakeRow Acts on the actual DB structure or on the data (although without handling
data) using the SQLBuilder tool to write easily your SQL statements.
DotNET
DotNET components
tDotNETInstantiate Invokes the constructor of a .NET object that is intended for later
reuse.
DotNET scenarios
Integrating .Net into Talend Studio: Introduction
Utilizing .NET in Talend
Dropbox
Dropbox components
tDropboxConnection Creates a Dropbox connection to a given account that the other Dropbox
components can reuse.
tDropboxPut Uploads data to Dropbox from either a local file or a given data flow.
Dropbox scenario
Uploading files to Dropbox
Dynamic Schema
ElasticSearch
ElasticSearch components
tElasticSearchConfiguration Enables the reuse of the connection configuration to ElasticSearch in the same
Job.
tElasticSearchLookupInput Executes a ElasticSearch query with a strictly defined order which must
correspond to the schema definition.
ELT Greenplum
tELTGreenplumInput Adds as many Input tables as required for the most complicated Insert
statement.
tELTGreenplumMap Uses the tables provided as input to feed the parameter in the built
statement. The statement can include inner or outer joins to be
implemented between tables or between one table and its aliases.
tELTGreenplumOutput Executes the SQL Insert, Update and Delete statement to the Greenplum
database
tELTHiveInput Replicates the schema, which the tELTHiveMap component that follows
will use, of the input Hive table.
tELTHiveOutput Works alongside tELTHiveMap to write data into the Hive table.
ELT JDBC
tELTInput Adds as many Input tables as required for the SQL statement to be
executed.
tELTMap Uses the tables provided as input to feed the parameter in the built SQL
statement. The statement can include inner or outer joins to be
implemented between tables or between one table and its aliases.
tELTOutput Carries out the action on the table specified and inserts the data according
to the output schema defined in the ELT Mapper.
ELT MSSql
ELT MSSql components
tELTMSSqlInput Adds as many Input tables as required for the most complicated Insert
statement.
tELTMSSqlMap Uses the tables provided as input to feed the parameter in the built
statement. The statement can include inner or outer joins to be
implemented between tables or between one table and its aliases.
tELTMSSqlOutput Executes the SQL Insert, Update and Delete statement to the MSSql
database
ELT MSSql scenarios
Aggregating Snowflake data using context variables as table and connection names
Aggregating table columns and filtering
Mapping data using a simple implicit join
Mapping data using a subquery
Mapping date using using an Alias table
ELT MySQL
tELTMysqlInput Adds as many Input tables as required for the most complicated Insert
statement.
tELTMysqlMap Uses the tables provided as input to feed the parameter in the built
statement. The statement can include inner or outer joins to be
implemented between tables or between one table and its aliases.
tELTMysqlOutput tELTMysqlOutput executes the SQL Insert, Update and Delete statement to
the Mysql database
ELT Netezza
ELT Netezza components
tELTNetezzaInput Allows you to add as many Input tables as required for the most
complicated Insert statement.
tELTNetezzaMap Uses the tables provided as input, to feed the parameter in the built
statement. The statement can include inner or outer joins to be
implemented between tables or between one table and its aliases.
tELTNetezzaOutput Performs the action (insert, update or delete) on data in the specified
Netezza table through the SQL statement generated by the
tELTNetezzaMap component.
tELTOracleInput Provides the Oracle table schema that will be used by the tELTOracleMap
component to generate the SQL SELECT statement.
tELTOracleMap Builds the SQL SELECT statement using the table schema(s) provided by
one or more tELTOracleInput components.
tELTOracleOutput Performs the action (insert, update, delete, or merge) on data in the
specified Oracle table through the SQL statement generated by the
tELTOracleMap component.
ELT PostgreSQL
tELTPostgresqlInput Provides the Postgresql table schema that will be used by the
tELTPostgresqlMap component to generate the SQL SELECT statement.
tELTPostgresqlMap Builds the SQL SELECT statement using the table schema(s) provided by
one or more tELTPostgresqlInput components.
tELTPostgresqlOutput Performs the action (insert, update or delete) on data in the specified
Postgresql table through the SQL statement generated by the
tELTPostgresqlMap component.
ELT Sybase
tELTSybaseInput Provides the Sybase table schema that will be used by the tELTSybaseMap
component to generate the SQL SELECT statement.
tELTSybaseMap Builds the SQL SELECT statement using the table schema(s) provided by
one or more tELTSybaseInput components.
tELTSybaseOutput Performs the action (insert, update or delete) on data in the specified
Sybase table through the SQL statement generated by the tELTSybaseMap
component.
ELT Teradata
tELTTeradataInput Provides the Teradata table schema that will be used by the
tELTTeradataMap component to generate the SQL SELECT statement.
tELTTeradataMap Builds the SQL SELECT statement using the table schema(s) provided by
one or more tELTTeradataInput components.
tELTTeradataOutput Performs the action (insert, update or delete) on data in the specified
Teradata table through the SQL statement generated by the
tELTTeradataMap component.
ELT Vertica
ELT Vertica components
tELTVerticaInput Provides the Vertica table schema that will be used by the tELTVerticaMap
component to generate the SQL SELECT statement.
tELTVerticaMap Builds the SQL SELECT statement using the table schema(s) provided by
one or more tELTVerticaInput components.
tELTVerticaOutput Performs the action (insert, update or delete) on data in the specified
Vertica table through the SQL statement generated by the tELTVerticaMap
component.
ESB REST
tRESTClient Interacts with RESTful Web service providers by sending HTTP and HTTPS
requests using CXF (JAX-RS) getting the corresponding responses.
tRESTResponse Returns a specific HTTP status code to the client end as a response to the
HTTP and/or HTTP requests.
ESB SOAP
tESBConsumer Calls the defined method from the invoked Web service and returns the
class as defined, based on the given parameters.
tESBProviderFault Serves a Talend Job cycle result as a Fault message of the Web service in
case of a request response communication style.
EXASolution components
tExasolBulkExec Imports data into an EXASolution database table using the IMPORT
command provided by the EXASolution database in a fast way.
tExasolCommit Validates the data processed through the Job into the connected
EXASolution database.
EXASolution scenario
Importing data into an EXASolution database table from a local CSV file
Excel
Excel components
tFileInputExcel Reads an Excel file row by row to split them up into fields using regular
expressions and then sends the fields as defined in the schema to the next
component.
tFileOutputExcel Writes an MS Excel file with separated data values according to a defined
schema.
Excel scenario
Extracting data from specific Excel cells
EXist
EXist components
tEXistConnection (deprecated) Opens a connection to an eXist database in order that a transaction may be carried out.
tEXistDelete (deprecated) Deletes specified resources from a remote eXist database.
tEXistGet (deprecated) Retrieves selected resources from a remote eXist database to a defined local directory.
tEXistPut (deprecated) Uploads specified files from a defined local directory to a remote eXist database.
tEXistXQuery (deprecated) Queries XML files located on remote databases using local files containing XPath queries
and outputs the results to an XML file stored locally.
tEXistXUpdate (deprecated) Processes XML file records and updates the existing records on the database server.
EXist scenario
Retrieving resources from a remote eXist DB server
Firebird
Firebird components
tFirebirdConnection Opens a connection to the specified database that can then be reused in
the subsequent subJob or subJobs.
tFirebirdOutput Executes the action defined on the table in a Firebird database and/or on
the data contained in the table, based on the flow incoming from the
preceding component in the Job.
tFirebirdRow Executes the stated SQL query on the specified Firebird database.
Flume
Flume components
tFlumeInput Acts as interface to integrate Flume and the Spark Streaming Job
developed with the Studio to continuously read data from a given Flume
agent.
tFlumeOutput Acts as interface to integrate Flume and the Spark Streaming Job
developed with the Studio to continuously send data to a given Flume
agent.
FTP
FTP components
tFTPFileList Lists all files and folders directly under a specified directory based on a
filemask pattern.
FTP scenarios
Listing and getting files/folders on an FTP directory
Putting files onto an FTP server
Renaming a file located on an FTP server
FullRow
FullRow components
tFileStreamInputFullRow Reads data in a newly-created file row by row and sends the entire rows
within one single field to the next Job component, via a Row > Main link.
tFileInputFullRow Reads a file row by row and sends complete rows of data as defined in the
schema to the next component via a Row link.
FullRow scenario
Reading full rows in a delimited file
Global variable
Global variable components
tGlobalVarLoad Sets variables using the incoming data so that the data can be dynamically
reused by other subJobs.
tSetGlobalVar Facilitates the process of defining global variables.
Google BigQuery
Google BigQuery components
tBigQueryOutputBulk Creates a .txt or .csv file for the data of large size so that you can process
it according to your needs before transferring it to Google BigQuery.
tBigQuerySQLRow Connects to Google BigQuery and performs queries to select data from
tables row by row or create or delete tables in Google BigQuery.
Google Dataproc
tGoogleDataprocManage Creates or deletes a Dataproc cluster in the Global region on Google Cloud
Platform.
Google Drive
tGoogleDriveConnection Opens a Google Drive connection that can be reused by other Google Drive
components.
tGoogleDriveList Lists all files, or folders, or both files and folders in a specified Google Drive
folder, in the domain, including both Shared Drive and My Drive, and all
shared drives.
tGoogleDrivePut Uploads data from a data flow or a local file to Google Drive.
Google PubSub
tPubSubInput Connects to the Google Cloud PubSub service that transmits messages to
the components that run transformations over these messages.
tPubSubInputAvro Connects to Google Cloud Pub/Sub to receive messages in the Avro format
for the components that run transformations over these messages.
tPubSubOutput Receives messages serialized into byte arrays by its preceding component
and issues these messages into a given PubSub service.
GPG
GPG component
tGPGDecrypt Calls the gpg -d command to decrypt a GnuPG-encrypted file and saves
the decrypted file in the specified directory.
GPG scenario
Decrypting a GnuPG-encrypted file and display its content
Greenplum
Greenplum components
tGreenplumCommit Commits global transaction in one go instead of repeating the operation for every
row or every batch and thus provides gain in performance.
tGreenplumConnection Opens a connection to the specified database that can then be reused in the
subsequent subJob or subJobs.
tGreenplumGPLoad Bulk loads data into a Greenplum table either from an existing data file, an input
flow, or directly from a data flow in streaming mode through a named-pipe.
tGreenplumOutput Executes the action defined on the table and/or on the data of a table, according
to the input flow from the previous component.
tGreenplumOutputBulk Prepares the file to be used as parameter in the INSERT query to feed the
Greenplum database.
tGreenplumRow Acts on the actual DB structure or on the data (although without handling data),
depending on the nature of the query and the database.
Groovy
Groovy components
tGroovy tGroovy broadens the functionality if the Job, using the Groovy language
which is a simplified Java syntax.
tGroovyFile Broadens the functionality of Jobs using the Groovy language which is a
simplified Java syntax.
Groovy scenario
Calling a file which contains Groovy code
GS
GS components
tGoogleCloudConfiguration Provides the connection configuration to Google Cloud Platform for a Spark Job.
tGSConfiguration Provides the connection configuration to Google Cloud Storage for a Spark Job.
tGSBucketCreate Creates a new bucket which you can use to organize data and control access to
data in Google Cloud Storage.
tGSBucketExist Checks the existence of a bucket in Google Cloud Storage so as to make further
operations.
tGSBucketList Retrieves a list of buckets from all projects or one specific project in Google
Cloud Storage.
tGSClose Closes an active connection to Google Cloud Storage in order to release the
occupied resources.
tGSConnection Provides the authentication information for making requests to the Google
Cloud Storage system and enables the reuse of the connection it creates to
Google Cloud Storage.
tGSCopy Copies or moves objects within a bucket or between buckets in Google Cloud
Storage.
tGSDelete Deletes the objects which match the specified criteria in Google Cloud Storage so
as to release the occupied resources.
tGSGet Retrieves objects which match the specified criteria from Google Cloud Storage
and outputs them to a local directory.
tGSList Retrieves a list of objects from Google Cloud Storage one by one.
tGSPut Uploads files from a local directory to Google Cloud Storage so that you can
manage them with Google Cloud Storage.
GS scenario
Managing files with Google Cloud Storage
HBase
HBase components
tHBaseConfiguration Enables the reuse of the connection configuration to HBase in the same
Job.
tHBaseInput Reads data from a given HBase database and extracts columns of selection.
HBase scenario
Exchanging customer data with HBase
HCatalog
HCatalog components
tHCatalogInput Reads data from an HCatalog managed Hive database and send data to the
component that follows.
tHCatalogLoad Reads data directly from HDFS and writes this data into an established
HCatalog managed table.
tHCatalogOutput Receives data from its incoming flow and writes this data into an HCatalog
managed table.
HCatalog scenario
Managing HCatalog tables on Hortonworks Data Platform
HDFS
HDFS components
tHDFSConfiguration Enables the reuse of the connection configuration to HDFS in the same
Job.
tHDFSCompare Compares two files in HDFS and based on the read-only schema, generates
a row flow that presents the comparison information.
tHDFSConnection Connects to a given HDFS so that the other Hadoop components can reuse
the connection it creates to communicate with this HDFS.
tHDFSCopy Copies a source file or folder into a target directory in HDFS and removes
this source if required.
tHDFSDelete Deletes a file located on a given Hadoop distributed file system (HDFS).
tHDFSGet Copies files from Hadoop distributed file system(HDFS), pastes them in a
user-defined directory and if needs be, renames them.
tHDFSInput Extracts the data in a HDFS file for other components to process it.
tHDFSList tHDFSList retrieves a list of files or folders based on a filemask pattern and
iterates on each unity.
tHDFSOutput Writes data flows it receives into a given Hadoop distributed file system
(HDFS).
tHDFSOutputRaw Transfers data of different formats such as hierarchical data in the form of a
single column into a given HDFS file system.
tHDFSProperties Creates a single row flow that displays the properties of a file processed in
HDFS.
tHDFSPut Connects to Hadoop distributed file system to load large-scale files into it
with optimized performance.
tHDFSRowCount Reads a file in HDFS row by row in order to determine the number of rows
this file contains.
HDFS scenarios
Checking the existence of a file in HDFS
Computing data with Hadoop distributed file system
Using HDFS components to work with Azure Data Lake Storage (ADLS)
Iterating on a HDFS directory
Hive
Hive components
tHiveConfiguration Enables the reuse of the connection configuration to Hive in the same Job.
tHiveConnection Establishes a Hive connection to be reused by other Hive components in your Job.
tHiveCreateTable Creates Hive tables that fit a wide range of Hive data formats.
tHiveInput Extracts data from Hive and sends the data to the component that follows.
tHiveLoad Writes data of different formats into a given Hive table or to export data from a Hive
table to a directory.
tHiveOutput Connects to a given Hive database and writes the data it receives into a given Hive table
or a directory in HDFS.
tHiveRow Acts on the actual DB structure or on the data without handling data itself, depending
on the nature of the query and the database.
tHiveWarehouseConfiguration Enables the reuse of the Hive Warehouse Connector connection configuration to Hive in
the same Job.
tHiveWarehouseInput Extracts data from Hive and sends the data to the component that follows using Hive
Warehouse Connector.
tHiveWarehouseOutput Connects to a given Hive database and writes the received data into a given Hive table
or a directory in HDFS using Hive Warehouse Connector.
Hive scenarios
Creating a JDBC Connection to Azure HDInsight Hive
Creating a partitioned Hive table
HSQLDB
HSQLDB components
tHSQLDbInput Executes a DB query with a strictly defined order which must correspond to
the schema definition and then it passes on the field list to the next
component via a Main row link.
tHSQLDbOutput Executes the action defined on the table and/or on the data contained in
the table, based on the flow incoming from the preceding component in
the Job.
tHSQLDbRow Acts on the actual DB structure or on the data (although without handling
data), depending on the nature of the query and the database.
HTTP
HTTP component
tHttpRequest Sends an HTTP request to the server and outputs the response information
locally.
HTTP scenarios
Sending a HTTP request to the server and saving the response information to a local file
Sending a POST request from a local JSON file
Impala
Impala components
tImpalaCreateTable Creates Impala tables that fit a wide range of Impala data formats.
tImpalaInput Executes the select queries to extract the corresponding data and sends
the data to the component that follows.
tImpalaLoad Writes data of different formats into a given Impala table or to export data
from an Impala table to a directory.
tImpalaOutput Executes the action defined on the data contained in the table, based on
the flow incoming from the preceding component in the Job.
tImpalaRow Acts on the actual DB structure or on the data (although without handling
data).
Informix
Informix components
tInformixBulkExec Executes Insert operations in Informix databases.
tInformixCommit Makes a global commit just once instead of commiting every row or batch
of rows separately.
tInformixConnection Opens a connection to the specified database that can then be reused in
the subsequent subJob or subJobs.
tInformixOutput Executes the action defined on the table and/or on the data contained in
the table, based on the flow incoming from the preceding component in
the Job.
tInformixOutputBulk Prepares the file to be used as a parameter in the INSERT query used to
feed Informix databases.
tInformixOutputBulkExec Carries out Insert operations in Informix databases using the data
provided.
tInformixRow Acts on the actual DB structure or on the data (although without handling
data) thanks to the SQLBuilder that helps you write easily your SQL
statements.
Ingres
Ingres components
tIngresBulkExec (deprecated) Inserts data in bulk to a table in the Ingres DBMS for performance gain.
tIngresClose (deprecated) Closes the transaction committed in the connected Ingres database.
tIngresCommit (deprecated) Commits in one go, using a unique connection, a global transaction instead of doing that on
every row or every batch and thus provides gain in performance.
tIngresConnection (deprecated) Opens a connection to the specified database that can then be reused in the subsequent
subJob or subJobs.
tIngresInput (deprecated) Reads an Ingres database and extracts fields based on a query.
tIngresOutput (deprecated) Executes the action defined on the table and/or on the data contained in the table, based on
the flow incoming from the preceding component in the Job.
tIngresOutputBulk (deprecated) Prepares the file whose data is inserted in bulk to the Ingres DBMS for performance gain.
tIngresOutputBulkExec Inserts data in bulk to a table in the Ingres DBMS for performance gain.
(deprecated)
tIngresRollback (deprecated) Avoids to commit part of a transaction involuntarily by canceling the transaction committed
in the connected database.
tIngresRow (deprecated) Acts on the actual DB structure or on the data (although without handling data) using the
SQLBuilder tool to write easily your SQL statements.
Ingres scenario
Loading data to a table in the Ingres DBMS
Interbase
Interbase components
tInterbaseClose (deprecated) Closes the transaction committed in the connected Interbase database.
tInterbaseCommit (deprecated) Commits in one go a global transaction instead of doing that on every row or every batch
and thus provides gain in performance.
tInterbaseConnection Opens a connection to the specified database that can then be reused in the subsequent
(deprecated) subJob or subJobs.
tInterbaseInput (deprecated) Reads an Interbase database and extracts fields based on a query.
tInterbaseOutput (deprecated) Executes the action defined on the table and/or on the data contained in the table, based on
the flow incoming from the preceding component in the Job.
tInterbaseRollback Avoids to commit part of a transaction involuntarily by canceling the transaction committed
(deprecated) in the connected Interbase database.
tInterbaseRow (deprecated) Acts on the actual database structure or on the data (although without handling data) using
the SQLBuilder tool to write easily your SQL statements.
Internet (Integration)
tFileFetch Retrieves a file through the given protocol (HTTP, HTTPS, FTP, or
SMB).
Jasper
Jasper components
tJasperOutput Creates a report in rich formats using Jaspersoft's iReport.
tJasperOutputExec Creates a report in rich formats using Jaspersoft's iReport and offers a
performance gain as it functions as a combination of an input component
and a tJasperOutput component.
Jasper scenario
Generating a report against a .jrxml template
tJavaMR Provides an editor that enables you to enter personalized MapReduce code
in order to integrate it in Talend program.
tJavaStorm (deprecated) Provides a Java code editor that lets you enter the custom Storm code you
want to use in the Storm topology you are designing.
tJava Extends the functionalities of a Talend Job using custom Java commands.
tJavaFlex Provides a Java code editor that lets you enter personalized code in order
to integrate it in Talend program.
tJavaRow Provides a code editor that lets you enter the Java code to be applied to
each row of the flow.
JavaDB
JavaDB components
tJavaDBOutput Executes the action defined on the table and/or on the data contained in
the table, based on the flow incoming from the preceding component in
the Job.
tJavaDBRow Acts on the actual database structure or on the data (although without
handling data) using the SQLBuilder tool to write easily your SQL
statements.
JBoss ESB
tJBossESBInput Retrieves a message from a JBossESB server to process it as a flow that can
be used in a Talend Job.
tJBossESBOutput Transforms the data used in a Talend Job into a JBossESB message.
JDBC
JDBC components
tJDBCCommit Commits in one go a global transaction instead of doing that on every row
or every batch and thus provides gain in performance.
tJDBCConnection Opens a connection to the specified database that can then be reused in
the subsequent subJob or subJobs.
tJDBCInput Reads any database using a JDBC API connection and extracts fields based
on a query.
tJDBCOutput Executes the action defined on the data contained in the table, based on
the flow incoming from the preceding component in the Job.
tJDBCRollback Avoids commiting part of a transaction accidentally by canceling the
transaction committed in the connected database.
tJDBCRow Acts on the actual DB structure or on the data (although without handling
data) using the SQLBuilder tool to write easily your SQL statements.
tJDBCTableList Lists the names of a given set of JDBC tables using a select statement
based on a Where clause.
JIRA
JIRA components
tJIRAInput Retrieves the issue information based on a JQL query or retrieve the
project information based on a specified project ID from JIRA.
JIRA scenarios
Creating an issue in JIRA application
Retrieving the project information from JIRA application
Updating an issue in JIRA application
JMS
JMS components
JMS scenario
Enqueuing/dequeuing a message on the ActiveMQ server
JSON
JSON components
tFileStreamInputJSON Extracts JSON data from a file, then transfers the data to, for instance, a file
or a database table.
tFileInputJSON Extracts JSON data from a file and transfers the data to a file, a database
table, etc.
tFileOutputJSON Receives data and rewrites it in a JSON structured data block in an output
file.
JSON scenarios
Extracting JSON data from a URL
Extracting JSON data from a file using JSONPath
Extracting JSON data from a file using JSONPath without setting a loop node
Extracting JSON data from a file using XPath
Writing a JSON structured file
Kafka
Kafka components
tKafkaCreateTopic Creates a Kafka topic that the other Kafka components can use.
tKafkaInput Transmits messages you need to process to the components that follow in
the Job you are designing.
Kafka scenarios
Analyzing a Twitter flow in near real-time
Analyzing people's activities using a Storm topology (deprecated)
Kerberos
Kerberos component
Keystore
Keystore component
Keystore scenario
Extracting customer information from a private WSDL file
Kinesis
Kinesis components
tKinesisInput Acts as consumer of an Amazon Kinesis stream to pull messages from this
Kinesis stream.
tKinesisInputAvro Acts as consumer of an Amazon Kinesis stream to pull messages from this
Kinesis stream.
tKinesisOutput Acts as data producer to put data to an Amazon Kinesis stream for real-
time ingestion.
Kinesis scenario
Working with Amazon Kinesis and Big Data Streaming Jobs
Kudu
Kudu components
tKuduConfiguration Enables the reuse of the connection configuration to Cloudera Kudu in the
same Job.
tKuduInput Retrieves data from a Cloudera Kudu table and sends them to the
component that follows for transformation.
Kudu scenario
Writing and reading data from Cloudera Kudu using a Spark Batch Job
LDAP
LDAP components
tLDAPAttributesInput Analyses each object found via the LDAP query and lists a collection of
attributes associated with the object.
tLDAPInput Executes an LDAP query based on the given filter and corresponding to the
schema definition. Then it passes on the field list to the next component
via a Row > Main link.
tLDAPOutput Executes an LDAP query based on the given filter and corresponding to the
schema definition. Then it passes on the field list to the next component
via a Row > Main link.
LDIF
LDIF components
tFileInputLDIF Reads an LDIF file row by row to split them up into fields and sends the
fields as defined in the schema to the next component using a Row
connection.
tFileOutputLDIF Writes or modifies an LDIF file with data separated in respective entries
based on the schema defined, or else deletes content from an LDIF file.
LDIF scenario
Writing data from a database table into an LDIF file
Library import
Library import component
tAssert Generates the boolean evaluation on the concern for the Job execution
status and provides the Job status messages to tAssertCatcher.
tDie Triggers the tLogCatcher component for exhaustive log before killing the
Job.
tFlowMeter Counts the number of rows processed in the defined flow, so this number
can be caught by the tFlowMeterCatcher component for logging purposes.
tLogCatcher Operates as a log function triggered by one of the three: Java exception,
tDie or tWarn, to collect and transfer log data.
tLogRow Displays data or results in the Run console to monitor data processed.
tStatCatcher Gathers the Job processing metadata at the Job level and at the
component level and transfers the log data to the subsequent component
for display or storage.
Machine Learning
tClassify Predicts which class an element belongs to, based on the classifier model generated by a
model training component.
tClassifySVM Predicts which class an element belongs to, based on the classifier model generated by
tSVMModel.
tDecisionTreeModel Analyzes feature vectors usually prepared and provided by tModelEncoder to generate a
classifier model that is used by tPredict to classify given elements.
tGradientBoostedTreeModel Analyzes feature vectors usually prepared and provided by tModelEncoder to generate a
classifier model that is used by tPredict to classify given elements.
tKMeansStrModel Analyzes incoming datasets in near real-time, based on applying the K-Means algorithm.
tMahoutClustering (deprecated) Groups unlabeled numerical data into clusters that can reveal interesting patterns or helps
identifying abnormal data items in the data set.
tModelEncoder Performs featurization operations to transform data into the format expected by the model
training components such as tLogisticRegressionModel or tKMeansModel.
tNaiveBayesModel Generates a classifier model that is used by tPredict to classify given elements.
tRecommend Recommends products to users known to this model, based on the user-product
recommender model generated by tASLModel.
tSVMModel Generates an SVM-based classifier model that can be used by tPredict to classify given
elements.
Mail
Mail components
tFileInputMail Reads the standard key data of a given MIME or MSG email file.
Mail scenarios
Extracting key fields from an email
Retrieving emails and extracting data from email files
Sending an email on error
Sending an email with attachment in HTML format
MapRDB
MapRDB components
tMapRDBInput Reads data from a given MapRDB database and extracts columns of
selection.
tMapROjaiInput Reads documents from a MapR-DB database to load the data in a given
Job.
MapRDB scenario
Writing candidate data in a MapR-DB OJAI database
MapRStreams
MapRStreams components
tMapRStreamsInputAvro Transmits messages in the Avro format to the Job that runs transformations over
these messages. Only MapR V5.2 onwards is supported by this component.
tMapRStreamsConnection Opens a reusable connection to a given MapR Streams cluster so that the other
MapR Streams components can reuse this connection.
tMapRStreamsCreateStream Creates a MapR Streams stream or topic that the other MapR Streams components
can use.
tMapRStreamsInput Transmits messages to the Job that runs transformations over these messages.
Only MapR V5.2 onwards is supported by this component.
tMapRStreamsOutput Publishes messages into a MapR Streams system. Only MapR V5.2 onwards is
supported by this component.
Marketo
Marketo components
tMarketoBulkExec Imports leads or custom objects into Marketo from a local file in the REST
API mode.
tMarketoCampaign Retrieves campaign records, activity and campaign changes related data
from Marketo.
tMarketoConnection Opens a connection to Marketo that can then be reused by other Marketo
components.
tMarketoInput Retrieves lead records, activity history, lead changes, and custom object
related data from Marketo.
tMarketoListOperation Adds/removes one or more leads to/from a list in Marketo. Also, it helps
you verify the existence of one or more leads in a list in Marketo.
tMarketoOutput Writes lead records or custom object records from the incoming data flow
into Marketo.
Marketo scenarios
Adding a lead record to a Marketo list using SOAP API
Transmitting data with Marketo using REST API
MarkLogic
MarkLogic components
tMarkLogicBulkLoad Imports local files into a MarkLogic server database in bulk mode using the
MarkLogic Content Pump (MLCP) tool.
MaxDB
MaxDB components
tMaxDBRow Acts on the actual DB structure or on the data (although without handling
data), depending on the nature of the query and the database.
tMDMClose Terminates an open MDM server connection after the execution of the
proceeding subJob.
tMDMCommit Commits all changes to the database made within the scope of a
transaction in MDM.
tMDMConnection Opens an MDM server connection for convenient reuse in the current Job
or transaction.
tMDMRollback Rolls back any changes made in the database rather than definitively
committing them, for example to prevent partial commits if an error
occurs.
tMDMBulkLoad Uses bulk mode to write XML structured master data into the MDM server.
tMDMDelete Deletes master data records from specific entities in the MDM Hub.
tMDMInput Reads data in an MDM Hub and thus makes it possible to process this data.
tMDMOutput Writes data into or removes data from the MDM server.
tMDMRestInput Reads data through the REST API from the MDM Hub for further processing.
tMDMViewSearch Retrieves the MDM records from an MDM hub by applying filtering criteria
you have created in a specific view and puts out results in XML structure.
tMDMRouteRecord Helps Event Manager to identify the changes you have made on your data
so that correlative actions can be triggered.
tMDMTriggerInput Reads the XML message (Document type) sent by MDM and passes the
information to the component that follows.
tMDMTriggerOutput Receives an XML flow (Document type) from the preceding component in
the Job.
MemSQL
MemSQL components
tMemSQLConnection Opens a connection to the specified database that can then be reused in the subsequent
(deprecated) subJob or subJobs.
tMemSQLInput (deprecated) Executes a DB query with a strictly defined order which must correspond to the schema
definition.
tMemSQLOutput (deprecated) Reads data incoming from the preceding component in the Job and executes the action
defined on a given MemSQL table and/or on the data contained in the table.
tMemSQLRow (deprecated) Acts on the actual database structure or on the data (although without handling data).
MemSQL scenario
Writing data to and reading data from a MemSQL database table
Microsoft CRM
tMicrosoftCrmInput Extracts data from a Microsoft CRM database based on conditions set on
specific columns.
Microsoft MQ
Microsoft MQ components
tMicrosoftMQInput Retrieves the first message in a given Microsoft message queue (only
support String).
tMicrosoftMQOutput Writes a defined column of given inflow data to Microsoft message queue
(only support String type).
Microsoft MQ scenario
Writing and fetching queuing messages from Microsoft message queue
MOM
MOM components
MOM scenarios
Asynchronous communication via a MOM server
Transmitting XML files via a MOM server
Mondrian
Mondrian component
tMondrianInput (deprecated) Executes a multi-dimensional expression (MDX) query corresponding to the dataset
structure and schema definition.
Mondrian scenario
Extracting multi-dimenstional datasets from a MySQL database (Cross-join tables)
MongoDB
MongoDB components
tMongoDBConfiguration Stores connection information and credentials to be reused by other MongoDB
components.
tMongoDBLookupInput Executes a database query with a strictly defined order which must correspond
to the schema definition.
tMongoDBBulkLoad Imports data files in different formats (CSV, TSV or JSON) into the specified
MongoDB database so that the data can be further processed.
tMongoDBConnection Creates a connection to a MongoDB database and reuse that connection in other
components.
tMongoDBGridFSDelete Automates the delete action over specific files in MongoDB GridFS.
tMongoDBGridFSProperties Obtains information about the properties of given files selected based on a
query.
tMongoDBInput Retrieves records from a collection in the MongoDB database and transfers them
to the following component for display or storage.
tMongoDBOutput Executes the action defined on the collection in the MongoDB database.
MongoDB scenarios
Reading and writing data in MongoDB using a Spark Streaming Job
Writing and reading data from MongoDB using a Spark Batch Job
Creating a collection and writing data to it
Importing data into MongoDB database
Managing files using MongoDB GridFS
Retrieving data from a collection by advanced queries
Upserting records in a collection
Using MongoDB functions to create a collection and write data to it
MQTT
MQTT components
tMQTTInput Acts as consumer of a MQTT topic to stream messages from this topic.
tMQTTOutput Acts as publisher to a MQTT topic to stream messages to this topic in real
time.
MS Delimited
MS Delimited components
MS Delimited scenario
Reading a multi structure delimited file
MS Positional
MS Positional components
MS Positional scenario
Reading data from a positional file
MS XML connectors
tFileInputMSXML Reads the data structures (schemas) of a multi-structured XML file and
sends the fields as defined in the different schemas to the next
components using Row connections.
MSSql
MSSql components
tMSSqlCommit Commits in one go, using a unique connection, a global transaction instead
of doing that on every row or every batch and thus provides gain in
performance.
tMSSqlConnection Opens a connection to the specified database that can then be reused in
the subsequent subJob or subJobs.
tMSSqlInput Executes a DB query with a strictly defined order which must correspond to
the schema definition.
tMSSqlLastInsertId Retrieves the last primary keys added by a user to a MSSql table.
tMSSqlOutput Executes the action defined on the table and/or on the data contained in
the table, based on the flow incoming from the preceding component in
the Job.
tMSSqlOutputBulk Prepares the file to be used as parameter in the INSERT query to feed the
MSSql database.
tMSSqlRollback Cancels the transaction commit in the MSSql database and thus avoids to
commit part of a transaction involuntarily.
tMSSqlRow Acts on the actual DB structure or on the data (although without handling
data).
tMSSqlTableList Lists the names of a given set of MSSql tables using a select statement
based on a Where clause.
MSSql scenarios
Inserting data into a database table and extracting useful information from it
Retrieving personal information using a stored procedure
MySQL
MySQL components
tMySQLInvalidRows Checks MySQL database rows against specific Data Quality patterns
(regular expression) or Data Quality rules (business rule).
tMysqlColumnList Iterates on all columns of a given Mysql table and lists column names.
tMysqlCommit Commits in one go, using a unique connection, a global transaction instead
of doing that on every row or every batch and thus provides gain in
performance.
tMysqlConnection Opens a connection to the specified MySQL database for reuse in the
subsequent subJob or subJobs.
tMysqlInput Executes a DB query with a strictly defined order which must correspond to
the schema definition.
tMysqlLastInsertId Obtains the primary key value of the record that was last inserted in a
Mysql table by a user.
tMysqlOutputBulk Writes a file with columns based on the defined delimiter and the MySQL or
Aurora standards.
tMysqlOutputBulkExec Executes the Insert action in the specified MySQL or Aurora database.
tMysqlRollback Cancels the transaction commit in the connected MySQL database to avoid
committing part of a transaction involuntarily.
tMysqlRow Executes the stated SQL query on the specified MySQL database.
tMysqlTableList Lists the names of a given set of Mysql tables using a select statement
based on a Where clause.
MySQL scenarios
Checking customer table against a given DQ rule to select customer records
Controlling the data definition language via tMysqlOutput when creating a table
Reading email addresses from a DB table and retrieving specific data
Updating a database table using tMysqlOutput in a Big Data Streaming Job
Writing dynamic columns from a source file to a database
Combining two flows for selective output
Getting the ID for the last inserted record with tMysqlLastInsertId
Inserting a column and altering data using tMysqlOutput
Inserting data in bulk in MySQL database
Inserting data in mother/daughter tables
Inserting transformed data in MySQL database
Iterating on DB tables and deleting their content using a user-defined SQL template
Iterating on a DB table and listing its column names
Removing and regenerating a MySQL table index
Retrieving data in error with a Reject link
Sharing a database connection between a parent Job and child Job
Updating data using tMysqlOutput
Using PreparedStatement objects to query data
Using tMysqlSP to find a State Label using a stored procedure
Writing columns from a MySQL database to an output file using tMysqlInput
NamedPipe
NamedPipe components
NamedPipe scenario
Writing and loading data through a named-pipe
tNLPPredict Uses a classifier model generated by tNLPModel to predict and label the
input text.
tNLPPreprocessing Prepares a text sample and divides it into tokens, which can be words,
numbers or punctuation marks.
Neo4j
Neo4j components
tNeo4jv4Input Reads data from Neo4j version 4.x and sends data in the output flow.
tNeo4jv4Output Receives data from the preceding component and writes the data into a Neo4j version 4.x
database.
tNeo4jv4Row Executes the stated Cypher query onto the specified Neo4J version 4.x database.
tNeo4jBatchOutput Receives data from the preceding component and writes the data into a local Neo4j
database.
tNeo4jBatchOutputRelationship Receives data from the preceding component and writes relationships in bulk into a local
Neo4j database.
tNeo4jImportTool Uses Neo4j Import Tool to create a Neo4j database and import large amounts of data in bulk
from CSV files to this database.
tNeo4jInput Reads data from Neo4j and sends data in the output flow.
tNeo4jOutput Receives data from the preceding component and writes the data into Neo4j.
tNeo4jOutputRelationship Receives data from the preceding component and writes relationships into Neo4j.
tNeo4jRow Executes the stated Cypher query onto the specified Neo4J database.
Neo4j scenarios
Creating nodes with a label using a Cypher query
Importing data from a CSV file to Neo4j and creating relationships using a single Cypher query
Importing data from a CSV file to Neo4j using a Cypher query
Writing information of actors and movies to Neo4j with hierarchical relationship using Neo4j Batch components
Writing data to a Neo4j database and reading specific data from it
Writing family information to Neo4j and creating relationships
Writing information of actors and movies to Neo4j with hierarchical relationship
Netezza
Netezza components
tNetezzaBulkExec Offers gains in performance while carrying out the Insert operations to a
Netezza database.
tNetezzaCommit Validates the data processed through the Job into the connected Netezza
database.
tNetezzaConnection Opens a connection to a Netezza database to be reused in the subsequent
subJob or subJobs.
tNetezzaInput Reads a database and extracts fields from a Netezza database based on a
query.
tNetezzaNzLoad Inserts data into a Netezza database table using Netezza's nzload utility.
tNetezzaRow Executes the SQL query stated onto the specified Netezza database.
Netsuite
Netsuite components
tNetSuiteV2019Connection Creates a connection to a NetSuite SOAP server by leveraging NetSuite v2019 features so
that other NetSuite V2019 components in the Job can reuse the connection.
tNetSuiteV2019Input Invokes the NetSuite SOAP service and retrieves data according to the conditions you
specify by leveraging NetSuite v2019 features.
tNetSuiteV2019Output Invokes the NetSuite SOAP service and inserts, updates, or removes data on the NetSuite
SOAP server by leveraging NetSuite v2019 features.
tNetsuiteConnection Creates a connection to the NetSuite SOAP server so that other NetSuite components in the
(deprecated) Job can reuse the connection.
tNetsuiteInput (deprecated) Invokes the NetSuite SOAP service and retrieves data according to the conditions you
specify.
tNetsuiteOutput (deprecated) Invokes the NetSuite SOAP service and inserts, updates, or removes data on the NetSuite
SOAP server.
Netsuite scenario
Handling data with NetSuite
Openbravo ERP
tOpenbravoERPInput Extracts data from OpenBravoERP database according to the conditions defined in specific
(deprecated) columns.
Oracle components
tOracleInvalidRows Checks Oracle database rows against specific Data Quality patterns
(regular expression) or Data Quality rules (business rule).
tOracleValidRows Checks Oracle database rows against Data Quality patterns (regular
expression).
tOracleCommit Validates the data processed through the Job into the connected Oracle
database
tOracleConnection Opens a connection to the specified Oracle database for reuse in the
subsequent subJob or subJobs.
tOracleOutputBulk Writes a file with columns based on the defined delimiter and the Oracle
standards.
tOracleRollback Cancels the transaction commit in the connected Oracle database to avoid
committing part of a transaction involuntarily.
tOracleRow Executes the stated SQL query on the specified Oracle database.
tOracleTableList Lists the names of specified Oracle tables using a SELECT statement based
on a WHERE clause.
Oracle scenarios
Checking number format using a stored procedure
Truncating and inserting file data into an Oracle database
Using context parameters when reading a table from an Oracle database
ORC
ORC components
tFileInputORC Extracts records from a given ORC format file and sends the data to the
next component for further processing.
tFileOutputORC Receives records from the processing component placed ahead of it and
writes the records into ORC format files.
Orchestration (Integration)
tCollector Feeds the parallel execution processes with the threads generated by
tPartitioner.
tPartitioner Partitions the input data before tCollector can transfer them to the parallel
execution processes.
tFlowToIterate Reads data line by line from the input flow and stores the data entries in
iterative global variables.
tReplicate Duplicates the incoming schema into two identical output flows.
tRunJob Manages complex Job systems which need to execute one Job after
another.
tSleep Identifies possible bottlenecks using a time break in the Job for testing or
tracking purpose.
tWaitForSqlData Iterates on a given connection for insertion or deletion of rows and triggers
a subJob when a condition linked to SQL data presence is met.
Palo
Palo components
tPaloCheckElements Checks whether elements are present in an incoming data flow existing in a given cube.
(deprecated)
tPaloConnection (deprecated) Opens a connection to a Palo Server and allows other components involved in a process to
share the connection for the duration of the process.
tPaloCubeList (deprecated) Retrieves a list of cube details from the given Palo database.
tPaloDatabaseList (deprecated) Lists database names, database types, number of cubes, number of dimensions, database
status and database id from a given Palo server.
tPaloDimensionList Retrieves a list of dimension details from the given Palo database.
(deprecated)
tPaloInputMulti (deprecated) Retrieves the stored or calculated values in combination with the element records out of a
cube.
tPaloOutput (deprecated) Takes the input stream and writes it to a given Palo cube.
tPaloOutputMulti (deprecated) Takes the input stream and writes it to a given Palo cube.
tPaloRuleList (deprecated) Lists all rules, formulas, comments, activation status, external IDs from a given cube.
Palo scenarios
Creating a cube in an existing database
Creating a database
Creating a dimension with elements
Creating a rule in a given cube
Rejecting inflow data when the elements to be written do not exist in a given cube
Retrieving detailed cube information from a given database
Retrieving detailed database information from a given Palo server
Retrieving detailed dimension information from a given database
Retrieving detailed rule information from a given cube
Retrieving dimension elements from a given cube
Writing data into a given cube
ParAccel
ParAccel components
tParAccelCommit (deprecated) Commits in one go a global transaction, using a unique connection, instead of doing that on
every row or every batch and thus provides gain in performance.
tParAccelConnection Opens a connection to the specified database that can then be reused in the subsequent
(deprecated) subJob or subJobs.
tParAccelOutput (deprecated) Executes the action defined on the table and/or on the data of a table, according to the
input flow form the previous component.
tParAccelOutputBulk Prepares the file to be used as parameter in the INSERT query to feed the ParAccel database.
(deprecated)
Parquet
Parquet components
tFileInputParquet Extracts records from a given Parquet format file and sends the data to the
next component for further processing.
tFileOutputParquet Receives records from the processing component placed ahead of it and
writes the records into Parquet format files.
tFileStreamInputParquet Extracts records from a given Parquet format file for other components to
process the records.
Petals
Petals components
POP
POP component
tPOP Fetches one or more email messages from a server using the POP3 or IMAP
protocol.
POP scenario
Retrieving a selection of email messages from an email server
Positional
Positional components
tFileStreamInputPositional Listens on a given directory for new files, reads data from them row by row and
extracts fields based on a specific pattern.
tFileInputPositional Reads a positional file row by row to split them up into fields based on a given
pattern and then sends the fields as defined in the schema to the next
component.
tFileOutputPositional Writes a file row by row according to the length and the format of the fields or
columns in a row.
Positional scenarios
Handling a positional file based on a dynamic schema
Reading a Positional file and saving filtered results to XML
PostgresPlus
PostgresPlus components
tPostgresPlusCommit Commits in one go a global transaction, using a unique connection, instead of doing
that on every row or every batch and thus improves performance.
tPostgresPlusConnection Opens a connection to the specified database that can then be reused in the
subsequent subJob or subJobs.
tPostgresPlusInput Executes a DB query with a strictly defined order which must correspond to the
schema definition. Then it passes on the field list to the next component via a Main row
link.
tPostgresPlusOutput Executes the action defined on the table and/or on the data contained in the table,
based on the flow incoming from the preceding component in the job.
tPostgresPlusOutputBulk Prepares the file to be used as parameter in the INSERT query to feed the PostgresPlus
database.
tPostgresPlusRow Acts on the actual DB structure or on the data (although without handling data),
depending on the nature of the query and the database. The SQLBuilder tool helps you
write easily your SQL statements.
PostgreSQL
PostgreSQL components
tPostgresqlInvalidRows Extracts DB rows that do not match a given data quality pattern, you can then
implement any required correction.
tPostgresqlBulkExec Improves performance while carrying out the Insert operations to a Postgresql
database.
tPostgresqlInput Executes a DB query with a strictly defined order which must correspond to the
schema definition. Then it passes on the field list to the next component via a
Main row link.
tPostgresqlOutput Executes the action defined on the table and/or on the data contained in the
table, based on the flow incoming from the preceding component in the job.
tPostgresqlOutputBulk Prepares the file to be used as parameters in the INSERT query to feed the
Postgresql database.
tPostgresqlRow Acts on the actual DB structure or on the data (although without handling data),
depending on the nature of the query and the database. The SQLBuilder tool
helps you write easily your SQL statements.
Processing (Integration)
Processing (Integration) components
tCacheOut Persists the input RDDs depending on the specific storage level you define
in order to offer faster access to these datasets later.
tExtractEDIField Reads the EDI structured data from an EDIFACT message file, generates an
XML according to the EDIFACT family and the EDIFACT type, extracts data
by parsing the generated XML using the XPath queries manually defined or
coming from the Repository wizard, and finally sends the data to the next
component via a Row connection.
tExtractRegexFields Extracts data and generates multiple columns from a formatted string
using regex matching.
tTop Sorts data and outputs several rows from the first one of this data.
tTopBy Groups and sorts data and outputs several rows from the first one of the
data in each group.
tWindow Applies a given Spark window on the incoming RDDs and sends the
window-based RDDs to its following component.
tWriteAvroFields Transforms the incoming data into Avro files.
tAggregateSortedRow Aggregates the sorted input data for output column based on a set of
operations. Each output column is configured with many rows as required,
the operations to be carried out and the input column from which the data
will be taken for better data aggregation.
tConvertType Converts one Talend java type to another automatically, and thus avoid
compiling errors.
tExternalSortRow Sorts input data based on one or several columns, by sort type and order,
using an external sort application.
tExtractJSONFields Extracts the desired data from JSON fields based on the JSONPath or XPath
query.
tExtractPositionalFields Extracts data and generates multiple columns from a formatted string
using positional fields.
tExtractXMLField Reads the XML structured data from an XML field and sends the data as
defined in the schema to the following component.
tFilterRow Filters input rows by setting one or more conditions on the selected
columns.
tJoin Performs inner or outer joins between the main data flow and the lookup
flow.
tNormalize Normalizes the input flow following SQL standard to help improve data
quality and thus eases the data update.
tSampleRow Selects rows according to a list of single lines and/or a list of groups of
lines.
tWriteJSONField Transforms the incoming data into JSON fields and transfers them to a file,
a database table, etc.
Properties components
tFileInputProperties Reads a text file row by row and separates the fields according to the model
key = value.
tFileOutputProperties Writes a configuration file, of the type .ini or .properties, containing text
data organized according to the model key = value.
Properties scenario
Reading and matching the keys and the values of different .properties files and outputting the results in a glossary
Proxy
Proxy component
RabbitMQ
RabbitMQ components
tRabbitMQInput Reads messages from a message queue and passes the messages in the
output flow.
tRabbitMQOutput Receives data from the preceding component as messages and adds the
messages to queues in the specified way.
Raw
Raw components
tFileInputRaw Reads all data in a raw file and sends it to a single output column for
subsequent processing by another component.
tFileOutputRaw Provides data coming from another component, in the form of a single
column of output data.
Regex
Regex components
tFileStreamInputRegex Listens on a given directory for new files, then reads data from these files,
row by row, in order to split the data into fields using regular expressions.
tFileInputRegex Reads a file row by row to split them up into fields using regular
expressions and sends the fields as defined in the schema to the next
component.
Regex scenario
Reading data using a Regex and outputting the result to Positional file
REST
REST component
REST scenario
Creating and retrieving data by invoking REST Web service
Riak
Riak components
tRiakBucketList (deprecated) Retrieves a list of buckets from a Riak cluster and iterates on it.
tRiakClose (deprecated) Closes an active connection to a Riak cluster so as to release occupied resources.
tRiakConnection (deprecated) Opens and reuses of the connection it creates to a Riak cluster.
tRiakInput (deprecated) Extracts the desired data from a bucket in a Riak node so as to store or apply changes to
the data.
tRiakKeyList (deprecated) Retrieves a list of keys and iterates on it within a Riak bucket for analysis or
development purposes.
tRiakOutput (deprecated) Receives data from the preceding component and writes data into or deletes data from
a bucket in a Riak cluster.
Riak scenario
Exporting data from a Riak bucket to a local file
Route
Route components
tRouteFault Sends messages from a Data Integration Job to a Mediation Route and
mark the message as fault.
tRouteInput Accepts messages in a Data Integration Job from a Mediation Route.
Route scenarios
Using shared Data Sources with DB components in Jobs with tRouteInput
Exchanging messages between a Job and a Route
Exchanging messages between a Job and a Route
RSS
RSS components
tRSSOutput Creates and writes XML files that hold RSS or Atom
feeds.
RSS scenarios
Creating an ATOM feed XML file
Creating an RSS flow and storing files on an FTP server
Creating an RSS flow that contains metadata
Fetching frequently updated blog entries.
Salesforce
Salesforce components
tSalesforceEinsteinBulkExec Loads data into Salesforce Analytics Cloud from a local file.
tSalesforceEinsteinOutputBulkExec Gains in performance during data operations to the Salesforce Analytics Cloud.
tSalesforceGetDeleted Collects data deleted during a specific period of time from a Salesforce object.
tSalesforceGetServerTimestamp Retrieves the current date of the Salesforce server presented in a timestamp format.
tSalesforceGetUpdated Collects data updated during a specific period of time from a Salesforce object.
tSalesforceOutputBulk Generates the file to be processed by the tSalesforceBulkExec component for bulk
processing.
SAP
SAP components
tELTSAPInput Provides the SAP table schema that will be used by the tELTSAPMap component
to generate the SQL SELECT statement.
tELTSAPMap Builds the SQL SELECT statement using the table schema(s) provided by one or
more tELTSAPInput components.
tSAPADSOInput Retrieves data of an active ADSO (Advanced Data Store Object) from an SAP BW
system on an SAP HANA database.
tSAPBapi Extracts data from or loads data to an SAP server using multiple input/output
parameters or the document type parameter.
tSAPBWInput Executes an SQL query with a strictly defined order which must correspond to
your schema definition.
tSAPCommit Commits a global transaction in one go, using a unique connection, instead of
doing that on every row or every batch and thus provides gain in performance.
tSAPConnection Commits a whole Job data in one go to the SAP system as one transaction.
tSAPDataSourceOutput Writes Data Source objects into an SAP BW Data Source system.
tSAPDataSourceReceiver Retrieves data requests stored on Talend SAP RFC server and related to a specific
Data Source system.
tSAPHanaBulkExec Improves performance while carrying out the Insert operations to an SAP HANA
database.
tSAPHanaInvalidRows Checks SAP Hana database rows against specific Data Quality patterns (regular
expression) or Data Quality rules (business rule).
tSAPHanaUnload Offloads massive data from the SAP HANA database to a third party system.
tSAPHanaValidRows Checks SAP Hana database rows against specific Data Quality patterns (regular
expression) or Data Quality rules (business rule).
tSAPIDocInput (deprecated) Extracts IDoc data set that is used for asynchronous transactions between SAP
systems or between a SAP system and another application.
tSAPIDocOutput Uploads IDoc data set in XML fomat to an SAP system.
tSAPODPInput Extracts business data from the ERP part of SAP (SAP Business application, SAP on
HANA, SAP R/3, and S4/HANA) through ODP (Operational Data Provisioning).
tSAPHanaCommit Commits in one go, using a unique connection, a global transaction instead of
doing that on every row or every batch and thus provides gain in performance.
tSAPHanaConnection Establishes a SAP HANA connection to be reused by other SAP HANA components
in your Job.
tSAPHanaInput Executes a database query with a defined command which must correspond to
the schema definition.
tSAPHanaOutput Executes the action defined on the table and/or on the data contained in the
table, based on the flow incoming from the preceding component in the Job.
tSAPHanaRow Acts on the actual database structure or on the data (although without handling
data).
SAP scenarios
Connecting to a given SAP R/3 system for listening the creation of IDoc files (deprecated)
Consuming Data Source objects using SSL Transport
Consuming IDocs for processing by tHMap
Exporting data using tSAPHanaUnload
Extracting Data using tSAPInfoCubeInput
Reading data from SAP BW database
Retrieving ADSO data from SAP BW
Retrieving data from SAP through ODP
Retrieving data from an SAP system by calling a BAPI function using document type parameters
Retrieving data from an SAP system by calling a BAPI function using multiple input/output parameters
Aggregating and filtering data in multiple SAP tables
SCD
SCD components
tDB2SCD Addresses Slowly Changing Dimension needs, reading regularly a source of
data and logging the changes into a dedicated SCD table
tInformixSCD Tracks and shows changes which have been made to Informix SCD dedicated
tables
tIngresSCD (deprecated) Reflects and tracks changes in a dedicated Ingres SCD table.
tMSSqlSCD Tracks and reflects changes in a dedicated SCD table in a Microsoft SQL Server
or Azure SQL database.
tParAccelSCD (deprecated) Addresses Slowly Changing Dimension needs, reading regularly a source of
data and logging the changes into a dedicated SCD table
tVerticaSCD Tracks and reflects data changes in a dedicated Vertica SCD table.
SCD scenario
Tracking data changes using Slowly Changing Dimensions (type 0 through type 3)
SCDELT
SCDELT components
tDB2SCDELT Addresses Slowly Changing Dimension needs through SQL queries (server-
side processing mode), and logs the changes into a dedicated DB2 SCD
table.
tJDBCSCDELT Tracks data changes in a source database table using SCD (Slowly
Changing Dimensions) Type 1 method and/or Type 2 method and writes
both the current and historical data into a specified SCD dimension table.
tMysqlSCDELT Reflects and tracks changes in a dedicated MySQL SCD table through SQL
queries.
tOracleSCDELT Reflects and tracks changes in a dedicated Oracle SCD table through SQL
queries.
tPostgresPlusSCDELT Addresses Slowly Changing Dimension needs through SQL queries (server-
side processing mode), and logs the changes into a dedicated PostgresPlus
SCD table.
tPostgresqlSCDELT Addresses Slowly Changing Dimension needs through SQL queries (server-
side processing mode), and logs the changes into a dedicated DB2 SCD
table.
tSybaseSCDELT Addresses Slowly Changing Dimension needs through SQL queries (server-
side processing mode), and logs the changes into a dedicated Sybase SCD
table.
tTeradataSCDELT Addresses Slowly Changing Dimension needs through SQL queries (server-
side processing mode), and logs the changes into a dedicated Teradata
SCD table.
SCDELT scenarios
Tracking data changes in a PostgreSQL table using the tPostgreSQLSCDELT component
Tracking data changes in a Snowflake table using the tJDBCSCDELT component
Tracking data changes using Slowly Changing Dimensions (type 0 through type 3)
SCP
SCP components
tSCPTruncate Removes data from file(s) on the defined SCP server via an SCP
connection.
SCP scenario
Handling a file using SCP
ServiceNow
ServiceNow components
SingleStore
SingleStore components
tSingleStoreBulkExec Loads data from a file into a table of a database connected through JDBC API.
tSingleStoreCommit Commits in one go a global transaction instead of doing that on every row or every
batch and thus provides gain in performance.
tSingleStoreConnection Opens a connection to the specified database that can then be reused in the
subsequent subJob or subJobs.
tSingleStoreInput Reads any database using a JDBC API connection and extracts fields based on a
query.
tSingleStoreOutput Executes the action defined on the data contained in the table, based on the flow
incoming from the preceding component in the Job.
tSingleStoreOutputBulk Prepares the bulk file to be used as a parameter to feed the database connected.
tSingleStoreOutputBulkExec Provides performance gain when loading data from a file into a table of a database
connected through JDBC API.
tSingleStoreRow Acts on the actual DB structure or on the data (although without handling data)
using the SQLBuilder tool to write easily your SQL statements.
tSingleStoreSP Centralizes multiple or complex queries in a database in order to call them easily.
Snowflake
Snowflake components
tSnowflakeConnection Opens a connection to Snowflake that can then be reused by other Snowflake
components.
tSnowflakeInput Reads data from a Snowflake table into the data flow of your Job based on an
SQL query.
tSnowflakeOutput Uses the data incoming from its preceding component to insert, update, upsert
or delete data in a Snowflake table.
tSnowflakeOutputBulk Writes incoming data to files generated in a folder. The folder can be in an
internal Snowflake stage, an Amazon Simple Storage Service (Amazon S3)
bucket, or an Azure container.
tSnowflakeOutputBulkExec Writes incoming data to files generated in a folder and then loads the data into a
Snowflake database table. The folder can be in an internal Snowflake stage, an
Amazon Simple Storage Service (Amazon S3) bucket, or an Azure container.
tSnowflakeRollback Cancels the transaction commit in the Snowflake database to avoid committing
part of a transaction involuntarily.
tSnowflakeRow Executes the SQL command stated onto a specified Snowflake database.
Snowflake scenarios
Aggregating Snowflake data using context variables as table and connection names
Loading Data Using COPY Command
Loading data in a Snowflake table using custom stage path
Querying data in a cloud file through a materialized view and a Snowflake external table
Writing data into and reading data from a Snowflake table
SOAP
SOAP component
tSOAP Calls a method via a Web service in order to retrieve the values of the
parameters defined in the component editor.
SOAP scenarios
Fetching the country name information using a Web service
Using a SOAP message from an XML file to get country name information and saving the information to an XML file
Socket
Socket components
tSocketInput Opens the socket port and listens for the incoming data.
tSocketOutput Sends out the data from the incoming flow to a listening socket
port.
Socket scenario
Passing on data to the listening port
Splunk
Splunk component
tSplunkEventCollector Sends the event data to Splunk through Splunk HTTP Event
Collector.
SQLite
SQLite components
tSQLiteCommit Commits in one go, using a unique connection, a global transaction instead
of doing that on every row or every batch and thus provides gain in
performance.
tSQLiteOutput Executes the action defined on the table and/or on the data contained in
the table, based on the flow incoming from the preceding component in
the job.
tSQLiteRow Executes the defined query onto the specified database and uses the
parameters bound with the column.
SQLite scenarios
Filtering SQlite data
Updating SQLite rows
SQLTemplate
SQLTemplate components
tSQLTemplate Executes the common database actions or customized SQL statement templates,
for example to drop/create a table.
tSQLTemplateCommit Commits a global action in one go using a single connection, instead of doing so
for every row or every batch of rows separately. This provides a gain in
performance.
tSQLTemplateFilterRows Sets row filters for any given data source, based on a WHERE clause.
tSQLTemplateMerge Merges data into a database table directly on the DBMS by creating and executing
a MERGE statement.
SQLTemplate scenarios
Filtering and aggregating table columns directly on the DBMS
Merging data directly on the DBMS
Sqoop
Sqoop components
tSqoopExport Defines the arguments required by Sqoop for transferring data to a RDBMS.
tSqoopImport Defines the arguments required by Sqoop for writing the data of your
interest into HDFS.
tSqoopImportAllTables Defines the arguments required by Sqoop for writing all of the tables of a
database into HDFS.
tSqoopMerge Performs an incremental import that updates an older dataset with newer
records. The file types of the newer and the older datasets must be the
same.
Sqoop scenarios
Importing a MySQL table to HDFS
Merging two datasets in HDFS
SVNLog
SVNLog component
SVNLog scenario
Retrieving a log message from an SVN repository
Sybase
Sybase components
tSybaseCommit Commits in one go, using a unique connection, a global transaction instead
of doing that on every row or every batch and thus provides gain in
performance.
tSybaseInput Executes a DB query with a strictly defined order which must correspond to
the schema definition.
tSybaseIQBulkExec Loads data into a Sybase database table from a flat file or other database
table.
tSybaseOutput Executes the action defined on the table and/or on the data contained in the
table, based on the flow incoming from the preceding component in the job.
tSybaseOutputBulk Prepares the file to be used as parameter in the INSERT query to feed the
Sybase database.
tSybaseRow Acts on the actual DB structure or on the data (although without handling
data).
Sybase scenario
Bulk-loading data to a Sybase IQ 12 database
System
System components
tRunJob Manages complex Job systems which need to execute one Job after
another.
System scenarios
Calling a Job and passing the parameter needed to the called Job
Displaying remote system information via SSH
Echoing 'Hello World!'
Modifying a variable during a Job execution
Passing a value from a parent Job to a child Job
Propagating the buffered output data from the child Job to the parent Job
Running a list of child Jobs dynamically
Tachyon
Tachyon component
tTachyonConfiguration Defines a connection to Tachyon storage system and enables the reuse of
the configuration in the same Job.
tAddLocationFromIP
tAddLocationFromIP component
tAddLocationFromIP scenario
Identifying a real-world geographic location of an IP
Talend Cloud
tJobLog Collects and shows exception data during the execution of the Job in
Talend Studio or the task in Talend Cloud Management Console.
tChangeFileEncoding
tChangeFileEncoding component
tChangeFileEncoding Transforms the character encoding of a given file and generates a new file
with the transformed character encoding.
tChangeFileEncoding scenario
Transforming the character encoding of a file
tCreateTemporaryFile
tCreateTemporaryFile component
tCreateTemporaryFile scenario
Creating a temporary file and writing data into it
Technical
Technical components
tBoundedStreamInput Provides a data stream for the component to be tested and is suitable for
use in a test case only.
tHashInput Reads from the cache memory data loaded by tHashOutput to offer high-
speed data feed, facilitating transactions involving a large amount of data.
tHashOutput Loads data to the cache memory to offer high-speed access, facilitating
transactions involving a large amount of data.
Technical scenarios
Clearing the memory before loading data to it in case an iterator exists in the same subJob
Reading data from the cache memory for high-speed data access
Teradata
Teradata components
tTeradataConfiguration Defines a connection to Teradata and enables the reuse of the connection
configuration in the same Job.
tTeradataLookupInput Executes a database query with a strictly defined order which must
correspond to the schema definition.
tTeradataCommit Commits in one go, using a unique connection, a global transaction instead
of doing that on every row or every batch and thus provides gain in
performance.
tTeradataConnection Opens a connection to the specified database that can then be reused in
the subsequent subJob or subJobs.
tTeradataFastLoad Executes a database query according to a strict order which must be the
same as the one in the schema.
tTeradataFastLoadUtility Executes a database query according to a strict order which must be the
same as the one in the schema.
tTeradataInput Executes a DB query with a strictly defined order which must correspond to
the schema definition.
tTeradataMultiLoad Executes a database query according to a strict order which must be the
same as the one in the schema.
tTeradataOutput Executes the action defined on the table and/or on the data contained in
the table, based on the flow incoming from the preceding component in
the job.
tTeradataRow Acts on the actual DB structure or on the data (although without handling
data).
tTeradataTPTExec Offers high performance in inserting data from an existing file to a table in
a Teradata database.
tTeradataTPTUtility Writes the incoming data to a file and then loads the data from the file to a
Teradata database.
tTeradataTPump Inserts, updates, or deletes data in the Teradata database with the TPump
loading utility which allows near-real-time data to be achieved in the data
warehouse.
Teradata scenarios
Inserting data into a Teradata database table
Loading data into a Teradata database
tFileCompare
tFileCompare component
tFileCompare Compares two files and provides comparison data based on a read-only
schema.
tFileCompare scenario
Comparing unzipped files
tFileCopy
tFileCopy component
tFileCopy scenario
Moving/copying/renaming files in batch
tFileDelete
tFileDelete component
tFileDelete scenario
Deleting files
tFileExist
tFileExist component
tFileExist scenario
Checking for the presence of a file and creating it if it does not exist
tFileList
tFileList component
tFileList scenarios
Finding duplicate files between two folders
Iterating on a file directory
tFileProperties
tFileProperties component
tFileProperties Creates a single row flow that displays the main properties of the
processed file.
tFileProperties scenario
Displaying the properties of a processed file
tFileRowCount
tFileRowCount component
tFileRowCount Opens a file and reads it row by row in order to determine the number of
rows inside.
tFileRowCount scenario
Writing a file to MySQL if the number of its records matches a reference value
tFileTouch
tFileTouch component
tFileTouch Creates an empty file or, if the specified file already exists, updates its date
of modification and of last access while keeping the contents unchanged.
tFixedFlowInput
tFixedFlowInput component
tFixedFlowInput scenario
Buffering output data on the webapp server
tMap
tMap component
tMap Transforms and routes data from single or multiple sources to single or
multiple destinations.
tMap scenarios
Advanced mapping with lookup reload at each row
Converting a UNIX timestamp to a readable date
Mapping data using a filter and a simple explicit join
Mapping with join output tables
tMemorizeRows
tMemorizeRows component
tMemorizeRows Memorizes a sequence of rows that passes through and allows the
following component(s) to perform operations of your choice on the
memorized rows.
tMemorizeRows scenario
Retrieving the different ages and lowest age data
tMsgBox
tMsgBox component
tMsgBox Opens a dialog box with an OK button requiring action from the
user.
tMsgBox scenario
'Hello world!' type test
tRowGenerator
tRowGenerator component
tRowGenerator Creates an input flow in a Job for testing purposes, in particular for
boundary test sets.
tRowGenerator scenario
Generating random java data
tServerAlive
tServerAlive component
tServerAlive scenario
Validating the status of the connection to a remote host
tSocketTextStreamInput
tSocketTextStreamInput component
tXMLMap
tXMLMap component
tXMLMap Transforms and routes data from single or multiple sources to single or
multiple destinations.
tXMLMap scenarios
Mapping and transforming XML data
Restructuring products data using multiple loop elements
VectorWise
VectorWise components
tVectorWiseCommit Commits a global transaction in one go using a single connection instead of doing so on
(deprecated) every row or every batch. This provides a gain in performance
tVectorWiseConnection Opens a connection to the specified database that can then be reused in the subsequent
(deprecated) subJob or subJobs.
tVectorWiseInput (deprecated) Executes a DB query with a strictly defined order which must correspond to the schema
definition.
tVectorWiseOutput Executes the action defined on the table and/or on the data contained in the table, based on
(deprecated) the flow incoming from the preceding component in the Job.
tVectorWiseRow (deprecated) Acts on the actual DB structure or on the data (although without handling data).
Vertica
Vertica components
tVerticaBulkExec Loads data into a Vertica database table from a local file using the Vertica
COPY SQL statement.
tVerticaConnection Opens a connection to the specified database that can then be reused in
the subsequent subJob or subJobs.
tVerticaInput Retrieves data from a Vertica database table based on a SQL query.
tVerticaOutput Inserts, updates, deletes, or copies data from an incoming flow into a
Vertica database table.
tVerticaOutputBulkExec Receives data from a preceding component, writes data into a local file,
and loads data into a Vertica database from the file using the Vertica COPY
SQL statement.
VtigerCRM
VtigerCRM components
Webservice
Webservice components
tRestWebServiceLookupInput Retrieves messages from a REpresentational State Transfer (REST) Web service
provider and gets responses accordingly.
tRestWebServiceOutput Serves as a REpresentational State Transfer (REST) Web service client that
continuously sends HTTP requests to a REST Web service provider in real time and
gets the responses.
tWebService Calls a method via a Web service in order to retrieve the values of the parameters
defined in the component editor.
Webservice scenarios
Getting the sum of two numbers using tWebServiceInput
Getting country names using tWebService
Workday
Workday component
tWorkdayInput Retrieves data of a Workday client based on a query or the Workday client
report.
XML
XML components
tFileStreamInputXML Opens a structured XML file and reads it row by row to split the data into
fields, then sends these fields as defined in the Schema to the next
component.
tEDIFACTtoXML Transforms an EDIFACT message file into the XML format for better
readability to users and compatibility with processing tools.
tExtractXMLField Reads the XML structured data from an XML field and sends the data as
defined in the schema to the following component.
tWriteXMLField Reads an input XML file and extracts the structure to insert it in defined
fields of the output XML file.
XML scenarios
Extracting XML data from a field in a database table
Extracting correct and erroneous data from an XML field in a delimited file
Extracting the structure of an XML file and inserting it into the fields of a database table
Reading an EDIFACT message file and saving it to XML
Transforming XML into HTML using an XSL stylesheet
Transforming stream into HTML using an XSL stylesheet
XML connectors
tAdvancedFileOutputXML Writes an XML file with separated data values according to an XML tree
structure.
tFileInputXML Reads an XML structured file row by row to split them up into fields and
sends the fields as defined in the schema to the next component.
tFileOutputXML Writes an XML file with separated data values according to a defined
schema.
XML validation
tDTDValidator Helps at controlling data and structure quality of the file to be processed
tXSDValidator Helps at controlling data and structure quality of the file or flow to be
processed.
XMLRPC
XMLRPC component
tXMLRPCInput Invokes a Method through a Web service and for the described
purpose.
XMLRPC scenario
Guessing the State name from an XMLRPC