Informatica Big Data For Developers

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Big Data for Developers

now.informatica.com/Big-Data-for-Developers-1022-Instructor-Led.html

Course Overview
This course is applicable for software version 10.2.2. Learn to accelerate Big Data
Integration through mass ingestion, incremental loads, transformations, processing of
complex files, and integrating data science using Python. Optimize the Big Data system
performance through monitoring, troubleshooting, and best practices while gaining an
understanding of how to reuse application logic for big data use cases.

Objectives
After successfully completing this course, students should be able to:

Mass ingest data to Hive and HDFS


Perform incremental loads in mass ingestion
Integrate with relational databases using SQOOP
Perform transformations across various engines
Execute a mapping using JDBC in Spark mode
Perform stateful computing and windowing
Process complex files
Execute dynamic mappings
Monitor logs and troubleshoot
Monitor logs using REST operations hub
Tune performances of Spark jobs
Create and interpret PowerCenter Reuse Reports
Import PowerCenter mapping to the Developer tool
Modify the imported mapping for it to be Hadoop-ready
Guidelines and limitations of importing from PC

Target Audience
Developer

Prerequisites

1/5
Agenda

Module 1: Module 5:
Informatica Big Data Complex Data
Management Processing
Overview Big data file
Big Data concepts formats – AVRO,
Big Data Management Parquet, JSON
features Complex file data
Benefits of Big Data types – structs,
Management arrays, maps
Big Data Management Dynamic
architecture mapping
Big Data Management Dynamic
developer tasks expression
support
Module 2: Ingestion Lab: Convert flat
file data object to
and Extraction an AVRO file
Integrating BDM with Lab: Write a JSON
Hadoop cluster file to Parquet file
Application services of format
BDM 10.2.2 Lab: Use complex
Hadoop file systems data types arrays,
Ingest data to HDFS structs, and
and Hive using SQOOP maps in a
Mass ingestion to HDFS mapping
and Hive – initial load Lab: Build
Mass ingestion to HDFS dynamic
and Hive - incremental mappings using
load dynamic
Lab: Ingesting data expressions
from Oracle (SQOOP) to
HDFS Module 6:
Lab: Ingesting data Monitoring Logs
from Oracle (SQOOP) to
Hive
and
Lab: Creating mapping Troubleshooting
specifications using a REST operations
mass ingestion service hub
– full load Spark monitoring
Lab: Creating mapping Blaze monitoring
2/5
specifications using a Viewing logs
mass ingestion service Troubleshooting
– incremental load Lab: Monitor
mappings using
Module 3: Big Data REST operations
Engine Strategy hub

BDM engine strategy Module 7:


Hive engine
architecture
Performance
MapReduce Tuning and Best
Tez Practices
Spark architecture
Native Vs
Blaze architecture
Hadoop mode of
Basic BDM
execution
transformations
Tune
Lab: Executing a
performance of
mapping using
Spark jobs
different BDM
Tune
transformations in
performance of
Spark mode
Blaze jobs
List some best
Module 4: Big Data practices while
Development working with
Process BDM
Lab: Optimize a
Advanced
flat file
transformations in
Lab: Test the
BDM – python and
performance of
update strategy
mapping with
Hive ACID use case
precision tuning
Stateful computing and
Lab: Optimize
windowing
lookup
Lab: Python
transformation
transformation
Lab: Update strategy
and JDBC support
Module 8:
Lab: Performing Hive Introduction to
upserts PowerCenter
Lab: Windowing Reuse
function LAG
Lab: Windowing Overview of Data
function LEAD Integration
Solutions
Transitioning
from
PowerCenter to
Big Data
Ecosystem
Steps to
migrating to BDM
Native vs Hadoop
3/5
mode
Transformations
on Hadoop mode
Lab: Execute
mappings in
Native and
Hadoop
environment
Lab:
Transformations

Module 9:
PowerCenter
Classic Reuse
Report
Export PC
mappings
through CLI
PC Reuse Report
Formats
Considerations in
using the PC
Reuse Utility
Lab: Assess the
PowerCenter
mapping to
execute on BDM

Module 10: BDM


Optimization
PowerCenter
Classic Import
Mapping
Validation
Lab: Import the
mapping from
PowerCenter
10.2.0 to BDM
Lab: Mapping
validation
Lab: Final case
study

4/5
Back to Course Overview

5/5

You might also like