Purvi Agrawal

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Purvi Agrawal

Sr. Data Engineer


[email protected] +91-7974560651 Bengaluru, KA, IN

Profile
Sr. Data Engineer with 3 years of experience in Python, PySpark, Databricks, data pipelines, ETL processes, and CI/CD
implementation using Jenkins and Bitbucket. Skilled in data warehousing and deployment automation.

Professional Experience
Sr. Data Engineer, LTIMindtree Jan 2023 – present | BANGALORE, India
Data Migration from MySQL to AWS S3
Migrating historical data from MySQL to AWS S3, enhancing data processing speed by 35%.

Leveraged Databricks workspace and cluster for robust data processing environments.

Installed and managed MySQL Connector and Boto3 libraries for seamless database and cloud interaction.

Utilized PySpark for JDBC connection to MySQL, extracting large volume of historical data per day.

Set up secure AWS credentials using Boto3 for authentication and interaction with AWS S3 services.

Transformed data into Parquet format and loaded it into S3, reducing data transfer costs by 15%.

Reduced data migration process time by 50%, completing daily migrations in 1 hour.

Engineer I, LTIMindtree Dec 2021 – Nov 2022 | BANGALORE, India


IRIS - Instance Reconcilation and Integration Solutions
Directed IRIS tool operations to aggregate, de-duplicate, and normalize middleware/software installation data, improving data

accuracy by 25% and reducing manual reconciliation time by 35 hours/month.


Handled version mapping for 46 products within the IRIS tool.

Implemented CI/CD pipelines using Jenkins, automating build, test, and deployment processes, improving efficiency by 20%.

Integrated Bitbucket repositories with Jenkins for seamless code integration and version control, reducing merge conflicts by

25%.
Implemented automated weekly data quality checks and analysis using AutoSys, ensuring the accuracy of data inputs and

generating comprehensive reports to inform decision-making.


Graduate Engineer Trainee, LTIMindtree Jul 2021 – Nov 2021 | BANGALORE, India
Trained in Big Data concepts including Data Warehousing, ETL, Databricks, Python, and PySpark.

Implemented Data Engineering concepts and skills through hands-on dummy projects.

Skills
Tools and Technologies
Python | Databricks | Pyspark | SparkSQL | MySQL | Data Warehousing | AWS S3 | AWS EC2 | Airflow

Jenkins | BitBucket | Release Lifecycle Management(RLM) | Linux(basics)


Projects
Twitter Data Pipeline using Airflow
Extracted data from the Twitter API, performed data transformations with Python and deployed the code on Airflow running
on EC2 instances.
Ensured efficient task automation and scheduling with Airflow. The final transformed data was stored in Amazon S3,
providing a scalable and cost-effective storage solution.
This project demonstrated expertise in API integration, data transformation, deployment, and cloud storage management.

Awards
Shooting Star Award - Team Contributor, LTIMindtree Sep 2023
Recipient of the Shooting Star Award, awarded to the top 1% of employees in the business unit for consistently exceeding
expectations, demonstrating a commitment to excellence, and significantly contributing to team success.

Education
B.Tech in Computer Science & Engineering Apr 2017 – May 2021
Madhav Institute of Technology & Science

You might also like