Appple Data Scientist Resume For Project
Appple Data Scientist Resume For Project
Appple Data Scientist Resume For Project
SUMMARY
Experienced Data Engineer proficient in Python, Spark, Kafka, Flink, SQL, and NoSQL databases. Skilled in building scalable
data pipelines and distributed systems, with a strong commitment to quality work in a fast-paced environment
WORK EXPERIENCE
BigCommerce
Data Engineer Intern June 2023 - August 2023
· Spearheaded and managed a large-scale Snowflake data retrieval pipeline for efficient data warehousing.
· Implemented logistic regression, and predictive models, achieving customer retention prediction accuracy of 86%.
· Leveraged advanced data-mining techniques to process and analyze 3 million data points, extracting critical features for search
indexing and ranking.
Publicis Sapient
Data Engineer - II June 2019 - July 2022
· Engineered complex ETL pipelines using Apache Spark, optimizing data extraction, transformation, and loading from diverse
sources, such as Redshift, S3, Kinesis Streams, and Kafka.
· Utilized distributed computing frameworks such as Apache Spark to manage large-scale data processing tasks on NoSQL
Cassandra DB, enhancing performance and resource utilization by 15%.
· Improved database performance by optimizing SQL queries and tuning indexes, resulting in 20% reduction in execution.
· Revamped and maintained real-time data streaming solutions with Apache Spark, and GCP Cloud Run in large-scale
infrastructure markets with over 30 million daily customer transactions in BigQuery, resulting in a 15% revenue increase.
· Independently built data validation module to simplify and automate reduced operation toil by 15%.
PROJECTS
GenAI PDF Summarization and Interaction with ChatGPT
· Designed and developed GenAI PDF-GPT, a web application utilizing OpenAI’s ChatGPT API and Retrieval-Augmented
Generation (RAGs) techniques.
· Summarized and extracted key information from uploaded PDFs using ChatGPT’s summarization capabilities.
· Enabled user interaction with the extracted information through a chat interface powered by ChatGPT.
· Leveraged LangChain for efficient information retrieval from PDFs.
AI/ML-powered Personalized Resume Generation Pipeline
· Designed and implemented ResumeFlow, an AI/ML-powered tool that utilizes Large Language Models (LLMs) to
generate personalized resumes and cover letters for users.
· Leveraged expertise in data engineering and software development best practices to integrate OpenAI’s GPT-3 and Google’s
Gemini models for tailored resume generation.
RAG AI-based Search Engine
· Spearheaded development of RAG based data pipeline for creating multimodal RAGs for different filetypes.
· Converted and stored every file type data as vector embeddings, ensuring low-latency search capabilities.
· Led Python FAST API development, providing efficient data access and retrieval.
EDUCATION
Arizona State University, Tempe, USA Expected May 2024
Masters of Science in Computer Science (GPA: 4/4)
TECHNICAL SKILLS