Appple Data Scientist Resume For Project

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

AMEY SADANAND BHILEGAONKAR

SUMMARY
Experienced Data Engineer proficient in Python, Spark, Kafka, Flink, SQL, and NoSQL databases. Skilled in building scalable
data pipelines and distributed systems, with a strong commitment to quality work in a fast-paced environment
WORK EXPERIENCE
BigCommerce
Data Engineer Intern June 2023 - August 2023
· Spearheaded and managed a large-scale Snowflake data retrieval pipeline for efficient data warehousing.
· Implemented logistic regression, and predictive models, achieving customer retention prediction accuracy of 86%.
· Leveraged advanced data-mining techniques to process and analyze 3 million data points, extracting critical features for search
indexing and ranking.
Publicis Sapient
Data Engineer - II June 2019 - July 2022
· Engineered complex ETL pipelines using Apache Spark, optimizing data extraction, transformation, and loading from diverse
sources, such as Redshift, S3, Kinesis Streams, and Kafka.
· Utilized distributed computing frameworks such as Apache Spark to manage large-scale data processing tasks on NoSQL
Cassandra DB, enhancing performance and resource utilization by 15%.
· Improved database performance by optimizing SQL queries and tuning indexes, resulting in 20% reduction in execution.
· Revamped and maintained real-time data streaming solutions with Apache Spark, and GCP Cloud Run in large-scale
infrastructure markets with over 30 million daily customer transactions in BigQuery, resulting in a 15% revenue increase.
· Independently built data validation module to simplify and automate reduced operation toil by 15%.
PROJECTS
GenAI PDF Summarization and Interaction with ChatGPT
· Designed and developed GenAI PDF-GPT, a web application utilizing OpenAI’s ChatGPT API and Retrieval-Augmented
Generation (RAGs) techniques.
· Summarized and extracted key information from uploaded PDFs using ChatGPT’s summarization capabilities.
· Enabled user interaction with the extracted information through a chat interface powered by ChatGPT.
· Leveraged LangChain for efficient information retrieval from PDFs.
AI/ML-powered Personalized Resume Generation Pipeline
· Designed and implemented ResumeFlow, an AI/ML-powered tool that utilizes Large Language Models (LLMs) to
generate personalized resumes and cover letters for users.
· Leveraged expertise in data engineering and software development best practices to integrate OpenAI’s GPT-3 and Google’s
Gemini models for tailored resume generation.
RAG AI-based Search Engine
· Spearheaded development of RAG based data pipeline for creating multimodal RAGs for different filetypes.
· Converted and stored every file type data as vector embeddings, ensuring low-latency search capabilities.
· Led Python FAST API development, providing efficient data access and retrieval.
EDUCATION
Arizona State University, Tempe, USA Expected May 2024
Masters of Science in Computer Science (GPA: 4/4)

Pune Institute of Computer Technology, Pune, India May 2019


Bachelors of Engineering in Electronics and Telecommunications (GPA: 8.78/10)

TECHNICAL SKILLS

Programming Languages : Python, Unix / Linux Scripting, Java


Cloud Platforms & Databases : GCP, AWS, Big Query, Cassandra, No SQL, SQL, PostgreSQL
Data Engineering : PySpark, Snowflake, Airflow, Spark, Kafka, Pandas, PySpark
DevOps / SRE : CI/CD, Git, Jenkins, Docker, Kubernetes
Certified Google Cloud Platform : Associate Cloud Engineer. Airflow, AWS, API, communication skills, computer science, data analysis, data processing, data science, ETL, Lambda, product design, Snowflake, SQL, Agile, cloud infrastructure, compliance, CSV, data lake, data models, data platform, Datadog, DynamoDB, governance, JSON, platform services, Terraform, user experience, Big data technologies (Hadoop, Spark, Hive), containerization (Docker, Kubernetes), data integration (Apache Kafka, Flume, NiFi), data storage (Delta Lake, HBase, Cassandra, MongoDB), data processing (Apache Hudi, Apache Beam, Apache Flink, Google Cloud Dataflow, Amazon Kinesis Data Analytics, Azure Databricks), Apache, architecture, data modeling, Flink, Java, Kafka, NoSQL, Spark, Beam, data architecture, development life cycle, Map Reduce, NodeJS, non-relational databases, compliance, data engineer, data pipeline, fast paced environment, Rust, AB Testing, algorithms, backend, data mining, data structures, deep learning, distributed system, end-to-end, natural language processing, online advertising, cross-functional, industry trends, capacity planning, documentation, infrastructure-as-code, MySQL, Oracle, RabbitMQ, Redis, data solutions, data warehouse, Shell, Business Intelligence, data driven, DNS, entrepreneur, Go, hardware, IP, JVM, microservices, statistics, tech stack, user interface, design patterns, hybrid cloud, SDLC

You might also like