Tien

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Tien (Minh) Nguyen

Data Scientist | AI Enginner

About Me

My Contact Having worked as a Data Scientist, especially in AI engineer, for 3 years.


I create data pipelines and have constructed machine learning models.
For several initiatives, I have gathered, preprocessed, evaluated data,
[email protected] deployed ML/AI model into AWS. I have worked with several customers
on Upwork and competed in data science events. I can also use data
+84 39364 1039 analytics to arrive at insights excites me. Especially in using AWS
technology, I am excited to provide my skills and expertise to new
Thu Duc, Ho Chi Minh City opportunities.

Nguyễn Minh Tiến

TienTH Professional Experience


Tien Minh Nguyen
Puritype | Data Scientist Leader - Remote
05/2024 – 09/2024

Education Background Key Tasks:


Designed and built MVPs for various AI-powered projects.
University of Information Technology Conducted hypothesis testing and A/B testing to validate model
improvements and optimize system performance.
Bachelor in Data Science Developed and deployed recommendation systems, enhancing accuracy
Completed in 2024 through AI techniques and LLM integration.
Collaborated closely with clients to implement feedback, iterating on model
GPA: 8.57/10 performance to meet specific needs.
Applied large language models (LLMs) and advanced AI algorithms to
improve recommendation system accuracy and user experience.
Languages
FPT Software | Data Scientist Middle - Onsite
Vietnamese
English
03/2023 – 03/2024

Key Tasks:
Implemented ETL processes for data collection and preprocessing, optimizing

Technical data pipelines for large-scale projects and ensuring clean, structured data
flow into AI/ML models.
R Conducted feature engineering to enhance AI/ML model inputs, improving
SQL prediction accuracy and identifying key features that contributed to model
Python success.
C++ Evaluated AI/ML model performance using R-squared and Mean Absolute
Error (MAE), ensuring reliable and actionable model outputs and continuously
improving model accuracy.
Led model validation processes to ensure the robustness of AI/ML models,
Hard Skill including thorough testing and sensitivity analysis before deployment.
Integrated feedback loops into the AI/ML models, ensuring the models could
Programming Languages adapt to changing business needs and input data variations over time.
Machine Learning Frameworks Presented findings and model performance reports to both technical and
Data Analysis and Visualization non-technical stakeholders, ensuring clear communication and actionable
Big Data Tools insights.
Natural Language Processing
Data Manipulation and Cleaning
Finpros | Data Scientist Junior - Hybrid
03/2021 – 01/2023
Soft Skill Key Tasks:
Assisted in collecting, cleaning, and analyzing datasets to derive insights for
Observation various projects.
Decision making Supported the development of basic predictive models.
Communication Created visualizations for the team using Matplotlib and Seaborn to clearly
Multi-tasking convey results.
Projects

1. Kidney Disease Classification: AWS CI/CD Deployment with GitHub Actions (August 2024):

Objective: To demonstrate an end-to-end deep learning workflow, including deployment on AWS with continuous integration and
continuous deployment (CI/CD) using GitHub Actions.
Tasks:
Developed a comprehensive pipeline for data preparation, model training, and deployment.
Configured AWS services including Amazon EC2 and Amazon ECR for hosting the model.
Implemented GitHub Actions for automated testing and deployment processes.
Skills Utilized: Deep Learning, Data Modeling, Image Segmentation, Computer Vision, AWS, GitHub Actions, Docker.
Outcome: Streamlined the deployment process, reducing time-to-market for machine learning models and enhancing collaboration
through automated workflows.

GitHub: Link to project

2. Predictive Modeling for Student Support (July 2024 - August 2024):

Objective: To cover the complete machine learning lifecycle from data collection to deployment.
Tasks:
Collected and preprocessed data, performing exploratory data analysis (EDA) to derive insights.
Engineered features and built predictive models, evaluating performance through cross-validation.
Deployed the model to provide real-time predictions.
Skills Utilized: Machine Learning, EDA, Data Visualization, AWS.
Outcome: Successfully created a robust model that improved prediction accuracy, facilitating better decision-making.
GitHub: Link to project

3. Wav2Lip Real-time Lip-sync Demo: FastAPI Integration & Python Environment Setup (September 2024)

Objective: To build a real-time lip-sync demo using the Wav2Lip model, leveraging FastAPI for the backend, and providing a detailed Python
environment setup and installation guide.
Tasks:
Created an installation guide, detailing the setup process for Python 3.8+, virtual environments, and installing dependencies.
Implemented a FastAPI-based backend to handle real-time video and audio processing for lip-syncing.
Set up a Python virtual environment to manage project dependencies efficiently, ensuring isolation from other system installations.
Installed and configured essential libraries such as FastAPI, uvicorn, numpy, opencv-python, librosa, and asyncio to enable real-time
audio-video synchronization.
Developed a FastAPI server that captures video and audio streams from the user's device, processes them using the Wav2Lip model, and
delivers the synchronized video back to the browser.
Configured the system to use WebSocket for transmitting real-time data between the client and server, ensuring low-latency lip-sync
output.
Wrote a comprehensive Dockerfile to enable easy containerization of the project, ensuring portability and quick deployment across
different environments.
Skills Utilized: FastAPI, WebSockets, Python (3.8+), Docker, numpy, opencv-python, librosa, asyncio, Real-time Video Processing, Audio
Processing.
Outcome: Successfully developed a real-time lip-sync demo that captures, processes, and streams synchronized video back to the user.
The project provided a seamless installation and setup process for developers, improving the ease of use and portability through Docker.

GitHub: Link to project

4. Human-Like Chatbot Using RAG Technology for Emotional Conversations (September 2024)

Objective: To create a chatbot that facilitates realistic, emotionally-driven conversations between a father and son using Retrieval-
Augmented Generation (RAG) technology.
Tasks:
Implemented RAG to generate emotionally resonant and context-aware responses.
Enabled multilingual support (English and Korean) with automatic language detection and translation.
Designed a project structure with organized data, source code, tests, and documentation for maintainability.
Added role-switching capabilities to enhance conversation dynamics.
Developed unit tests to ensure functionality and chatbot performance.
Skills Utilized: RAG, Python, NLP, Multilingual Processing, Emotion-Aware AI, Unit Testing.
Outcome: Delivered a chatbot with human-like, emotionally engaging conversations and seamless language switching.

GitHub: Link to project


5. Text Summarizer using Docker on AWS EC2 (July 2024 - August 2024):

Objective: To develop a model that summarizes lengthy documents into concise summaries.
Tasks:
Designed a seamless text summarization workflow utilizing NLP techniques.
Deployed the application using Docker on AWS EC2 for scalability.
Created a conda environment for easy setup and installation of dependencies.
Skills Utilized: Natural Language Processing, MLOps, AWS, Docker.
Outcome: Enabled users to process large texts efficiently, enhancing productivity in information retrieval.

GitHub: Link to project

6. Travel Recommendation System using Content-based, Collaborative, and Hybrid Filtering (September
2023 - October 2023):

Objective: To create a recommendation system that suggests travel destinations based on user preferences and behavior.
Tasks:
Collected and prepared data from various sources to build a comprehensive dataset.
Developed algorithms using content-based and collaborative filtering techniques for personalized recommendations.
Evaluated the system's performance through user feedback and adjusted parameters accordingly.
Skills Utilized: Recommender Systems, Python, Data Modeling.
Outcome: Provided tailored travel suggestions, improving user satisfaction and engagement.

GitHub: Link to project

7. Sentiment Analysis Project on Vietnamese Social Media Platforms (May 2021 - November 2021):

Objective: To analyze sentiment in user comments on popular Vietnamese social media platforms, focusing on Facebook.
Tasks:
Collected and preprocessed a dataset of over 1 million user comments in Vietnamese from Facebook.
Implemented Natural Language Processing (NLP) techniques to classify sentiments as positive, negative, or neutral.
Utilized a combination of machine learning models, including Support Vector Machine (SVM) and Long Short-Term Memory (LSTM)
networks.
Accuracy: Achieved an overall sentiment classification accuracy of 87%.
Outcome: The analysis provided valuable insights into public opinion on various topics, which helped businesses tailor their marketing
strategies and improve customer engagement.

Certificates
Google Cloud Certifications: A Tour of Google Cloud Hands-on Labs
Google Cloud Certifications: Big Data and Machine Learning Fundamentals
Google Cloud Certifications: Foundations: Data, Data, Everywhere
Google Cloud Certifications: A Tour of Google Cloud Hands-on Labs
IBM Certifications: Data Visualization and Dashboards with Excel and Cognos
IBM Certifications: Excel Basics for Data Analysis
IBM Certifications: Introduction to Data Analytics
Udemy Certifications: Python for Data Analysis & Visualization
Udemy Certifications: Python for Deep Learning: Build Neural Networks in Python

Achievements
Achieved Top 5 in UIT Data Challenge 2023 (March 2023):
Team Leadership: Led a team of data scientists and analysts to compete in the prestigious UIT Data Challenge 2023.
Strategic Planning: Devised a comprehensive plan outlining project milestones, resource allocation, and timelines to ensure a
competitive edge.
Model Selection: Handpicked a robust pretrained model suitable for our dataset, significantly reducing the time-to-deployment.
Data Preprocessing: Actively participated in data cleaning, transformation, and augmentation to prepare a high-quality dataset for
model training.
Error Analysis: Conducted thorough error analysis post-competition to identify and rectify model weaknesses, providing valuable
insights for future projects.

You might also like