Project Data Scientist Program Group Project
Project Data Scientist Program Group Project
Project Data Scientist Program Group Project
GENERAL INSTRUCTION
You are required to perform classification or regression tasks using supervised learning techniques.
Choose at least THREE suitable algorithms and write a code to implement machine learning (ML)
algorithms (k-NN, decision trees, naïve Bayes, etc.) by using Python and the Scikit-Learn library.
Your work must include the following:
1. Dataset
Select any dataset for the classification task from the UCI database or any resource. Use the
same data set to perform the ML algorithms. A good dataset for better points will have these
criteria:
a) Latest dataset (2020-2024)
b) Sufficient number of data points in the dataset to create a good ML model.
c) Complex dataset where some pre-processing operations are required.
2. Report
1
Data Scientist Program
2-4. Result
● Figures of the output result. Choose suitable performance metrics to evaluate the
classification/regression task.
● Based on the model validation and performance evaluation in your work, compare
both of your models in terms of model performance (based on performance metrics
used).
SUBMISSION INSTRUCTIONS
Due Date:
(3) Deliverables:
● Dataset used for the assignment
● Notebook (.ipynb file)
● Full Report
LATE POLICY
Submission after the due time without having been granted an extension by your lecturer, will mean
that your work is ‘late.' Late work will have a penalty of 10% of the total possible marks deducted
from the mark that your work is worth, per day (including weekends).