ITS66604 MidTerm Individual - Preview

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

TAYLOR’S UWE DUAL AWARDS PROGRAMMES

April 2024 Semester


Machine Learning and Parallel Computing
(ITS66604)

Mid Term Test – Individual (10%)


Due Date: 9 June 2024 via myTIMeS (11:59 pm)

STUDENT DECLARATION
1. I confirm that I am aware of the University’s Regulation Governing Cheating in a University
Test and Assignment and of the guidance issued by the School of Computing and IT
concerning plagiarism and proper academic practice, and that the assessed work now
submitted is in accordance with this regulation and guidance.
2. I understand that, unless already agreed with the School of Computing and IT, assessed
work may not be submitted that has previously been submitted, either in whole or in part,
at this or any other institution.
3. I recognise that should evidence emerge that my work fails to comply with either of the
above declarations, then Imay be liable to proceedings under Regulation.

Student Name Student ID Date Signature Score


About this Mid Term Test:
Module Learning Outcome: On completion of this alternative assessment, students should be
able to:
o MLO2: To design and develop machine learning algorithms to solve a problem.

Instructions:
1. Follow the instructions provided in MyTimes questions.

2. Download the dataset using the link provided in MyTimes.

3. Create document (Word or google Doc) to an online location, paste the URL to a
designated box in MyTimes. All the screenshots should be pasted in this file. A template
is provided in MyTimes.

4. Write descriptive answers to the questions under each task in MyTimes


Your answers should be clear and concise, and they should demonstrate your
understanding of the concepts being covered.

5. Attach the URL link to google colab notebook or your original program files (.py) (.ipynb)
to the report upon submission. Use a proper program written in Python and execute the
code.When writing your code, be sure to follow best practices and document your code
thoroughly. Make sure to test your code thoroughly to ensure that it is working correctly.
The Case Study:

This case study investigates the use of machine learning to predict total conversion from a
social media ad campaign. The dataset contains 1143 observations in 11 variables.

Metadata:
Attributes Description
age Age of the person to whom the ad is shown
gender Gender of the person to whom the ad is shown
A code specifying the category to which the person's interest
interest belongs (interests are as mentioned in the person's Facebook public
profile)
Impressions The number of times the ad was shown
Clicks Number of clicks on for that ad
Spent Amount paid by company XYZ to Facebook, to show that ad
Total number of people who enquired about the product after seeing
Total conversion
the ad

Objective

The objective of this case study is to develop an optimized regression model to predict total
conversion. The model should be able to accurately predict total conversion for new ad
campaigns, based on the features that are available.

Approach

The following approach will be used to develop the optimized regression model:

• Exploratory data analysis (EDA) will be performed to understand the distribution of the
variables and the relationship between the variables.
• A baseline linear regression model will be trained to predict total conversion.
• The performance of the baseline model will be evaluated using various metrics, such as
mean squared error (MSE), root mean squared error (RMSE), and R-squared.
• More complex regression models, such as polynomial regression, will be trained and
evaluated.
• The best performing regression model will be selected, tuned, optimized and used to
predict total conversion for new ad campaigns.

Benefits

The development of an optimized regression model to predict total conversion will provide the
following benefits:

• The model can be used to identify the ad campaigns that are most likely to be successful.
• The model can be used to optimize the ad campaigns to improve total conversion.
• The model can be used to predict the total conversion for new ad campaigns, before they
are launched.

This case study will demonstrate the use of machine learning to solve a real-world problem. The
optimized regression model that is developed in this case study can be used by businesses to
improve their social media ad campaigns and increase their sales.

Question 1: EDA (30 marks)

Perform exploratory data analysis (EDA) on the dataset to understand the distribution of the
variables and the relationship between the variables.

a. Use relevant visualizations to inspect the data quality, including checking for missing
values.

Provide a description of your main findings in MyTimes text box, paste the screenshots
into the online document.

(5 marks)

b. Examine the range of values among all the features. Analyze the spread of the data and
identify any potential outliers.

Provide a description of your main findings in MyTimes text box, paste the screenshots
into the online document.

(5 marks)

c. Identify and visualize the features that have the strongest correlations with the total
conversion rate.

Provide a description of your main findings in MyTimes text box, paste the screenshots
into the online document.

(10 marks)

d. Perform feature encoding on ‘Age’ using label encoding and ‘Gender’ using one-hot
encoding techniques.

Provide the name of dataframe that stores this output in MyTimes text box, paste the
screenshots of dataframe into the online document.

(10 marks)
Question 2: Baseline Regression Model (30 marks)

Train a baseline linear regression model to predict the total conversion. Evaluate the
performance of the baseline model.

a. Based on your EDA findings in question 1, select at least two relevant features as
independent variables (X) to predict the total conversion rate (y).

Name the features you selected in MyTimes text box, provide functional code with its
output in your google Colab notebook.

(5 marks)

b. Perform train-test split at an 80:20 train-test ratio.

Provide the line of code in MyTimes text box, provide functional code with its output in
your google Colab notebook.

(5 marks)

c. Train the linear regression model and report the equation of the model with the trained
coefficients. (10 marks)

Provide the equation of model in MyTimes text box, provide the screenshot of features
and coefficients in online document, provide functional code with its output in your
google Colab notebook.

d. Evaluate the model performance using appropriate metrics and provide a detailed
explanation of the results. (10 marks)

Provide the metrics and detailed explanation in MyTimes text box, provide relevant
screenshot of features and coefficients in online document, provide functional code with
its output in your google Colab notebook.

_end of preview_
Question 3: Polynomial regression (20 marks)

Refer to MyTimes.

Question 4: Regularization (20 marks)

Refer to MyTimes.
MACHINE LEARNING AND PARALLEL COMPUTING ITS66604
Mid Term Marking Scheme (APRIL 2024)
Score (Percentage of the allocated marks for each task)
Criteria Excellent Good Average Poor
>= 90% < 90% , >= 70% < 70% , >= 40% < 40%
18 marks 14 - 17 marks 8 – 13 marks 0 – 7 marks
Question 1: EDA Demonstrates a Demonstrates a Demonstrates a Demonstrates a
deep good basic limited
understanding of understanding of understanding of understanding of
the data and its the data and its the data and its the data and its
relationships. relationships. relationships. relationships.
Able to identify key Able to identify key Able to identify key unable to identify
patterns and patterns and patterns and key patterns and
insights, and insights, and insights, and insights, and
communicate communicate communication communication is
findings clearly and findings clearly. may be less clear unclear and
concisely. and concise. concise

Question 2: Demonstrates a Demonstrates a Demonstrates a Demonstrates a


Baseline deep good basic limited
Regression understanding of understanding of understanding of understanding of
Model linear regression linear regression linear regression linear regression
and its application and its application and its application and its application
to the problem at to the problem at to the problem at to the problem at
hand. hand. hand. hand.
Select appropriate Select appropriate May or maynot May not be able to
evaluation metrics evaluation metrics select appropriate select appropriate
and interpret them and interpret them evaluation evaluation metrics
correctly. mostly correctly. metrics , and interpretation
interpretation may may be incomplete
be incomplete or or incorrect.
incorrect.
Question 3: Able to explain Able to choose the May or may not be Not able to choose
Polynomial how to choose the degree of the able to choose the the degree of the
Regression degree of the polynomial degree of the polynomial
polynomial regression model; polynomial regression model;
regression model; train and evaluate regression model; Not train and
train and evaluate the model May or may not evaluate the model
the model correctly. Analyze train and evaluate correctly. Analysis
correctly. Analyze and compare the the model and comparison of
and compare the performance correctly. Analysis the performance
performance metrics of the and comparison of metrics of the
metrics of the polynomial the performance polynomial
polynomial regression model metrics of the regression model
regression model to the performance polynomial to the performance
to the performance metrics of the regression model metrics of the
metrics of the baseline model in to the performance baseline model
baseline model in an informative metrics of the may may be
a comprehensive way. baseline model incomplete or
and informative may be limited in incorrect.
way. scope and depth.
Cont…
MACHINE LEARNING AND PARALLEL COMPUTING ITS66604
Mid Term Marking Scheme (AUGUST 2023)
Score (Percentage of the allocated marks for each task)
Criteria Excellent Good Average Poor
>= 90% < 90% , >= 70% < 70% , >= 40% < 40%
18 marks 14 - 17 marks 8 – 13 marks 0 – 7 marks
Question 4: Able to explain Able to explain May or may not be May not be able to
Optimizing the how to tune how to tune able to explain how explain how to
Regression hyperparameters hyperparameters to tune tune
Model and select the best and select the best hyperparameters hyperparameters
model. Discuss the model. Discuss the and select the best and select the best
hyperparameters hyperparameters model. May or may model. May not be
that they would that they would not be able to able to explain how
tune and how they tune and how they explain how to to tune
would choose the would choose the tune hyperparameters
best model in a best model in a hyperparameters and select the best
comprehensive generally correct and select the best model correctly.
and informative way. model correctly. Discussion of the
way. Discussion of the hyperparameters
hyperparameters that they would
that they would tune and how they
tune and how they would choose the
would choose the best model may
best model may also be incomplete
also be limited in or incorrect.
scope and depth.

- END -

You might also like