Assignment3 - CSE4002

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

CSE4002

Artificial Intelligence Fundamentals

Assignment Specification

Semester 1, 2024
General Information

1. Due Date

The due date of the assignment is

11:59 PM (AEST/AEDT), Wednesday 12 June 2024.

2. Submission (please read this carefully before you submit the assignment)

The assignment submission is electronic only. Any handwritten content, including diagrams, will not be
assessed. In other words, you will receive 0 mark. Submit the assignment through the Assignment 3
Submission Chute which is available in the Assessment folder of the subject LMS site.

You must submit the following two files separately:

• A word or pdf document that answers questions in this Assignment 3.

Use your student id to name the file. For example, if your student id is 1234567, then the file name is
ID_1234567.docx/.doc/.pdf. Otherwise, it will not be assessed, which means you will receive 0 mark. The
reason is your assessment submission must generate a similarity score (you are responsible for checking
this). Submitting in Word format is the best way to do this. If your submission does not generate a
similarity score, it cannot be checked for plagiarism and therefore will not be marked.

In this word file, please answer Task 1, and Task 2 and Task 3 one by one. Task 1 and task 2 are Prolog
programming problems, please write solutions, queries, and answers. Task 3 is a python programming
problem; you need to write a report regarding solving this problem.

• a .py or .ipynb file to be submitted. The python program is to solve the task 3 of this assignment. If your
student id is 1234567, then the file name is ID_1234567.py or ID_1234567.ipynb. Otherwise, it will not be
assessed, which means you will receive 0 mark.

When submitting, please do not zip these files.

3. Weighting

You need to answer all three tasks if you are master students. The assignment contributes 40% of the
final assessment for this subject.

4. Academic Integrity

This assessment must be done individually. This means that answers to questions and code that you
write must be your own. You must not collude with other students in any way, and you must not outsource
your work to any third party. La Trobe University treats plagiarism seriously. When detected, penalties are
strictly imposed. Further information can be found at

http://www.latrobe.edu.au/plagiarism/plagiarism.html
Assignment Specification

PROLOG programming [20 marks]

Task 1 (10 marks)

Problem Description:

Write a PROLOG program to represent the following facts and enter the program into the online
executor. There were six people in a family reunion - Anna, Lily, Rebecca, Elizabeth, Mia, and
Olivia. They provide some information about the family relations.

Anna said: “Lily and Rebecca are my daughters”.

Elizabeth said: “Lily is my mom”.

Mia said: “Lily is my mom”.

Rebecca said: “Olivia is my daughter”.

The program defines a predicate which is mother (X, Y) (X is the mother of Y) and assume above
statements are the following facts:

mother(anna, lily).

mother(anna, rebecca).

mother(lily, elizabeth).

mother(lily, mia).

mother(rebecca, olivia).

Requirements:

(1) Define a predicate sister(X, Y) (X and Y are sisters). (1 mark)


(2) Define a predicate cousin(X, Y) ( X and Y are cousins). (1 mark)
(3) Define a predicate granddaughter(X, Y) (X is a granddaughter of Y). (1 mark)
(4) Define a predicate descendent(X, Y) ( X is a descendent of Y). (2 mark)

Once the program is entered, make the following queries in the PROLOG executor:

(5) ?- sister(X,Y). (1 mark for all answers)


(6) ?- cousin(X,Y). (1 mark for all answers)
(7) ?- granddaughter (X,Y). (1 mark for all answers)
(8) ?- descendent(X,Y). (2 marks for all answers)
Task 2 (10 marks)

Problem Description:

A fruit shop exported a basket of two types of fruits – strawberry and orange, and the owner wants to
measure the weights. The weight of a strawberry is between 15g and 25g. The weight of an orange is
between 125g and 200g. A robot one-by-one picks up fruits from the basket, weight it and enter the
weight into the PROLOG program that you wrote. The robot enters 0 (zero) when all fruits have been
picked out of the basket. Your program will then display the average weight of the strawberry and the
average weight of the orange from the basket.

Hint: We will use a few variables, S_sum, S_num, O_sum, and O_num, to represent the total weight of
input strawberries, total number of input strawberries, total weight of input oranges, and total number of
input oranges, respectively. Then, by dividing the total weight by the total number, we can obtain the
average weight.

Requirements:

(1) Implement a predicate measure(S_sum, S_num, O_sum, O_num). (4 marks)


a. It requires you (the user) to input a weight. (1 mark)
b. It requires you use selection predicate ->/0 to achieve if the input is zero (0), the
program will then display the average weight of the strawberry and the average weight of
the orange from the basket. Otherwise, the program will request a new input weight. (3
marks)
(2) Implement a predicate accumulate(X, S_sum, S_num, O_sum, O_num) which would perform
the accumulation of the total weights and the total numbers of input strawberries and oranges
respectively and then request a new input weight. (3 marks)
(3) Implement a predicate add(X, S, N, NS, NN) which respectively updates the input weights and
numbers represented by S and N into new variables NS and NN with updated values according to
the user input X. (1 mark)
(4) Consider the following samples of the program execution can be as below (2 marks):
% To run, please type weight.

?- weight.

155.2

188

17.3

126.7

19.9

179.9

24.5

The average strawberry weights: 20.5666666

The average orange weights: 162.45

(PS: please note that there is a line-break.)


Python Programming [20 marks]

Task 3

Background

Decision Trees (DTs) are a non-parametric supervised learning method used for
classification and regression. The goal is to create a model that predicts the value of a
target variable by learning simple decision rules inferred from the data features. A tree
can be obtained after training. In practice, a predictive decision tree model will
incrementally select the best decisions to split on (evaluated based on the entropy
principle) to provide an output classification based on our input data. For this
assessment, you will be describing a new problem and utilising some machine learning
Python modules to create an ID3 predictive decision tree model, along with a
visualisation to better understand the classification process.

This Task 3 will measure your ability to 1) study a problem that can be tackled by an
artificial intelligence method, e.g., a decision tree; 2) implement the decision tree using
the given instructions; 3) visualize and analyse the decision tree. The objective of this
assessment is to utilise the pandas the scikit-learn library to implement the ID3
decision tree machine learning algorithm to create a classifier that can tackle a specific
problem. By the end of this assessment, you should have a better understanding of how
decision trees work. Utilising the trained tree visualisation, you should have a reinforced
understanding of the core decision tree principles and how the trees’ split evaluations
operate.

Problem Description

You will be creating a decision tree that will predict wine classes based on provided attributes.
Imagine that you are a wine producer compiling data for a study. The data is the results of a
chemical analysis of wines grown in the same region in Italy by three different cultivators. There
are thirteen different measurements taken for different constituents found in the three types of
wine, class_0, class_1, and class_2.

Those thirteen different measurements include: Alcohol, Malic acid, Ash, Alcalinity of ash,
Magnesium, Total phenols, Flavanoids, Nonflavanoid phenols, Proanthocyanins, Color
intensity, Hue, OD280/OD315 of diluted wines and Proline.

Therefore, the model’s input parameters can be Those thirteen different measurements and the
model output should be the wine classes.
Implementation instructions
Assignment dependency installation

You will have to install the following required dependency:

• scikit-learn
• pandas
• matplotlib

Imports

You will be utilising a number of well-known machine learning Python modules in this
task. These steps allow for ease of implementation of our decision tree and include
numerous learning tools to help boost your understanding.

import pandas as pd

import sklearn

import matplotlib.pyplot as plt

Load the dataset and Format the training/testing data

You will be using the following codes to load the dataset:

# Load wine dataset

from sklearn.datasets import load_wine

data = load_wine()

The pandas python module is a very powerful, frequently used data analysis tool in all
forms of machine learning; it allows you to store and manipulate large datasets very
easily and is highly compatible/integrated with other machine learning tools/modules.

You will be storing your dataset into a pandas DataFrame. A DataFrame is very similar to
a dictionary in standard Python but has many additional useful features. Add our data
into this DataFrame by specifying the data keys and corresponding values.

Then, what we need to do is to create a training set for training the classifier and a test
set to evaluate the quality of the trained classifier. Creating the training and test sets
can be quite easy for this task, we can simply split the collected data into two groups.
For example, if we have collected 100 records, we can use 80 records as the training
set, leaving the remaining 20 records as the test set.

Train the decision tree

Once you have correctly formatted your data, you can move on to creating the decision
tree. Create a new scikit-learn DecisionTreeClassifier, pass the ‘entropy’ key as the
criterion for the information gain. Scikit-learn is a powerful machine learning
framework. You will be utilising the included DecisionTreeClassifier class to create
and train your decision tree. Please follow this instruction: https://scikit-
learn.org/stable/modules/tree.html

Call the fit method on this classifier object to train the decision tree.

Create a graph visualisation

Next, utilise the plt.figure method to generate the dot data representation of the trained
decision tree graph. The decision tree graph visualisation will be saved in your working
directory as a PDF named output_graph.

Discuss the generated graph visualisation in your report in details. You will have to put
the generated graph visualisation (output_graph) in the word file as well. Some useful
information you may need for the visualization of a decision tree:

https://scikit-learn.org/stable/modules/tree.html and https://scikit-


learn.org/stable/modules/generated/sklearn.tree.plot_tree.html

Test the decision tree

To test your developed model, you will have to pass the test set into your trained
decision tree classifier. Input this test set into the trained model. Calculating the
classification accuracy is needed.

Detailed program specifications

According to the above descriptions, this task requires you to fulfill the following
objectives:

(1) Load the wine dataset correctly, and split it into train/test sets appropriately;
(2) Appropriate implementation of a decision tree for solving the classification task
using the loaded dataset;
(3) Train decision tree correctly, obtaining good test results (the actual test
performance should be revealed in report);
(4) Visualize the trained decision tree correctly.

Discussions for the Task 3

This task 3 requires a detailed report to explain what you have done and what you have
learned by finishing the Task 3. You are required to write problem definition, the logic of
your implementation of a decision tree, and the results (the accuracy) of your code and
so on.

Lastly, it is requested to further include a session describing your understanding of


decision tree and other AI-related technologies. For example, what you have learned
from doing this assignment and how may such technology can enhance our capability.
Interesting and insightful ideas about what AI will be in the future are very welcome.

The task 3 will be marked based on the following criteria:

• Quality report [12 marks]


o Good quality report is clean and easy to follow. Nicely written language
and comprehensive explanations, together with figures (2 mark)
o Problem definition and design (2 marks)
o Methodology (2 marks)
o Detailed explanation of trained decision Tree (4 marks)
o Insightful Discussion (2 marks)
• Executable of the Python program and correctness of the output [8 marks]
o The executable codes are clean and easy to read without any bugs (1
mark). Comments explaining the codes (1 mark)
o Input data loaded successfully (1 mark) and the split on training and test
set is correct (1 mark)
o The decision tree training and testing is implemented flawlessly using the
given instructions (2 marks)
o A classification accuracy is printed. (1 mark)
o Decision tree visualization is provided. (1 mark)

You might also like