CSC462 Artificial Intelligence Lab Manual Version3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 117

Artificial Intelligence

CSC462
Lab Manual

Name

Registration Number

Class

Instructor’s Name
Introduction
The "Artificial Intelligence Lab Manual" is a comprehensive resource designed to
accompany a lab on artificial intelligence. This lab manual aims to provide students
with practical, hands-on experience in the exciting field of AI. Through a series of
guided exercises and small projects, students will explore core AI concepts,
algorithms, and tools, enabling them to develop a strong foundation in this rapidly
evolving discipline. In order to achieve the set goal, lab will be divided into three
main modules.

1) Introduction to Python, aimed to give students the basic concepts of python


language in the context of AI.

2) Implementation of searching Algorithms in Python 3. This category covers


fundamentals to advance topics including different AI agents, searching algorithms,
knowledge representation,

3) Implementation of Machine Learning algorithms using Python. The supervised


learning methods are considered like the basic concept of linear regression to
classifier. The basic Neural Networks method is also included to help students for
their projects using supervised and unsupervised learning.

The labs constitute 25 % of the total marks for this course. During the labs you will
work individually. You will be graded for this and the ‘In-Lab’ tasks during the in-
lab viva. You will complete the ‘Post-Lab’ section of each lab before coming to the
next week’s lab.

You are not allowed to wander in the lab or consult other groups when performing
experiments. Similarly, the lab reports must contain original efforts. CUI has a
zero tolerance anti-plagiarism policy.

Apart from these weekly labs you will complete a semester project in group of
three students. Lab Mid will be conducted during the lab session. Final Project /
Final Exam which will be graded as Lab Terminal. The grading policy is already
discussed in the Course Description File.
Acknowledgement
The labs for CSC462 Artificial Intelligence are designed by Ms. Sadaf Iqbal. This is
the first version and is still in completion process. Afterwords, Dr. Atif Shakeel and
Dr. Asfar Yaqub completed the manual. The third version is completed by Dr. Ali
Raza Shahid to restructure the labs which could help students in the semester
project regarding the AI advance learning methods.

Resources: • Lab Manual


• https://www.tutorialspoint.com/
• https://github.com/vinta/awesome-python

History of Revision

Version Date of Team Comments


Issue
1.0 February 3, Ms. Sadaf Iqbal This is the first editable draft of
2020 Mr. Noman Ahmed CSC462 AI lab manual.
2.0 This is the updated lab manual for the
upcoming Fall 2022 semester with all
September Dr. Atif Shakeel
the improvement/ changes suggested in
16, 2022 Dr. Azfar Yaqub
CDF version 3. It might need further
improvements.
3.0 September Updated draft of CSC426. For any
Dr. Ali Raza Shahid
20, 2023 details contact: [email protected]
List of Experiments

Lab # Description
Lab 1 Python Introduction – Data Structures and Loops
Lab 2 Python Introduction – Functions, Tuples, Lists and Dictionaries
Lab 3 Object Oriented Design & Breadth First Search Algorithm in Python
Lab 4 Depth First Search for Graph and Tree Traversal
Lab 5 Greedy Best First Search / A* Search for Graphs
Lab 6 Hill Climb Search Algorithm Implementation
Lab 7 Genetic Algorithm
Lab 8 Constraint Satisfaction Problem
Lab 9 Linear Regression and Gradient descent
Lab 10 Logistic Regression / Classification in Machine learning
Lab 11 Decision Tree Classification / KNN Classification
Lab 12 Convolution Neural Networks

List of experiment old


Lab #1: Python Introduction – Data Structures and Loops

Objectives:

This lab is an introductory session on Python. It is a powerful open-source programming


language, comparable to MATLAB, Perl, Ruby, Scheme and Java. Some of Python’s notable
features are:

• Easy to use that makes it simple to get your first programs working.
• Easy to learn that makes it an excellent choice for beginners.
• Runs everywhere, including Mac OS X, Windows, Linux and Unix.

In this lab students will be able to:

• Understand installing Phyton IDE and also using Cloud IDE


• Basic Operations on strings and numbers
• Basic use of conditionals
• Basic use of loops

Pre-Lab Reading 1:

Installing Python IDE on Local Computer:

https://www.geeksforgeeks.org/how-to-install-jupyter-notebook-in-windows/

For Cloud Python IDE along with GPU resource:

https://colab.research.google.com

Pre-Lab Reading 2:

As a pre-lab reading activity, read Chapters 3-5 from the book “Introduction to Programming
Using Python by Y. Liang (Pearson, 2013)” to gain an insight about python programming and
its fundamentals.

Pre-Lab Reading 3: Background

Programming is simply the act of entering instructions for the computer to perform. These
instructions might crunch some numbers, modify text, look up information in files, or
communicate with other computers over the Internet. All programs use basic instructions as
building blocks. Here are a few of the most common ones, in English:

• “Do this; then do that.”


• “If this condition is true, perform this action; otherwise, do that action.”
• “Do this action that number of times.”
• “Keep doing that until this condition is true.”

You can combine these building blocks to implement more intricate decisions, too. Programming
is a creative task, somewhat like constructing a castle out of LEGO bricks. You start with a basic
idea of what you want your castle to look like and inventory your available blocks. Then you
start building. Once you’ve finished building your program, you can pretty up your code just like
you would your castle.
Python refers to the Python programming language (with syntax rules for writing what is
considered valid Python code) and the Python interpreter software that reads source code (written
in the Python language) and performs its instructions. The name Python comes from the surreal
British comedy group Monty Python, not from the snake. Python programmers are affectionately
called Pythonistas.

Pre-Lab Reading 4: Loops in Python:

▪ The 'while' loop

a=0
while a < 10:
a=a+1
print a
▪ The 'for' loop

for i in range(1, 5):

print i

for i in range(1, 5):

print i

else:

print 'The for loop is over'

In-Lab Task 1:

Display numbers on screen using Python IDE.

1. Run Python IDLE


2. Type in any number, say 24 and press Enter.
3. 24 should be printed on the screen
4. Now type 4.2, press Enter.
5. 4.2 should be displayed on screen as an output.
6. Now type print(234). Press Enter. 234 will be the output.
7. Type print(45.90) and press Enter. The output will show 45.90 on screen.

In-Lab Task 2:

Display strings on screen.

1. Python recognized strings through quotes (single, double). Anything inside quotes is a
string for Python interpreter.
2. Type hello, press Enter. An error message will be displayed as Python interpreter does
not understand this as a string.
3. Type ‘hello’ and press Enter. Hello will be displayed.
4. Type ‘Quote me on this!’ and press Enter. Same string will be displayed.
5. Type “What’s your name?” and press Enter. What’s your name? will be printed on the
screen.
6. You can specify multi-line strings using triple quotes – “ “ “ or ‘ ‘ ‘. Type following text
and press Enter

The results on the console should be like this:

In-Lab Task 3:

Use Python as a calculator.

1. The interpreter acts as a simple calculator: you can type an expression at it and it will
write the value. Expression syntax is straightforward: the operators + (addition), -
(subtraction), * (multiplication) and / (division) work just like in most other languages.
Parentheses (()) can be used for grouping.

2. Type in 2 + 2 and press Enter. Python will output its result i.e. 4.
3. Try following expressions in the shell.
a. 50 – 4
b. 23.5 – 2.0
c. 23 – 18.5
d. 5 * 6
e. 2.5 * 10
f. 2.5 * 2.5
g. 28 / 4 (note: division returns a floating point number)
h. 26 / 4
i. 23.4 / 3.1

In-Lab Task 4:

Get an integer answer from division operation. Also get remainder of a division operation
in the output.

1. Division (/) always gives a float answer. There are different ways to get a whole as an
answer.
2. Use // operator:
a. Type 28 // 4 and press Enter. The answer is a whole number.
b. Type 26 // 4 and press Enter.
3. Use int/float cast operator: this operator changes the interchangeable types.
a. Type int(26/4) and press Enter. The answer is a whole number.
b. Type int(28/4), Type float(28/4); press Enter.
4. The modulus (or mod) % operator is used to get the remainder as an output (division /
operator returns the quotient).
a. Type 28 % 4. Press Enter. 0 will be the result.
b. Type 26 % 4. Press Enter. 2 will be the result.
In-Lab Task 5:

Calculate 43, 410, 429, 4150, 41000

1. The multiplication (*) operator can be used for calculating powers of a number.
However, if the power is big, the task will be tedious. For calculating powers of a
number, Python uses ** operator.
2. Type following and obtain the results of above expressions.
a. 4 ** 3
c. 4 ** 10
d. 4 ** 29
e. 4 ** 150
f. 4 ** 1000

In-Lab Task 6:

Write following math expressions. Solve them by hand using operators’ precedence. Calculate their
answers using Python. Match the results.

The table below shows the operator precedence in Python (from Highest to Lowest).

Calculate following expressions:


1. 2+3*6
2. (2+3)*6
3. 48565878 * 578453
4. 2 + 2 (note the spaces after +)
5. (5 - 1) * ((7 + 1) / (3 - 1))
6. 5 +
7. 42 + 5 + * 2

In-Lab Task 7:

Combine numbers and text.

Type the following code. Run it.


In-Lab Task 8:

Take input from the keyboard and use it in your program.

In Python and many other programming languages you can get user input. In Python the input()
function will ask keyboard input from the user.The input function prompts text if a parameter is
given. The function reads input from the keyboard, converts it to a string and removes the newline
(Enter).Type and experiment with the script below.

In-Lab Task 9:

Let us take an integer from user as input and check whether the given value is even or not.

1. Create a new Python file from Python Shell and type the following code.
2. Run the code by pressing F5.

You will get the following output.

In-Lab Task 10:

Let us modify the code to take an integer from user as input and check whether the given value is
even or odd. If the given value is not even then it means that it will be odd. So here we need to use
if-else statement an demonstrated below.

1. Create a new Python file from Python Shell and type the following code.
2. Run the code.

You will get the following output.

In-Lab Task 11:

Calculate the sum of all the values between 0-10 using while loop.

1. Create a new Python file from Python Shell and type the following code.
2. Run the code

You will get the following output.

In-Lab Task 12:

Accept 5 integer values from user and display their sum. Draw flowchart before coding in
python.
1. Create a new Python file from Python Shell and type the following code.
2. Run the code.

You will get the following output.


In-Lab Task 13:

Write a Python code to keep accepting integer values from user until 0 is entered. Display sum
of the given values.
1. Create a new Python file from Python Shell and type the following code.
2. Run the code.

You will get the following output.

In-Lab Task 13:

Write a Python code to accept an integer value from user and check that whether the given
value is prime number or not.
Solution:
1. Create a new Python file from Python Shell and type the following code.
2. Run the code.

You will get the following output(s).


Post-Lab Task 1:

Write a Python code to accept marks of a student from 1-100 and display the grade according
to the following formula.

• Grade F if marks are less than 50


• Grade E if marks are between 50 to 60
• Grade D if marks are between 61 to 70
• Grade C if marks are between 71 to 80
• Grade B if marks are between 81 to 90
• Grade A if marks are between 91 to 100

Post-Lab Task 2:

Write a program that takes a number from user and calculate the factorial of that number.

Post-Lab Task 3:

Fibonacci series is that when you add the previous two numbers the next number is formed.
You have to start from 0 and 1.
E.g., 0+1=1 → 1+1=2 → 1+2=3 → 2+3=5 → 3+5=8 → 5+8=13

So, the series becomes:


0 1 1 2 3 5 8 13 21 34 55 …
Steps: You have to take an input number that shows how many terms to be displayed. Then
use loops for displaying the Fibonacci series up to that term e.g., input no is equals to 6 the
output should be:
011235

NOTE: ATTACH THE SNIPS OF YOUR TASK WITHIN LAB MANUAL


CRITICAL ANALYSIS/CONCLUSION

Lab Assessment

Pre-Lab /1

In-Lab /5

Data
Analysis
/4 /10
Data
Post-Lab /4 /4
Presentation

Writing
/4
Style

Instructor Signature and Comments


LAB# 2: Python Introduction – Functions, Tuples, Lists and Dictionaries

Objectives:

• How to use lists, tuples, sets and dictionaries


• How to use loops with lists
• How to write customized functions

Pre-Lab Reading 1:

As a pre-lab activity, read Chapters 6, 10 and 14 from the book “Introduction to Programming Using
Python - Y. Liang (Pearson, 2013)” to gain an insight about python programming and its
fundamentals.

Pre-Lab Reading 2: Background

Python provides different types of data structures as sequences. In a sequence, there are
more than one values and each value has its own index. The first value will have an index 0 in
python, the second value will have index 1 and so on. These indices are used to access a
particular value in the sequence.

Python offers different types of sequences. Lists are the most important type of sequence being used
in Python. It is a collection of same or different type of objects. These objects are separated by
commas to distinguish from each other enclosed in square brackets. The following activities show
that how lists are used in Python. A tuple is a sequence of immutable Python objects. Tuples are
sequences, just like lists. The differences between tuples and lists are, the tuples cannot be changed
unlike lists and tuples use parentheses, whereas lists use square brackets. The data type "set", which
is a collection type, has been part of Python since version 2.4. A set contains an unordered collection
of unique and immutable objects. The set data type is, as the name implies, a Python implementation
of the sets as they are known from mathematics. This explains, why sets unlike lists or tuples can't
have multiple occurrences of the same element. A dictionary is a collection which is unordered,
changeable and indexed. In Python dictionaries are written with curly brackets, and they have keys
and values.

Pre-Lab Reading 3: Functions

▪ How to call a function?

function_name(parameters)

Code Example - Using a function

a = multiplybytwo(70)

The computer would actually see this:

a=140

▪ Define a Function?

def function_name(parameter_1,parameter_2):
{this is the code in the function}

return {value (e.g. text or number) to return to the main program}

▪ range() Function:

If you do need to iterate over a sequence of numbers, the built-in function range() comes in
handy. It generates lists containing arithmetic progressions:

>>> range(10) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

It is possible to let the range start at another number, or to specify a different increment
(even negative; sometimes this is called the ‘step’):

>>> range(5, 10)

[5, 6, 7, 8, 9]

>>> range(0, 10, 3)

[0, 3, 6, 9]

>>> range(-10, -100, -30)

[-10, -40, -70]

Pre-Lab Reading 4: Lists

Lists are what they seem - a list of values. Each one of them is numbered, starting from zero.
You can remove values from the list, and add new values to the end. Example: Your many cats'
names. Compound data types, used to group together other values. The most versatile is the list,
which can be written as a list of comma-separated values (items) between square brackets. List
items need not all have the same type

cats = ['Tom', 'Snappy', 'Kitty', 'Jessie', 'Chester']

print cats[2]
cats.append('Catherine')

#Remove your 2nd cat, Snappy. Woe is you.


del cats[1]

Compound datatype:
>>> a = ['spam', 'eggs', 100, 1234]

>>> a[1:-1]
['eggs', 100]
>>> a[:2] + ['bacon', 2*2]
['spam', 'eggs', 'bacon', 4]
>>> 3*a[:3] + ['Boo!']
['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boo!']
>>> a= ['spam', 'eggs', 100, 1234]
>>> a[2] = a[2] + 23
>>> a
['spam', 'eggs', 123, 1234]
Replace some items:
... a[0:2] = [1, 12]
>>> a
[1, 12, 123, 1234]

Remove some:
... a[0:2] = []
>>> a
[123, 1234]

Clear the list: replace all items with an empty list:


>>> a[:] = []
>>> a
[]

Length of list:
>>> a = ['a', 'b', 'c', 'd']
>>> len(a)
4

Nest lists:
>>> q = [2, 3]
>>> p = [1, q, 4]
>>> len(p)
3
>>> p[1]
[2, 3]

Functions of lists (you may also explore the online python documentation for more
details):

list.append(x): Add an item to the end of the list; equivalent to a[len(a):] = [x].

list.extend(L): Extend the list by appending all the items in the given list;equivalent to a[len(a):]
= L.

list.insert(i, x): Insert an item at a given position. The first argument is the index of the element
before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is
equivalent to a.append(x).

list.remove(x): Remove the first item from the list whose value is x. It is an error if there is no
such item.

list.pop([i]): Remove the item at the given position in the list, and return it. If no index is
specified, a.pop() removes and returns the last item in the list.

list.count(x): Return the number of times x appears in the list.

list.sort(): Sort the items of the list, in place.

list.reverse(): Reverse the elements of the list, in place.

Pre-Lab Reading 5: Tuples:


Tuples are just like lists, but you can't change their values. Again, each value is numbered starting
from zero, for easy reference. Example: the names of the months of the year.

months = ('January' , 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September',
'October', 'November', ' December’)

index Value

0 Jan

1 Feb

2 Mar

3 April

4 May

5 Jun

6 Jul

7 Aug

8 Sep

9 Oct

10 Nov

11 Dec

>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']

>>> fruit = set(basket) # create a set without duplicates

>>> fruit

set(['orange', 'pear', 'apple', 'banana'])

>>> 'orange' in fruit # fast membership testing

True

>>> 'crabgrass' in fruit

False

Pre-Lab Reading 4: Dictionaries


Dictionaries are similar to what their name suggests - a dictionary. In a dictionary, you have an
'index' of words, and for each of them a definition.

In python, the word is called a 'key', and the definition a 'value'. The values in a dictionary
aren't numbered - they aren't in any specific order, either - the key does the same thing.

You can add, remove, and modify the values in dictionaries. Example: telephone book.

phonebook = { 'ali':8806336, 'omer':6784346,'shoaib':7658344, 'saad':1122345 }

#Add the person '' to the phonebook:


phonebook['waqas'] = 1234567

# Remove the person '' to the phonebook:


del phonebook['shoaib']

phonebook = {'Andrew Parson':8806336, \


'Emily Everett':6784346, 'Peter Power':7658344, \
'Lewis Lame':1122345}

#Add the person 'Gingerbread Man' to the phonebook:


phonebook['Gingerbread Man'] = 1234567

#Delete the person 'Gingerbread Man' to the phonebook:


del phonebook['Andrew Parson']

In-Lab Task 1:

Use loops to accept 5 values from user and store them in a list. Display all the values (objects) of
the list.

You will get the following output.

In-Lab Task 2:

Repeat the above code by accepting 5 integer values from user. Store these values in a list and
display the sum of given values.
You will get the following output.

In-Lab Task 3:

Accept 5 integer values from user. Store these values in a list and display the list in ascending
order.

You will get the following output.

In-Lab Task 4:

Accept two lists from user and display their join.


You will get the following output.

In-Lab Task 5:

Write a Python code to accept a list from user and find a required element in it.

You will get the following output.


If we run the program again and enter 55 to find in the list then the output will be as below.

In-Lab Task 6:

Write a function called say_hello that takes in a person’s name as a parameter. The function should
print a greeting message with the person’s name. Then call the function three times with three
different names.

In-Lab Task 7:

A palindrome is a string which is same read forward or backwards.

For example: "dad" is the same in forward or reverse direction. Another example is
"aibohphobia" which literally means, an irritable fear of palindromes.

Write a function in python that receives a string and returns True if that string is a palindrome and
False otherwise. Remember that difference between upper and lower case characters are ignored
during this determination.

In-Lab Task 8:
Imagine two matrices given in the form of 2D lists as under;
a = [[1, 0, 0], [0, 1, 0], [0, 0, 1] ]
b = [[1, 2, 3], [4, 5, 6], [7, 8, 9] ]
Write a python code that finds another matrix/2D list that is a product of and b, i.e.,
C=a*b

In-Lab Task 9:

A closed polygon with N sides can be represented as a list of tuples of N connected coordinates, i.e.,

[ (x1,y1), (x2,y2), (x3,y3), . . . , (xN,yN) ]. A sample polygon with 6 sides (N=6) is shown below.

Write a python function that takes a list of N tuples as input and returns the perimeter of the polygon.
Remember that your code should work for any value of N.

Hint: A perimeter is the sum of all sides of a polygon.

In-Lab Task 10:

Imagine two sets A and B containing numbers. Without using built-in set functionalities, write
your own function that receives two such sets and returns another set C which is a symmetric
difference of the two input sets. (A symmetric difference between A and B will return a set C
which contains only those items that appear in one of A or B. Any items that appear in both
sets are not included in C). Now compare the output of your function with the following built-
in functions/operators.
• A.symmetric_difference(B)
• B.symmetric_difference(A)
• A^B
• B^A

In-Lab Task 11:

Create a Python program that contains a dictionary of names and phone numbers. Use a tuple of
separate first and last name values for the key field. Initialize the dictionary with at least three
names and numbers. Ask the user to search for a phone number by entering a first and last name.
Display the matching number if found, or a message if not found.

Post-Lab Task 1:

Create two lists based on the user values. Merge both the lists and display in sorted order.

Post-Lab Task 2:

Repeat the above activity to find the smallest and largest element of the list. (Suppose all the
elements are integer values)
Post-Lab Task 3:

The derivate of a function 𝑓(𝑥) is a measurement of how quickly the function 𝑓 changes with
respect to change in its domain 𝑥. This measurement can be approximated by the following
relation,

Where h represents a small increment in x. You have to prove the following relation,

Imagine x being a list that goes from –pi to pi with an increment of 0.001. You can approximate
the derivative by using the following approximation,

In your case, assume h = 0.001. That is at each point in x, compute the right hand side of above
equation and compare whether the output value is equivalent to cos(x). Also print the
𝑑
corresponding values of 𝑑𝑥 sin (𝑥)and cos(x) for every point. Type ‘’from math import *’’ at the
start of your program to use predefined values of pi, and sin and cos functions. What happens if
you increase the interval h from 0.001 to 0.01 and then to 0.1?

Moreover, also plot your results using the matplotlib library (Hint: See online documentation and
examples)

NOTE: ATTACH THE SNIPS OF YOUR TASK WITHIN LAB MANUAL


CRITICAL ANALYSIS/CONCLUSION

Lab Assessment

Pre-Lab /1

In-Lab /5

Data
Analysis
/4 /10
Data
Post-Lab /4 /4
Presentation

Writing
/4
Style

Instructor Signature and Comments

LAB #3: Object Oriented Design & Breadth First Search Algorithm in
Python

Objectives
This lab will introduce students to objected oriented programming and search problems. We
will first start by representing problems in terms of state space graph using the classes. Given a
state space graph, starting state and a goal state, students will then perform a basic Breadth First
Search solution within that graph. The output will be a set of actions or a path that will begin
from initial state/node and end in the goal node.

• How to make a class


• How to represent problems in terms of state space graph
• How to use find a solution of those problems using Breadth First Search

Pre-Lab Reading 1:

The word 'class' can be used when describing the code where the class is defined.

A variable inside a class is known as an Attribute

A function inside a class is known as a method

• A class is like a
– Prototype
– Blue-print
– An object creator
• A class defines potential objects
– What their structure will be
– What they will be able to do
• Objects are instances of a class
– An object is a container of data: attributes
– An object has associated functions: methods
Syntax:

# Defining a class
class class_name:
[statement 1]
[statement 2]
[statement 3] [etc]

Example1:

class MyClass:
i = 12345
def f(self):
return 'hello world'
x = MyClass()
print x.i
print x.f()

Example2:
class Complex:
def __init__(self, realpart, imagpart):
self.r = realpart
self.i = imagpart
x = Complex(3.0, -4.5)
print x.r," ",x.i

Example3:

class Shape:
def __init__(self,x,y):
self.x = x
self.y = y
description = "This shape has not been described yet"
author = "Nobody has claimed to make this shape yet"
def area(self):
return self.x * self.y
def perimeter(self):
return 2 * self.x + 2 * self.y
def describe(self,text):
self.description = text
def authorName(self,text):
self.author = text
def scaleSize(self,scale):
self.x = self.x * scale
self.y = self.y * scale
a=Shape(3,4)
print a.area()

Inheritance Example:

class Square(Shape):
def __init__(self,x):
self.x = x
self.y = x
class DoubleSquare(Square):
def __init__(self,y):
self.x = 2 * y
self.y = y
def perimeter(self):
return 2 * self.x + 3 * self.y

Pre-Lab Reading Task 2:

Create a class name basic_calc with following attributes and methods; Two integers (values are
passed with instance creation) Different methods such as addition, subtraction, division,
multiplication

Create another class inherited from basic_calc named s_calc which should have the following
additional methods; Factorial, x_power_y,log, ln etc

class child_class(parent_class):
def __init__(self,x):
#it will modify the _init_ function from parent class
#additional methods can be defined

Pre-Lab Reading Task 3:

As pre-lab activity, read Chapter 3 from the book (Artificial Intelligence, A Modern Approach by
Peter Norvig, 3rd edition) to know the basics of search algorithms.

Pre-Lab Reading Task 4:

Graph Theory

Graph theory and in particular the graph ADT (abstract data-type) is widely explored and
implemented in the field of Computer Science and Mathematics. Consisting of vertices (nodes)
and the edges (optionally directed/weighted) that connect them, the data-structure is effectively
able to represent and solve many problem domains. One of the most popular areas of algorithm
design within this space is the problem of checking for the existence or (shortest) path between
two or more vertices in the graph. Properties such as edge weighting and direction are two such
factors that the algorithm designer can take into consideration. In this post I will be exploring
two of the simpler available algorithms, Depth-First and Breath-First search to achieve the goals
highlighted below:

▪ Find all vertices in a subject vertex connected component.


▪ Return all available paths between two vertices.

And in the case of BFS, return the shortest path (length measured by number of path edges).

In our illustration, - which is a pictorial representation of a graph, - the node "a" is connected
with the node "c", but "a" is not connected with "b". The connecting line between two nodes is
called an edge. If the edges between the nodes are undirected, the graph is called an undirected
graph. If an edge is directed from one vertex (node) to another, a graph is called a directed graph.
An directed edge is called an arc.
Python has no built-in data type or class for graphs, but it is easy to implement them in Python.
One data type is ideal for representing graphs in Python, i.e. dictionaries.

The keys of the dictionary above are the nodes of our graph. The corresponding values are lists
with the nodes, which are connecting by an edge. There is no simpler and more elegant way to
represent a graph.

An edge can be seen as a 2-tuple with nodes as elements, i.e. ("a","b")

Paths in Graph

We want to find now the shortest path from one node to another node.

▪ Adjacent vertices:
Two vertices are adjacent when they are both incident to a common edge.
▪ Path in an undirected Graph:
A path in an undirected graph is a sequence of vertices P = ( v1, v2, ..., vn ) ∈ V x V x
... x V such that vi is adjacent to v{i+1} for 1 ≤ i < n. Such a path P is called a path of
length n from v1 to vn.
▪ Simple Path:
A path with no repeated vertices is called a simple path.

Degree of a Graph

The degree of a vertex v in a graph is the number of edges connecting it, with loops counted
twice. The degree of a vertex v is denoted deg(v). The maximum degree of a graph G, denoted
by Δ(G), and the minimum degree of a graph, denoted by δ(G), are the maximum and minimum
degree of its vertices.

In the graph on the right side, the maximum degree is 5 at vertex c and the minimum degree is 0,
i.e the isolated vertex f.
Pre-Lab Reading Task 5:

Sometimes, very different-sounding problems turn out to be similar when you think about
how to solve them. What do Pac-Man, the royal family of Britain, and driving to Orlando have
in common? They all involve route-finding or path-search problems:

How is the current Prince William related to King William III, who endowed the College of
William and Mary in 1693?

What path should a ghost follow to get to Pac-Man as quickly as possible?

What's the best way to drive from Dallas, Texas to Orlando, Florida?

We have to be given some information to answer any of these questions. This information about
each question can be represented in terms of graph. This lab will enable students to represent
problems in terms of graph and then finding a solution in terms of a path using breadth first search
(BFS).

In-Lab Task 1:

Consider a toy problem that can be represented as a following graph. How would you represent this
graph in python?
Solution:

Remember a node of a tree can be represented by four attributes. We consider node as a class
having four attributes namely,
1- State of the node
2- Parent of the node
3- Actions applicable to that node
4- The total path cost of that node starting from the initial state until that particular node is
reached

We can now implement this class in a dictionary. This dictionary will represent our state space
graph. As we will traverse through the graph, we will keep updating parent and cost of each node.

In-Lab Task 2:

For the graph in previous activity, imagine node A as starting node and your goal is to reach F.
Keeping breadth first search in mind, describe a sequence of actions that you must take to reach that
goal state.

Solution:

Remember that in theory class, we discussed the following implementation of breadth first search.
What follows is python implementation of above mentioned algorithm,

Now the function definition is complete which can be called as follows,

There is one additional line of graph[child].parent=currentNode in python code which is missing in


pseudocode. This allows us to update each parent of a node as we traverse the graph. Afterwards,
the function actionSequence() is called which returns a series of actions when a goal state is
reached. Given a start state, end state and a graph, this function recursively iterated through each
parent until the starting state is reached.
Calling BFS() will return the following solution,
['A', 'C', 'F']

In-Lab Task 3:

Change initial state to D and set goal state as C. What will be resulting path of BFS search?

Solution: ['D', 'B', 'A', 'C']

In-Lab Task 4:

Imagine going from Arad to Bucharest in the following map. Implement a BFS to find the
corresponding path.

Post Lab Task 1:

Implement the given Graph in Python.


Write a method to output degree of each node within the graph. Write a method to find any path
from node 6 to node 1 in given Graph. Modify to show all possible paths between node 6 to
node 1 in Graph.

Post-Lab Task 2:

Consider a maze as shown below. Each empty tile represents a separate node in the graph. There
are maximum of four possible actions i.e., to move up, down, left or right on any given tile/node.
Using BFS, find out how to get out of the maze if you’re in the start position depicted below.

CRITICAL ANALYSIS/CONCLUSION
Lab Assessment

Pre-Lab /1

In-Lab /5

Data
Analysis
/4 /10
Data
Post-Lab /4 /4
Presentation

Writing
/4
Style

Instructor Signature and Comments

Lab 4: Depth First Search for Graph and Tree Traversal


Objectives:

This lab will introduce students to search problems. We will first start by representing problems in
terms of state space graph. Given a state space graph, starting state and a goal state, students will
then perform a basic Depth First Search solution within that graph. The output will be a set of
actions or a path that will begin from initial state/node and end in the goal node. Students will also
see the working of Uniform Cost Search.

• How to represent problems in terms of state space graph


• How to find a solution of those problems using Breadth First Search.
• How to find a solution of those problems using Uniform Cost Search.

Pre-Lab Reading Task 1:

As pre-lab activity, read Chapter 3 from the book (Artificial Intelligence, A Modern Approach by
Peter Norvig, 3rd edition) to know the basics of search algorithms.

Pre-Lab Reading Task 2:

In the previous lab we studied breadth first search, which is the most naïve way of searching
trees. The path returned by breadth search is not optimal. Uniform cost search always returns the
optimal path. In this lab students will be able to implement a uniform cost search approach.
Students will also be able to apply depth first search to their problems.

In-Lab Task 1:

Consider a toy problem that can be represented as a following graph. How would you represent this
graph in python?

Remember a node of a tree can be represented by four attributes. We consider node as a class
having four attributes namely,
1. State of the node
2. Parent of the node
3. Actions applicable to that node (In the book/theory class, this attribute referred to the
parent action that generated current node, however in this lab we will use this
attribute to refer to all possible children states that are achievable from current state).
4. The total path cost of that node starting from the initial state until that particular node
is reached
We can now implement this class in a dictionary. This dictionary will represent our state space
graph. As we will traverse through the graph, we will keep updating parent and cost of each node.

In-Lab Task 2:

For the graph in previous activity, imagine node A as starting node and your goal is to reach F.
Keeping depth first search in mind, describe a sequence of actions that you must take to reach
that goal state.

Solution: Remember that we can implement depth first search simply by using LIFO approach
instead of FIFO that was used in breadth first search. Additionally we also don’t keep leaf nodes
(nodes without children) in explored set.
Now the function definition is complete which can be called as follows,

Notice the difference in two portions of the code between breadth first search and depth first search.
In the first we just pop out the last entry from the queue and in the 2 nd difference we delete leaf
nodes from the graph.

Calling DFS() will return the following solution,

['A', 'E', 'D']

In-Lab Task 3:

Change initial state to D and set goal state as C. What will be resulting path of BFS search?
What will be the sequence of nodes explored?

Solution: Final path is ['D', 'B', 'A', 'C']

Explored node sequence is D, E, A.

In-Lab Task 4:

Imagine the same tree but this time we also mention the cost of each edge.
Implement a uniform cost solution to find the path from C to B.

Solution:

First we modify our graph structure so that in Node.actions is an array of tuples where each tuple
contains a vertex and its associated weight.

We also modify the frontier format which will now be a dictionary. This dictionary will contain
each node (the state of the node will act as a key and its parent and accumulated cost from the
initial state will be two attributes of a particular key). We now define a function which will give the
node/key for which the cost is minimum. This will be implementation of pop method from the
priority queue.

The rest of the functions are the same as in BFS i.e.,


Finally we define UCS()
solution = UCS()
print(solution)
The above two lines will print

['C', 'F', 'D', 'B']

In-Lab Task 5:

Imagine going from Arad to Bucharest in the following map. Your goal is to minimize the distance
mentioned in the map during your travel. Implement a uniform cost search to find the
corresponding path.

Post-Lab Task 1:

Create a graph and then set initial and goal states such that the number of nodes visited for BFS is
smaller than that in DFS. Now modify the initial and goal state such that the number of nodes
visited for BFS is larger than that in DFS.
CRITICAL ANALYSIS/CONCLUSION

Lab Assessment

Pre-Lab /1

In-Lab /5

Data
Analysis
/4 /10
Data
Post-Lab /4 /4
Presentation

Writing
/4
Style

Instructor Signature and Comments

Lab 5: Greedy Best First Search / A* Search for Graphs


Objective:

This lab will introduce students to heuristic and local search methods. In particular we will implement
a particular variant of heuristic search known as A* search.
• How to represent problems in terms of state space graph
• How to find a solution of those problems using A* search.

Pre-Lab Reading 1:

As pre-lab activity, read Chapter 3 from the book (Artificial Intelligence, A Modern Approach by Peter

Norvig, 3rd edition) to know the basics of search algorithms.

Pre-Lab Reading 2: Useful concepts

So far we have studied uninformed search strategies. There can also be occasions where we are given
some extra information related to a particular goal in the form of heuristics. We can use such heuristics in
our search strategy. A particular form of this strategy known as A* search uses the total cost of a
solution via a particular node as that node’s evaluation criteria. We will see its implementation in
detail. A* Search algorithm is one of the best and popular technique used in path-finding and graph
traversals.

Why A* Search Algorithm?

Informally speaking, A* Search algorithms, unlike other traversal techniques, it has “brains”. What it
means is that it is really a smart algorithm which separates it from the other conventional algorithms.
This fact is cleared in detail in below sections. And it is also worth mentioning that many games and
web-based maps use this algorithm to find the shortest path very efficiently (approximation).

Explanation

Consider a square grid having many obstacles and we are given a starting cell and a target cell. We
want to reach the target cell (if possible) from the starting cell as quickly as possible. Here A*
Search Algorithm comes to the rescue. What A* Search Algorithm does is that at each step it picks
the node according to a value-‘f’ which is a parameter equal to the sum of two other parameters – ‘g’
and ‘h’. At each step it picks the node/cell having the lowest ‘f’, and process that node/cell. We
define ‘g’ and ‘h’ as simply as possible below g = the movement cost to move from the starting point
to a given square on the grid, following the path generated to get there. h = the estimated movement
cost to move from that given square on the grid to the final destination. This is often referred to as
the heuristic, which is nothing but a kind of smart guess. We really don’t know the actual distance
until we find the path, because all sorts of things can be in the way (walls, water, etc.). There can be
many ways to calculate this ‘h’ which are discussed in the later sections.

Algorithm
We create two lists – Open List and Closed List (just like Dijkstra Algorithm)

// A* Search Algorithm
1. Initialize the open list
2. Initialize the closed list put the starting node on the open
list (you can leave its f at zero)
3. while the open list is not empty
a) find the node with the least f on the open list, call it
"q"
b) pop q off the open list
c) generate q's 8 successors and set their parents to q
d) for each successor
i) if successor is the goal, stop search
ii) else, compute both g and h for successor
successor.g = q.g + distance between
successor and q successor.h = distance from
goal to successor (This can be done using
many ways, we will discuss three heuristics-
Manhattan, Diagonal and Euclidean Heuristics)
successor.f = successor.g + successor.h
iii) if a node with the same position as successor
is in the OPEN list which has a lower f than
successor, skip this successor
iv) if a node with the same position as successor
is in the CLOSED list which has a lower f
than successor, skip this successor
otherwise, add the node to the open listend
(for loop)
e) push q on the closed list end (while loop)

So suppose as in the below figure if we want to reach the target cell from the source cell, then the A*
Search algorithm would follow path as shown below. Note that the below figure is made by
considering Euclidean Distance as a heuristics.

Pre-Lab Reading 3: Heuristics

There are generally three approximation heuristics to calculate h –


1) Manhattan Distance –
• It is nothing but the sum of absolute values
of differences in the goal’s x and y
coordinates and the current cell’s x and y
coordinates respectively, i.e.,
h = abs (current_cell.x – goal.x) +
abs (current_cell.y – goal.y)
• When to use this heuristic? – When we are
allowed to move only in four directions only
(right, left, top, bottom)
The Manhattan Distance Heuristics is shown by the
below figure (assume red spot as source cell and green spot as target cell).

2) Diagonal Distance-
• It is nothing but the maximum of absolute
values of differences in the goal’s x and y
coordinates and the current cell’s x and y
coordinates respectively, i.e.,
dx = abs(current_cell.x – goal.x)
dy = abs(current_cell.y – goal.y)
h = D * (dx + dy) + (D2 - 2 * D) * min(dx, dy)
where D is length of each node(usually = 1) and D2 is diagonal distance between each node (usually
= sqrt(2) ).
• When to use this heuristic? – When we are allowed to move in eight directions only
(similar to a move of a King in Chess)
The Diagonal Distance Heuristics is shown by the below figure (assume red spot as source cell and
green spot as target cell).

3) Euclidean Distance-
• As it is clear from its name, it is nothing but
the distance between the current cell and the
goal cell using the distance formula
h = sqrt ( (current_cell.x – goal.x)2 +
(current_cell.y – goal.y)2 )
• When to use this heuristic? – When we are
allowed to move in any directions.
The Euclidean Distance Heuristics is shown by the below
figure (assume red spot as source cell and green spot as
target cell).
Relation (Similarity and Differences) with other algorithms- Dijkstra is a special case of A*
Search Algorithm, where h = 0 for all nodes.

In-Lab Task 1:
Consider a maze as shown below. Each empty tile represents a separate node in the graph, while the
walls are represented by blue tiles. Your starting node is A and the goal is to reach Y. Implement an
A* search to find the resulting path.

Solution:
Since A* search needs a heuristic function that specifies the distance of each node with the goal, you
can use Euclidean distance between a particular node and the goal node as your heuristic function of
that particular node.
But first we need to modify the structure of Node class that will also include the heuristic distance of
that node.

Next we modify the structure of dictionary that will save our graph. Here at each node, we also specify
the coordinate location of each node. For implementation we will follow uniform cost implementation
except for the fact that the total cost will include both the previous cost of reaching that node and the
distance of the goal with that node.
In-Lab Task 2:

Your goal is to navigate a robot out of a maze. The robot starts in


the corner of the maze marked with red color. You can turn the
robot to face north, east, south, or west. You can direct the robot
to move forward a certain distance, although it will stop before
hitting a wall. The goal is to reach the final state marked with
green color.
Write a program that implements A* algorithms to solve this maze.
Write the path followed (in the form of coordinates) and the cost of
the path.
CRITICAL ANALYSIS/CONCLUSION

Lab Assessment

Pre-Lab /1

In-Lab /5

Data
Analysis
/4 /10
Data
Post-Lab /4 /4
Presentation

Writing
/4
Style

Instructor Signature and Comments

Lab 6: Hill Climb Search Algorithm Implementation

Objective:
This lab will introduce students to search problems. We will first start by representing problems in
terms of state space graph. Given a state space graph, starting state and a goal state, students will then
perform a basic hill climbing solution within that graph. The output will be a set of actions or a path
that will begin from initial state/node and end in the goal node. In computer science, hill climbing is a
mathematical optimization technique which belongs to the family of local search. It is relatively simple
to implement, making it a popular first choice. Although more advanced algorithms may give better
results, in some situations hill climbing works just as well.

• How to represent problems in terms of state space graph


• How to find a solution of those problems using hill climbing Search.

Pre-Lab Reading 1:
As pre-lab activity, read Chapter 3 from the book (Artificial Intelligence, A Modern Approach by Peter
Norvig, 3rd edition) to know the basics of search algorithms.

Pre-Lab Reading 2: Useful concepts

Hill climbing can be used to solve problems that have many solutions, some of which are better than
others. It starts with a random (potentially poor) solution, and iteratively makes small changes to the
solution, each time improving it a little. When the algorithm cannot see any improvement anymore, it
terminates. Ideally, at that point the current solution is close to optimal, but it is not guaranteed that hill
climbing will ever come close to the optimal solution.

For example, hill climbing can be applied to the traveling salesman problem. It is easy to find a
solution that visits all the cities but will be very poor compared to the optimal solution. The algorithm
starts with such a solution and makes small improvements to it, such as switching the order in which
two cities are visited. Eventually, a much better route is obtained. Hill climbing is used widely in
artificial intelligence, for reaching a goal state from a starting node. Choice of next node and starting
node can be varied to give a list of related algorithms.

Hill Climbing is a heuristic search used for mathematical optimization problems in the field of
Artificial Intelligence. Given a large set of inputs and a good heuristic function, it tries to find a
sufficiently good solution to the problem. This solution may not be the global optimal maximum.

• In the above definition, mathematical optimization problems imply that hill-climbing


solves the problems where we need to maximize or minimize a given real function by
choosing values from the given inputs. Example-Travelling salesman problem where we
need to minimize the distance traveled by the salesman.

• ‘Heuristic search’ means that this search algorithm may not find the optimal solution to
the problem. However, it will give a good solution in a reasonable time.

• A heuristic function is a function that will rank all the possible alternatives at any
branching step in the search algorithm based on the available information. It helps the
algorithm to select the best route out of possible routes.

Features of Hill Climbing

A. Variant of generate and test algorithm: It is a variant of generating and test algorithm. The
generate and test algorithm is as follows :

1. Generate possible solutions.


2. Test to see if this is the expected solution.
3. If the solution has been found quit else go to step 1.

Hence, we call Hill climbing a variant of generating and test algorithm as it takes the feedback from
the test procedure. Then this feedback is utilized by the generator in deciding the next move in
search space.

B. Uses the Greedy approach: At any point in state space, the search moves in that direction only
which optimizes the cost of function with the hope of finding the optimal solution at the end.

Types of Hill Climbing

1. Simple Hill climbing: It examines the neighboring nodes one by one and selects the first
neighboring node which optimizes the current cost as the next node.

Algorithm:

Step 1 : Evaluate the initial state. If it is a goal state then stop and return success. Otherwise, make
initial state as current state.

Step 2 : Loop until the solution state is found or there are no new operators present which can be
applied to the current state.

a) Select a state that has not been yet applied to the current state and apply it to produce a new
state.
b) Perform these to evaluate new state
i. If the current state is a goal state, then stop and return success.
ii. If it is better than the current state, then make it current state and proceed further.
iii. If it is not better than the current state, then continue in the loop until a solution is
found.

Step 3 : Exit.

2. Steepest-Ascent Hill climbing: It first examines all the neighboring nodes and then selects
the node closest to the solution state as of the next node.

Algorithm:

Step 1 : Evaluate the initial state. If it is a goal state then stop and return success. Otherwise, make
initial state as current state.
Step 2 : Repeat these steps until a solution is found or current state does not change
a) Select a state that has not been yet applied to the current state.
b) Initialize a new ‘best state’ equal to current state and apply it to produce a new state.
c) Perform these to evaluate new state
i. If the current state is a goal state, then stop and return success.
ii. If it is better then best state, then make it best state else continue loop with another
new state.
d) Make best state as current state and go to Step 2: b) part.

Step 3 : Exit

3. Stochastic hill climbing: It does not examine all the neighboring nodes before deciding
which node to select. It just selects a neighboring node at random and decides (based on the amount
of improvement in that neighbor) whether to move to that neighbor or to examine another.

Algorithm:

Step 1: Evaluate the initial state. If it is a goal state then stop and return success. Otherwise,
make the initial state the current state.
Step 2: Repeat these steps until a solution is found or the current state does not change.
a) Select a state that has not been yet applied to the current state.
b) Apply successor function to the current state and generate all the neighbor states.
c) Among the generated neighbor states which are better than the current state choose a state
randomly (or based on some probability function).
d) If the chosen state is the goal state, then return success, else make it the current state and
repeat step 2: b part.

Step 3: Exit.

In-Lab Task 1:

For the given graph , imagine node A as starting node and your goal is to reach Y. Apply hill
climbing and see how closer you can get to your destination.
Solution:
Instead of maintaining a fringe or a frontier to save the nodes that are to be explored, hill climbing just
explores the best child of a given node, then explores the best grandchild of a particular child and so on
and so forth.
This will give ['F', 'H', 'I', 'J'] as the solution, which means that the algorithm gets
stuck at J and doesn’t go further towards Y since the distance of G is greater than J. Remember that the
path is not important in local search. The solution should be your last node (currently that is J which
happens to be local maxima).

In-Lab Task 2:

Write a program that implements Hill Climbing algorithms to


solve this maze. Write the path followed (in the form of
coordinates) and the cost of the path.

Post-Lab Task 1:

In the ‘In-Lab Task 1’, it was observed that the basic hill climb algorithm did not provide a
complete solution. Therefore, using the Steepest-Ascent hill climbing algorithm from the Pre-
Lab Reading to implement the method for finding a better solution.

Post-Lab Task 2:

In the ‘In-Lab Task 2’, it was observed that the basic hill climb algorithm did not yield a
complete solution. Therefore, utilize the Stochastic Hill Climbing algorithm from the Pre-Lab
Reading to implement a method for achieving a more optimal solution. Additionally, alter the
initial and goal states to explore alternatives and observe the introduced randomness.

Note: The instructor can design graded lab activities according to the level of difficult and
complexity of the solved lab activities. The lab tasks assigned by the instructor should be
evaluated in the same lab
CRITICAL ANALYSIS/CONCLUSION

Lab Assessment

Pre-Lab /1

In-Lab /5

Data
Analysis
/4 /10
Data
Post-Lab /4 /4
Presentation

Writing
/4
Style

Instructor Signature and Comments


Lab 7: Genetic Algorithms: Chromosome Encoding and Global Maxima
Optimization

Objective:

This lab will introduce students to genetic algorithms. Students will get the opportunity to get into
details of genetic concepts of computation including crossover, mutation and survivor selection. This
lab will also introduce students into different schemes of chromosome encoding.

• How to encode chromosomes into alphabets


• How to select survivors based upon fitness function
• How to find global maxima using genetic algorithms
.
Pre-Lab Reading 1:

As pre-lab activity, read Chapters 4 from the book (Artificial Intelligence, A Modern Approach by
Peter Norvig, 4th edition) to know the basics of genetic algorithms.

Pre-Lab Reading 2: Useful concepts

In previous lab we implemented hill climbing and saw that it can stuck at local maxima. A possible
solution to avoid local maxima is to use genetic algorithms.

Genetic Algorithms(GAs) are adaptive heuristic search algorithms that belong to the larger part of
evolutionary algorithms. Genetic algorithms are based on the ideas of natural selection and genetics.
These are intelligent exploitation of random search provided with historical data to direct the search
into the region of better performance in solution space. They are commonly used to generate highquality
solutions for optimization problems and search problems.

Genetic algorithms simulate the process of natural selection which means those species who can
adapt to changes in their environment are able to survive and reproduce and go to next generation. In
simple words, they simulate “survival of the fittest” among individual of consecutive generation for
solving a problem. Each generation consist of a population of individuals and each individual
represents a point in search space and possible solution. Each individual is represented as a string of
character/integer/float/bits. This string is analogous to the Chromosome.

Foundation of Genetic Algorithms


Genetic algorithms are based on an analogy with genetic structure and behavior of chromosomes of
the population. Following is the foundation of GAs based on this analogy –
1. Individual in population compete for resources and mate
2. Those individuals who are successful (fittest) then mate to create more offspring than
others
3. Genes from “fittest” parent propagate throughout the generation, that is sometimes
parents create offspring which is better than either parent.
4. Thus each successive generation is more suited for their environment.

Search space
The population of individuals are maintained within search space. Each individual represents a
solution in search space for given problem. Each individual is coded as a finite length vector
(analogous to chromosome) of components. These variable components are analogous to Genes.
Thus a chromosome (individual) is composed of several genes (variable components).
Fitness Score
A Fitness Score is given to each individual which shows the ability of an individual to “compete”.
The individual having optimal fitness score (or near optimal) are sought.

The GAs maintains the population of n individuals (chromosome/solutions) along with their fitness
scores.The individuals having better fitness scores are given more chance to reproduce than others.
The individuals with better fitness scores are selected who mate and produce better offspring by
combining chromosomes of parents. The population size is static so the room has to be created for
new arrivals. So, some individuals die and get replaced by new arrivals eventually creating new
generation when all the mating opportunity of the old population is exhausted. It is hoped that over
successive generations better solutions will arrive while least fit die.

Each new generation has on average more “better genes” than the individual (solution) of previous
generations. Thus each new generations have better “partial solutions” than previous generations.
Once the offspring produced having no significant difference from offspring produced by previous
populations, the population is converged. The algorithm is said to be converged to a set of solutions
for the problem.

Operators of Genetic Algorithms


Once the initial generation is created, the algorithm evolves the generation using following operators:

1) Selection Operator: The idea is to give preference to the individuals with good fitness
scores and
allow them to pass their genes to successive generations.
2) Crossover Operator: This represents mating between individuals. Two individuals are
selected
using selection operator and crossover sites are chosen randomly. Then the genes at these
crossover
sites are exchanged thus creating a completely new individual (offspring). For example:

3) Mutation Operator: The key idea is to insert random genes in offspring to maintain the
diversity in the population to avoid premature convergence. For example:

Algorithm
The whole method can be summarized as:
1) Randomly initialize populations p
2) Determine fitness of population
3) Until convergence repeat:
a) Select parents from population
b) Crossover and generate new population
c) Perform mutation on new population
d) Calculate fitness for new population

In- Lab Task 1:

Consider maxOne problem where the goal is to arrange a string of L bits into all ones. At first the
solution may seem trivial i.e., for L=8 the solution is [1, 1, 1, 1, 1, 1, 1, 1]. Despite this we shall see
how many iterations it will take for an instance of genetic algorithm to find the solution.

Solution:
We start by looking at the GA class which is instantiated by providing individualSize (length of
chromosome or L in our case) and populationSize (how many survivors will we retain in each
generation). The instance variable population is a dictionary of items where each item contains a bit
array of an individual and its heuristic distance from the goal. The variable totalFitness maintains the
total heuristic value of each generation/population.

Whenever we update current population, we also update the heuristic value of each
individual/state/chromosome. This is done using method updatePopulationFitness(). The individual
fitness/heuristic value is calculated as simply the number of ones in an array of an individual. The goal
is to maximize this value generation after generation.

We now focus on how to select parents for reproduction. This is done using roulette wheel
implementation. We first determine the size of the roulette wheel i.e., how many values will it store.
This is simply set as 5 times the size of population, e.g., if we consider that we will retain 10 survivors
at each generation then the size of the wheel is 50. We now have to fill these 50 values based upon the
fitness value of each of those 10 individuals. This is done by calculating the probability of occurrence
(h/sum(h)) of each individual. The variable individualLength determines how many values out of those
50 belong to that specific individual. Finally we generate random locations from roulette wheel
populationSize times and select individuals based upon that. Those individuals now replace the original
population.

We now come to the method generateChildren() which gerates children based upon crossover. A
crossover probability is provided as input. For population size of 8 and crossover probability of 0.8 for
instance, we need to do the crossover only between 80% of the 4 pairs i.e., we will do crossover only
with 3 pairs and will take a pair as it is to next step (which is mutation). After doing crossover, we also
update the fitness value of that child (although this shouldn’t be required since we are already calling
updatePopulationFitness method at the end).
The next step is to mutate the population (not individual child) based upon a certain mutation
probability provided as an input. For example, for individualSize of 5 bits and populationSize of 8 and
a probability of 0.05, we need to swap 5% of those 8*5=40 bits, i.e., round(0.05*40) bits will be
swapped.

We now focus on our main function that will initialize the class. Whenever we find an individual in a
generation which has ideal heuristic value, we terminate the algorithm.
In-Lab Task 2:

Imagine an 8 queen problem, where the goal is to place 8 queens on an 8 X 8 board such that no two
queens are on the same row or column or diagonal. (Before proceeding, kindly refer to lectures). A
sample state is shown below.

Solution:

We start by encoding the state in the form of 8 alphabets. We then define a heuristic
function that evaluates how close a given state is with the goal. For this function we sum
up number of queens that are in the same row or diagonal within a given queen and

then we add up this measure for all 8 queens. What follows is the function definition,
We now calculate fitness of whole population and after that initialize the
population.
We now define the survivor selection method. Here we invert the probability of
selection, i.e., the state with lower heuristic value should be given higher
probability of selection.

The children generation is the same as we saw in last activity,


For mutation, we generate a random number between 0-7 and replace a particular
location based upon a mutation probability.

Finally we define our main function,

Post-Lab Task 1:

The traveling salesman problem is an optimization problem where there is a finite number
of cities, and the cost of travel between each city is known. The goal is to find an ordered
set of all the cities for the salesman to visit such that the cost or total distance traveled by
the salesman is minimized.

The following shoes some cities in US and the route that is optimal for the salesman to
follow to minimize the total distance.
CSC462 Artificial Intelligence

Generate 15 random points or cities in python. Each point can be represented by a


location tuple (x,y). Assume x ranges from 0-5 and y ranges from 0-5. Select an ordered
set of these 15 points e.g., {(x4,y4), (x9,y9),…., 15 points}. This will represent a single
individual/chromosome which basically tells the sales person that he/she should visit

4th city first, then 9th city and so on. The heuristic value of this individual will be the
total distance covered by the salesman or the sum of distance between consecutive
pair of those 15 points in that individual. Likewise generate a pollution of 45
individuals (possibly by randomly permuting this set). Keep in mind that the salesman
is not allowed to visit any city twice i.e., the list must not contain any point multiple
times. Use set() function in python for this purpose. See how the total fitness values of
the population decreases with each generation. Note that in this case we don’t know
the minimum heuristic value of the ideal solution. Actually, we don’t have an ideal
solution beforehand i.e., we don’t know how the ideal sequence of cities looks like
(unlike other problems where we already know the solution beforehand e.g., how 4
queens should be placed in order to obtain the heuristic value of 0 or how all bits
should be 1 to obtain the ideal heuristic value).

COMSATS University Islamabad Page 67


CSC462 Artificial Intelligence

CRITICAL ANALYSIS/CONCLUSION

Lab Assessment

Pre-Lab /1

In-Lab /5

Data
Analysis
/4 /10
Data
Post-Lab /4 /4
Presentation

Writing
/4
Style

Instructor Signature and Comments

COMSATS University Islamabad Page 68


CSC462 Artificial Intelligence

Lab 8: Constraint Satisfaction Problem


Objective:

This lab will introduce students to solving constraint satisfaction problems.

• How to solve constraint satisfaction problem.


.
Pre-Lab Reading 1:

As pre-lab activity, read Chapters 5 from the book (Artificial Intelligence, A Modern Approach by
Peter Norvig, 4th edition) to know the basics of constraint satisfaction problems.

Pre-Lab Reading 2: Useful concepts

Constraint satisfaction is a technique where a problem is solved when its values satisfy certain
constraints or rules of the problem. Such type of technique leads to a deeper understanding of the
problem structure as well as its complexity.

Constraint satisfaction depends on three components, namely:


• X: It is a set of variables.
• D: It is a set of domains where the variables reside. There is a specific domain for each
variable.
• C: It is a set of constraints which are followed by the set of variables.

In constraint satisfaction, domains are the spaces where the variables reside, following the problem
specific constraints. These are the three main elements of a constraint satisfaction technique. The
constraint value consists of a pair of {scope, rel}. The scope is a tuple of variables which participate in
the constraint and rel is a relation which includes a list of values which the variables can take to satisfy
the constraints of the problem.

Solving Constraint Satisfaction Problems


The requirements to solve a constraint satisfaction problem (CSP) is:
• A state-space
• The notion of the solution.
A state in state-space is defined by assigning values to some or all variables such as
{X1=v1, X2=v2, and so on…}.

An assignment of values to a variable can be done in three ways:


• Consistent or Legal Assignment: An assignment which does not violate any constraint or rule
is called Consistent or legal assignment.
• Complete Assignment: An assignment where every variable is assigned with a value, and the
solution to the CSP remains consistent. Such assignment is known as Complete assignment.
• Partial Assignment: An assignment which assigns values to some of the variables only. Such
type of assignments are called Partial assignments.

Types of Domains in CSP


There are following two types of domains which are used by the variables:

• Discrete Domain: It is an infinite domain which can have one state for multiple variables. For
example, a start state can be allocated infinite times for each variable.

COMSATS University Islamabad Page 69


CSC462 Artificial Intelligence

• Finite Domain: It is a finite domain which can have continuous states describing one domain
for one specific variable. It is also called a continuous domain.

Constraint Types in CSP


With respect to the variables, basically there are following types of constraints:
• Unary Constraints: It is the simplest type of constraints that restricts the value of a single
variable.
• Binary Constraints: It is the constraint type which relates two variables. A value x2 will
contain a value which lies between x1 and x3.
• Global Constraints: It is the constraint type which involves an arbitrary number of variables.

Some special types of solution algorithms are used to solve the following types of constraints:

• Linear Constraints: These type of constraints are commonly used in linear programming
where each variable containing an integer value exists in linear form only.
• Non-linear Constraints: These type of constraints are used in non-linear programming where
each variable (an integer value) exists in a non-linear form.

Note: A special constraint which works in real-world is known as Preference constraint.

Constraint Propagation
In local state-spaces, the choice is only one, i.e., to search for a solution. But in CSP, we have two
choices either:

• We can search for a solution or


• We can perform a special type of inference called constraint propagation.

Constraint propagation is a special type of inference which helps in reducing the legal number of
values for the variables. The idea behind constraint propagation is local consistency.

In local consistency, variables are treated as nodes, and each binary constraint is treated as an arc in
the given problem. There are following local consistencies which are discussed below:

• Node Consistency: A single variable is said to be node consistent if all the values in the
variable’s domain satisfy the unary constraints on the variables.
• Arc Consistency: A variable is arc consistent if every value in its domain satisfies the binary
constraints of the variables.
• Path Consistency: When the evaluation of a set of two variable with respect to a third variable
can be extended over another variable, satisfying all the binary constraints. It is similar to arc
consistency.
• k-consistency: This type of consistency is used to define the notion of stronger forms of
propagation. Here, we examine the k-consistency of the variables.

In-Lab Task 1:

Imagine you have a map of Australia that you want to color by state/territory (which we’ll
collectively call “regions”). No two adjacent regions should share a color. Can you color the regions
with only three different colors?

The answer is yes. Try it out on your own (the easiest way is to print out a map of Australia with a

COMSATS University Islamabad Page 70


CSC462 Artificial Intelligence

white background). As human beings, we can quickly figure out the solution by inspection and a
little trial and error. It’s a trivial problem and a great first problem for our backtracking constraint
satisfaction solver. The problem is illustrated in figure:

COMSATS University Islamabad Page 71


CSC462 Artificial Intelligence

To model the problem as a CSP, we need to define the variables, domains, and constraints. The
variables are the seven regions of Australia (at least the seven that we’ll restrict ourselves to):
Western Australia; Northern Territory; South Australia; Queensland; New South Wales; Victoria;
and Tasmania. In our CSP, they can be modeled with strings. The domain of each variable is the
three different colors that can possibly be assigned (we’ll use red, green, and blue). The constraints
are the tricky part. No two adjacent regions can be colored with the same color, and our constraints
are dependent on which regions border one another. We can use binary constraints (constraints
between two variables). Every two regions that share a border also share a binary constraint
indicating they can’t be assigned the same color.

To implement these binary constraints in code, we need to subclass the Constraint class.
The MapColoringConstraint subclass takes two variables in its constructor (therefore being a binary
constraint): the two regions that share a border. Its overridden satisfied() method check whether the

COMSATS University Islamabad Page 72


CSC462 Artificial Intelligence

two regions both have a domain value (color) assigned to them—if either doesn’t, the constraint’s
trivially satisfied until they do (there can’t be a conflict when one doesn’t yet have a color). Then it
checks whether the two regions are assigned the same color (obviously there’s a conflict, meaning
the constraint isn’t satisfied, when they’re the same).
The class is presented here in its entirety. MapColoringConstraint isn’t generic in terms of type hinting,
but it subclasses a parameterized version of the generic class Constraint that indicates both variables
and domains are of type str.

Now that we have a way of implementing the constraints between regions, fleshing out the Australian
map-coloring problem with our CSP solver is a matter of filling in domains and variables, and then
adding constraints.

Finally, backtracking_search() is called to find a solution.

COMSATS University Islamabad Page 73


CSC462 Artificial Intelligence

In-Lab Task 2:

In a four queens problem, a board is a four-by-four grid of squares. A queen is a chess piece that can
move on the chessboard any number of squares along any row, column, or diagonal. A queen is
attacking another piece if, in a single move, it can move to the square the piece is on without jumping
over any other piece. If the other piece is in the line of sight of the queen, then it’s attacked by it). The
four queens problem poses the question of how four queens can be placed on a chessboard without any
queen attacking another queen. The problem is illustrated in figure below. Solve the given problem
modelling it as a CSP.

Post-Lab Task 1:

Use instances of In-Lab Task 2 to try other sizes of chess board larger than 12x12.

COMSATS University Islamabad Page 74


CSC462 Artificial Intelligence

Lab 9: Linear Regression and Gradient descent


Objective

In this lab students will learn the basics of linear regression and the concept of Gradient
decent with its implementation in the Python programming language.

Pre-Lab Reading 1:
Given below are the basic assumptions that a linear regression model makes regarding a dataset
on which it is applied:

Linear relationship: Relationship between response and feature variables should be linear. The
linearity assumption can be tested using scatter plots. As shown below, 1st figure represents
linearly related variables whereas variables in the 2nd and 3rd figures are most likely non-linear.
So, 1st figure will give better predictions using linear regression.

Little or no multi-collinearity: It is assumed that there is little or no multicollinearity in the


data. Multicollinearity occurs when the features (or independent variables) are not independent
of each other.

Little or no auto-correlation: Another assumption is that there is little or no autocorrelation in


the data. Autocorrelation occurs when the residual errors are not independent of each other.

Homoscedasticity: Homoscedasticity describes a situation in which the error term (that is, the
“noise” or random disturbance in the relationship between the independent variables and the
dependent variable) is the same across all values of the independent variables. As shown below,
figure 1 has homoscedasticity while figure 2 has heteroscedasticity.

COMSATS University Islamabad Page 75


CSC462 Artificial Intelligence

Pre-Lab Reading 2: Applications:

Trend lines: A trend line represents the variation in quantitative data with the passage of time
(like GDP, oil prices, etc.). These trends usually follow a linear relationship. Hence, linear
regression can be applied to predict future values. However, this method suffers from a lack of
scientific validity in cases where other potential changes can affect the data.

Economics: Linear regression is the predominant empirical tool in economics. For example, it is
used to predict consumer spending, fixed investment spending, inventory investment, purchases
of a country’s exports, spending on imports, the demand to hold liquid assets, labor demand, and
labor supply.

Finance: The capital price asset model uses linear regression to analyze and quantify the
systematic risks of an investment.

Biology: Linear regression is used to model causal relationships between parameters in


biological systems.

Linear regression is a statistical method for modeling relationships between a dependent


variable with a given set of independent variables. In this lab, we refer to dependent
variables as responses and independent variables as features for simplicity. In order to
provide a basic understanding of linear regression, we start with the most basic version
of linear regression.
Simple Linear Regression
Simple linear regression is an approach for predicting a response using a single feature.
It is assumed that the two variables are linearly related. Hence, we try to find a linear
function that predicts the response value(y) as accurately as possible as a function of the

COMSATS University Islamabad Page 76


CSC462 Artificial Intelligence

feature or independent variable(x). Let us consider a dataset where we have a value of


response y for every feature x:

For generality, we define:


x as feature vector, i.e x = [x1, x2, …., xn], y as response vector, i.e y = [y1, y2, …., yn] for n
observations (in above example, n=10).
A scatter plot of the above dataset looks like:

Now, the task is to find a line that fits best in the above scatter plot so that we can predict the
response for any new feature values. (i.e a value of x not present in a dataset)

This line is called a regression line.

The equation of regression line is represented as:

COMSATS University Islamabad Page 77


CSC462 Artificial Intelligence

Here,

h(x_i) represents the predicted response value for ith observation.

b_0 and b_1 are regression coefficients and represent y-intercept and slope of regression line
respectively.

To create our model, we must “learn” or estimate the values of regression coefficients b_0 and
b_1. And once we’ve estimated these coefficients, we can use the model to predict responses!

In this lab, we are going to use the principle of Least Squares.

Now consider:

Here, e_i is a residual error in ith observation.

So, our aim is to minimize the total residual error.

We define the squared error or cost function, J as:

and our task is to find the value of b_0 and b_1 for which J(b_0,b_1) is minimum!

Without going into the mathematical details, we present the result here:

COMSATS University Islamabad Page 78


CSC462 Artificial Intelligence

where SS_xy is the sum of cross-deviations of y and x:

Pre-Lab Reading 2:

Multiple linear regression attempts to model the relationship between two or more features and a
response by fitting a linear equation to the observed data. Clearly, it is nothing but an extension
of simple linear regression. Consider a dataset with p features(or independent variables) and one
response(or dependent variable). Also, the dataset contains n rows/observations. We define:

X (feature matrix) = a matrix of size n X p where x_{ij} denotes the values of jth feature for ith
observation.

So,

and

y (response vector) = a vector of size n where y_{i} denotes the value of response for ith
observation.

COMSATS University Islamabad Page 79


CSC462 Artificial Intelligence

The regression line for p features is represented as:

where h(x_i) is predicted response value for ith observation and b_0, b_1, …, b_p are the
regression coefficients.

Also, we can write:

where e_i represents residual error in ith observation.

We can generalize our linear model a little bit more by representing feature matrix X as:

So now, the linear model can be expressed in terms of matrices as:

Where,

And

COMSATS University Islamabad Page 80


CSC462 Artificial Intelligence

Now, we determine an estimate of b, i.e. b’ using the Least Squares method.

As already explained, the Least Squares method tends to determine b’ for which total residual
error is minimized.

We present the result directly here:

where‘ represents the transpose of the matrix while -1 represents the matrix inverse.

Knowing the least square estimates, b’, the multiple linear regression model can now be
estimated as:

where y’ is the estimated response vector.

Pre-Lab Reading Task 3:

Gradient Decent

Gradient descent is a name for a generic class of computer algorithms which minimize a
function. These algorithms achieve this end by starting with initial parameter values and
iteratively moving towards a set of parameter values that minimize some cost function or metric
that's the descent part. The movement toward best-fit is achieved by taking the derivative of the
variable or variables involved, towards the direction with the lowest (calculus-defined) gradient
that's the gradient part.

Gradient descent is an important concept in computer science, and an illustrative example of why
CS has kind of overtaken statistics in importance when it comes to machine learning: it's a general-
purpose tool that can be used to "brute force" an optimal solution in a wide range of scenarios,
which doesn't have the elegance, closed-form solution, and unfortunate sheer mathematical in
palatability of a statistical solution.

COMSATS University Islamabad Page 81


CSC462 Artificial Intelligence

Ordinary linear regression is a good and simple way of demonstrating how gradient descent works.
We start with some error function. We could use any metric we want, but in OLS the obvious one
is the residual sum of squares.

Given a sequence of points, yi, and a sequence of points predicted by our model, ŷi, RSS is:

Our objective is to minimize this value. Inserting our linear regression model in for
the ŷi predictions, and assuming (for the sake of simplicty) that we're doing regression on only one
variable, we get:

Where b is the intercept and m is the slope of the line of best fit.
Now we need to take the gradient. Since this is an equation of two variables (b and m) the gradient
will consist of two partial derivatives. Hence the gradient is:

To solve, take a step in the negative gradient direction every iteration. Eventually we will have
something that converges.

In-Lab Task 1:

Implement a simple python code of above technique from Pre-Lab Reading Task 1 using small
dataset.

Solution:

COMSATS University Islamabad Page 82


CSC462 Artificial Intelligence

COMSATS University Islamabad Page 83


CSC462 Artificial Intelligence

Output and graph obtained looks like this:

COMSATS University Islamabad Page 84


CSC462 Artificial Intelligence

In-Lab Task 2:

Implement a multiple linear regression techniques on the Boston house pricing dataset using
Scikit-learn library (Pre-Lab Reading 2).

Solution:

COMSATS University Islamabad Page 85


CSC462 Artificial Intelligence

Output:

and Residual Error plot looks like this:

COMSATS University Islamabad Page 86


CSC462 Artificial Intelligence

In the above example, we determine the accuracy score using Explained Variance Score.

We define:

explained_variance_score = 1 – Var{y – y’}/Var{y}

where y’ is the estimated target output, y the corresponding (correct) target output, and Var is
Variance, the square of the standard deviation.

The best possible score is 1.0, lower values are worse.

In Lab Task 2: Feature Impact Analysis

After building your linear regression model, you decide to investigate the impact of individual
features on the model's predictions. Choose one significant feature from Boston house pricing
dataset and answer the following questions:

• Choose a significant feature (e.g., "RM" for average number of rooms per dwelling).
• Investigate the impact of its coefficient on predictions.
• Evaluate model performance after removing this feature.
• Provide practical insights into the real-world implications of this feature's relationship
with house prices.

This question encourages you to delve into the specifics of a feature's role in the linear regression
model, assess its significance, and consider the practical implications of the relationship between
the chosen feature and the target variable. You may also use different data plotting methods for
visualizing the results i.e., scatter plot, regression line visualization, residual plot, feature
removal impact.

In-Lab Task 3:

Let's implement the Gradient decent and test (note that for the implementation we'll actually use
MSE, mean squared error. MSE is just RSS divided by the number of points, n. We do that because
it leads to "nicer" input numbers, as RSS is a really big number).
Solution:
import numpy as np

class GradientDescentLinearRegression:
def __init__(self, learning_rate=0.01, iterations=1000):
self.learning_rate, self.iterations = learning_rate, iterations

def fit(self, X, y):


b = 0
m = 5
n = X.shape[0]

COMSATS University Islamabad Page 87


CSC462 Artificial Intelligence

for _ in range(self.iterations):
b_gradient = -2 * np.sum(y - m*X + b) / n
m_gradient = -2 * np.sum(X*(y - (m*X + b))) / n
b = b + (self.learning_rate * b_gradient)
m = m - (self.learning_rate * m_gradient)
self.m, self.b = m, b

def predict(self, X):


return self.m*X + self.b

That's all we need! OK, let's see how this performs on some example data. We'll generate a cloud
of points that's normally distributed around the line y=x, and see what our algorithm do.

np.random.seed(42)
X = np.array(sorted(list(range(5))*20)) + np.random.normal(size=100,
scale=0.5)
y = np.array(sorted(list(range(5))*20)) + np.random.normal(size=100,
scale=0.25)

clf = GradientDescentLinearRegression()
clf.fit(X, y)

import matplotlib.pyplot as plt


plt.style.use('fivethirtyeight')

plt.scatter(X, y, color='black')
plt.plot(X, clf.predict(X))
plt.gca().set_title("Gradient Descent Linear Regressor")

The output should be:

COMSATS University Islamabad Page 88


CSC462 Artificial Intelligence

Our model solution is very close to the ideal solution of m=1 and b=0. Now show the values
obtained by our implementation as:

The biggest advantage gradient descent has is that it requires no knowledge whatsoever of the
fundamentals of the model. We can apply the classifier we've built without knowing anything
about linear regression. In particular, we don't need to know that linear regression has a closed-
form solution, or what that solution looks like, or how to derive it. Instead we just pick a metric,
compute its derivative, and then use a computer to brute-force a solution.

COMSATS University Islamabad Page 89


CSC462 Artificial Intelligence

Lab 10: Logistic Regression / Classification in Machine learning

Objective:

Pre-Lab Reading Task 1:

Reference: Learning Predictive Analytics with Python book

Pre-Lab Reading Task 2:

Building A Logistic Regression in Python

Logistic Regression is a Machine Learning classification algorithm that is used to predict the
probability of a categorical dependent variable. In logistic regression, the dependent variable is a
binary variable that contains data coded as 1 (yes, success, etc.) or 0 (no, failure, etc.). In other
words, the logistic regression model predicts P(Y=1) as a function of X.

Logistic Regression Assumptions


• Binary logistic regression requires the dependent variable to be binary.
• For a binary regression, the factor level 1 of the dependent variable should represent
the desired outcome.
• Only the meaningful variables should be included.
• The independent variables should be independent of each other. That is, the model
should have little or no multicollinearity.
• The independent variables are linearly related to the log odds.
• Logistic regression requires quite large sample sizes.

Keeping the above assumptions in mind, let’s look at our dataset.

In-lab Task 1:

COMSATS University Islamabad Page 90


CSC462 Artificial Intelligence

Dataset

The dataset comes from the UCI Machine Learning repository, and it
is related to direct marketing campaigns (phone calls) of a Portuguese
banking institution. The classification goal is to predict whether the
client will subscribe (1/0) to a term deposit (variable y). The dataset
can be downloaded from here.

Code:

import pandas as pd
import numpy as np
from sklearn import preprocessing
import matplotlib.pyplot as plt
plt.rc("font", size=14)
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import seaborn as sns
sns.set(style="white")
sns.set(style="whitegrid", color_codes=True)

The dataset provides the bank customers’ information. It includes 41,188 records and 21 fields.

COMSATS University Islamabad Page 91


CSC462 Artificial Intelligence

Input variables

1. age (numeric)

2. job : type of job (categorical: “admin”, “blue-collar”, “entrepreneur”, “housemaid”,


“management”, “retired”, “self-employed”, “services”, “student”, “technician”,
“unemployed”, “unknown”)

3. marital : marital status (categorical: “divorced”, “married”, “single”, “unknown”)

4. education (categorical: “basic.4y”, “basic.6y”, “basic.9y”, “high.school”, “illiterate”,


“professional.course”, “university.degree”, “unknown”)

5. default: has credit in default? (categorical: “no”, “yes”, “unknown”)

6. housing: has housing loan? (categorical: “no”, “yes”, “unknown”)

7. loan: has personal loan? (categorical: “no”, “yes”, “unknown”)

8. contact: contact communication type (categorical: “cellular”, “telephone”)

9. month: last contact month of year (categorical: “jan”, “feb”, “mar”, …, “nov”,
“dec”)

10. day_of_week: last contact day of the week (categorical: “mon”, “tue”, “wed”, “thu”,
“fri”)

11. duration: last contact duration, in seconds (numeric). Important note: this attribute
highly affects the output target (e.g., if duration=0 then y=’no’). The duration is not
known before a call is performed, also, after the end of the call, y is obviously
known. Thus, this input should only be included for benchmark purposes and should
be discarded if the intention is to have a realistic predictive model

COMSATS University Islamabad Page 92


CSC462 Artificial Intelligence

12. campaign: number of contacts performed during this campaign and for this client
(numeric, includes last contact)

13. pdays: number of days that passed by after the client was last contacted from a
previous campaign (numeric; 999 means client was not previously contacted)

14. previous: number of contacts performed before this campaign and for this client
(numeric)

15. poutcome: outcome of the previous marketing campaign (categorical: “failure”,


“nonexistent”, “success”)

16. emp.var.rate: employment variation rate — (numeric)

17. cons.price.idx: consumer price index — (numeric)

18. cons.conf.idx: consumer confidence index — (numeric)

19. euribor3m: euribor 3 month rate — (numeric)

20. nr.employed: number of employees — (numeric)

Predict variable (desired target):

y — has the client subscribed a term deposit? (binary: “1”, means “Yes”, “0” means “No”)

The education column of the dataset has many categories and we need to reduce the categories

for a better modelling. The education column has the following categories:

COMSATS University Islamabad Page 93


CSC462 Artificial Intelligence

Let us group “basic.4y”, “basic.9y” and “basic.6y” together and call them “basic”.

data['education']=np.where(data['education'] =='basic.9y', 'Basic', data['education'])


data['education']=np.where(data['education'] =='basic.6y', 'Basic', data['education'])
data['education']=np.where(data['education'] =='basic.4y', 'Basic', data['education'])

After grouping, this is the columns:

In-lab task 2: Data Exploration

COMSATS University Islamabad Page 94


CSC462 Artificial Intelligence

count_no_sub = len(data[data['y']==0])
count_sub = len(data[data['y']==1])
pct_of_no_sub = count_no_sub/(count_no_sub+count_sub)
print("percentage of no subscription is", pct_of_no_sub*100)
pct_of_sub = count_sub/(count_no_sub+count_sub)
print("percentage of subscription", pct_of_sub*100)

percentage of no subscription is 88.73458288821988

percentage of subscription 11.265417111780131

Our classes are imbalanced, and the ratio of no-subscription to subscription instances is 89:11.

Before we go ahead to balance the classes, let’s do some more exploration.

COMSATS University Islamabad Page 95


CSC462 Artificial Intelligence

Observations:

• The average age of customers who bought the term deposit is higher than that of the
customers who didn’t.

• The pdays (days since the customer was last contacted) is understandably lower for
the customers who bought it. The lower the pdays, the better the memory of the last
call and hence the better chances of a sale.

• Surprisingly, campaigns (number of contacts or calls made during the current


campaign) are lower for customers who bought the term deposit.

We can calculate categorical means for other categorical variables such as education and marital

status to get a more detailed sense of our data.

COMSATS University Islamabad Page 96


CSC462 Artificial Intelligence

In-Lab Task 3: Visualizations of the dataset

%matplotlib inline
pd.crosstab(data.job,data.y).plot(kind='bar')
plt.title('Purchase Frequency for Job Title')
plt.xlabel('Job')
plt.ylabel('Frequency of Purchase')
plt.savefig('purchase_fre_job')

COMSATS University Islamabad Page 97


CSC462 Artificial Intelligence

The frequency of purchase of the deposit depends a great deal on the job title. Thus, the job title
can be a good predictor of the outcome variable.

table=pd.crosstab(data.marital,data.y)
table.div(table.sum(1).astype(float), axis=0).plot(kind='bar', stacked=True)
plt.title('Stacked Bar Chart of Marital Status vs Purchase')
plt.xlabel('Marital Status')
plt.ylabel('Proportion of Customers')
plt.savefig('mariral_vs_pur_stack')

COMSATS University Islamabad Page 98


CSC462 Artificial Intelligence

The marital status does not seem a strong predictor for the outcome variable.

table=pd.crosstab(data.education,data.y)
table.div(table.sum(1).astype(float), axis=0).plot(kind='bar', stacked=True)
plt.title('Stacked Bar Chart of Education vs Purchase')
plt.xlabel('Education')
plt.ylabel('Proportion of Customers')
plt.savefig('edu_vs_pur_stack')

COMSATS University Islamabad Page 99


CSC462 Artificial Intelligence

Education seems a good predictor of the outcome variable.

pd.crosstab(data.day_of_week,data.y).plot(kind='bar')
plt.title('Purchase Frequency for Day of Week')
plt.xlabel('Day of Week')
plt.ylabel('Frequency of Purchase')
plt.savefig('pur_dayofweek_bar')

COMSATS University Islamabad Page 100


CSC462 Artificial Intelligence

Day of week may not be a good predictor of the outcome.

pd.crosstab(data.month,data.y).plot(kind='bar')
plt.title('Purchase Frequency for Month')
plt.xlabel('Month')
plt.ylabel('Frequency of Purchase')
plt.savefig('pur_fre_month_bar')

Month might be a good predictor of the outcome variable.

COMSATS University Islamabad Page 101


CSC462 Artificial Intelligence

data.age.hist()
plt.title('Histogram of Age')
plt.xlabel('Age')
plt.ylabel('Frequency')
plt.savefig('hist_age')

Most of the customers of the bank in this dataset are in the age range of 30–40.

pd.crosstab(data.poutcome,data.y).plot(kind='bar')
plt.title('Purchase Frequency for Poutcome')
plt.xlabel('Poutcome')
plt.ylabel('Frequency of Purchase')
plt.savefig('pur_fre_pout_bar')

COMSATS University Islamabad Page 102


CSC462 Artificial Intelligence

Poutcome seems to be a good predictor of the outcome variable.

In-lab Task 4: Create dummy variables.

That is variables with only two values, zero and one.


cat_vars=['job','marital','education','default','housing','loan','contact','month','day_of_week','po
utcome']
for var in cat_vars:
cat_list='var'+'_'+var
cat_list = pd.get_dummies(data[var], prefix=var)
data1=data.join(cat_list)

data=data1cat_vars=['job','marital','education','default','housing','loan','contact','month','day_o
f_week','poutcome']
data_vars=data.columns.values.tolist()
to_keep=[i for i in data_vars if i not in cat_vars]

Our final data columns will be:


data_final=data[to_keep]
data_final.columns.values

COMSATS University Islamabad Page 103


CSC462 Artificial Intelligence

In-lab Task 5: Over-sampling using SMOTE

With our training data created, I’ll up-sample the no-subscription using the SMOTE

algorithm(Synthetic Minority Oversampling Technique). At a high level, SMOTE:


1. Works by creating synthetic samples from the minor class (no-subscription) instead
of creating copies.
2. Randomly choosing one of the k-nearest-neighbors and using it to create a similar,
but randomly tweaked, new observations.

We are going to implement SMOTE in Python.

X = data_final.loc[:, data_final.columns != 'y']


y = data_final.loc[:, data_final.columns == 'y']from imblearn.over_sampling import SMOTEos
= SMOTE(random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
columns = X_train.columnsos_data_X,os_data_y=os.fit_sample(X_train, y_train)
os_data_X = pd.DataFrame(data=os_data_X,columns=columns )
os_data_y= pd.DataFrame(data=os_data_y,columns=['y'])
# we can Check the numbers of our data
print("length of oversampled data is ",len(os_data_X))

COMSATS University Islamabad Page 104


CSC462 Artificial Intelligence

print("Number of no subscription in oversampled data",len(os_data_y[os_data_y['y']==0]))


print("Number of subscription",len(os_data_y[os_data_y['y']==1]))
print("Proportion of no subscription data in oversampled data is
",len(os_data_y[os_data_y['y']==0])/len(os_data_X))
print("Proportion of subscription data in oversampled data is
",len(os_data_y[os_data_y['y']==1])/len(os_data_X))

Now we have a perfect balanced data! You may have noticed that I over-sampled only on the

training data, because by oversampling only on the training data, none of the information in the

test data is being used to create synthetic observations, therefore, no information will bleed from

test data into the model training.

In Lab Task 6: Recursive Feature Elimination

Recursive Feature Elimination (RFE) is based on the idea to repeatedly construct a model and
choose either the best or worst performing feature, setting the feature aside and then repeating
the process with the rest of the features. This process is applied until all features in the dataset
are exhausted. The goal of RFE is to select features by recursively considering smaller and
smaller sets of features.

data_final_vars=data_final.columns.values.tolist()
y=['y']
X=[i for i in data_final_vars if i not in y]from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegressionlogreg = LogisticRegression()rfe =
RFE(logreg, 20)
rfe = rfe.fit(os_data_X, os_data_y.values.ravel())
print(rfe.support_)
print(rfe.ranking_)

COMSATS University Islamabad Page 105


CSC462 Artificial Intelligence

The RFE has helped us select the following features:

“euribor3m”, “job_blue-collar”, “job_housemaid”, “marital_unknown”, “education_illiterate”,

“default_no”, “default_unknown”, “contact_cellular”, “contact_telephone”, “month_apr”,

“month_aug”, “month_dec”, “month_jul”, “month_jun”, “month_mar”, “month_may”,

“month_nov”, “month_oct”, “poutcome_failure”, “poutcome_success”.

cols=['euribor3m', 'job_blue-collar', 'job_housemaid', 'marital_unknown', 'education_illiterate',


'default_no', 'default_unknown',
'contact_cellular', 'contact_telephone', 'month_apr', 'month_aug', 'month_dec', 'month_jul',
'month_jun', 'month_mar',
'month_may', 'month_nov', 'month_oct', "poutcome_failure", "poutcome_success"]
X=os_data_X[cols]
y=os_data_y['y']

In Lab task 7: Implementing the model.

import statsmodels.api as sm
logit_model=sm.Logit(y,X)
result=logit_model.fit()
print(result.summary2())

COMSATS University Islamabad Page 106


CSC462 Artificial Intelligence

The p-values for most of the variables are smaller than 0.05, except four variables, therefore, we

will remove them.


cols=['euribor3m', 'job_blue-collar', 'job_housemaid', 'marital_unknown', 'education_illiterate',
'month_apr', 'month_aug', 'month_dec', 'month_jul', 'month_jun', 'month_mar',
'month_may', 'month_nov', 'month_oct', "poutcome_failure", "poutcome_success"]
X=os_data_X[cols]
y=os_data_y['y']logit_model=sm.Logit(y,X)
result=logit_model.fit()
print(result.summary2())

COMSATS University Islamabad Page 107


CSC462 Artificial Intelligence

In-lab Task 8: Logistic Regression Model Fitting


from sklearn.linear_model import LogisticRegression
from sklearn import metricsX_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.3, random_state=0)
logreg = LogisticRegression()
logreg.fit(X_train, y_train)

COMSATS University Islamabad Page 108


CSC462 Artificial Intelligence

Predicting the test set results and calculating the accuracy


y_pred = logreg.predict(X_test)
print('Accuracy of logistic regression classifier on test set: {:.2f}'.format(logreg.score(X_test,
y_test)))

Accuracy of logistic regression classifier on test set: 0.74

Confusion Matrix
from sklearn.metrics import confusion_matrix
confusion_matrix = confusion_matrix(y_test, y_pred)
print(confusion_matrix)

[[6124 1542]

[2505 5170]]

The result is telling us that we have 6124+5170 correct predictions and 2505+1542 incorrect

predictions.

Compute precision, recall, F-measure and support

To quote from Scikit Learn:

The precision is the ratio tp / (tp + fp) where tp is the number of true positives and fp the number

of false positives. The precision is intuitively the ability of the classifier to not label a sample as

positive if it is negative.

COMSATS University Islamabad Page 109


CSC462 Artificial Intelligence

The recall is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of

false negatives. The recall is intuitively the ability of the classifier to find all the positive

samples.

The F-beta score can be interpreted as a weighted harmonic mean of the precision and recall,

where an F-beta score reaches its best value at 1 and worst score at 0.

The F-beta score weights the recall more than the precision by a factor of beta. beta = 1.0 means

recall and precision are equally important.

The support is the number of occurrences of each class in y_test.


from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))

Interpretation: Of the entire test set, 74% of the promoted term deposit were the term deposit

that the customers liked. Of the entire test set, 74% of the customer’s preferred term deposits that

were promoted.

COMSATS University Islamabad Page 110


CSC462 Artificial Intelligence

COMSATS University Islamabad Page 111


CSC462 Artificial Intelligence

CRITICAL ANALYSIS/CONCLUSION

Lab Assessment

Pre-Lab /1

In-Lab /5

Data
Analysis
/4 /10
Data
Post-Lab /4 /4
Presentation

Writing
/4
Style

Instructor Signature and Comments

COMSATS University Islamabad Page 112


CSC462 Artificial Intelligence

COMSATS University Islamabad Page 113


CSC462 Artificial Intelligence

Lab 11: Decision Tree Classification

Objective:
• Understand where Decision Trees fit into the larger picture of this class and other models
• Understand what Decision Trees are and why we would care to use them
• How decision trees work
• Feel comfortable running sklearn's implementation of a decision tree
• Understand the concepts of bagging and random forests

Pre Lab Reading Task 1:

Machine learning is a type of artificial intelligence (AI) that provides computers


with the ability to learn without being explicitly programmed. Machine learning
focuses on the development of Computer Programs that can change when
exposed to new data. In this lab, we’ll see basics of Machine Learning, and
implementation of a simple machine learning algorithm using python.

Python community has developed many modules to help programmers


implement machine learning. In this article, we will be using numpy, scipy and
scikit-learn modules. We can install them using cmd command:
pip install numpy scipy scikit-learn

Pre Lab Reading Task 2: Introduction to Decision Trees

Decision trees are a type of supervised machine learning algorithm used for classification and
regression tasks. They make decisions by asking a series of questions and branching based on the
answers. Example: Think of a decision tree as a flowchart. For instance, in a weather prediction
scenario, the first node might ask, "Is the humidity high?" with branches leading to "Yes" or
"No." For further details see the scikit-learn documentation https://scikit-
learn.org/stable/modules/tree.html.

Decision Trees background


Let's do a high-level recap of what we've learned in this course so far:

COMSATS University Islamabad Page 114


CSC462 Artificial Intelligence

Say we have input data 𝑋=(𝑋1,𝑋2,...,𝑋𝑛) and corresponding class labels 𝑌=(𝑌1,𝑌2,...,𝑌𝑛)
where 𝑛 represents the number of observations/instances (i.e., unique samples). Much of statistical
𝑋 and 𝑌. In particular, we
learning concerns trying to model this relationship between our data's
assert that the 𝑌's were produced/generated by some underlying function 𝑓(𝑋), and that there is
inevitably some noise and systematic, implicit bias and error 𝜖 that cannot be captured by any
𝑓(𝑋). Thus, we have:
𝑌=𝑓(𝑋)+𝜖
Statistical learning concerns either prediction or inference:
Prediction: concerns trying to learn a function 𝑓̂(𝑋) that is as close as possible to the true function
𝑓(𝑋). This allows us to estimate 𝑌 values for any new input data 𝑋.
Inference: concerns trying to understand/model the relationship between 𝑋 and 𝑌, effectively
learning how the data was generated.
Independent of this, if you have access to gold truth labels 𝑌, and you make use of them for your
modelling, then you are working on a supervised learning task. If you do not have or make use of
𝑌 values, and you are only concerned with the input data 𝑋, you are working on an unsupervised
learning task.

In Lab Task 1:
Load the required libraries
# imports

%matplotlib inline

import numpy as np

import scipy as sp

from sklearn.model_selection import train_test_split

from sklearn import tree

from sklearn.model_selection import cross_val_score

from sklearn.utils import resample

from sklearn.tree import DecisionTreeClassifier

from sklearn.ensemble import RandomForestClassifier

import matplotlib as mpl

COMSATS University Islamabad Page 115


CSC462 Artificial Intelligence

import matplotlib.cm as cm

import matplotlib.pyplot as plt

import pandas as pd

pd.set_option('display.width', 500)

pd.set_option('display.max_columns', 100)

pd.set_option('display.notebook_repr_html', True)

import seaborn.apionly as sns

To build a decision tree:


• Start with an empty tree and some data 𝑋
• Decide what your splitting criterion will be (e.g., Gini, Entropy, etc)
• Decide what your what your stopping criterion will be, or if you'll develop a large tree and prune
(pruning is covered in Lecture 15, slides 41 - 54)
• Build the tree in a greedy manner, and if you have multiple hyperparameters, use cross-validation
to determine the best values

COMSATS University Islamabad Page 116


CSC462 Artificial Intelligence

Lab 12: Convolution Neural Networks

COMSATS University Islamabad Page 117

You might also like