cs3491 Ai ML Lab Manual
cs3491 Ai ML Lab Manual
cs3491 Ai ML Lab Manual
LABORATORY MANUAL
FOR
REGULATION 2021
VISION
To create globally competent software professionals with social values to cater the ever-
changing industry requirements.
MISSION
M1 To provide appropriate infrastructure to impart need-based technical education
through effective teaching and research
M2 To involve the students in collaborative projects on emerging technologies to fulfill
the industrial requirements
M3 To render value based education to students to take better engineering decision
with social consciousness and to meet out the global standards
M4 To inculcate leadership skills in students and encourage them to become a globally
competent professional
Programme Educational Objectives (PEOs)
The graduates of Computer Science and Engineering will be able to
PEO1 Pursue Higher Education and Research or have a successful career in industries
associated with Computer Science and Engineering, or as Entrepreneurs
PEO2 Ensure that graduates will have the ability and attitude to adapt to emerging
technological changes
PEO3 Acquire leadership skills to perform professional activities with social
consciousness
Programme Specific Outcome (PSOs)
The graduates will be able to
PSO1 The students will be able to analyze large volume of data and make business
decisions to improve efficiency with different algorithms and tools
PSO2 The students will have the capacity to develop web and mobile applications for real
time scenarios
PSO3 The students will be able to provide automation and smart solutions in various
forms to the society with Internet of Things
Course Code & Name: CS3491 & ARTIFICIAL INTELLIGENCE & MACHINE
LEARNING LABORATORY
REGULATION : R2021
YEAR/SEM : II/IV
COURSE OUTCOMES
CO1
CO2
CO3
CO4
CO5
CORRELATION LEVELS
Substantial/ High 3
Moderate/ Medium 2
Slight/ Low 1
LIST OF EXPERIMENTS
DATE:
Aim
To implement Breadth-First Search (BFS) traversal algorithm on an undirected graph
using Python.
Algorithm
Step 1: Start by putting any one of the graph’s vertices at the back of the queue.
Step 2: Now take the front item of the queue and add it to the visited list.
Step 3: Create a list of that vertex's adjacent nodes. Add those which are not within the
visited list to the rear of the queue.
Step 4: Keep continuing Steps two and three till the queue is empty.
PROGRAM
graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}
visited = [] # List for visited nodes.
queue = [] #Initialize a queue
def bfs(visited, graph, node): #function for BFS
visited.append(node)
queue.append(node)
while queue: # Creating loop to visit each node
m = queue.pop(0)
print (m, end = " ")
for neighbour in graph[m]:
if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)
# Driver Code
print("Following is the Breadth-First Search")
bfs(visited, graph, '5') # function calling
EXPECTED OUTPUT
Result :
DATE:
Aim
To implement Depth-First Search (DFS) traversal algorithm on an undirected graph
using Python.
Algorithm:
Step 1: Create a set to keep track of visited nodes.
Step 2: Create a function dfs(visited, graph, node) to implement DFS.
Step 3: If the current node is not visited, print the node and mark it as visited.
Step 4: For each neighbour of the current node, if the neighbour is not visited,
recursively call the dfs() function on the neighbour.
Step 5: In the main program, call the dfs() function on a starting node to start DFS
traversal of the graph.
Program
# Using a Python dictionary to act as an adjacency list
graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}
visited = set() # Set to keep track of visited nodes of graph.
def dfs(visited, graph, node): #function for dfs
if node not in visited:
print (node)
visited.add(node)
for neighbour in graph[node]:
dfs(visited, graph, neighbour)
# Driver Code
print("Following is the Depth-First Search")
dfs(visited, graph, '5')
EXPECTED OUTPUT
Result :
Aim:
To write a code to implementation of Informed search algorithm (A*, memory-
bounded A*) using Python
Algorithm:
Step 1: Initialize the open set with the start node and the closed set as empty.
Step 2: Initialize g(start_node) to 0 and parents(start_node) to start_node.
Step 3: While open set is not empty, do the following:
a. Select the node with the lowest f() value from the open set.
b. If this node is the stop node or it has no neighbors, exit the loop.
c. For each neighbor of the current node:
i. If it is not in open set or closed set, add it to the open set, set its parent to the
current node, and calculate its g value.
ii. If it is in the open set and its g value is greater than the current node's g value
plus the weight of the edge between them, update its g value and parent node.
iii. If it is in the closed set and its g value is greater than the current node's g value
plus the weight of the edge between them, remove it from the closed set and add it
to the open set with the updated g value and parent node.
d. If no node is selected, there is no path between start and stop nodes.
e. If the stop node is selected, construct the path from start to stop using the
parent nodes and return the path.
f. Remove the selected node from the open set and add it to the closed set.
Step 4: If the open set becomes empty, there is no path between start and stop nodes.
Program
def aStarAlgo(start_node, stop_node):
open_set = set(start_node)
closed_set = set()
g = {} #store distance from starting node
parents = {} # parents contains an adjacency map of all nodes
#distance of starting node from itself is zero
g[start_node] = 0
#start_node is root node i.e it has no parent nodes
#so start_node is set to its own parent node
parents[start_node] = start_node
while len(open_set) > 0:
n = None
#node with lowest f() is found
for v in open_set:
if n == None or g[v] + heuristic(v) < g[n] + heuristic(n):
n=v
if n == stop_node or Graph_nodes[n] == None:
pass
else:
for (m, weight) in get_neighbors(n):
#nodes 'm' not in first and last set are added to first
#n is set its parent
if m not in open_set and m not in closed_set:
open_set.add(m)
parents[m] = n
g[m] = g[n] + weight
#for each node m,compare its distance from start i.e g(m) to the
#from start through n node
else:
if g[m] > g[n] + weight:
#update g(m)
g[m] = g[n] + weight
#change parent of m to n
parents[m] = n
#if m in closed set,remove and add to open
if m in closed_set:
closed_set.remove(m)
open_set.add(m)
if n == None:
print('Path does not exist!')
return None
EXPECTED OUTPUT
Result :
DATE:
Aim:
import pandas as pd
df = pd.read_csv('spam.csv',encoding='cp1252')
df
5571 ham Rofl. Its true to its name NaN NaN NaN
df.rename(columns={"v1":"class_label","v2":"message"},inplace=True)
df
Will Ì_ b going to
5568 ham NaN NaN NaN
esplanade fr home?
5571 ham Rofl. Its true to its name NaN NaN NaN
df
class_label message
5567 spam This is the 2nd time we have tried 2 contact u...
5569 ham Pity, * was in mood for that. So...any other s...
5570 ham The guy did some bitching but I acted like i'd...
df.class_label.value_counts()
Output
ham 4825
spam 747
Name: class_label, dtype: int64
# convert class lable from string to numeric
df
class_label message
5570 0 The guy did some bitching but I acted like i'd...
df['message'],
df['class_label'],
test_size = 0.3,
random_state = 0)
y_train.value_counts()
Output
0 3391
1 509
Output
0 1434
1 238
Name: class_label, dtype: int64
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer(
x_train_transformed
Output
Output
classifier = MultinomialNB()
classifier.fit(x_train_transformed, y_train)
Output
MultinomialNB()
ytest_predicted_labels = classifier.predict(x_test_transformed)
ytest_predicted_labels
Output
1672
y_test # actual labels in test dataset
ytest_predicted_labels # predicted labels for test dataset
Output
4456 0
690 0
944 0
3768 0
1189 0
..
4833 0
3006 0
509 0
1761 0
1525 0
Name: class_label, Length: 1672, dtype: int64
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
print ('Accuracy Score :',accuracy_score(y_test, ytest_predicted_labels))
Output
[[1423 11]
[ 14 224]]
ACTUAL OUTPUT
Result :
DATE:
Aim:
import pandas as pd
import csv
heartDisease = pd.read_csv('heart.csv')
heartDisease = heartDisease.replace('?',np.nan)
print(heartDisease.head())
print(heartDisease.dtypes)
model=
BayesianModel([('age','heartdisease'),('sex','heartdisease'),('exang','heartdisease'),('cp','he
artdisease'),('heartdisease','restecg'),('heartdisease','chol')])
model.fit(heartDisease,estimator=MaximumLikelihoodEstimator)
HeartDiseasetest_infer = VariableElimination(model)
q1=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'restecg':1})
print(q1)
EXPECTED OUTPUT
ACTUAL OUTPUT
Result :
DATE:
Aim:
To write a program for implementing Linear regression model using Python
Algorithm:
Step 1: Import the required libraries:
1. pandas for data handling
2. numpy for mathematical processing
3. matplotlib for data visualization
4. sklearn for splitting data and applying classification algorithms.
Step 2: Read the dataset in a pandas dataframe.
Step 3: Rename the columns to appropriate names.
Step 4: Remove any unnecessary columns from the dataframe.
Step 5: Split the dataset into training and testing sets using the train_test_split function
from sklearn.
Step 6: Create LinearRegression object for regression model.
Step 7: Use fit method of LinearRegression object to train the model on the training
data.
Step 8: Use the predict method of the LinearRegression object to predict the class labels
of the testing set.
Step 9: Visualize the result using scatter plot of matplotlib.
Program
# importing the dataset
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dataset = pd.read_csv('D:\\KIOT\\PRAVEEN\\AIML_LAB\\Salary_Data.csv')
dataset.head()
# data preprocessing
X = dataset.iloc[:, :-1].values #independent variable array
y = dataset.iloc[:,1].values #dependent variable vector
# splitting the dataset
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=1/3,random_state=0)
# fitting the regression model
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train,y_train) #actually produces the linear eqn for the data
# predicting the test set results
y_pred = regressor.predict(X_test)
y_pred
y_test
# visualizing the results
#plot for the TRAIN
plt.scatter(X_train, y_train, color='red') # plotting the observation line
EXPECTED OUTPUT
Sample dataset
Output
ACTUAL OUTPUT
Result :
EXPT NO: 5.ii. Build Regression models (ii) Multiple Linear Regression
DATE:
Aim:
To write a program for implementing Multiple Linear regression model using
Python
Algorithm:
Step 1: Import the required libraries:
pandas for data handling
numpy for mathematical processing
matplotlib for data visualization
sklearn for splitting data and applying classification algorithms.
LabelEncoder and OneHotEncoder from sklearn.preprocessing.
Step 2: Read the dataset in a pandas dataframe.
Step 3: Rename the columns to appropriate names.
Step 4: Remove any unnecessary columns from the dataframe.
Step 5: Split the datasets into features and target using iloc method.
Step 6: Encode the categorical feature in the feature set using LabelEncoder and
OneHotEncoder methods.
Step 7: Split the dataset into training and testing sets using the train_test_split function
from sklearn.
Step 8: Use the predict method of the LinearRegression object to predict the class labels
of the testing set.
Step 9: Print the actual target values and predicted target values using the y test and
ypred variables.
Program
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv('D:\\KIOT\\AIML_LAB\\50_Startups.csv')
dataset.head()
# data preprocessing
X = dataset.iloc[:,:-1].values
y = dataset.iloc[:,4].values
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelEncoder_X = LabelEncoder()
X[:,3] = labelEncoder_X.fit_transform(X[ : , 3])
from sklearn.compose import ColumnTransformer
ct = ColumnTransformer([('encoder', OneHotEncoder(), [3])], remainder='passthrough')
X = np.array(ct.fit_transform(X), dtype=np.float)
X = X[:, 1:]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
# Fitting the model
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
# predicting the test set results
y_pred = regressor.predict(X_test)
y_test
y_pred
EXPECTED OUTPUT
Output
113969.43533013, 167921.06569553])
ACTUAL OUTPUT
Result :
DATE:
Aim:
To write a program for implementing Decision Tree model using Python
Algorithm:
Step 1: Import the required libraries:
1.pandas for data handling
2.sklearn for splitting data and applying classification algorithms.
3.DecisionTree classifier from sklearn.tree
4.plot_tree from sklearn.tree for plotting the decision tree.
Step 2: Read the dataset in a pandas dataframe.
Step 3: Remove any unnecessary columns from the dataframe.
Step 4: Split the dataset into training and testing sets using the train_test_split function
from sklearn.
Step 5: Create DecisionTreeClassifier object to create decision tree classifier with Gini
impurity.
Step 6: Use fit method to train the model on the training data.
Step 7: Visualize the result using plot_tree method.
Program
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import plot_tree
# Create the input data
data = {
'income': [20, 30, 40, 50, 60, 70, 80, 90, 100, 110],
'credit_score': [50, 60, 70, 80, 90, 100, 110, 120, 130, 140],
'eligible': [0, 0, 0, 1, 1, 1, 1, 1, 1, 1]
}
df = pd.DataFrame(data)
df.head()
Sample Dataset
income credit_score eligible
0 20 50 0
1 30 60 0
2 40 70 0
3 50 80 1
4 60 90 1
clf.fit(X, y)
Output
DecisionTreeClassifier()
Output
EXPECTED OUTPUT
ACTUAL OUTPUT
Result :
DATE:
Aim:
To write a program for implementing Random Forest model using Python
Algorithm:
Step 1: Import the required libraries:
1. pandas for data handling
2. numpy for mathematical operation
3. sklearn for splitting data and applying classification algorithms.
4. RandomForestRegressor from sklearn.ensemble
5. matplotlib for visualizing the result.
6. mean_squared_error from metrics.
Step 2: import load_boston dataset from sklearn.
Step 3: Split the dataset into training and testing sets using the train_test_split function
from sklearn.
Step 4: Create RandomForestRegressor object.
Step 5: Use fit method to train the model on the training data and predict the results.
Step 6: Calculate the mean squared error of predicted values using mean_squared_error
method.
Step 7: Visualize the result using barh plot.
Program
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_boston
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
# Load the Boston dataset
boston = load_boston()
X = boston.data
y = boston.target
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create a Random Forest Regressor with 100 trees
rfr = RandomForestRegressor(n_estimators=100, random_state=42)
# Fit the model to the training data
rfr.fit(X_train, y_train)
Output
RandomForestRegressor(random_state=42)
# Predict the target values of the testing data
y_pred = rfr.predict(X_test)
# Calculate the mean squared error
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)
Output
Mean Squared Error: 7.901513892156864
# Plot the feature importances
features = boston.feature_names
importances = rfr.feature_importances_
indices = np.argsort(importances)
plt.title('Feature Importances')
plt.barh(range(len(indices)), importances[indices], color='b', align='center')
plt.yticks(range(len(indices)), [features[i] for i in indices])
plt.xlabel('Relative Importance')
plt.show()
EXPECTED OUTPUT
ACTUAL OUTPUT
Result :
DATE:
Aim:
To write a program to implement the SVM model using Python
Algorithm:
Step 1: Import necessary libraries – pandas, numpy, matplotlib.pyplot, and sklearn.
Step 2: Load the dataset using pandas read_csv function and store it in the variable
'data'.
Step 3: Split the dataset into training and test samples using train_test_split function
from sklearn. Store them in the variables 'training_set' and 'test_set'.
Step 4: Classify the predictors and target. Extract the first two columns as predictors
and the last column as the target variable. Store them in the variables 'X_train' and
'Y_train' for the training set and 'X_test' and 'Y_test' for the test set.
Step 5: Encode the target variable using LabelEncoder from sklearn. Store it in the
variable 'le'.
Step 6: Initialize the Support Vector Machine (SVM) classifier using SVC from sklearn
with kernel type 'rbf' and random_state as 1.
Step 7: Fit the training data into the classifier using the fit() function.
Step 8: Predict the classes for the test set using the predict() function and store them in
the variable 'Y_pred'.
Step 9: Attach the predictions to the test set for comparing using the code
"test_set['Predictions'] = Y_pred".
Step 10: Calculate the accuracy of the predictions using confusion_matrix from sklearn.
Step 11: Visualize the classifier using matplotlib. Use the ListedColormap to color the
graph and show the legend using the scatter plot.
Program
#Importing the dataset
import pandas as pd
data = pd.read_csv("D:\\KIOT\\PRAVEEN\\AIML_LAB\\apples_and_oranges.csv")
#Splitting the dataset into training and test samples
from sklearn.model_selection import train_test_split
training_set, test_set = train_test_split(data, test_size = 0.2, random_state = 1)
#Classifying the predictors and target
X_train = training_set.iloc[:,0:2].values
Y_train = training_set.iloc[:,2].values
X_test = test_set.iloc[:,0:2].values
Y_test = test_set.iloc[:,2].values
#Initializing Support Vector Machine and fitting the training data
from sklearn.svm import SVC
classifier = SVC(kernel='rbf', random_state = 1)
classifier.fit(X_train,Y_train)
#Predicting the classes for test set
Y_pred = classifier.predict(X_test)
#Attaching the predictions to test set for comparing
test_set["Predictions"] = Y_pred
EXPECTED OUTPUT
Output
import numpy as np
plt.figure(figsize = (7,7))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.xlabel('Weight In Grams')
plt.ylabel('Size in cm')
plt.legend()
plt.show()
Output
ACTUAL OUTPUT
Result :
DATE:
Aim:
To write a program for implementing Averaging ensembling technique using
Python
Algorithm:
Step 1: Load the iris dataset using the sklearn.datasets module.
Step 2: Split the dataset into training and testing sets using the train_test_split function
from sklearn.model_selection module.
Step 3: Select only the sepal data for both training and testing datasets.
Step 4: Create an instance of the BaggingClassifier class from the sklearn.ensemble
module.
Step 5: Fit the BaggingClassifier instance to the training data.
Step 6: Calculate the score of the BaggingClassifier on the testing data.
Step 7: Create an instance of the KNeighborsClassifier class and fit it to the training
data.
Step 8: Calculate the score of the KNeighborsClassifier on the testing data.
Step 9: Define the make_meshgrid function to create a meshgrid for plotting the
decision boundaries.
Step 10: Define the plot_contours function to plot the decision boundaries.
Step 11: Get the sepal length and sepal width data from the iris dataset.
Step 12: Create the meshgrid for plotting the decision boundaries using the
make_meshgrid function.
Step 13: Plot the decision boundaries using the plot_contours function and the
BaggingClassifier instance.
Step 14: Plot the actual data points for the versicolor and virginica classes using the
scatter function.
Step 15: Show the plot.
Program
# Load the iris dataset
from sklearn import datasets
iris = datasets.load_iris()
# split into train and test datasets
from sklearn.model_selection import train_test_split
# just use the sepal data
X_train, X_test, y_train, y_test = train_test_split(iris.data[:,0:2],iris.target)
# Model the data set using Bagging Classifier
from sklearn.ensemble import BaggingClassifier
from sklearn.neighbors import KNeighborsClassifier
classifier = BaggingClassifier(base_estimator = KNeighborsClassifier(),
max_samples = 10,
n_estimators = 100)
classifier.fit(X_train,y_train)
Output
BaggingClassifier(base_estimator=KNeighborsClassifier(), max_samples=10,
n_estimators=100)
# Calculate the score
classifier.score(X_test,y_test)
Output
0.7894736842105263
classifier_knn = KNeighborsClassifier()
classifier_knn.fit(X_train,y_train)
classifier_knn.score(X_test,y_test)
Output
0.7368421052631579
import numpy as np
def make_meshgrid(x, y, h=.02):
x_min, x_max = x.min() - 1, x.max() + 1
y_min, y_max = y.min() - 1, y.max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
return xx, yy
def plot_contours(ax, clf, xx, yy, **params):
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
out = ax.contourf(xx, yy, Z, **params)
return out
X0, X1 = iris.data[:,0], iris.data[:, 1]
# Pass the data. make_meshgrid will automatically identify the min and max points to
draw the grid
xx, yy = make_meshgrid(X0, X1)
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
mpl.rcParams['figure.dpi'] = 200
# plot the meshgrid
plot_contours(plt, classifier, xx, yy,
cmap=plt.cm.coolwarm,
alpha=0.8)
# plot the actual data points for versicolor and virginica
plt.scatter(X0, X1, c=iris.target,
cmap=plt.cm.coolwarm,
s=20, edgecolors='k',
alpha=0.2)
Output
Here is a visual comparing ensemble of KNN classifier vs a single KNN classifier. You
can see that the overfitting in the case of a single KNN classifier has been greatly
reduced in the ensemble model.
classifier.score(X_test,y_test)
Output
0.7894736842105263
classifier_knn.score(X_test,y_test)
Output
0.7368421052631579
The accuracy of the standalone KNN modeler seems to be better than the ensemble.
That is because we have a small dataset. Once the model hits a large real world dataset,
the ensemble performs much better – specifically with variance.
ACTUAL OUTPUT
Result :
DATE:
Aim:
To write a program for implementing AdaBoost ensembling technique using
Python
Algorithm:
Step 1: Import the necessary libraries.
Step 2: Load the iris dataset.
Step 3: Split the dataset into training and testing sets.
Step 4: Create an instance of the AdaBoost classifier.
Step 5: Train the classifier on the training data.
Step 6: Create a meshgrid to plot the decision boundaries.
Step 7: Define a function to plot the decision boundaries and data points.
Step 8: Plot the decision boundaries and data points.
Step 9: Calculate the accuracy of the classifier on the test data.
Step 10: Output the accuracy score.
Program
from sklearn.ensemble import AdaBoostClassifier
classifier_adb =
AdaBoostClassifier(n_estimators=100,random_state=100,learning_rate=0.1)
classifier_adb.fit(X_train,y_train)
AdaBoostClassifier(learning_rate=0.1, n_estimators=100, random_state=100)
X0, X1 = iris.data[:,0], iris.data[:, 1]
# Pass the data. make_meshgrid will automatically identify the min and max points to
draw the grid
xx, yy = make_meshgrid(X0, X1)
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
# plot the meshgrid
plot_contours(plt, classifier_adb, xx, yy,
cmap=plt.cm.coolwarm,
alpha=0.8)
Output
classifier_adb.score(X_test,y_test)
Output
0.6578947368421053
The score of the classifier is almost similar to the Random Forest, but a bit higher than
the regular decision tree score.
ACTUAL OUTPUT
Result :
DATE:
Aim:
To write a program for implementing Gradient Descent ensembling technique
using Python
Algorithm:
Step 1: Import the necessary libraries.
Step 2: Load the Boston dataset using load_boston() from the datasets module.
Step 3: Split the dataset into training and testing sets using train_test_split method from
the model_selection module.
Step 4: Instantiate a Gradient Boosting Regressor model from the ensemble module.
Step 5: Train the Gradient Boosting Regressor model on the training data using fit
method
Step 6: Get the feature importance scores of the trained model using
feature_importances_
Step 7: Evaluate the performance of the model on the test data using score() and
calculate the R-squared value.
Step 8: Instantiate a Linear Regression model from the linear_model module and fit it to
the training data.
Step 9: Evaluate the performance of the Linear Regression model on the test data using
score method and calculate the R-squared value.
Gradient Boost is essentially a combination of Decision trees with some kind of Residuals
minimizing algorithm (It could be Ordinary Least Squares or Gradient Descent). As long
as you understand the concept of residuals and how to minimize them, we are good to
go to understand Gradient boost.
So, by definition, Gradient Boost is a good for for regression problems. However, it can
be adapted to Classification problems using the logit/expit function ( used in Logistic
Regression). So, let’s get started with a regression problem, say the Boston Housting
dataset.
from sklearn import datasets
boston = datasets.load_boston()
boston.feature_names
Output
DATE:
Aim:
To implement K-Means clustering algorithm using Python
Algorithm:
Algorithm
Step 1:Import the necessary libraries: Import KMeans from sklearn.cluster and pandas
for data manipulation.
Step 2:Read the data: Read the CSV file containing customer purchase data into a
DataFrame called data.
Step 3:Initialize KMeans: Create a KMeans object with the desired number of clusters
(2).
Step 4:Fit KMeans: Call the fit() method on the KMeans object to perform clustering.
Step 5:Predict cluster labels: Call the predict() method on the fitted KMeans object to
assign data points to clusters.
Step 6:Get cluster labels: Store the predicted cluster labels in a variable called labels.
Step 7:Print cluster labels: Print the labels variable to display the cluster labels for each
data point.
Step 8:Interpretation: Use the resulting cluster labels for further analysis or business
decisions.
K-means is one of the most popular clustering algorithms, used to partition data into K
clusters.
Sample Input:
Assume that we have a dataset of customer purchase records. The dataset has two
features, namely 'Amount' and 'Frequency', which represent the amount spent and the
frequency of visits to the store, respectively.
Program
kmeans.fit(data)
# Predict the cluster labels
labels = kmeans.predict(data)
# Print the cluster labels
print(labels)
Sample data set
Amount Frequency
20 2
25 2
30 1
40 1
50 3
60 4
70 3
80 2
90 1
Output
[0 0 0 0 0 1 1 1 1]
Result :
DATE:
Aim:
To implement Hierarchical clustering algorithm using Python
Algorithm:
Step 1:Import AgglomerativeClustering from sklearn.cluster and pandas for data
manipulation.
Step 2:Read the CSV file containing animal data into a DataFrame called data.
Step 5:Predict cluster labels using the labels_ attribute of the fitted
AgglomerativeClustering object.
Step 7:Print the labels variable to display the cluster labels for each data point.
Step 8:Interpretation: Use the resulting cluster labels for further analysis or business
decisions.
Sample Input:
Assume that we have a dataset of animals, and we want to cluster them based on their
attributes. The dataset has four features, namely 'Hair', 'Feathers', 'Eggs', and 'Milk',
which represent whether the animal has hair, feathers, lays eggs, and produces milk,
respectively.
Program
[1 1 0 0 1 0]
The output shows that the algorithm has divided the animals into two clusters. The first
cluster includes the animals that have hair and produce milk, while the second cluster
includes the animals that have feathers and lay eggs.
Result :
DATE:
Aim:
To implement DBSCAN clustering algorithm using Python
Algorithm:
Step 1:Import DBSCAN from sklearn.cluster and pandas for data manipulation.
Step 2:Read the CSV file containing customer purchase data into a DataFrame called
data.
Step 3:Initialize DBSCAN with the desired hyperparameters: epsilon (eps) and
minimum number of samples (min_samples).
Step 4:Fit DBSCAN by calling the fit() method on the DBSCAN object.
Step 5:Predict cluster labels using the labels_ attribute of the fitted DBSCAN object.
Step 7:Print the labels variable to display the cluster labels for each data point.
Step 8:Interpretation: Use the resulting cluster labels for further analysis or business
decisions.
Sample Input:
Assume that we have a dataset of customers who visit a store, and we want to cluster
them based on their shopping behavior. The dataset has two features, namely 'Amount'
and 'Frequency', which represent the amount spent and the frequency of visits to the
store, respectively.
Program
print(labels)
Sample data set:
Amount Frequency
20 2
25 2
30 1
40 1
50 3
60 4
70 3
80 2
90 1
Output
[ 0 0 -1 -1 1 1 1 0 -1]
In this example, we set the eps parameter to 2 and the min_samples parameter to 2.
These parameters control the density of the clusters and the minimum number of points
required to form a cluster.
The output of the DBSCAN algorithm is an array of cluster labels, where -1 represents
outliers.
Result :
DATE:
Aim:
To implement EM for Bayesian network using Python
Algorithm:
Step 1:Import numpy and pandas for data manipulation, and BayesianModelSampling
from pgmpy.sampling for Bayesian network sampling.
Step 5:Convert the samples to a pandas DataFrame called data with column names as
the nodes of the Bayesian network.
Step 6:Print the first 5 rows of the data DataFrame using data.head().
Step 7:Interpretation: The resulting data DataFrame contains 1000 samples generated
from the Bayesian network model, which can be used for further analysis or inference.
Note: The Bayesian network model needs to be defined and fitted prior to this Step
using appropriate methods and data.
Program
import numpy as np
import pandas as pd
from pgmpy.sampling import BayesianModelSampling
# Set the random seed for reproducibility
np.random.seed(42)
# Create a BayesianModelSampling object from the model
sampler = BayesianModelSampling(model)
# Generate 1000 samples from the Bayesian Network
samples = sampler.forward_sample(size=1000)
# Convert the samples to a pandas dataframe
data = pd.DataFrame(samples, columns=model.nodes())
print(data.head())
Output
A C B D E
0 0 0 0 1 0
1 1 1 0 1 1
2 1 1 1 0 1
3 0 0 1 0 0
4 0 0 1 0 0
# This generates 1000 samples of the 5 nodes in the Bayesian Network.
# Use the BayesianEstimator class to estimate the CPDs from the sample data
new_model.add_cpds(*cpds)
new_model.check_model()
Output
True
DATE:
Aim:
To build simple Neural Network model using Python
Algorithm
Step 1:Import numpy for numerical operations and Sequential and Dense from
keras.models and keras.layers respectively for defining and compiling the neural
network model.
Step 2:Define the dataset X as the input features and y as the target labels.
Step 3:Define the neural network model architecture using Sequential and add layers to
it using model.add(). In this case, add a dense layer with 8 units, input dimension of 2,
and ReLU activation function, and another dense layer with 1 unit and sigmoid
activation function.
Step 4:Compile the model using model.compile() with binary cross-entropy loss, Adam
optimizer, and accuracy as the evaluation metric.
Step 5:Train the model using model.fit() with the dataset X and y, specifying the
number of epochs (10) and batch size (4) for training.
Step 6:Evaluate the trained model using model.evaluate() with the same dataset X and
y, and store the scores in a variable called scores.
Step 7:Print the accuracy score of the model using model.metrics_names[1] to get the
name of the accuracy metric and scores[1]*100 to get the accuracy in percentage.
Step 8:Interpretation: The resulting accuracy score is printed, which represents the
accuracy of the trained neural network model on the given dataset.
Program
Output
Epoch 1/10
1/1 [==============================] - 1s 1s/Step - loss: 0.6963 - accuracy: 0.7500
Epoch 2/10
1/1 [==============================] - 0s 25ms/Step - loss: 0.6961 - accuracy:
0.5000
Epoch 3/10
1/1 [==============================] - 0s 30ms/Step - loss: 0.6959 - accuracy:
0.5000
Epoch 4/10
1/1 [==============================] - 0s 13ms/Step - loss: 0.6957 - accuracy:
0.5000
Epoch 5/10
1/1 [==============================] - 0s 15ms/Step - loss: 0.6956 - accuracy:
0.5000
Epoch 6/10
1/1 [==============================] - 0s 22ms/Step - loss: 0.6954 - accuracy:
0.5000
Epoch 7/10
1/1 [==============================] - 0s 13ms/Step - loss: 0.6952 - accuracy:
0.5000
Epoch 8/10
1/1 [==============================] - 0s 19ms/Step - loss: 0.6950 - accuracy:
0.5000
Epoch 9/10
1/1 [==============================] - 0s 10ms/Step - loss: 0.6949 - accuracy:
0.5000
Epoch 10/10
1/1 [==============================] - 0s 11ms/Step - loss: 0.6947 - accuracy:
0.5000
1/1 [==============================] - 0s 265ms/Step - loss: 0.6945 - accuracy:
0.5000
accuracy: 50.00%
Algorithm
1. Import tensorflow as tf for defining and compiling the neural network model.
2. Define the input layer using tf.keras.Input() with the shape of the input data (4 in
this case).
3. Define the output layer using tf.keras.layers.Dense() with 1 unit and sigmoid
activation function, and passing the input layer as the input to this layer.
4. Define the model using tf.keras.Model() and passing the input and output layers
as arguments.
5. Compile the model using model.compile() with the Adam optimizer, binary
cross-entropy loss, and accuracy as the evaluation metric.
Program
import tensorflow as tf
# Define the input and output layers
inputs = tf.keras.Input(shape=(4,))
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(inputs)
# Define the model
model = tf.keras.Model(inputs=inputs, outputs=outputs)
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Print the summary of the model
model.summary()
Output
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
================================================================
=
input_2 (InputLayer) [(None, 4)] 0
dense_6 (Dense) (None, 1) 5
================================================================
=
Total params: 5
Trainable params: 5
Non-trainable params: 0
Algorithm
1. Import tensorflow as tf for defining and compiling the neural network model.
2. Define the input layer using tf.keras.Input() with the shape of the input data (4 in
this case).
3. Define the hidden layer using tf.keras.layers.Dense() with 8 units and ReLU
activation function, and passing the input layer as the input to this layer.
4. Define the output layer using tf.keras.layers.Dense() with 1 unit and sigmoid
activation function, and passing the hidden layer as the input to this layer.
5. Define the model using tf.keras.Model() and passing the input and output layers
as arguments.
6. Compile the model using model.compile() with the Adam optimizer, binary
cross-entropy loss, and accuracy as the evaluation metric.
Program
import tensorflow as tf
# Define the input and hidden layers
inputs = tf.keras.Input(shape=(4,))
hidden = tf.keras.layers.Dense(8, activation='relu')(inputs)
# Define the output layer
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(hidden)
# Define the model
model = tf.keras.Model(inputs=inputs, outputs=outputs)
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Print the summary of the model
model.summary()
Output
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
================================================================
=
input_1 (InputLayer) [(None, 4)] 0
dense_4 (Dense) (None, 8) 40
dense_5 (Dense) (None, 1) 9
================================================================
=
Total params: 49
Trainable params: 49
Non-trainable params: 0
Result :
Thus the program to build simple neural network models was executed
successfully.
DATE:
Aim:
To build deep learning neural network model using Python
Algorithm:
Step 1:Import numpy as np for generating random data.
Step 2:Import Sequential and Dense from keras.models and keras.layers respectively for
defining the neural network model.
Step 3:Generate random input data X with shape (100, 10) and output data Y with shape
(100, 1).
Step 4:Define the model using Sequential() and add layers to the model using
model.add(). Specify the activation functions and input shape for each layer.
Step 5:Compile the model using model.compile() with the Adam optimizer, binary
cross-entropy loss, and accuracy as the evaluation metric.
Step 6:Fit the model to the data using model.fit() with X and Y as input and output data
respectively, specifying the number of epochs and batch size for training.
Step 7:Make predictions using the trained model on the first 10 data points of X using
model.predict(), and store the predictions in predictions.
Step 8:Print the predictions using print(predictions) to display the predicted values.
Program
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
# Generate some random data for the input
X = np.random.rand(100, 10)
# Generate some random data for the output
Y = np.random.rand(100, 1)
# Define the model
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=10))
model.add(Dense(16, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Fit the model to the data
model.fit(X, Y, epochs=10, batch_size=16)
# Make some predictions
predictions = model.predict(X[:10])
print(predictions)
Output
Epoch 1/10
7/7 [==============================] - 7s 9ms/Step - loss: 0.6959 - accuracy:
0.0000e+00
Epoch 2/10
7/7 [==============================] - 0s 6ms/Step - loss: 0.6944 - accuracy:
0.0000e+00
Epoch 3/10
7/7 [==============================] - 0s 6ms/Step - loss: 0.6925 - accuracy:
0.0000e+00
Epoch 4/10
7/7 [==============================] - 0s 5ms/Step - loss: 0.6915 - accuracy:
0.0000e+00
Epoch 5/10
7/7 [==============================] - 0s 7ms/Step - loss: 0.6909 - accuracy:
0.0000e+00
Epoch 6/10
7/7 [==============================] - 0s 5ms/Step - loss: 0.6899 - accuracy:
0.0000e+00
Epoch 7/10
7/7 [==============================] - 0s 6ms/Step - loss: 0.6893 - accuracy:
0.0000e+00
Epoch 8/10
7/7 [==============================] - 0s 8ms/Step - loss: 0.6883 - accuracy:
0.0000e+00
Epoch 9/10
7/7 [==============================] - 0s 5ms/Step - loss: 0.6877 - accuracy:
0.0000e+00
Epoch 10/10
7/7 [==============================] - 0s 6ms/Step - loss: 0.6871 - accuracy:
0.0000e+00
1/1 [==============================] - 0s 281ms/Step
[[0.46943492]
[0.46953392]
[0.4827811 ]
[0.49216908]
[0.48589352]
[0.49423394]
[0.4870021 ]
[0.45913622]
[0.4569753 ]
[0.44420815]]
In this example, we're generating some random input and output data, defining a
neural network with three layers, compiling the model with an optimizer and loss
function, and fitting the model to the data. Finally, we're making some predictions on
the first ten rows of the input data.
Note that there are many other libraries and frameworks for building neural networks
in Python, such as TensorFlow, PyTorch, and scikit-learn, and the specific
implementation details can vary depending on the library.
Result :