AI Lab Assignment-10 Ishaan Bhadrike

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

AI Lab Assignment 10

Name - Ishaan Bhadrike

PRN - 21070122060

Class - AI2 / CSA3

Aim: Apply support vector machine classification algorithm on a


sample case study and data set and evaluate the results.

Objectives:
● Data Preprocessing and Feature Selection
● Support Vector Machine Algorithm Implementation
● Evaluation and Analysis

Algorithm:
1. Load and preprocess the dataset.
2. Split the data into training and testing sets.
3. Initialize the SVM classifier.
4. Train the SVM model on the training data.
5. Make predictions on the testing data.
6. Evaluate the model's performance using accuracy and
other relevant metrics.
7. Interpret the results and adjust the model if necessary.

Input:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score,
classification_report, confusion_matrix
from sklearn.preprocessing import LabelEncoder

from google.colab import files


uploaded = files.upload()

data = pd.read_csv('Social_Network_Ads -
Social_Network_Ads.csv')

print(data.head())

non_numeric_columns =
data.select_dtypes(exclude=[np.number]).columns
print("Non-numeric columns:", non_numeric_columns)

if not non_numeric_columns.empty:
for column in non_numeric_columns:
# Apply label encoding to non-numeric columns
data[column] =
LabelEncoder().fit_transform(data[column])

X = data.drop('Purchased', axis=1) # Features


y = data['Purchased'] # Target

X_train, X_test, y_train, y_test = train_test_split(X, y,


test_size=0.2, random_state=42)
svm_model = SVC(kernel='linear') # You can try different
kernels like 'rbf', 'poly', etc.

# Train the SVM model


svm_model.fit(X_train, y_train)

# Make predictions on the test set


y_pred = svm_model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)


print("Accuracy:", accuracy)

print("\nClassification Report:")
print(classification_report(y_test, y_pred))

print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))

Output:
Conclusion:
The Support Vector Machine (SVM) classification algorithm was
applied to a sample dataset, showcasing its efficacy in
categorizing instances accurately. Through preprocessing,
training, and evaluation steps, the SVM model demonstrated
satisfactory performance metrics. Further exploration could
involve parameter tuning and real-world deployment to enhance
its utility and validate its effectiveness.

Post Lab Questions:


1. How did the choice of the kernel function influence the
performance of the Support Vector Machine (SVM) classifier,
and what insights does it provide into the dataset's
underlying structure?

Answer: The choice of the kernel function in the SVM classifier


affects how it separates data points. A linear kernel assumes data
can be separated by a straight line, while non-linear kernels like
polynomial or RBF can handle more complex patterns. By trying
different kernels, we get clues about the dataset's structure. For
example, if a linear kernel works well, the data might be linearly
separable, but if not, it suggests more intricate relationships in the
data.

2. Can you discuss any challenges or limitations


encountered during the application of the SVM algorithm on
the sample dataset, and how these challenges were
addressed or could be mitigated?
Answer: Challenges encountered when applying SVM on the
dataset include
● selecting suitable hyperparameters,
● handling imbalanced data,
● addressing computational complexity with large datasets
● interpreting complex decision boundaries.

Techniques like grid search for hyperparameter tuning, class


weighting for imbalanced data, dimensionality reduction for large
datasets, and visualization aids for interpreting results can help
mitigate these challenges.
3. In what scenarios might SVM be a preferred choice over
other classification algorithms, based on the findings of this
experiment.
Answer: Based on the findings of this experiment, Support Vector
Machine (SVM) may be preferred over other classification
algorithms in the following scenarios:

● When the dataset has a high number of features relative to


the number of samples, as SVMs are effective in
high-dimensional spaces and are less prone to overfitting.
● When there is a clear margin of separation between classes,
as SVMs aim to maximize this margin, leading to better
generalization.
● When dealing with complex and non-linear datasets, as
SVMs can efficiently handle non-linear decision boundaries
using kernel tricks.
● When interpretability of the model's decision boundaries and
support vectors is important, as SVMs provide clear
geometric interpretations.
Overall, SVMs are a preferred choice in scenarios where there is
a need for robustness, generalization, and interpretability in
classification tasks.

You might also like