Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

LAB#06 Raheel Aslam (74-FET/BSEE/F16)

Machine Learning Lab


Code for logistic regression using weather.csv:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

#Loading dataset – User_Data


dataset = pd.read_csv(r"D:\8th semister\machine learning lab\lab5 6\weather1.csv")
#Now, to predict whether a user will purchase the product or not,
#one needs to find out the relationship between Age and Estimated Salary.
#Here User ID and Gender are not important factors for finding out this.

# Input
x = dataset.iloc[:, [2, 3]].values
# Output /
y = dataset.iloc[:, 4].values

#Splitting the dataset to train and test.


#75% of data is used for training the model
#25% of it is used to test the performance of our model.

from sklearn.model_selection import train_test_split

xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size = 0.25, )

from sklearn.preprocessing import StandardScaler


sc_x = StandardScaler()
xtrain = sc_x.fit_transform(xtrain)
xtest = sc_x.transform(xtest)

print (xtrain[0:10, :])

#Finally, we are training our Logistic Regression model.

from sklearn.linear_model import LogisticRegression


classifier = LogisticRegression(random_state = 0)
classifier.fit(xtrain, ytrain)

#After training the model, it time to use it to do prediction on testing data.


y_pred = classifier.predict(xtest)

#Let’s test the performance of our model – Confusion Matrix

from sklearn.metrics import confusion_matrix


cm = confusion_matrix(ytest, y_pred)
print ("Confusion Matrix : \n", cm)

#Performance measure – Accuracy


from sklearn.metrics import accuracy_score
print ("Accuracy : ", accuracy_score(ytest, y_pred))

#Visualizing the performance of our model.

from matplotlib.colors import ListedColormap


X_set, y_set = xtest, ytest
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:,
0].max() + 1, step = 0.01),np.arange(start = X_set[:, 1].min() - 1,stop = X_set[:,
1].max() + 1, step = 0.01))

plt.contourf(X1, X2,
classifier.predict(np.array([X1.ravel(),X2.ravel()]).T).reshape(X1.shape), alpha =
0.75, cmap = ListedColormap(('red', 'green')))

plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())

for i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],c = ListedColormap(('red',
'green'))(i), label = j)

plt.title('Classifier (Test set)')


plt.xlabel('Humidity')
plt.ylabel('Windy')
plt.legend()
plt.show()

Output:
[[-0.81649658 -0.81649658]
[-0.81649658 -0.81649658]
[-0.81649658 -0.81649658]
[-0.81649658 -0.81649658]
[ 1.22474487 1.22474487]
[-0.81649658 1.22474487]
[-0.81649658 1.22474487]
[ 1.22474487 -0.81649658]
[ 1.22474487 1.22474487]
[ 1.22474487 -0.81649658]]
Confusion Matrix :
[[1 0]
[0 3]]
Accuracy : 1.0

Analysis:
Therefore, the analysis of the result is that by using logistic regression using scikit learn the
accuracy can be almost 100 percent and the classes are accurately differentiated and as you can
see both the classes 0 and 1 are differentiated using result and present a way better result than
linear regression.

You might also like