ML Lab6.Ipynb - Colaboratory

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

5/10/22, 12:39 AM 191389_ML_Lab6.

ipynb - Colaboratory

DECISION TREE

1 import pandas as pd
2 import numpy as np
3 import seaborn as sns
4 import matplotlib.pyplot as plt
5 from google.colab import files
6 uploaded = files.upload()
7

Choose Files No file chosen


Upload widget is only available when the cell has been
executed in the
current browser session. Please rerun this cell to enable.
Saving Bill.csv to Bill.csv

1 dt_df = pd.read_csv("Bill.csv")

1 dt_df.describe()

Variance Skewness Curtosis Entropy Class

count 1372.000000 1372.000000 1372.000000 1372.000000 1372.000000

mean 0.433735 1.922353 1.397627 -1.191657 0.444606

std 2.842763 5.869047 4.310030 2.101013 0.497103

min -7.042100 -13.773100 -5.286100 -8.548200 0.000000

25% -1.773000 -1.708200 -1.574975 -2.413450 0.000000

50% 0.496180 2.319650 0.616630 -0.586650 0.000000

75% 2.821475 6.814625 3.179250 0.394810 1.000000

max 6.824800 12.951600 17.927400 2.449500 1.000000

1
dt_df.isnull().sum()*100/dt_df.shape[0]

Variance 0.0

Skewness 0.0

Curtosis 0.0

Entropy 0.0

Class 0.0

dtype: float64

1
X_dt = dt_df.drop('Class',axis=1)

2
y_dt = dt_df['Class']

1
from sklearn.model_selection import train_test_split

3
X_train,X_test,y_train,y_test=train_test_split(X_dt,y_dt,test_size=0.2)

https://colab.research.google.com/drive/1fb1GxUhTNcGPM5E0YNyfB3YyegB03GFC#scrollTo=bjc56ErlCh4M&printMode=true 1/5
5/10/22, 12:39 AM 191389_ML_Lab6.ipynb - Colaboratory

1
X_train.head()

Variance Skewness Curtosis Entropy

286 1.3419 -4.4221 8.0900 -1.73490

412 3.7767 9.7794 -3.9075 -3.53230

493 2.8084 11.3045 -3.3394 -4.41940

369 2.1948 1.3781 1.1582 0.85774

732 -2.7143 11.4535 2.1092 -3.96290

1
y_train.head()

286 0

412 0

493 0

369 0

732 0

Name: Class, dtype: int64

1
import statsmodels.api as sm

3
X_train_sm = sm.add_constant(X_train)

4
dt_lm = sm.OLS(y_train, X_train_sm).fit()

/usr/local/lib/python3.7/dist-packages/statsmodels/tsa/tsatools.py:117: FutureWarning
x = pd.concat(x[::order], 1)

1
print(dt_lm.summary())

OLS Regression Results

==============================================================================

Dep. Variable: Class R-squared: 0.867

Model: OLS Adj. R-squared: 0.866

Method: Least Squares F-statistic: 1775.

Date: Thu, 24 Mar 2022 Prob (F-statistic): 0.00

Time: 18:14:55 Log-Likelihood: 315.74

No. Observations: 1097 AIC: -621.5

Df Residuals: 1092 BIC: -596.5

Df Model: 4

Covariance Type: nonrobust

==============================================================================

coef std err t P>|t| [0.025 0.975]

------------------------------------------------------------------------------

const 0.8033 0.008 94.520 0.000 0.787 0.820

Variance -0.1436 0.002 -59.842 0.000 -0.148 -0.139

Skewness -0.0787 0.002 -45.004 0.000 -0.082 -0.075

Curtosis -0.1035 0.002 -47.924 0.000 -0.108 -0.099

Entropy 0.0008 0.004 0.239 0.811 -0.006 0.008

==============================================================================

Omnibus: 150.601 Durbin-Watson: 1.938

Prob(Omnibus): 0.000 Jarque-Bera (JB): 283.478

Skew: -0.844 Prob(JB): 2.78e-62

https://colab.research.google.com/drive/1fb1GxUhTNcGPM5E0YNyfB3YyegB03GFC#scrollTo=bjc56ErlCh4M&printMode=true 2/5
5/10/22, 12:39 AM 191389_ML_Lab6.ipynb - Colaboratory

Kurtosis: 4.832 Cond. No. 11.3

==============================================================================

Warnings:

[1] Standard Errors assume that the covariance matrix of the errors is correctly spec

1
from sklearn.tree import DecisionTreeClassifier

3
clf_CART = DecisionTreeClassifier()

4
clf_ID3 = DecisionTreeClassifier(criterion = 'entropy') #criterion is by default gini 
5
#For ID3 criterion is entropy

7
clf_CART.fit(X_train,y_train)

8
clf_ID3.fit(X_train,y_train)

DecisionTreeClassifier(criterion='entropy')

1
y_pred=clf_CART.predict(X_test)

1
from sklearn.metrics import confusion_matrix,classification_report

3
print(confusion_matrix(y_test,y_pred))

4
print(classification_report(y_test,y_pred))

[[149 4]

[ 1 121]]

precision recall f1-score support

0 0.99 0.97 0.98 153

1 0.97 0.99 0.98 122

accuracy 0.98 275

macro avg 0.98 0.98 0.98 275

weighted avg 0.98 0.98 0.98 275

1
#CART Decision Tree

3
from sklearn.tree import plot_tree

5
plt.figure(figsize=(25,10))

6
plot_tree(clf_CART, filled=True)

7
plt.show()

https://colab.research.google.com/drive/1fb1GxUhTNcGPM5E0YNyfB3YyegB03GFC#scrollTo=bjc56ErlCh4M&printMode=true 3/5
5/10/22, 12:39 AM 191389_ML_Lab6.ipynb - Colaboratory

1
#ID3 Decision Tree

3
from sklearn.tree import plot_tree

5
plt.figure(figsize=(25,10))

6
plot_tree(clf_ID3, filled=True)

7
plt.show()

https://colab.research.google.com/drive/1fb1GxUhTNcGPM5E0YNyfB3YyegB03GFC#scrollTo=bjc56ErlCh4M&printMode=true 4/5
5/10/22, 12:39 AM 191389_ML_Lab6.ipynb - Colaboratory

https://colab.research.google.com/drive/1fb1GxUhTNcGPM5E0YNyfB3YyegB03GFC#scrollTo=bjc56ErlCh4M&printMode=true 5/5

You might also like