Assignment5 VidulGarg

Market Basket Magic: Extracting Insights for Retail Success
Customer segmentation is a crucial aspect of retail and marketing strategy. Mall Customer
Segmentation is a common data analysis project that involves categorizing mall customers into
distinct groups or segments based on various characteristics and behaviors. This segmentation
is valuable for tailoring marketing efforts, optimizing store layouts, and enhancing customer
experiences.es.s
UNSUPERVISED LEARNING
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df=pd.read_csv(r"D:\MachineLearning\DataScienceCourse\
Mall_Customers.csv")
df
CustomerID Gender Age Annual Income (k$) Spending Score (1-

100)
0 1 Male 19 15
39
1 2 Male 21 15
81
2 3 Female 20 16
6
3 4 Female 23 16
77
4 5 Female 31 17
40
.. ... ... ... ... .
..
195 196 Female 35 120
79
196 197 Female 45 126
28
197 198 Male 32 126
74
198 199 Male 32 137
18
199 200 Male 30 137
83
[200 rows x 5 columns]
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CustomerID 200 non-null int64
1 Gender 200 non-null object
2 Age 200 non-null int64
3 Annual Income (k$) 200 non-null int64
4 Spending Score (1-100) 200 non-null int64
dtypes: int64(4), object(1)
memory usage: 7.9+ KB
plt.hist(df["Spending Score (1-100)"], bins=10)
(array([16., 20., 10., 17., 35., 37., 11., 24., 14., 16.]),
array([ 1. , 10.8, 20.6, 30.4, 40.2, 50. , 59.8, 69.6, 79.4, 89.2,
99. ]),
<BarContainer object of 10 artists>)
df['Gender'].value_counts().plot(kind='bar')
<Axes: xlabel='Gender'>
Female customers are more !!
# encoding
df['Gender']=df['Gender'].replace({'Male':1,'Female':0})
df.head(5)
CustomerID Gender Age Annual Income (k$) Spending Score (1-100)

0 1 1 19 15 39
1 2 1 21 15 81
2 3 0 20 16 6
3 4 0 23 16 77
4 5 0 31 17 40
# Selecting 3rd and 4th column as a numpy array

x = df.iloc[:, [3, 4]].values
array([[ 15, 39],

[ 15, 81],
[ 16, 6],
[ 16, 77],
[ 17, 40],
[ 17, 76],
[ 18, 6],
[ 18, 94],
[ 19, 3],
[ 19, 72],
[ 19, 14],
[ 19, 99],
[ 20, 15],
[ 20, 77],
[ 20, 13],
[ 20, 79],
[ 21, 35],
[ 21, 66],
[ 23, 29],
[ 23, 98],
[ 24, 35],
[ 24, 73],
[ 25, 5],
[ 25, 73],
[ 28, 14],
[ 28, 82],
[ 28, 32],
[ 28, 61],
[ 29, 31],
[ 29, 87],
[ 30, 4],
[ 30, 73],
[ 33, 4],
[ 33, 92],
[ 33, 14],
[ 33, 81],
[ 34, 17],
[ 34, 73],
[ 37, 26],
[ 37, 75],
[ 38, 35],
[ 38, 92],
[ 39, 36],
[ 39, 61],
[ 39, 28],
[ 39, 65],
[ 40, 55],
[ 40, 47],
[ 40, 42],
[ 40, 42],
[ 42, 52],
[ 42, 60],
[ 43, 54],
[ 43, 60],
[ 43, 45],
[ 43, 41],
[ 44, 50],
[ 44, 46],
[ 46, 51],
[ 46, 46],
[ 46, 56],
[ 46, 55],
[ 47, 52],
[ 47, 59],
[ 48, 51],
[ 48, 59],
[ 48, 50],
[ 48, 48],
[ 48, 59],
[ 48, 47],
[ 49, 55],
[ 49, 42],
[ 50, 49],
[ 50, 56],
[ 54, 47],
[ 54, 54],
[ 54, 53],
[ 54, 48],
[ 54, 52],
[ 54, 42],
[ 54, 51],
[ 54, 55],
[ 54, 41],
[ 54, 44],
[ 54, 57],
[ 54, 46],
[ 57, 58],
[ 57, 55],
[ 58, 60],
[ 58, 46],
[ 59, 55],
[ 59, 41],
[ 60, 49],
[ 60, 40],
[ 60, 42],
[ 60, 52],
[ 60, 47],
[ 60, 50],
[ 61, 42],
[ 61, 49],
[ 62, 41],
[ 62, 48],
[ 62, 59],
[ 62, 55],
[ 62, 56],
[ 62, 42],
[ 63, 50],
[ 63, 46],
[ 63, 43],
[ 63, 48],
[ 63, 52],
[ 63, 54],
[ 64, 42],
[ 64, 46],
[ 65, 48],
[ 65, 50],
[ 65, 43],
[ 65, 59],
[ 67, 43],
[ 67, 57],
[ 67, 56],
[ 67, 40],
[ 69, 58],
[ 69, 91],
[ 70, 29],
[ 70, 77],
[ 71, 35],
[ 71, 95],
[ 71, 11],
[ 71, 75],
[ 71, 9],
[ 71, 75],
[ 72, 34],
[ 72, 71],
[ 73, 5],
[ 73, 88],
[ 73, 7],
[ 73, 73],
[ 74, 10],
[ 74, 72],
[ 75, 5],
[ 75, 93],
[ 76, 40],
[ 76, 87],
[ 77, 12],
[ 77, 97],
[ 77, 36],
[ 77, 74],
[ 78, 22],
[ 78, 90],
[ 78, 17],
[ 78, 88],
[ 78, 20],
[ 78, 76],
[ 78, 16],
[ 78, 89],
[ 78, 1],
[ 78, 78],
[ 78, 1],
[ 78, 73],
[ 79, 35],
[ 79, 83],
[ 81, 5],
[ 81, 93],
[ 85, 26],
[ 85, 75],
[ 86, 20],
[ 86, 95],
[ 87, 27],
[ 87, 63],
[ 87, 13],
[ 87, 75],
[ 87, 10],
[ 87, 92],
[ 88, 13],
[ 88, 86],
[ 88, 15],
[ 88, 69],
[ 93, 14],
[ 93, 90],
[ 97, 32],
[ 97, 86],
[ 98, 15],
[ 98, 88],
[ 99, 39],
[ 99, 97],
[101, 24],
[101, 68],
[103, 17],
[103, 85],
[103, 23],
[103, 69],
[113, 8],
[113, 91],
[120, 16],
[120, 79],
[126, 28],
[126, 74],
[137, 18],
[137, 83]], dtype=int64)
Elbow Method
from sklearn.cluster import KMeans
k_values=range(1,11)
wcss=[]
for i in k_values:
model=KMeans(n_clusters=i)
model.fit(x)
wcss.append(model.inertia_)
C:\Users\Vidul\AppData\Local\Programs\Python\Python311\Lib\site-
packages\sklearn\cluster\_kmeans.py:1412: FutureWarning: The default
value of `n_init` will change from 10 to 'auto' in 1.4. Set the value
of `n_init` explicitly to suppress the warning
super()._check_params_vs_input(X, default_n_init=10)
plt.plot(k_values,wcss,marker='o',linestyle='-',color='b')
# Set the x-ticks to display values from 1 to 10

plt.xticks(range(1, 11))
plt.title('Elbow Method')
plt.grid(True)
We can see optimal value of k as 5 here !!

Kmeans()
model=KMeans(n_clusters=5, init='k-means++', random_state=42)
y_pred=model.fit_predict(x)
y_pred
array([4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4,
2,
4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4,
0,
4, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 1, 0, 1, 3, 1, 3,
1,
0, 1, 3, 1, 3, 1, 3, 1, 3, 1, 0, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3,
1,
3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3,
1,
3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3,
1,
3, 1])
To Visualize the clusters:

k=5
colors=['blue','green','red','cyan','magenta']
plt.figure(figsize=(5,4))
#To create scatter plot

for i in range(k):
cluster_data=x[y_pred==i]
plt.scatter(cluster_data[:,0],
cluster_data[:,1],
s=100,
c=colors[i],
label=f'cluster{i+1}')
plt.xlabel('Annual Income (k$)')

plt.ylabel('Spending Score (1-100)')
#To plot cluster centroids
plt.scatter(model.cluster_centers_[:,0],
model.cluster_centers_[:,1],
s=300,
c='yellow',
label='centroid')
plt.legend(loc='upper left', bbox_to_anchor=(1, 1))
<matplotlib.legend.Legend at 0x22064dff1d0>
Cluster 1 (Blue) : People who are average in terms of earning and spending
Cluster 2 (Green) : People who are earning high and also spending high
-----------------> TARGET CUSTOMERS
Cluster 3 (Red) : People who are earning less but spending more
Cluster 4 (Cyan) : People who are earning high but spending less
Cluster 5 (Magenta) : People who are Earning less , spending less
Cluster 2 People are the target customers!!
y_pred
array([4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4,
2,
4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4,
0,
4, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 1, 0, 1, 3, 1, 3,
1,
0, 1, 3, 1, 3, 1, 3, 1, 3, 1, 0, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3,
1,
3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3,
1,
3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3,
1,
3, 1])

Assignment5 VidulGarg

Uploaded by

Copyright:

Available Formats

Assignment5 VidulGarg

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assignment5 VidulGarg

Uploaded by

Copyright:

Available Formats

Market Basket Magic: Extracting Insights for Retail Success

CustomerID Gender Age Annual Income (k$) Spending Score (1-

[200 rows x 5 columns]

plt.hist(df["Spending Score (1-100)"], bins=10)

CustomerID Gender Age Annual Income (k$) Spending Score (1-100)

# Selecting 3rd and 4th column as a numpy array

array([[ 15, 39],

# Set the x-ticks to display values from 1 to 10

We can see optimal value of k as 5 here !!

To Visualize the clusters:

#To create scatter plot

plt.xlabel('Annual Income (k$)')

plt.legend(loc='upper left', bbox_to_anchor=(1, 1))

Cluster 2 People are the target customers!!

You might also like