Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
2 views3 pages

Experiment 2 KMeans Clustering

The document outlines an experiment to implement K-Means clustering for customer segmentation using Python and scikit-learn. It includes software requirements, a dataset description, and a step-by-step procedure for loading data, determining the optimal number of clusters using the Elbow method, applying K-Means, and visualizing the results. The expected output includes an Elbow plot and a scatter plot of customer segments based on income and spending score.

Uploaded by

Anant More
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views3 pages

Experiment 2 KMeans Clustering

The document outlines an experiment to implement K-Means clustering for customer segmentation using Python and scikit-learn. It includes software requirements, a dataset description, and a step-by-step procedure for loading data, determining the optimal number of clusters using the Elbow method, applying K-Means, and visualizing the results. The expected output includes an Elbow plot and a scatter plot of customer segments based on income and spending score.

Uploaded by

Anant More
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Experiment 2: Customer Segmentation

using K-Means Clustering


Aim:
To implement K-Means clustering algorithm for customer segmentation using Python and
scikit-learn.

Software Requirements:
Python 3.x, Jupyter Notebook, pandas, matplotlib, seaborn, scikit-learn

Dataset:
Sample customer dataset with features like Age, Annual Income, and Spending Score.

Procedure:
1. Import necessary libraries.
2. Load the dataset.
3. Explore and visualize the dataset using scatter plots.
4. Use the Elbow method to determine the optimal number of clusters (k).
5. Apply K-Means clustering algorithm using the determined value of k.
6. Visualize the clusters formed.
7. Interpret the results for business insights.

Program:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.cluster import KMeans

# Load dataset
data = pd.read_csv('Mall_Customers.csv')
X = data[['Annual Income (k$)', 'Spending Score (1-100)']]

# Elbow method to find optimal k


wcss = []
for i in range(1, 11):
kmeans = KMeans(n_clusters=i, init='k-means++', random_state=42)
kmeans.fit(X)
wcss.append(kmeans.inertia_)

plt.plot(range(1, 11), wcss)


plt.title('Elbow Method')
plt.xlabel('Number of clusters')
plt.ylabel('WCSS')
plt.show()

# Apply K-Means
kmeans = KMeans(n_clusters=5, init='k-means++', random_state=42)
y_kmeans = kmeans.fit_predict(X)

# Visualizing clusters
plt.scatter(X.values[y_kmeans == 0, 0], X.values[y_kmeans == 0, 1], s = 100, c = 'red', label =
'Cluster 1')
plt.scatter(X.values[y_kmeans == 1, 0], X.values[y_kmeans == 1, 1], s = 100, c = 'blue', label =
'Cluster 2')
plt.scatter(X.values[y_kmeans == 2, 0], X.values[y_kmeans == 2, 1], s = 100, c = 'green', label
= 'Cluster 3')
plt.scatter(X.values[y_kmeans == 3, 0], X.values[y_kmeans == 3, 1], s = 100, c = 'cyan', label =
'Cluster 4')
plt.scatter(X.values[y_kmeans == 4, 0], X.values[y_kmeans == 4, 1], s = 100, c = 'magenta',
label = 'Cluster 5')
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow',
label = 'Centroids')
plt.title('Customer Segments')
plt.xlabel('Annual Income (k$)')
plt.ylabel('Spending Score (1-100)')
plt.legend()
plt.show()

Sample Output:
The output consists of:

 Elbow plot showing the optimal number of clusters (typically 5 for this dataset).
 Scatter plot of customers segmented into clusters.
 Different customer segments visualized based on income and spending score.
Viva Questions:
 What is the purpose of customer segmentation?
 How does the K-Means algorithm work?
 What is the Elbow method?
 What are the limitations of K-Means clustering?

You might also like