Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
7 views13 pages

Class6 Unsupervised Learning Clustering

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views13 pages

Class6 Unsupervised Learning Clustering

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Class 6: Unsupervised Learning (Clustering)

Theory and Project on Customer


Segmentation

By: Arpit Kumar Sharma


What is
Unsupervised
Learning?

Introduction
to - Learning from
Unsupervised unlabeled data.

Learning

- No predefined
output labels.
Types of Unsupervised Learning:

- Clustering: Grouping similar data


points (e.g., K-Means, Hierarchical
Clustering).

- Dimensionality Reduction: Reducing


the number of features (e.g., PCA, t-
SNE).

- Association: Finding relationships


between variables (e.g., Market
Basket Analysis).
What is Clustering?

Clustering - A technique to group similar


Explained data points together.

- Used for pattern recognition,


data exploration, and
segmentation.
KEY CONCEPTS: - CLUSTER: A GROUP OF - CENTROID: THE CENTER OF - DISTANCE METRIC:
SIMILAR DATA POINTS. A CLUSTER (E.G., IN K- MEASURES SIMILARITY
MEANS). BETWEEN DATA POINTS
(E.G., EUCLIDEAN DISTANCE).
Types of Clustering Algorithms

• Partitioning Methods:
• - K-Means, K-Medoids.

• Hierarchical Methods:
• - Agglomerative, Divisive.

• Density-Based Methods:
• - DBSCAN.

• Model-Based Methods:
• - Gaussian Mixture Models (GMM).
K-Means Clustering

• How K-Means Works:

• 1. Choose the number of clusters (K).


• 2. Initialize centroids randomly.
• 3. Assign each data point to the nearest centroid.
• 4. Update centroids by calculating the mean of assigned points.
• 5. Repeat until convergence.

• Advantages:
• - Simple and fast.
• - Works well with large datasets.

• Limitations:
• - Requires predefined K.
• - Sensitive to initial centroid positions.
Project: Customer Segmentation

• Objective:
• - Group customers based on purchasing behavior,
demographics, or preferences.

• Dataset:
• - Customer data (e.g., age, income, spending score).

• Steps:
• 1. Preprocess data (e.g., scaling, handling missing values).
• 2. Apply K-Means clustering.
• 3. Analyze clusters to identify customer segments.
• 4. Visualize results using scatter plots or bar charts.
Cluster 1: High
income, low
spending (e.g.,
cautious
spenders).

Cluster 2: High
income, high
spending (e.g.,
Example of premium
customers).
Customer Cluster 3: Low
Segmentation income, low
spending (e.g.,
budget-
conscious).

Cluster 4: Low
income, high
spending (e.g.,
impulsive
buyers).
Q&A
Thank You!

You might also like