Class 6: Unsupervised Learning (Clustering)
Theory and Project on Customer
Segmentation
By: Arpit Kumar Sharma
What is
Unsupervised
Learning?
Introduction
to - Learning from
Unsupervised unlabeled data.
Learning
- No predefined
output labels.
Types of Unsupervised Learning:
- Clustering: Grouping similar data
points (e.g., K-Means, Hierarchical
Clustering).
- Dimensionality Reduction: Reducing
the number of features (e.g., PCA, t-
SNE).
- Association: Finding relationships
between variables (e.g., Market
Basket Analysis).
What is Clustering?
Clustering - A technique to group similar
Explained data points together.
- Used for pattern recognition,
data exploration, and
segmentation.
KEY CONCEPTS: - CLUSTER: A GROUP OF - CENTROID: THE CENTER OF - DISTANCE METRIC:
SIMILAR DATA POINTS. A CLUSTER (E.G., IN K- MEASURES SIMILARITY
MEANS). BETWEEN DATA POINTS
(E.G., EUCLIDEAN DISTANCE).
Types of Clustering Algorithms
• Partitioning Methods:
• - K-Means, K-Medoids.
• Hierarchical Methods:
• - Agglomerative, Divisive.
• Density-Based Methods:
• - DBSCAN.
• Model-Based Methods:
• - Gaussian Mixture Models (GMM).
K-Means Clustering
• How K-Means Works:
• 1. Choose the number of clusters (K).
• 2. Initialize centroids randomly.
• 3. Assign each data point to the nearest centroid.
• 4. Update centroids by calculating the mean of assigned points.
• 5. Repeat until convergence.
• Advantages:
• - Simple and fast.
• - Works well with large datasets.
• Limitations:
• - Requires predefined K.
• - Sensitive to initial centroid positions.
Project: Customer Segmentation
• Objective:
• - Group customers based on purchasing behavior,
demographics, or preferences.
• Dataset:
• - Customer data (e.g., age, income, spending score).
• Steps:
• 1. Preprocess data (e.g., scaling, handling missing values).
• 2. Apply K-Means clustering.
• 3. Analyze clusters to identify customer segments.
• 4. Visualize results using scatter plots or bar charts.
Cluster 1: High
income, low
spending (e.g.,
cautious
spenders).
Cluster 2: High
income, high
spending (e.g.,
Example of premium
customers).
Customer Cluster 3: Low
Segmentation income, low
spending (e.g.,
budget-
conscious).
Cluster 4: Low
income, high
spending (e.g.,
impulsive
buyers).
Q&A
Thank You!