Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views9 pages

Clustering Part 2

The document compares various clustering algorithms, highlighting K-Means, Agglomerative, and DBSCAN. It emphasizes DBSCAN's advantages, such as not requiring a predefined number of clusters and its ability to identify noise and non-spherical clusters. Additionally, it discusses the application of DBSCAN in retail for detecting customer groupings and outliers, along with a group assignment related to clustering analysis.

Uploaded by

Aditya Solanki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views9 pages

Clustering Part 2

The document compares various clustering algorithms, highlighting K-Means, Agglomerative, and DBSCAN. It emphasizes DBSCAN's advantages, such as not requiring a predefined number of clusters and its ability to identify noise and non-spherical clusters. Additionally, it discusses the application of DBSCAN in retail for detecting customer groupings and outliers, along with a group assignment related to clustering analysis.

Uploaded by

Aditya Solanki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Clustering beyond K-Means

Comparison of Clustering Algorithms

K-Means Agglomerative DBSCAN


• Works on only Numeric • Intuitive and fast • No need to specify value of
Information • Less computationally k
• Computationally Expensive expensive • Works on finding non-
• Does not show process of • Can aid in selection of spherical clusters
cluster formation correct k value • Can identify noise in data
• Influenced by outliers. • Provide hierarchical order • Immune to the effects of
Forcefully assign noise to among clusters outliers
cluster • Less impact of outliers
• Does not provide
hierarchical presentation
• Identifying right value of k
Hierarchical Clustering: Dendrogram
Linkage Functions
DBSCAN
Choosing DBSCAN Parameters

• MinPts: Minimum points required in neighborhood for dense region.

MinPts >= D +1 (D is number of dimensions)

MinPts = 2* D

MinPts = In (N) (N is the number of observations in the data)


• Radius: try and error. We may use a k-distance plot and find the elbow point.
DBSCAN for Retail

• Detect natural groupings of customers based on shopping behavior without prior


assumptions.

• Identify outliers such as fraudulent transactions or unusual purchasing spikes.

• Handle large, complex datasets efficiently, making it a scalable tech tool for
growing businesses.
Group Assignment on Clustering

• https://www.kaggle.com/datasets/salahuddinahmedshuvo/ecommer
ce-consumer-behavior-analysis-data

• Develop Problem Statement for Clustering

• Apply the clustering and generate profiles

• Suggestion on targeting
Thank You!

You might also like