Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
7 views20 pages

Lecture 5

The document discusses the DBSCAN algorithm, a density-based clustering method that identifies clusters based on the density of data points without needing to predefine the number of clusters. It defines core, border, and outlier points based on the parameters minPts and eps, and outlines steps to solve clustering problems using this method. Two examples demonstrate how to identify core points, border points, and outliers with different parameter settings.

Uploaded by

vikrammadhad2446
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views20 pages

Lecture 5

The document discusses the DBSCAN algorithm, a density-based clustering method that identifies clusters based on the density of data points without needing to predefine the number of clusters. It defines core, border, and outlier points based on the parameters minPts and eps, and outlines steps to solve clustering problems using this method. Two examples demonstrate how to identify core points, border points, and outliers with different parameter settings.

Uploaded by

vikrammadhad2446
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

AIML

Dr. Nitin Arvind Shelke


Density based clustering : DBSCAN
• Unsupervised Learning Method under Clustering
• Density-Based Approach: DBSCAN groups points based on density,
identifying dense regions as clusters and sparse regions as noise
(outliers).
• No Need to Predefine Clusters: Unlike K-Means, DBSCAN does not
require specifying the number of clusters beforehand. It automatically
detects clusters based on density.
• Handles Arbitrary Shapes & Noise: DBSCAN can identify clusters of
various shapes and sizes and effectively detects outliers, making it
more robust than centroid-based clustering methods.
Density based clustering : DBSCAN

• There are two key parameters in DBSCAN needed


to define ‘Density’.

✓ minPts: The minimum number of points (a


threshold) clustered together for a region to be
considered dense.
✓ eps (ε): A distance measure that will be used to
locate the points in the neighborhood of any
point.
Density based clustering : DBSCAN
Core, Border, and Outlier Points:

1. Core Points have at least MinPts neighbors within ε (Eps) distance.


2. Border Points have fewer than MinPts neighbors but are reachable
from a core point.
3. Outliers (Noise Points) are neither core nor border points.
Density based clustering : DBSCAN

• The DBSCAN algorithm takes two input


parameters.
➢ Radius around each point ( eps) and the
minimum number of data points that should be
around that point within that radius ( MinPts).

• Considering the example, consider the point


(1.5,2.5), if we take eps = 0.3, then the circle
around the point with radius = 0.3, will contain
only one other point inside it (1.2,2.5) as shown
below:
Density based clustering : DBSCAN

• In this, we have 3 types of data points.

Core Point: A point is a core point if it has


more than MinPts points within eps.

Border Point: A point which has fewer


than MinPts within “eps” but it is in the
neighborhood of a core point.

Noise or outlier: A point which is not a


core point or border point.
Density based clustering : DBSCAN
• Q. Given the points A(3, 7), B(4, 6), C(5, 5), D(6, 4), E(7, 3), F(6, 2),
G(7, 2) and H(8, 4), Find the core points, border point and outliers
using DBSCAN.
• 1) Take Eps = 2.5 and MinPts = 4
• 2) Take Eps = 2.5 and MinPts = 3
Density based clustering : DBSCAN
Steps to solve the DBSCAN Problem
• Step 1: Create the distance matrix by calculating the distance using
Euclidian distance formula
• Step 2: Find all the data points that lie in the Eps-neighborhood of
each data point. That is, put all the points in the neighborhood set of
each data point whose distance is <= MinPts.
• Step 3: Identify the Core Points, Border Points, and Outlier Points
Density based clustering : DBSCAN
• Step 1: Create the distance matrix by calculating the distance using
Euclidian distance formula
Density based clustering : DBSCAN

Distance Calculation from data point A to other points


Density based clustering : DBSCAN

Distance Calculation from data point B to other points


Density based clustering : DBSCAN

Distance Calculation from data point C to other points


Density based clustering : DBSCAN

Distance Calculation from data point D to other points


Density based clustering : DBSCAN
Density based clustering : DBSCAN
• Step 2: Now, finding all the data points that lie in the Eps-
neighborhood of each data points. That is, put all the points in the
neighborhood set of each data point whose distance is <=2.5.
Density based clustering : DBSCAN
Take Eps = 2.5 and MinPts = 4
Density based clustering : DBSCAN
• Step 3: Identify the Core Points, Border Points, and Outlier Points
Density based clustering : DBSCAN
• Eps = 2.5 and MinPts = 4
1) Core Points: D, E, F, G, H (These points have at least 4 neighbors
within ε = 2.5)
2) Border Point: C (Connected to a core point but has fewer than 4
neighbors)
3) Outliers: A, B (These points are neither core points nor directly
connected to a core point)
Density based clustering : DBSCAN
• Eps = 2.5 and MinPts = 3
1) Core Points: B, C, D, E, F, G, H (These points have at least 3 neighbors
within ε = 2.5)
2) Border Point: A (A has fewer than 3 neighbors but is connected to a
core point)
3) Outliers: None (All points are either core or border)

You might also like