Hierarchical Clusters

Hierarchical clustering is a technique that groups similar data points into a hierarchy, starting with each point as its own cluster and progressively merging them based on similarity. It can be visualized using a dendrogram, which illustrates how clusters are formed step by step. There are two main types of hierarchical clustering: agglomerative (bottom-up) and divisive (top-down), each with distinct workflows for merging or splitting clusters.

Uploaded by

chandanaramesh2711

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views6 pages

Hierarchical Clusters

Uploaded by

chandanaramesh2711

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Why hierarchical clustering?

Hierarchical clustering is a technique used to group similar data points

together based on their similarity creating a hierarchy or tree-like
structure. The key idea is to begin with each data point as its own
separate cluster and then progressively merge or split them based on
their similarity.
Lets understand this with the help of an example
Imagine you have four fruits with different weights: an apple (100g), a
banana (120g), a cherry (50g), and a grape (30g). Hierarchical
clustering starts by treating each fruit as its own group.
 It then merges the closest groups based on their weights.
 First, the cherry and grape are grouped together because they are the
lightest.
 Next, the apple and banana are grouped together.
Finally, all the fruits are merged into one large group, showing how
hierarchical clustering progressively combines the most similar data
points.
Getting Started with Dendogram
A dendrogram is like a family tree for clusters. It shows how individual
data points or groups of data merge together. The bottom shows each
data point as its own group, and as you move up, similar groups are
combined. The lower the merge point, the more similar the groups are. It
helps you see how things are grouped step by step.
The working of the dendrogram can be explained using the below
diagram:

Dendogram

In this image, on the left side, there are five points labeled P, Q, R, S, and
T. These represent individual data points that are being clustered. On the
right side, there’s a dendrogram, which shows how these points are
grouped together step by step.
 At the bottom of the dendrogram, the points P, Q, R, S, and T are all
separate.
 As you move up, the closest points are merged into a single group.
 The lines connecting the points show how they are progressively
merged based on similarity.
 The height at which they are connected shows how similar the points
are to each other; the shorter the line, the more similar they are
Types of Hierarchical Clustering
Now that we understand the basics of hierarchical clustering, let’s explore
the two main types of hierarchical clustering.
1. Agglomerative Clustering
2. Divisive clustering
Hierarchical Agglomerative Clustering
It is also known as the bottom-up approach or hierarchical
agglomerative clustering (HAC). Unlike flat clustering hierarchical
clustering provides a structured way to group data. This clustering
algorithm does not require us to prespecify the number of clusters.
Bottom-up algorithms treat each data as a singleton cluster at the outset
and then successively agglomerate pairs of clusters until all clusters have
been merged into a single cluster that contains all data.

Hierarchical Agglomerative Clustering

Workflow for Hierarchical Agglomerative clustering
1. Start with individual points: Each data point is its own cluster. For
example if you have 5 data points you start with 5 clusters each
containing just one data point.
2. Calculate distances between clusters: Calculate the distance
between every pair of clusters. Initially since each cluster has one point
this is the distance between the two data points.
3. Merge the closest clusters: Identify the two clusters with the smallest
distance and merge them into a single cluster.
4. Update distance matrix: After merging you now have one less cluster.
Recalculate the distances between the new cluster and the remaining
clusters.
5. Repeat steps 3 and 4: Keep merging the closest clusters and updating
the distance matrix until you have only one cluster left.
6. Create a dendrogram: As the process continues you can visualize the
merging of clusters using a tree-like diagram called a dendrogram. It
shows the hierarchy of how clusters are merged.

Hierarchical Divisive clustering

It is also known as a top-down approach. This algorithm also does not
require to prespecify the number of clusters. Top-down clustering requires
a method for splitting a cluster that contains the whole data and proceeds
by splitting clusters recursively until individual data have been split into
singleton clusters.
Workflow for Hierarchical Divisive clustering :
1. Start with all data points in one cluster: Treat the entire dataset as a
single large cluster.
2. Split the cluster: Divide the cluster into two smaller clusters. The
division is typically done by finding the two most dissimilar points in the
cluster and using them to separate the data into two parts.
3. Repeat the process: For each of the new clusters, repeat the splitting
process:
1. Choose the cluster with the most dissimilar points.
2. Split it again into two smaller clusters.
4. Stop when each data point is in its own cluster: Continue this
process until every data point is its own cluster, or the stopping
condition (such as a predefined number of clusters) is met.
Hierarchical Divisive clustering

Computing Distance Matrix

While merging two clusters we check the distance between two every
pair of clusters and merge the pair with the least distance/most
similarity. But the question is how is that distance determined. There are
different ways of defining Inter Cluster distance/similarity. Some of them
are:
1. Min Distance: Find the minimum distance between any two points of
the cluster.
2. Max Distance: Find the maximum distance between any two points of
the cluster.
3. Group Average: Find the average distance between every two points
of the clusters.
4. Ward’s Method: The similarity of two clusters is based on the increase
in squared error when two clusters are merged.
Distance Matrix Comparision in Hierarchical Clustering

Implementations code for Distance Matrix

Comparision
import numpy as np
from scipy.cluster.hierarchy import dendrogram, linkage
import matplotlib.pyplot as plt

X = np.array([[1, 2], [1, 4], [1, 0],

[4, 2], [4, 4], [4, 0]])

Z = linkage(X, 'ward') # Ward Distance

dendrogram(Z) #plotting the dendogram

plt.title('Hierarchical Clustering Dendrogram')

plt.xlabel('Data point')
plt.ylabel('Distance')
plt.show()
Output:

Hierarchical Clustering Dendrogram

Hierarchical Clustering Guide
No ratings yet
Hierarchical Clustering Guide
11 pages
Hierarchical Clustering Algorithm
No ratings yet
Hierarchical Clustering Algorithm
9 pages
10Hierarchical&Probabilistic Clustering & GMM (ML)
No ratings yet
10Hierarchical&Probabilistic Clustering & GMM (ML)
24 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
41 pages
Hierarchical Clustering - 11.3.2024 - Full
No ratings yet
Hierarchical Clustering - 11.3.2024 - Full
14 pages
Agglomerative Hierarchical Clustering
No ratings yet
Agglomerative Hierarchical Clustering
22 pages
Hierarchal Clustering
No ratings yet
Hierarchal Clustering
13 pages
7 Clustering-Hierarichal Clustering
No ratings yet
7 Clustering-Hierarichal Clustering
13 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
11 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
21 pages
Unit 4 ML
No ratings yet
Unit 4 ML
14 pages
Unit 4
No ratings yet
Unit 4
25 pages
Unit5 CSM ML
No ratings yet
Unit5 CSM ML
32 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
23 pages
9FM0 Topic Test - D1 - 1 - Algorithms PDF
No ratings yet
9FM0 Topic Test - D1 - 1 - Algorithms PDF
12 pages
Clustering
No ratings yet
Clustering
19 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
7 pages
DWM 4
No ratings yet
DWM 4
14 pages
1746780018-Lecture#44 Hierarchical Clustering
No ratings yet
1746780018-Lecture#44 Hierarchical Clustering
5 pages
Bhumika - Intern Offer Letter
No ratings yet
Bhumika - Intern Offer Letter
2 pages
Hierarchial Clustering
No ratings yet
Hierarchial Clustering
14 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
10 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
10 pages
Hierarchical
No ratings yet
Hierarchical
31 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
26 pages
Clustering Hierarchical PDF
No ratings yet
Clustering Hierarchical PDF
31 pages
Hierarchical Clustering Case Study
No ratings yet
Hierarchical Clustering Case Study
4 pages
3CP10 MJJ Hierarchical Clustering
No ratings yet
3CP10 MJJ Hierarchical Clustering
40 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
7 pages
(13.4.2019) Hierarchical Clustering
No ratings yet
(13.4.2019) Hierarchical Clustering
2 pages
Sorting Algorithms
No ratings yet
Sorting Algorithms
9 pages
Hierarchical Clustering PDF
No ratings yet
Hierarchical Clustering PDF
7 pages
Lecture - 11 Hierarchical Clustering
No ratings yet
Lecture - 11 Hierarchical Clustering
28 pages
Lect 11 DM
No ratings yet
Lect 11 DM
41 pages
Hierarchical Clustering Guide
No ratings yet
Hierarchical Clustering Guide
50 pages
Hierarchical Clustering: Class Program University Semester Lecturer Sources
100% (1)
Hierarchical Clustering: Class Program University Semester Lecturer Sources
33 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
19 pages
Lec.4.D. M. Spring 2025
No ratings yet
Lec.4.D. M. Spring 2025
19 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
4 pages
Hierar Scale4
No ratings yet
Hierar Scale4
51 pages
DA Seminar
No ratings yet
DA Seminar
29 pages
Unit 4 Self Made
No ratings yet
Unit 4 Self Made
28 pages
Hierarchical Clustering Explained
No ratings yet
Hierarchical Clustering Explained
14 pages
Hierarchical Clustering Guide
No ratings yet
Hierarchical Clustering Guide
110 pages
4.4 Hierarchical Clustering Methods
No ratings yet
4.4 Hierarchical Clustering Methods
39 pages
ML CO4 SESSION 30 Hierarchical Clustering
No ratings yet
ML CO4 SESSION 30 Hierarchical Clustering
20 pages
Heirarchical Clustering
No ratings yet
Heirarchical Clustering
22 pages
Clustering Techniques in ML
No ratings yet
Clustering Techniques in ML
3 pages
Spooo
No ratings yet
Spooo
9 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
6 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
2 pages
Clustring
No ratings yet
Clustring
20 pages
Hierarchical Clustering Guide
No ratings yet
Hierarchical Clustering Guide
40 pages
ML Lec-17
No ratings yet
ML Lec-17
12 pages
There Are Two Main Categories of Shift-Reduce Parsers
No ratings yet
There Are Two Main Categories of Shift-Reduce Parsers
26 pages
Module-5-Cluster Analysis-Part1
No ratings yet
Module-5-Cluster Analysis-Part1
24 pages
6 - Chapter 6 - Hierarchical Clustering
No ratings yet
6 - Chapter 6 - Hierarchical Clustering
32 pages
Cluster 1
No ratings yet
Cluster 1
6 pages
Particulars Mittal 2015
No ratings yet
Particulars Mittal 2015
29 pages
DWM Exp8 127 133 137
No ratings yet
DWM Exp8 127 133 137
4 pages
Two Algorithms For Maximum and Minimum Weighted Bipartite Matching
No ratings yet
Two Algorithms For Maximum and Minimum Weighted Bipartite Matching
38 pages
Exp 8
No ratings yet
Exp 8
7 pages
Algorithm Design & Analysis Guide
No ratings yet
Algorithm Design & Analysis Guide
139 pages
NSS Posterholder Appreciation Certificate-1
No ratings yet
NSS Posterholder Appreciation Certificate-1
11 pages
Computer Science Project: For Isc Programming in Bluej
No ratings yet
Computer Science Project: For Isc Programming in Bluej
33 pages
Linear Programming for Students
No ratings yet
Linear Programming for Students
8 pages
Agnes
No ratings yet
Agnes
25 pages
Stack Operations and Evaluation
No ratings yet
Stack Operations and Evaluation
16 pages
ChandanaR 1DT23MC022
No ratings yet
ChandanaR 1DT23MC022
2 pages
Classification: Basic Concepts and Decision Trees
No ratings yet
Classification: Basic Concepts and Decision Trees
71 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Data Structures With Python 20CS41P
No ratings yet
Data Structures With Python 20CS41P
29 pages
4th (Java Collections Interview Questions (2022) - Javatpoint)
No ratings yet
4th (Java Collections Interview Questions (2022) - Javatpoint)
22 pages
CN Module 2
No ratings yet
CN Module 2
40 pages
Algoritma Untuk Tataletak Fasilitas
No ratings yet
Algoritma Untuk Tataletak Fasilitas
53 pages
FP-Growth Example
0% (1)
FP-Growth Example
3 pages
Cluster Analysis Concepts & Algorithms
No ratings yet
Cluster Analysis Concepts & Algorithms
93 pages
Yukthi Ai Driven Research ENGINE
No ratings yet
Yukthi Ai Driven Research ENGINE
5 pages
DS MCQ
No ratings yet
DS MCQ
9 pages
Maximum Flow Algorithm Guide
No ratings yet
Maximum Flow Algorithm Guide
15 pages
How Does Database Indexing Work
No ratings yet
How Does Database Indexing Work
4 pages
Unit 1-Linear Data Structures: Linked List Implementation
No ratings yet
Unit 1-Linear Data Structures: Linked List Implementation
44 pages
PGEE Coding Questions Lists
No ratings yet
PGEE Coding Questions Lists
4 pages
Lecture - 3 On Data Structures: Array
No ratings yet
Lecture - 3 On Data Structures: Array
30 pages
LS Paper
No ratings yet
LS Paper
18 pages
Chapters 1 9
No ratings yet
Chapters 1 9
9 pages
CSE246 Report
No ratings yet
CSE246 Report
10 pages
DSA Practical 5
No ratings yet
DSA Practical 5
11 pages
3 Recitation StochasticGradientDescent
No ratings yet
3 Recitation StochasticGradientDescent
10 pages
AI & ML Assignment Questions
No ratings yet
AI & ML Assignment Questions
3 pages
CP Question Paper
No ratings yet
CP Question Paper
2 pages
Questions
No ratings yet
Questions
2 pages
PCC-CS494 Rakesh Manna
No ratings yet
PCC-CS494 Rakesh Manna
1 page
Adobe Scan 29 Jul 2025
No ratings yet
Adobe Scan 29 Jul 2025
1 page

Hierarchical Clusters

Uploaded by

Hierarchical Clusters

Uploaded by

Why hierarchical clustering?

Hierarchical clustering is a technique used to group similar data points

Hierarchical Agglomerative Clustering

Hierarchical Divisive clustering

Computing Distance Matrix

Implementations code for Distance Matrix

X = np.array([[1, 2], [1, 4], [1, 0],

Z = linkage(X, 'ward') # Ward Distance

dendrogram(Z) #plotting the dendogram

plt.title('Hierarchical Clustering Dendrogram')

Hierarchical Clustering Dendrogram

You might also like