Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
27 views5 pages

Module 2 Lab 4

The document provides a detailed explanation of t-Distributed Stochastic Neighbor Embedding (t-SNE), an unsupervised dimensionality reduction technique for visualizing high-dimensional data. It outlines the steps of t-SNE, practical applications, hyperparameter tuning, and best practices for effective use. Key takeaways emphasize t-SNE's effectiveness in revealing data clusters and the importance of experimenting with hyperparameters like perplexity and iterations.

Uploaded by

katrao39798
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views5 pages

Module 2 Lab 4

The document provides a detailed explanation of t-Distributed Stochastic Neighbor Embedding (t-SNE), an unsupervised dimensionality reduction technique for visualizing high-dimensional data. It outlines the steps of t-SNE, practical applications, hyperparameter tuning, and best practices for effective use. Key takeaways emphasize t-SNE's effectiveness in revealing data clusters and the importance of experimenting with hyperparameters like perplexity and iterations.

Uploaded by

katrao39798
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Detailed Explanation of Module 2 Lab 4: t-Distributed Stochastic Neighbor

Embedding (t-SNE)
(Fully Updated with Lab Content and Your Queries)

Section 1: What is t-SNE and Why Use It?


t-SNE stands for t-Distributed Stochastic Neighbor Embedding.
It is an unsupervised, non-linear dimensionality reduction technique, mainly used for
visualizing high-dimensional data in 2D or 3D.
Developed by Laurens van der Maaten and Geoffrey Hinton in 2008, t-SNE helps reveal
patterns, clusters, and relationships in data that are not visible in tables or with linear
methods like PCA [1] .

Section 2: How Does t-SNE Work? (Step-by-Step with Example)


t-SNE works in three main steps, each with a clear purpose and effect:

Step 1: Measure Similarities in High-Dimensional Space


For every pair of data points, t-SNE centers a Gaussian distribution over each point and
measures how dense the other points are under this distribution.
This process produces a set of probabilities (Pij) that reflect how likely points are to be
neighbors in the original high-dimensional space.
The perplexity parameter controls the size of the neighborhood considered for each point
(think of it as a guess of how many close neighbors each point has). Typical values are
between 5 and 50 [1] .
Example:
Suppose you have 1,797 images of handwritten digits (each 8×8 pixels, so 64 features). For
each image, t-SNE calculates its similarity to every other image based on their pixel values,
resulting in a matrix of probabilities that encode local structure.

Step 2: Measure Similarities in Low-Dimensional Space


t-SNE maps all points to a 2D or 3D space, initially at random.
It then computes a new set of probabilities (Qij) using a Student t-distribution (with heavier
tails than a Gaussian).
The heavy tails allow distant points to be modeled more flexibly, helping clusters spread out
and avoid crowding [1] .
Example:
The same digit images are now points on a 2D plane. t-SNE computes how close they are using
the t-distribution, building a new matrix of similarities for the low-dimensional space.

Step 3: Match the Two Probability Distributions


t-SNE tries to make the low-dimensional similarities (Qij) match the high-dimensional
similarities (Pij) as closely as possible.
It does this by minimizing the Kullback-Leibler (KL) divergence between the two
distributions, using gradient descent.
The points are moved around iteratively until the best match is found, preserving local
neighborhoods and revealing clusters.
Example:
As optimization proceeds, images of the digit "3" are pulled together, "8"s are grouped, and so
on. After enough iterations, the 2D plot shows well-separated clusters for each digit [1] .

Section 3: Practical Application – Visualizing Digits


Dataset: 1,797 handwritten digit images (0–9), each 8×8 pixels (64 features).
Goal: Visualize how the digits cluster in 2D using t-SNE.
Code Example:

from sklearn.manifold import TSNE


from sklearn.datasets import load_digits
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

digits = load_digits()
X = np.vstack([digits.data[digits.target == i] for i in range(10)])
y = np.hstack([digits.target[digits.target == i] for i in range(10)])

tsne = TSNE(init="pca", random_state=20150101, n_components=2, perplexity=30, n_iter=1000


digits_proj = tsne.fit_transform(X)

palette = np.array(sns.color_palette("hls", 10))


plt.figure(figsize=(8,8))
plt.scatter(digits_proj[:,0], digits_proj[:,1], c=palette[y.astype(int)], s=40)
plt.title("t-SNE visualization")
plt.show()

Interpretation: Each color is a digit. t-SNE clusters similar digits together, making the
structure visible [1] .
Section 4: Understanding and Tuning t-SNE Hyperparameters
Parameter What It Does Typical Values / Notes

n_components Output dimensions (usually 2 or 3 for visualization) 2 or 3

perplexity Controls neighborhood size (local vs. global structure) 5–50 (try several values)

n_iter Number of optimization steps (iterations) ≥250 (usually 1000 or more)

Optimization algorithm (‘barnes_hut’ is fast, ‘exact’ is ‘barnes_hut’ for large


method
slower) datasets

Effect of Perplexity:
Low perplexity (e.g., 5): Focuses on very local structure; clusters may be small and tight.
High perplexity (e.g., 100): Considers more global structure; clusters may merge or lose
detail.
Best practice: Try several values and compare results. Perplexity should be less than the
number of points [1] .

Effect of n_iter (Iterations):


Too few iterations: The plot may not stabilize; clusters may look “pinched” or not well
separated.
More iterations: Allows the optimization to converge and produce a clearer map.
Best practice: Iterate until the configuration is stable (often ≥1000) [1] .

Effect of Method:
‘barnes_hut’: Fast, approximate, O(NlogN) time; good for large datasets.
‘exact’: Slower, O(N²) time; more accurate but computationally expensive [1] .

Section 5: Visualizing the Effects of Hyperparameters


Changing Perplexity:
Perplexity 5: Local clusters dominate, but global structure may be lost.
Perplexity 30: Balanced, clear clusters for each digit.
Perplexity 100: Clusters may merge, and points from different digits may mix.
Changing Iterations:
10, 20, 60, 120 steps: Clusters are not yet formed; plots look unstable or “pinched.”
1000 steps: Well-separated, stable clusters.
5000 steps: Similar to 1000, but clusters may be denser [1] .
Section 6: Best Practices and Limitations
Randomness:
t-SNE results can vary between runs due to random initialization. Use random_state for
reproducibility.
No True Clustering:
t-SNE is not a clustering algorithm; it only helps visualize clusters.
Interpretation:
The axes in a t-SNE plot have no intrinsic meaning; only the relative positions and
groupings matter.
Parameter Sensitivity:
Results can change with different perplexity or iteration settings. Always experiment
with several values [1] .

Section 7: Summary Table


Step What Happens Example (Digits)

Compute similarities (Pij) in high-dimensional space using How likely two digit images are
1
Gaussian neighbors

Compute similarities (Qij) in low-dimensional space using Initial random 2D positions for each
2
Student-t image

Minimize KL divergence between Pij and Qij (gradient Points move until clusters of digits
3
descent) form

Section 8: Exercises and Exploration


Try different perplexity and iteration values to see how the visualization changes.
Use t-SNE for exploration, not for clustering or modeling directly.
Combine with other techniques: Use t-SNE after PCA or as a preprocessing step for
visualization [1] .

Section 9: Key Takeaways


t-SNE is a powerful tool for exploring and visualizing high-dimensional data.
It excels at revealing clusters and local structure.
Hyperparameters like perplexity and n_iter greatly influence the results—experiment with
them!
t-SNE is best for visualization and data exploration, not as a preprocessing step for
modeling or clustering [1] .
If you want a deeper explanation of any step, or want to see code for a particular part, just
ask!

1. AIML_Module_2_Lab_4_t_SNE.ipynb-Colab.pdf

You might also like