Singular Value Decomposition
(SVD) and Its Applications
Jiaxing Shen
5 April 2022
Outline
• Why SVD?
• Preliminaries
• Understand SVD
• Popular Applications
Four Ability Levels of Using a Technique
What required in original
and novel research
Extend Extend
• How is it related to other techniques
• What are other potential scenarios Optimize Optimize
• How to improve it: efficiency, reliability,
Implement distributed environments
Implement
• How to implement
• What are pros of cons Understand Understand
• What it is
Ability of using a technique • Where it could be applied
Focus of the talk
Why SVD?
§ SVD is the foundation of Recommender Systems that are at the heart of
huge companies like Google, YouTube, Amazon, Facebook, Netflix…
§ “… it (SVD) is not nearly as famous as it should be.” -- Gilbert Strang
Figure from Google Ngram Viewer
Why SVD?
§ SVD is an enduringly popular technique
Data from DBLP by searching “Singular value decomposition” and “SVD”
Why SVD
• Even now still appear in top venues like NeurIPS, ICML, TKDE….
Preliminaries: How to see a matrix?
§ As a simple yet useful data structure,
e.g., an image as a matrix
§ As a linear equation
§ As a mappings of vectors
2 3
1 2
A[m⇥n] , f (.) : IRn ! IRm 41 35 : IR2 ! IR3
1 4
§ As a linear transformation
Preliminaries: Matrix as linear transformations
(A) stretched, (B) compressed, (C) rotated,
(D) reflected or flipped, and (E) sheared.
M’ is a non-linear transformation!
SVD Theorem
T
A[m⇥n] = U[m⇥m] ⌃[m⇥n] V[n⇥n]
Orthogonal matrix Diagonal matrix
1
Q = QT Q : qij = 0 if i 6= j Sigma: ordered
by importance of
eigenfaces
A scale
transformation
A =
M is a matrix of Each columns of Each row of V
= million faces, a U (left singular (right singular
column vector is vectors) is a vectors) is a
U ⌃ VT a face image “eignfaces” mixture of Us to
make a real face
Geometric Explanation of SVD
Any linear transformations can be
represented by rotation, scaling, and
rotation again.
T
A[m⇥n] = U[m⇥m] ⌃[m⇥n] V[n⇥n]
r is rank of A
T
A[m⇥n] = U[m⇥r] ⌃[r⇥r] V[r⇥n]
SVD – Properties
• Always possible to decompose any real matrix A = U ⌃V T
• Singular values in Sigma are all positive, and sorted in descending order
• The number of nonzero singular values of A equals the rank of A
• Matrix could be approximated using the first r singular values and the
corresponding singular vectors
• The sum of the first r singular values over the total sum of singular values
indicate the approximation ratio
Popular applications of SVD
• Noise removal
• Data compression
• Dimension reduction
• Latent semantic analysis
Noise removal
Motivation
• Sensors are vulnerable to noises, sensor
readings are thus inaccurate
How
• Set those singular values that are smaller than a
threshold to zero
Use a scanner to scan a picture M[25⇥15] ⇡ u1 T
1 v1 + u2 T
2 v2 + u3 T
3 v3
The scanned picture is with too much noise
Original Scanned Denoised
Noise!
M[25⇥15]
Data compression
Motivation
• Resources are limited in IoT devices
• Storage
• Bandwidth
How
• Choose n largest singular values ⌃0, n-rank approximation A ⇡ U ⌃0 V T
• The percentage of “information” contained in the approximation
matrix is sum(⌃0 )
sum(⌃)
Data compression
Percentage
Singular values
Dimension reduction
Motivation
• Visualize high dimensional data
• Redundant features
• Combine features
How PCA algorithm
• Principal component analysis 1. Formulate an input matrix A
(PCA) 2. Computer covariance matrix C of A
3. [U S V] = svd(C)
Latent semantic analysis
Motivation LSA or LSI (latent semantic indexing)
• Polysemy (one word with multiple means analyzing documents to find the
concepts) underlying meaning or concepts of
those documents.
• Synonymy (multiple words with one
concept)
How
• Word matrix (term frequency)
• Fill in missing values
• SVD
C1 C2 C3
The strength of the concept
C1
C2
C3
A row represents the relevance or
important of articles in the concept
A column represents the relevance or
important of books in the concept
Concept refers to a category of articles, like
business, history, or literature.
SVD as a feature extractor
Other matrix factorizations
References
• Gilbert Strang, Linear Algebra and Its Applications. Brooks Cole.
• http://andrew.gibiansky.com/blog/mathematics/cool-linear-algebra-singular-value-decomposition/
• https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/
• http://www.ams.org/samplings/feature-column/fcarc-svd
• https://ccjou.wordpress.com/2009/09/01/%E5%A5%87%E7%95%B0%E5%80%BC%E5%88%86
%E8%A7%A3-svd/
• https://www.quora.com/What-is-a-good-explanation-of-Latent-Semantic-Indexing-LSI