26-08-2025
Matrix Algebra and Calculus:
Review
Reference
• Matrix calculus, Wiki http://en.wikipedia.org/wiki/Matrix_calculus
• The Matrix Cookbook
http://www.imm.dtu.dk/pubdb/views/edoc_download.php/3274/pdf/im
m3274.pdf
• Video lectures on Linear Algebra by Prof. Gilbert Strang, MIT
https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-
2010/video-lectures/
1
26-08-2025
Basic concepts
◼ Vector in Rn is an ordered
1
set of n real numbers.
6
◼ e.g. v = (1,6,3,4) is in R4 3
◼ A column vector: 4
◼ A row vector:
(1 6 3 4)
◼ m-by-n matrix is an object
in Rmxn with m rows and n
columns, each entry filled 1 2 8
with a (typically) real 4 78 6
number: 9 3 2
Notation 4
◼ Matrix: 𝐀, 𝐗, 𝐘
◼ bold capital letter
𝑛 columns
◼ Vector: 𝐚, 𝐱, 𝐲(column) 𝐴11 𝐴12 𝐴1𝑛
𝑚 rows ⋯
◼ boldface lowercase letter 𝐴21 𝐴22 𝐴2𝑛
𝐀=
𝑚 × 𝑛 matrix ⋮ ⋱ ⋮
◼ Scalar: 𝑎, 𝑥, 𝑦 𝐴𝑚1 𝐴𝑚2 ⋯ 𝐴𝑚𝑛
◼ lowercase itatilc typeface 𝑚 × 1 vector 1 × 1 scalar
◼ Transpose: 𝐀𝑇 , 𝐚𝑇
◼ Trace: tr 𝐀
◼ tr 𝐀 = 𝐴11 + 𝐴22 + ⋯ + 𝐴𝑛𝑛 = σ𝑛𝑖=1 𝐴𝑖𝑖
◼ Determinant: det 𝐀
Slide Courtesy: Po-Chen Wu
2
26-08-2025
Properties of Transpose 5
◼ 𝐀𝑇 𝑇 =𝐀
◼ 𝐀+𝐁 𝑇 = 𝐀𝑇 + 𝐁 𝑇
◼ 𝐀+𝐁+𝐂 𝑇 = 𝐀𝑇 + 𝐁 𝑇 + 𝐂 𝑇
◼ 𝑟𝐀 𝑇 = 𝑟𝐀𝑇
◼ 𝐀𝐁 𝑇 = 𝐁 𝑇 𝐀𝑇
◼ 𝐀𝐁𝐂 𝑇 = 𝐂 𝑇 𝐁 𝑇 𝐀𝑇
Basic concepts
Remember the good, old Euclidean distance?
3
26-08-2025
Basic concepts
We will use lower case letters for vectors The elements are
referred by xi.
◼ Vector dot (inner) product:
If u•v=0, ||u||2 != 0, ||v||2 != 0 → u and v are orthogonal
If u•v=0, ||u||2 = 1, ||v||2 = 1 → u and v are orthonormal
◼ Vector outer product:
4
26-08-2025
Now can we visualize the column space of a matrix?
5
26-08-2025
A much trickier representation:
Viewing Matrix-Matrix Product in terms of Vector Matrix Product:
And
6
26-08-2025
Special matrices
a 0 0 a b c
0 b 0 diagonal 0 d e upper-triangular
0 0 c 0 0
f
a b 0 0 a 0 0
c d e 0
b c 0 lower-triangular
0 tri-diagonal
f g h d e f
0 j
0 i
1 0 0
0 1 0 I (identity matrix)
0 0 1
7
26-08-2025
8
26-08-2025
Matrix factorization: a key method for designing recommender systems
9
26-08-2025
10
26-08-2025
Eigenvalues and Eigenvectors
◼ Eigenvalue problem (one of the most important problems in the
linear algebra):
If A is an nn matrix, do there exist nonzero vectors x in Rn
such that Ax is a scalar multiple of x?
(The term eigenvalue is from the German word Eigenwert, meaning
“proper value”)
◼ Eigenvalue and Eigenvector :
A: an nn matrix
: a scalar (could be zero) ※ Geometric Interpretation
y
x: a nonzero vector in Rn Ax = x
Eigenvalue
Ax = x
x
Eigenvector
x
Trace & Determinant 22
◼ If 𝐀is a square 𝑛−by−𝑛matrix and if 𝜆1 , ⋯ , 𝜆𝑛 are the eigenvalues of 𝐀, then
◼ tr 𝐀 = 𝐴11 + 𝐴22 + ⋯ + 𝐴𝑛𝑛 = σ𝑛𝑖=1 𝐴𝑖𝑖 = σ𝑛𝑖=1 𝜆𝑖
◼ det 𝐀 = σ𝑛𝑖=1 −1 𝑖+𝑗
𝑎𝑖,𝑗 𝑀𝑖,𝑗 = ς𝑛𝑖=1 𝜆𝑖
◼ Minor𝑀𝑖,𝑗 :
the determinant of the 𝑛 − 1 × 𝑛 − 1 -matrix that results
from𝐀 by removing the 𝑖th row and the 𝑗th column.
◼ Cofactor 𝐶𝑖,𝑗 : −1 𝑖+𝑗
𝑀𝑖,𝑗
𝐶11 𝐶12 𝐶1𝑛
⋯ 𝐂 𝑇
𝐶21 𝐶22 𝐶2𝑛
◼ Cofactor matrix: 𝐂 = 𝐀−1 =
det 𝐀
⋮ ⋱ ⋮
𝐶𝑛1 𝐶𝑛2 ⋯ 𝐶𝑛𝑛
11
26-08-2025
The Eigen-decompostion
Matrix calculus is a specialized notation for doing
multivariable calculus, especially over spaces of matrices.
Slides from: Fin500J Mathematical Foundations in Finance, Philip H. Dybvig 24
12
26-08-2025
1.1 Derivative of Vector with Respect to Vector
This is the tangent matrix or often referred as the Jacobian matrix.
25
1.2 Derivative of a Scalar with Respect to Vector
If y is a scalar
It is also called the gradient of y with respect to a vector variable x, denoted by
y
1.3 Derivative of Vector with Respect to Scalar
26
13
26-08-2025
Can we “see” the gradients?
Nice Youtube video is here:
https://www.youtube.com/watch?v=W6aDzrrLAzQ
27
28
14
26-08-2025
Example 1
Given
and
29
Some useful vector derivative formulas
n xC
C11 C12 C1n x1 t =1
t 1t
C
C2 n x2 xt C2t
n
21 C22 = t =1
Cx n
= CT Cn1 Cn 2 Cnn xn xt Cnt
x t =1
xT C
=C
x
xT x
= 2x c11 c21 cn1
x
Cx c12 c22 cn 2
Homework = = CT
x
c c2 n cnn
1n
30
15
26-08-2025
Important Property of Quadratic Form xTCx
(xTCx)
= ( C + CT ) x
x n xC
C11 C12 C1n x1 t =1
t 1t
Proof: C
C2 n x2 xt C2t
n
21 C22 = t =1
n n
xT Cx = xi ( x j Cij )
n
Cn1 Cn 2 Cnn xn xt Cnt
i =1 j =1 t =1
n n
xi ( x j Cij ) n
n
xk ( x j Ckj ) xi xk Cik
(x Cx)
T
i =1 j =1
j =1
i =1
= = +
xk xk xk xk
n n
= x j Ckj + xi Cik
j =1 i =1
(xT Cx)
= Cx + CT x = ( C + CT ) x
x
If C is symmetric, (xTCx)
= 2C x
x 31
2 The Chain Rule for Vector Functions
Let
where z is a function of y, which is in turn a function of x, we can write
Each entry of this matrix may be expanded as
32
16
26-08-2025
The Chain Rule for Vector Functions (Cont.)
Then
On transposing both sides, we finally obtain
This is the chain rule for vectors (different from the conventional chain
rule of calculus, the chain of matrices builds toward the left)
33
Example 2
x, y are as in Example 1 and z is a function of y defined as
z1 z1 = y12 − 2 y2
z2 = y2 − y1
2
z2
z = , and , we have
z3
z3 = y1 + y2
2 2
z = 2 y + y
z4 4 1 2
z1 z2 z3 z4
y1 2 y1 −1 2 y1 2
z y1 y1 y1
= = .
y z1 z2 z3 z4 −2 2 y2 2 y2 1
y2 y2 y2 y2
Therefore,
2 x1 0 4 x1 y1 −2 x1 4 x1 y1 4 x1
z y z 2 y1 −1 2 y1 2
= = −1 3 = −2 y1 − 6 1 + 6 y2 − 2 y2 + 6 y2 1
x x y −2 2 y2 2 y2 1 −4 x
0 2 x3 3 4 x3 y2 4 x3 y2 2 x3
34
17
26-08-2025
List of Differentiation
• Result of differentiating various kinds of aggregates with other kinds of aggregates.
Scalar𝑦 Vector𝐲(size𝑚) Matrix𝐘(size𝑚 × 𝑛)
Notation Type Notation Type Notation Type
𝜕𝑦 𝜕𝐲 size-𝑚column 𝜕𝐘
Scalar 𝑥 scalar 𝑚 × 𝑛 matrix
𝜕𝑥 𝜕𝑥 vector 𝜕𝑥
Vector𝐱(siz 𝜕𝑦 size-𝑛 𝜕𝐲 𝜕𝐘
𝑚 × 𝑛 matrix −
e𝑛) 𝜕𝐱 row vector 𝜕𝐱 𝜕𝐱
Matrix 𝜕𝑦 𝜕𝐲 𝜕𝐘
𝑞 × 𝑝matrix − −
𝐗(size𝑝 × 𝑞) 𝜕𝐗 𝜕𝐗 𝜕𝐗
35
Derivative Formulas
𝜕𝐲
𝐲
𝜕𝐱
𝐀𝐱 𝐀
𝐱𝑇 𝐀 𝐀𝑇
𝐱𝑇 𝐱 2𝐱 𝑇
𝐱 𝑇 𝐀𝐱 𝐱 𝑇 𝐀 + 𝐱 𝑇 𝐀𝑇
• Hint: Derive 𝐱
• If you have to differentiate 𝐱 𝑇 , transpose the rest.
• If you have two 𝐱-terms, differentiate them separatelyin turn
and then sum up the two derivatives.
36
18