0% found this document useful (0 votes)

151 views36 pages

Matrix Differentiation Rules and Application

1) Linear fitting finds a linear function that minimizes the error between predicted and actual values of data points. 2) The linear fitting problem can be expressed compactly as minimizing the squared error between the matrix product of the data matrix and coefficients vector, and the actual values vector. 3) Taking the derivative of the error function with respect to the coefficients and setting it equal to zero yields the solution, but doing so using the explicit expansion is tedious. Matrix derivatives provide a more efficient approach.

Uploaded by

Doru Irimescu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

151 views36 pages

Matrix Differentiation Rules and Application

Uploaded by

Doru Irimescu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Matrix Differentiation

CS5240 Theoretical Foundations in Multimedia

Leow Wee Kheng

Department of Computer Science

School of Computing
National University of Singapore

Leow Wee Kheng (NUS) Matrix Differentiation 1 / 36

Linear Fitting Revisited

Linear fitting solves this problem:
Given n data points pi = [xi1 · · · xim ]⊤ , 1 ≤ i ≤ n, and their
corresponding values vi , find a linear function f that
minimizes the error
n
X
E= (f (pi ) − vi )2 . (1)
i=1

The linear function f (pi ) has the form

f (p) = f (x1 , . . . , xm ) = a1 x1 + · · · + am xm + am+1 . (2)

Leow Wee Kheng (NUS) Matrix Differentiation 2 / 36

Linear Fitting Revisited

The data points are organized into a matrix equation

D a = v, (3)

where
 
  a1  
x11 · · · x1m 1 .. v1
 .. .. ..  , a =  ..  . (4)
 
D= . .. .  , and v = 
 
. . .   . 
 am 
xn1 · · · xnm 1 vn
am+1

The solution of Eq. 3 is

a = (D⊤ D)−1 D⊤ v. (5)

Leow Wee Kheng (NUS) Matrix Differentiation 3 / 36

Linear Fitting Revisited

Denote each row of D as d⊤

i . Then,

n
X
2 2
E= (d⊤
i a − vi ) = kD a − vk . (6)
i=1

So, linear least squares problem can be described very compactly as

min kD a − vk2 . (7)

To show that the solution in Eq. 5 minimizes error E, need to

differentiate E with respect to a and set it to zero:
dE
= 0. (8)
da
How to do this differentiation?

Leow Wee Kheng (NUS) Matrix Differentiation 4 / 36

Linear Fitting Revisited

The obvious (but hard) way:

 2
Xn X m
E=  aj xij + am+1 − vi  . (9)
i=1 j=1

Expand equation explicitly giving

  
 X n Xm




 2  aj xij + am+1 − vi  xik , for k 6= m + 1
 i=1 j=1
∂E 
=
∂ak
 


 X n Xm
2  aj xij + am+1 − vi  , for k = m + 1





i=1 j=1

Then, set ∂E/∂ak = 0 and solve for ak .

This is slow, tedious and error prone!

Leow Wee Kheng (NUS) Matrix Differentiation 5 / 36

Linear Fitting Revisited

Which one do you like to be?

Leow Wee Kheng (NUS) Matrix Differentiation 6 / 36

Linear Fitting Revisited

At least like these?

Leow Wee Kheng (NUS) Matrix Differentiation 7 / 36

Matrix Derivatives

Matrix Derivatives
There are 6 common types of matrix derivatives:

Type Scalar Vector Matrix

∂y ∂y ∂Y
Scalar
∂x ∂x ∂x
∂y ∂y
Vector
∂x ∂x
∂y
Matrix
∂X

Leow Wee Kheng (NUS) Matrix Differentiation 8 / 36

Matrix Derivatives

Derivatives by Scalar

Numerator Layout Notation Denominator Layout Notation

∂y ∂y
∂x ∂x
 
∂y1

∂y  ∂x 
∂y

∂y1 ∂ym

∂y⊤
= .. 
= ··· ≡
.

∂x   ∂x ∂x ∂x ∂x
 ∂ym 
∂x
 
∂y11 ∂y1n
···
∂Y 
 ∂x ∂x 
= .. .. .. 
. . .

∂x  
 ∂ym1 ∂ymn 
···
∂x ∂x

Leow Wee Kheng (NUS) Matrix Differentiation 9 / 36

Matrix Derivatives

Derivatives by Vector

Numerator Layout Notation Denominator Layout Notation

∂y
 
 ∂x1 
∂y ∂y ∂y ∂y  . 
= ··· = .. 
∂x ∂x1 ∂xn ∂x   ∂y


∂xn
∂y1 ∂y1 ∂y1 ∂ym
   
··· ···
 ∂x1 ∂xn   ∂x1 ∂x1 
∂y  .. .. ∂y 
.. =  ... .. ..
 
= . . .
 . .

∂x   ∂x  
 ∂ym ∂ym   ∂y1 ∂ym 
··· ···
∂x1 ∂xn ∂xn ∂xn
∂y ∂y⊤
≡ ≡
∂x⊤ ∂x

Leow Wee Kheng (NUS) Matrix Differentiation 10 / 36

Matrix Derivatives

Derivative by Matrix

Numerator Layout Notation Denominator Layout Notation

∂y ∂y ∂y ∂y
   
··· ···
 ∂x11 ∂xm1   ∂x11 ∂x1n 
∂y  . . ..  ∂y  . . .. 
= .. .. .
 = .. .. .

∂X    ∂X   
∂y ∂y  ∂y ∂y 
··· ···
∂x1n ∂xmn ∂xm1 ∂xmn
∂y ∂y
≡ ≡
∂X⊤ ∂X

Leow Wee Kheng (NUS) Matrix Differentiation 11 / 36

Matrix Derivatives

Pictorial Representation

. . .
. . .

numerator .
layout

denominator .
layout

Leow Wee Kheng (NUS) Matrix Differentiation 12 / 36

Matrix Derivatives

Caution
◮ Most books and papers don’t state which convention they use.
◮ Reference [2] uses both conventions but clearly differentiate them.
∂y
 
 ∂x1 
∂y ∂y ∂y ∂y  . 
= ··· = .. 
∂x⊤ ∂x1 ∂xn ∂x  
 ∂y 
∂xn
∂y1 ∂y1 ∂y1 ∂ym
   
··· ···
 ∂x1 ∂xn  ⊤
 ∂x1 ∂x1 
∂y  . . .  ∂y  . . .. 
⊤
=  ..
 .. ..  =  ..
 .. . 

∂x  ∂ym ∂x
∂ym   ∂y1 ∂ym 
··· ···
∂x1 ∂xn ∂xn ∂xn

◮ It is best not to mix the two conventions in your equations.

◮ We adopt numerator layout notation.
Leow Wee Kheng (NUS) Matrix Differentiation 13 / 36
Matrix Derivatives Commonly Used Derivatives

Commonly Used Derivatives

Here, scalar a, vector a and matrix A are not functions of x and x.
da
(C1) = 0 (column matrix)
dx

da
(C2) = 0⊤ (row matrix)
dx

da
(C3) = 0⊤ (matrix)
dX

da
(C4) = 0 (matrix)
dx

dx
(C5) =I
dx
Leow Wee Kheng (NUS) Matrix Differentiation 14 / 36
Matrix Derivatives Commonly Used Derivatives

d a⊤ x d x⊤ a
(C6) = = a⊤
dx dx

dx⊤ x
(C7) = 2 x⊤
dx

d(x⊤ a)2
(C8) = 2 x ⊤ a a⊤
dx

dAx
(C9) =A
dx

d x⊤A
(C10) = A⊤
dx

d x⊤Ax
(C11) = x⊤ (A + A⊤ )
dx
Leow Wee Kheng (NUS) Matrix Differentiation 15 / 36
Matrix Derivatives Math Notation

Math Notation
We represent a vector x as a column matrix
 
x1
 x2 
x =  . .
 
.
 . 
xm

Its transpose x⊤ is a row matrix

x ⊤ = x1 x2 · · ·

xm .

Leow Wee Kheng (NUS) Matrix Differentiation 16 / 36

Matrix Derivatives Math Notation

Consider two vectors x and y with the same number of components.

Their inner product x⊤ y is actually a 1×1 matrix:

x⊤ y = [ s ]

where
m
X
s= x i yi .
i=1

For notational inconvenience, we usually drop the matrix and

regard the inner product as a scalar, i.e.,
m
X
x⊤ y = x i yi .
i=1

Leow Wee Kheng (NUS) Matrix Differentiation 17 / 36

Matrix Derivatives Derivatives of Scalar by Scalar

Derivatives of Scalar by Scalar

∂(u + v) ∂u ∂v
(SS1) = +
∂x ∂x ∂x

∂uv ∂v ∂u
(SS2) =u +v (product rule)
∂x ∂x ∂x

∂g(u) ∂g(u) ∂u
(SS3) = (chain rule)
∂x ∂u ∂x

∂f (g(u)) ∂f (g) ∂g(u) ∂u

(SS4) = (chain rule)
∂x ∂g ∂u ∂x

Leow Wee Kheng (NUS) Matrix Differentiation 18 / 36

Matrix Derivatives Derivatives of Vector by Scalar

Derivatives of Vector by Scalar

∂au ∂u
(VS1) =a
∂x ∂x
where a is not a function of x.

∂Au ∂u
(VS2) =A
∂x ∂x
where A is not a function of x.

⊤
∂u⊤

∂u
(VS3) =
∂x ∂x

∂(u + v) ∂u ∂v
(VS4) = +
∂x ∂x ∂x
Leow Wee Kheng (NUS) Matrix Differentiation 19 / 36
Matrix Derivatives Derivatives of Vector by Scalar

∂g(u) ∂g(u) ∂u
(VS5) = (chain rule)
∂x ∂u ∂x
with consistent matrix layout.

∂f (g(u)) ∂f (g) ∂g(u) ∂u

(VS6) = (chain rule)
∂x ∂g ∂u ∂x
with consistent matrix layout.

Leow Wee Kheng (NUS) Matrix Differentiation 20 / 36

Matrix Derivatives Derivatives of Matrix by Scalar

Derivatives of Matrix by Scalar

∂aU ∂U
(MS1) =a
∂x ∂x
where a is not a function of x.

∂AUB ∂U
(MS2) =A B
∂x ∂x
where A and B are not functions of x.

∂(U + V) ∂U ∂V
(MS3) = +
∂x ∂x ∂x

∂UV ∂V ∂U
(MS4) =U + V (product rule)
∂x ∂x ∂x
Leow Wee Kheng (NUS) Matrix Differentiation 21 / 36
Matrix Derivatives Derivatives of Scalar by Vector

Derivatives of Scalar by Vector

∂au ∂u
(SV1) =a
∂x ∂x
where a is not a function of x.

∂(u + v) ∂u ∂v
(SV2) = +
∂x ∂x ∂x

∂uv ∂v ∂u
(SV3) =u +v (product rule)
∂x ∂x ∂x

∂g(u) ∂g(u) ∂u
(SV4) = (chain rule)
∂x ∂u ∂x

∂f (g(u)) ∂f (g) ∂g(u) ∂u

(SV5) = (chain rule)
∂x ∂g ∂u ∂x
Leow Wee Kheng (NUS) Matrix Differentiation 22 / 36
Matrix Derivatives Derivatives of Scalar by Vector

∂u⊤ v ∂v ∂u
(SV6) = u⊤ + v⊤ (product rule)
∂x ∂x ∂x
∂u ∂v
where and are in numerator layout.
∂x ∂x
∂u⊤Av ∂v ∂u
(SV7) = u⊤A + v⊤A⊤ (product rule)
∂x ∂x ∂x
∂u ∂v
where and are in numerator layout,
∂x ∂x
and A is not a function of x.

Leow Wee Kheng (NUS) Matrix Differentiation 23 / 36

Matrix Derivatives Derivatives of Scalar by Matrix

Derivatives of Scalar by Matrix

∂au ∂u
(SM1) =a
∂X ∂X
where a is not a function of X.

∂(u + v) ∂u ∂v
(SM2) = +
∂X ∂X ∂X

∂uv ∂v ∂u
(SM3) =u +v (product rule)
∂X ∂X ∂X

∂g(u) ∂g(u) ∂u
(SM4) = (chain rule)
∂X ∂u ∂X

∂f (g(u)) ∂f (g) ∂g(u) ∂u

(SM5) = (chain rule)
∂X ∂g ∂u ∂X
Leow Wee Kheng (NUS) Matrix Differentiation 24 / 36
Matrix Derivatives Derivatives of Vector by Vector

Derivatives of Vector by Vector

∂au ∂u ∂a
(VV1) =a +u (product rule)
∂x ∂x ∂x

∂Au ∂u
(VV2) =A
∂x ∂x
where A is not a function of x.

∂(u + v) ∂u ∂v
(VV3) = +
∂x ∂x ∂x

∂g(u) ∂g(u) ∂u
(VV4) = (chain rule)
∂x ∂u ∂x

∂f (g(u)) ∂f (g) ∂g(u) ∂u

(VV5) = (chain rule)
∂x ∂g ∂u ∂x
Leow Wee Kheng (NUS) Matrix Differentiation 25 / 36
Matrix Derivatives Notes on Denominator Layout

Notes on Denominator Layout

In some cases, the results of denominator layout are the transpose of
those of numerator layout. Moreover, the chain rule for denominator
layout goes from right to left instead of left to right.

Numerator Layout Notation Denominator Layout Notation

da⊤ x da⊤ x
(C7) = a⊤ =a
dx dx
dx⊤Ax dx⊤Ax
(C11) = x⊤ (A + A⊤ ) = (A + A⊤ )x
dx dx
∂f (g(u)) ∂f (g) ∂g(u) ∂u ∂f (g(u)) ∂u ∂g(u) ∂f (g)
(VV5) = =
∂x ∂g ∂u ∂x ∂x ∂x ∂u ∂g

Leow Wee Kheng (NUS) Matrix Differentiation 26 / 36

Matrix Derivatives Derivations of Derivatives

Derivations of Derivatives
d a⊤ x d x⊤ a
(C6) = = a⊤
dx dx
(The not-so-hard way)
∂s ds
Let s = a⊤ x = a1 x1 + · · · + an xn . Then, = ai . So, = a⊤ .
∂xi dx
(The easier way)
X ∂s ds
Let s = a⊤ x = ai xi . Then, = ai . So, = a⊤ .
∂xi dx
i

dx⊤ x
(C7) = 2 x⊤
dx
X ∂s ds
Let s = x⊤ x = x2i . Then, = 2xi . So, = 2 x⊤ .
∂xi dx
i
Leow Wee Kheng (NUS) Matrix Differentiation 27 / 36
Matrix Derivatives Derivations of Derivatives

d(x⊤ a)2
(C8) = 2 x ⊤ a a⊤
dx
∂s2 ∂s ds2
Let s = x⊤ a. Then, = 2s = 2 s ai . So, = 2 x ⊤ a a⊤ .
∂xi ∂xi dx

dAx
(C9) =A
dx
(The hard
 way)    
a11 · · · a1n x1 a11 x1 + · · · + a1n xn
 .. .. ..   ..  =  ..
Ax =  . .

. .  .   .
an1 · · · ann xn an1 x1 + · · · + ann xn
(The easy way)
X ∂si ds
Let s = Ax. Then, si = aij xj , and = aij . So, = A.
∂xj dx
j
Leow Wee Kheng (NUS) Matrix Differentiation 28 / 36
Matrix Derivatives Derivations of Derivatives

d x⊤A
(C10) = A⊤
dx
Let y⊤ = x⊤ A, and aj denote the j-th column of A. Then, yi = x⊤ aj .

dyi dy⊤
Applying (C6) yields = a⊤
j . So, = A⊤ .
dx dx

d x⊤Ax
(C11) = x⊤ (A + A⊤ )
dx
d x⊤Ax dAx dx
Apply (SV6) to and obtain x⊤ + (Ax)⊤ ,
dx dx dx
Next, apply (C9) to the first part of the sum, and obtain
x⊤A + (Ax)⊤ , which is x⊤ (A + A⊤ ).
(Need to prove SV6—Homework.)

Leow Wee Kheng (NUS) Matrix Differentiation 29 / 36

Linear Fitting Revisited

Now, let us show that the solution

a = (D⊤ D)−1 D⊤ v.

minimizes error E
n
X
2 2
E= (d⊤
i a − vi ) = kDa − vk .
i=1

Proof:

E = kDa − vk2 = (Da − v)⊤ (Da − v)

= (a⊤ D⊤ − v⊤ )(Da − v)
= a⊤ D⊤ Da − a⊤ D⊤ v − v⊤ Da + v⊤ v.

Leow Wee Kheng (NUS) Matrix Differentiation 30 / 36

Linear Fitting Revisited

E = a⊤ D⊤ Da − a⊤ D⊤ v − v⊤ Da + v⊤ v.

Apply (C11), (C6), (C9) and (C2) to the four terms.

dE
= a⊤ (D⊤ D + D⊤ D) − (D⊤ v)⊤ − v⊤ D + 0
da
= 2 a⊤ D⊤ D − 2 v⊤ D.

Set dE/da = 0 and obtain

2 a⊤ D ⊤ D − 2 v ⊤ D = 0
a⊤ D ⊤ D = v ⊤ D

Transpose both sides of the equation and get

D⊤ D a = D⊤ v
a = (D⊤ D)−1 D⊤ v.
Leow Wee Kheng (NUS) Matrix Differentiation 31 / 36
Summary

Summary
◮ Matrix calculus studies calculus of matrices.
◮ There are 6 common derivatives of matrices.
◮ There are 2 competing notational convention:
numerator layout notation vs. denominator layout convention.
◮ We adopt numerator layout notation.
◮ Do not mix the two conventions in your equations.
◮ Use matrix differentiation to prove that pseudo-inverse minimizes
sum square error.

Leow Wee Kheng (NUS) Matrix Differentiation 32 / 36

Summary

Use the right tool

become lightning fast!
Leow Wee Kheng (NUS) Matrix Differentiation 33 / 36
Probing Questions

Probing Questions
◮ Is there a simple way to double check that the derivative result
makes sense?
◮ Why do we use sum square error for linear fitting? Can we use
other forms of errors?
◮ Six common types of matrix derivatives are discussed. Three other
types are left out. Can we work out the other derivatives, e.g.,
derivatives of vector by matrix or matrix by matrix?

Leow Wee Kheng (NUS) Matrix Differentiation 34 / 36

Homework

Homework
1. What are the key concepts that you have learned?
2. Prove the product rule SV3 using scalar product rule SS2.
∂uv ∂v ∂u
(SV3) =u +v
∂x ∂x ∂x
3. Prove the product rule SV6 using SV3.

∂u⊤ v ∂v ∂u
(SV6) = u⊤ + v⊤
∂x ∂x ∂x
∂u ∂v
where and are in numerator layout.
∂x ∂x

Leow Wee Kheng (NUS) Matrix Differentiation 35 / 36

References

References
1. J. E. Gentle, Matrix Algebra: Theory, Computations, and Applications in
Statistics, Springer, 2007.
2. H. Lütkepohl, Handbook of Matrices, John Wiley & Sons, 1996.
3. K. B. Petersen and M. S. Pedersen, The Matrix Cookbook, 2012.
www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf
4. Wikipedia, Matrix Calculus.
en.wikipedia.org/wiki/Matrix_calculus

Leow Wee Kheng (NUS) Matrix Differentiation 36 / 36

Linear System Theory 2E (Wilson J. Rugh)
88% (16)
Linear System Theory 2E (Wilson J. Rugh)
596 pages
Kerala University Complex Analysis Module 1
No ratings yet
Kerala University Complex Analysis Module 1
71 pages
Schonemann Trace Derivatives Presentation
No ratings yet
Schonemann Trace Derivatives Presentation
82 pages
Advanced Algebra for Distance Learners
100% (1)
Advanced Algebra for Distance Learners
82 pages
Caltech Math 5c Homework Solutions
No ratings yet
Caltech Math 5c Homework Solutions
30 pages
Assign 01
No ratings yet
Assign 01
4 pages
Real Analysis Study Notes
0% (1)
Real Analysis Study Notes
5 pages
P Lancaster The Theory of Matrices 2nd ED PDF
100% (1)
P Lancaster The Theory of Matrices 2nd ED PDF
587 pages
Finite-Dimensional Linear Algebra: Mark S
0% (1)
Finite-Dimensional Linear Algebra: Mark S
7 pages
Bellman - Introduction To Matrix Analysis, 2ed (Classics in Applied Mathematics) (2ed SIAM 1997)
100% (1)
Bellman - Introduction To Matrix Analysis, 2ed (Classics in Applied Mathematics) (2ed SIAM 1997)
434 pages
Matrix Differentiation
No ratings yet
Matrix Differentiation
34 pages
Humphreys - Linear Algebraic Groups
100% (1)
Humphreys - Linear Algebraic Groups
262 pages
Lab Maple8 Tutorial
No ratings yet
Lab Maple8 Tutorial
10 pages
Functional Analysis Master
No ratings yet
Functional Analysis Master
107 pages
Lecture Slides For Introduction To Applied Linear Algebra: Vectors, Matrices, and Least Squares
No ratings yet
Lecture Slides For Introduction To Applied Linear Algebra: Vectors, Matrices, and Least Squares
470 pages
Bergh & Löfström - Interpolation Spaces An Introduction
No ratings yet
Bergh & Löfström - Interpolation Spaces An Introduction
217 pages
Algebra Lecture Notes J K Verma
No ratings yet
Algebra Lecture Notes J K Verma
109 pages
Differential Equations A Problem Solving Approach Based On MATLAB by P. Mohana Shankar
No ratings yet
Differential Equations A Problem Solving Approach Based On MATLAB by P. Mohana Shankar
459 pages
Battery Monitor Project Report
100% (1)
Battery Monitor Project Report
24 pages
Tensors
50% (2)
Tensors
46 pages
gsm128 Endmatter
No ratings yet
gsm128 Endmatter
51 pages
Data Visualization With Ma Thematic A
No ratings yet
Data Visualization With Ma Thematic A
46 pages
Matrix Calculus PDF
No ratings yet
Matrix Calculus PDF
9 pages
Cayley Hamilton
100% (1)
Cayley Hamilton
5 pages
Problems and Solutions in Matrix Calculus
100% (1)
Problems and Solutions in Matrix Calculus
162 pages
Impedance Matching Basics Explained
No ratings yet
Impedance Matching Basics Explained
7 pages
Vector and Matrix Calculus: Herman Kamper 30 January 2013
No ratings yet
Vector and Matrix Calculus: Herman Kamper 30 January 2013
5 pages
A Second Course in Elementary Differential Equations PDF
No ratings yet
A Second Course in Elementary Differential Equations PDF
201 pages
Basic Construction Problems
No ratings yet
Basic Construction Problems
45 pages
Matrix Differentiation Rules and Application
No ratings yet
Matrix Differentiation Rules and Application
36 pages
Course Pack
No ratings yet
Course Pack
129 pages
Vector Analysis
No ratings yet
Vector Analysis
153 pages
2009-2010 PHD Student Handbook
100% (1)
2009-2010 PHD Student Handbook
105 pages
Differential Forms
No ratings yet
Differential Forms
10 pages
Matrix Calculus: 1 The Derivative
100% (2)
Matrix Calculus: 1 The Derivative
13 pages
Matrixproblems PDF
No ratings yet
Matrixproblems PDF
151 pages
Multivariable Mathematics With Maple - Linear Algebra, Vector Calculus and Differential - Carlson & Johnson
100% (1)
Multivariable Mathematics With Maple - Linear Algebra, Vector Calculus and Differential - Carlson & Johnson
40 pages
Ebin - Pub - Linear Algebra and Differential Equations Using Matlab
No ratings yet
Ebin - Pub - Linear Algebra and Differential Equations Using Matlab
654 pages
Math 105 Written Report
No ratings yet
Math 105 Written Report
13 pages
ch15 Beam Analysis Using The Stiffness Method (For Student) (Compatibility Mode)
No ratings yet
ch15 Beam Analysis Using The Stiffness Method (For Student) (Compatibility Mode)
36 pages
Matrix Diff
No ratings yet
Matrix Diff
12 pages
Derivation of Normal Equations
No ratings yet
Derivation of Normal Equations
7 pages
Integration in Finite Terms - Maxwell Rosenlicht
100% (1)
Integration in Finite Terms - Maxwell Rosenlicht
11 pages
Matrix Calculus Derivatives Guide
No ratings yet
Matrix Calculus Derivatives Guide
8 pages
Direction Cosines
No ratings yet
Direction Cosines
25 pages
Matrixcookbook Wiki
No ratings yet
Matrixcookbook Wiki
18 pages
Vdoc - Pub Generalized Vectorization Cross Products and Matrix Calculus
No ratings yet
Vdoc - Pub Generalized Vectorization Cross Products and Matrix Calculus
279 pages
F Matrix Calculus
No ratings yet
F Matrix Calculus
9 pages
Matrix Differentiation Guide
No ratings yet
Matrix Differentiation Guide
36 pages
Summary of Vector Calculus PDF
No ratings yet
Summary of Vector Calculus PDF
2 pages
System Theory & Matrix Calculus
No ratings yet
System Theory & Matrix Calculus
10 pages
Matrix Calculus for Engineers
100% (1)
Matrix Calculus for Engineers
9 pages
3 Functions Assignment
No ratings yet
3 Functions Assignment
8 pages
Matrix Calculus and Kronecker Product
No ratings yet
Matrix Calculus and Kronecker Product
7 pages
Math Models of Physics Problems
No ratings yet
Math Models of Physics Problems
29 pages
ERRATA For The Solutions Manual of T. Shifrin's Multivariable Mathematics
No ratings yet
ERRATA For The Solutions Manual of T. Shifrin's Multivariable Mathematics
2 pages
Linalg Friedberg Solutions
100% (7)
Linalg Friedberg Solutions
32 pages
Markov Chain Application On The Daily Climatic Temperature in The Science Garden, Quezon City Weather Station
No ratings yet
Markov Chain Application On The Daily Climatic Temperature in The Science Garden, Quezon City Weather Station
14 pages
Matrix Calculus for Researchers
No ratings yet
Matrix Calculus for Researchers
10 pages
Vector Elds
No ratings yet
Vector Elds
18 pages
Matrix Calculus - Notes On The Derivative of A Trace: Johannes Traa
No ratings yet
Matrix Calculus - Notes On The Derivative of A Trace: Johannes Traa
7 pages
Math Essentials for ML Engineers
No ratings yet
Math Essentials for ML Engineers
38 pages
Matrix Derivatives
No ratings yet
Matrix Derivatives
4 pages
Hands On Exercises 1 - Getting Started: 1. Equipment
No ratings yet
Hands On Exercises 1 - Getting Started: 1. Equipment
8 pages
Inner Product Space
No ratings yet
Inner Product Space
15 pages
Matrix Derivative Calculus Paper
0% (1)
Matrix Derivative Calculus Paper
7 pages
MATH2412 Dot Product
No ratings yet
MATH2412 Dot Product
7 pages
X X X Dx. X DX.: °2004 ONG TCV Scoala Virtuala A Tanarului Matematician
No ratings yet
X X X Dx. X DX.: °2004 ONG TCV Scoala Virtuala A Tanarului Matematician
6 pages
l28 Isi P Bozomitu 2015 Siitme p161
No ratings yet
l28 Isi P Bozomitu 2015 Siitme p161
4 pages
Functions of Several Variables2
No ratings yet
Functions of Several Variables2
5 pages
Rudin Solutions Chapter 9
No ratings yet
Rudin Solutions Chapter 9
1 page
Advanced Integration Techniques Guide
No ratings yet
Advanced Integration Techniques Guide
8 pages
Matrix Calculus
No ratings yet
Matrix Calculus
56 pages
Maths Class Xi Chapter 01 Sets Practice Paper 01 2024
No ratings yet
Maths Class Xi Chapter 01 Sets Practice Paper 01 2024
3 pages
Probability and Statistics For Engineers Syllabus
No ratings yet
Probability and Statistics For Engineers Syllabus
7 pages
Differential Equations - Shepley L - Ross - 3rd Ed, New York, ©1984 - Jossey-Bass, Incorporated Publishers - 9780471032946 - Anna's Archive
No ratings yet
Differential Equations - Shepley L - Ross - 3rd Ed, New York, ©1984 - Jossey-Bass, Incorporated Publishers - 9780471032946 - Anna's Archive
824 pages
Math EE IB
No ratings yet
Math EE IB
13 pages
Linear - Algebra - and Metric Calculus-1
No ratings yet
Linear - Algebra - and Metric Calculus-1
59 pages
矩阵微分手册-Matrix calculus-Wikipedia
No ratings yet
矩阵微分手册-Matrix calculus-Wikipedia
18 pages
Vector/Matrix Calculus Guide
No ratings yet
Vector/Matrix Calculus Guide
10 pages
Essential Vector Calculus
No ratings yet
Essential Vector Calculus
7 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
Chapter Matrix Derivative Common Cases
No ratings yet
Chapter Matrix Derivative Common Cases
6 pages
Selected Solutions To Linear Algebra Done Wrong
No ratings yet
Selected Solutions To Linear Algebra Done Wrong
24 pages
(Ebook PDF) Fundamentals of Complex Analysis: With Applications To Engineering and Science 3rd Edition Instant Download
100% (8)
(Ebook PDF) Fundamentals of Complex Analysis: With Applications To Engineering and Science 3rd Edition Instant Download
49 pages
Functional Analysis and Applications 1st Ed Abul Hasan Siddiqi Download
No ratings yet
Functional Analysis and Applications 1st Ed Abul Hasan Siddiqi Download
89 pages
DTU UG Mathematical Software 3
100% (1)
DTU UG Mathematical Software 3
53 pages
Functional and Structural Tensor Analysis For Engineers Draft Rebecca M Brannon Download
100% (1)
Functional and Structural Tensor Analysis For Engineers Draft Rebecca M Brannon Download
83 pages
Matrix Methods and Fractional Calculus 1st Edition Arak M Mathai Instant Download
No ratings yet
Matrix Methods and Fractional Calculus 1st Edition Arak M Mathai Instant Download
56 pages
Matrix Calculus Problems With Solutions
No ratings yet
Matrix Calculus Problems With Solutions
1 page

Matrix Differentiation Rules and Application

Uploaded by

Matrix Differentiation Rules and Application

Uploaded by

Matrix Differentiation

CS5240 Theoretical Foundations in Multimedia

Leow Wee Kheng

Department of Computer Science

Leow Wee Kheng (NUS) Matrix Differentiation 1 / 36

Linear Fitting Revisited

The linear function f (pi ) has the form

f (p) = f (x1 , . . . , xm ) = a1 x1 + · · · + am xm + am+1 . (2)

Leow Wee Kheng (NUS) Matrix Differentiation 2 / 36

The data points are organized into a matrix equation

The solution of Eq. 3 is

a = (D⊤ D)−1 D⊤ v. (5)

Leow Wee Kheng (NUS) Matrix Differentiation 3 / 36

Denote each row of D as d⊤

So, linear least squares problem can be described very compactly as

min kD a − vk2 . (7)

To show that the solution in Eq. 5 minimizes error E, need to

Leow Wee Kheng (NUS) Matrix Differentiation 4 / 36

The obvious (but hard) way:

Expand equation explicitly giving

Then, set ∂E/∂ak = 0 and solve for ak .

Leow Wee Kheng (NUS) Matrix Differentiation 5 / 36

Which one do you like to be?

Leow Wee Kheng (NUS) Matrix Differentiation 6 / 36

At least like these?

Leow Wee Kheng (NUS) Matrix Differentiation 7 / 36

Type Scalar Vector Matrix

Leow Wee Kheng (NUS) Matrix Differentiation 8 / 36

Numerator Layout Notation Denominator Layout Notation

Leow Wee Kheng (NUS) Matrix Differentiation 9 / 36

Numerator Layout Notation Denominator Layout Notation

Leow Wee Kheng (NUS) Matrix Differentiation 10 / 36

Numerator Layout Notation Denominator Layout Notation

Leow Wee Kheng (NUS) Matrix Differentiation 11 / 36

Leow Wee Kheng (NUS) Matrix Differentiation 12 / 36

◮ It is best not to mix the two conventions in your equations.

Commonly Used Derivatives

Its transpose x⊤ is a row matrix

Leow Wee Kheng (NUS) Matrix Differentiation 16 / 36

Consider two vectors x and y with the same number of components.

For notational inconvenience, we usually drop the matrix and

Leow Wee Kheng (NUS) Matrix Differentiation 17 / 36

Derivatives of Scalar by Scalar

∂f (g(u)) ∂f (g) ∂g(u) ∂u

Leow Wee Kheng (NUS) Matrix Differentiation 18 / 36

Derivatives of Vector by Scalar

∂f (g(u)) ∂f (g) ∂g(u) ∂u

Leow Wee Kheng (NUS) Matrix Differentiation 20 / 36

Derivatives of Matrix by Scalar

Derivatives of Scalar by Vector

∂f (g(u)) ∂f (g) ∂g(u) ∂u

Leow Wee Kheng (NUS) Matrix Differentiation 23 / 36

Derivatives of Scalar by Matrix

∂f (g(u)) ∂f (g) ∂g(u) ∂u

Derivatives of Vector by Vector

∂f (g(u)) ∂f (g) ∂g(u) ∂u

Notes on Denominator Layout

Numerator Layout Notation Denominator Layout Notation

Leow Wee Kheng (NUS) Matrix Differentiation 26 / 36

Leow Wee Kheng (NUS) Matrix Differentiation 29 / 36

Linear Fitting Revisited

E = kDa − vk2 = (Da − v)⊤ (Da − v)

Leow Wee Kheng (NUS) Matrix Differentiation 30 / 36

Apply (C11), (C6), (C9) and (C2) to the four terms.

Set dE/da = 0 and obtain

Transpose both sides of the equation and get

Leow Wee Kheng (NUS) Matrix Differentiation 32 / 36

Use the right tool

Leow Wee Kheng (NUS) Matrix Differentiation 34 / 36

Leow Wee Kheng (NUS) Matrix Differentiation 35 / 36

Leow Wee Kheng (NUS) Matrix Differentiation 36 / 36

You might also like