Properties of The Singular Value Decomposition: Preliminary Definitions
Properties of The Singular Value Decomposition: Preliminary Definitions
Preliminary definitions:
x1 y1
x y
be denoted x = and y = .
2 2
M M
x y
n n
Then the Euclidean inner product is defined as
x, y := x H y
= x1 y1 + x2 y2 +K +xn yn
Euclidean vector norm: Let " •,• " denote the Euclidean inner
product. Then the vector norm associated with this inner product is
given by
x 2 := x, x
n .
= ∑ x1
2
i =1
Euclidean matrix norm: Given A ∈Cm × n . Then the matrix norm induced
by the Euclidean vector norm is given by:
Av 2
A 2 := max v ≠ 0
v2
= λ max ( A H A)
0, i ≠ j
xi , x j =
1, i = j
(Hence xi = 1, ∀i .)
X ⊥ := {x ∈Cn : x, y = 0 ∀ y ∈X} .
u1H
H
u
U HU = 2 [u1 u2 L un ]
M
H
un
u1H u1 u1H u2 L u1H 1 0 L 0
H
u2 u1 u2H u2 L u2H 0 1 L 0
= =
M M O M M M O M
H
H
un u1 un u2 L unH 0 0 L 1
U = [u1 u2 K um ]
V = [v1 v2 K vn ]
such that
Σ H
U 0 V , m ≥ n
A=
U[ Σ 0]V H , m ≤ n
where
σ 1 0 L 0
0 σ L 0
Σ= , p = min( m,n)
2
M M O M
0 0 σ p
L
and
σ1 ≥ σ 2 ≥K ≥ σ p ≥ 0 .
Avi = σ i ui , i = 1,K , p
Av 2
σ max := max v ≠ 0
v2
= A2
Av 2
σ min := min v ≠ 0
v2
rank( A) = r
= Ur Σ r V rH
where
A # = V r Σ r−1Ur .
Many results for linear systems have the form "if such and such a
matrix has full rank, then such and such a property holds"
(paraphrased from Golub and Van Loan). Such results are n a i v e , in
that they neglect the the fact that a matrix generated from physical
data is almost always full rank. The more important question is n o t
"does the matrix have full rank?", but rather "how close is the matrix
6
to one which does not have full rank?". The SVD is a very useful tool
in making this concept precise.
Proposition:
rank( A + ∆A) = p .
Proof:
p
(b) Let A = ∑ σ iui viH , and consider ∆A = − σ pu p v pH (where σ p = σ min ). It is
i =1
p−1
easy to see that A + ∆A = ∑ σ iui viH , and thus that rank( A + ∆A) = p − 1 .
i =1
7
Suppose that rank( A) = p , but A has very small singular values. Then
A is "close" to a singular matrix in the sense that there exists a small
perturbation ∆A to the elements of A that causes  to lose rank.
Indeed, such a matrix should possibly be treated in applications as
though it were singular.
In practice, people do not look always look for small singular values
to determine distance to singularity. It is often more useful to
compute the r a t i o between the maximum and minimum singular
values.
Cn , m ≥ n, rank( A) = n .
Cm×n , b ∈C
Ax = b, A ∈C
Suppose that we are given the data for A and b and need to solve for
x . Let R( A) denote the range of A . If b ∈R
R( A) , then we may find x
from
x = A# b .
Let
Â:= A + ∆A
b̂:= b + ∆b
x̂:= x + ∆x
so that Âx̂ = b̂ . Our next result relates the errors ∆A and ∆b to errors
in the computed value of x .
∆A
≤ α < 1.
σ min ( A)
∆A ∆b
Suppose that ∆A and ∆b satisfy the bounds ≤ δ and ≤δ,
A b
where δ is a constant. Then
∆x 2δ
≤ κ ( A) .
x 1− α
###
It follows from this result that if κ ( A) >> 1, then small relative errors
in A and b m a y result in large relative errors in the computed value
of x . One should be cautioned, however, that this error estimate is
only an upper bound. Hence, the computed answer is not g u a r a n t e e d
to be incorrect. However, there are nonpathological examples for
which the upper bound on the error is achieved! Hence, if no
additional information is available, the answer to any calculation
involving the inverse of a matrix that is ill-conditioned (i.e., κ ( A) >> 1)
should be viewed dubiously.
9
1 100
Example: Consider A = . MATLAB says that rank( A) = 2 :
0 1
»A=[1 100;0 1];
»rank(A)
ans =
2
»cond(A)
ans =
1.0002e+04
The condition number looks pretty large; however, for the purposes
of solving linear systems of two equations in two unknowns, this
matrix is not particularly ill-conditioned with respect to numerical
roundoff error in the computer computations. Suppose, on the other
hand, that A and b are constructed from physical data. How reliable
are calculations based upon A−1?
For example, are we sure that the zero in the (2,1) element of A is
really zero?
1 100
 = ,
0.009 1
and consider the product A−1  ; if  = A , this product will equal the
identity matrix. Let us see what the error is for our example:
»inv(A)*Ahat
ans =
0.1 0.0
0.009 1.0
1.009 100
 =
0 1
10
»inv(A)*Ahat
ans =
1.009 0
0 1.0
Σ
A = U V H ,
0
where
U = [u1 K um ] ∈C
Cm×m , V = [v1 K vn ] ∈C
Cn×n , and Σ = diag(σ1 σ 2 K σ n ).
Suppose that
where ε is so small (or so much smaller than σ r ) that the last (n-r+1)
singular values are all effectively zero. Then we say that A has
effective rank equal to r .
Furthermore, let Reff ( A) and N eff ( A) denote the effective range and
effective nullspace of A , respectively. Then we can calculate bases
for these subspaces by choosing appropriate singular vectors:
Similarly,
⊥
Reff ( A):= span{ur+1, K , um } and N eff
⊥
( A):= span{v1, K , vr } .
Define
r
Aeff = ∑ σ iui viH
i=1
= Ur Σ r VrH
where
σ
( )
κ Aeff = 1
σr
will not be "too large". Note that
b ∈R ( ) Reff ( A) .
R Aeff ⇔ b ∈R
Hence, if b ∈R ( )
R Aeff , then the equation Ax = b has a solution that is
#
robust against small errors in A . This solution is given by x = Aeff b.
Example(continued):
1 100
Once again, consider A = . Let's look at the SVD of A , and use
0 1
the information from the singular values and vectors to construct a
rank 1 matrix, Aeff , that is "close" to A .
»[U,S,V] = svd(A)
U=
12
1.0 -0.01
0.01 1.0
S=
100.01 0
0 0.01
V=
0.01 -1.0
1.0 0.01
»Reff = U(:,1)
Reff =
1.0000
0.0100
The value of Reff makes sense, because both columns of A have much
larger entries in the first row than in the second.
»Neff = V(:,2)
Neff =
-1.0000
0.0100
Check to see that Neff "almost" gets multiplied by zero, and thus is in
the "effective" nullspace of A :
»A*Neff
ans =
-0.0001
0.01
Aeff = σ1u1v1H
( ) ( )
Note that R Aeff = Reff ( A) and N Aeff = N eff ( A) .
13
»Aeff = Reff*S(1,1)*V(:,1)'
Aeff =
0.9999 100.0000
0.0100 0.9999
»rank(Aeff)
ans =
1
( )
We see that, as expected, rank Aeff = 1 .
0 100
A1 =
0 1
Cq , u ∈C
y = Au, y ∈C Cp,
1 100
A= ,
0 1
d1 0
for which σ min (A) = 0.01 and κ (A) = 10,000 . If we choose D1 = and
0 1
1 0
D2 = , then
0 d2
1 100 d2
D1−1 AD2 = d1 .
0 1
Systems Interpretations
Consider the system
ẋ = Ax + Bu, Rn , u ∈R
x ∈R Rp
y = Cx , Rq
y ∈R
P( s ) = C( sI − A)−1 B
1 10
(For example, consider the matrix A = , which has σ max = 100.01 ,
0 1
σ min = 0.01, and κ ≅ 10000 .)
It follows from (i) that there does exist some control authority at DC
(i.e., the DC gain matrix is nonzero). However, (ii) implies that some
control inputs are relatively ineffective. Ineffective control inputs
can arise in two ways.
Finally, it follows from (iii) that any calculations involving P(0)−1 are
sensitive to errors in the data for P(0) .
r v u x y
N Σ B ( s I- A ) - 1 C
u x y
Σ Σ B Σ ( sI- A ) - 1 C Σ
-
-
r
Σ
K -
w e
KI I/ s
18
In each case, closed loop stability implies that the steady state
response of the system output to a step command r(t) = r01(t) satisfies
yss = r0 . Note also that, in each case, yss = P(0)uss . It follows that the
steady state control signal must satisfy:
uss = P(0)−1 r0
( )
Since σ max P(0)−1 = 1
σ min ( P(0))
, it follows from (ii) that relatively
large control signals will be required to track certain step commands.
Large control signals are undesirable because they may saturate
control actuators or cause other undesirable nonlinear behavior.
Furthermore, (iii) tells us that small changes in the data of P(0) may
cause large changes in P(0)−1. It follows that small errors in P(0) may
cause the size of the control signal generated to force command
tracking to be much larger than that indicated from the nominal
value of P(0) . Hence, even if the control signals obtained by
examining the nominal model of P(0) are reasonable, the control
signals obtained from the true plant may not be.
mask
SiO2
Si
mask
SiO2
Si
SiO2
Si
2. Selectivity: What are the relative etch rates of SiO2 , Si , and the
photoresist mask?
3. Etch rate: Is this constant over the course of a single etch, and
from etch to etch for many etches?
Gas
In let
Plasm a
Throt t le
Valve
We have the following three actuators that we can use for control:
We also have sensors to measure three signals that we can feed back
and attempt to regulate:
Vbias , [ F ] , Pressure.
These actuators and sensors are related to the plasma, not the wafer.
Currently, we cannot sense features on the wafer surface in real
time. Hence, our control strategy is to use the control inputs to
regulate the plasma properties, which we c a n measure. The plasma
properties are only indirectly related to the wafer etch; however,
they do determine the environment in which the etch takes place. I t
21
Inputs:
The flow input determines the rate at which CF4 gas enters the etch
chamber.
The RF power input has two effects: (i) it disassociates CF4 → CF3+ + F
(a charged ion and reactive fluorine radical whose density we denote
by [ F ]), and (ii) it sets up a bias voltage, Vbias , across the plasma. This
bias voltage accelerates the ions so that they bombard the wafer
surface. The physical energy thus imparted to the surface, combined
with the chemical reactions between [ F ] and Si are responsible for
etching the exposed portion of the wafer surface.
The throttle input determines the rate at which gases are exhausted
from the chamber.
Outputs:
The Pressure output determines (among other things) the mean free
path between collisions in the chamber. The longer the mean free
path, the more energy the ions have when they impact the wafer
surface, and the greater the physical component of the etching
process.
Modelling:
Using small signal step response data together with black box system
identification techniques, we obtained a linear model of the system
at the nominal operating condition:
Note that both CF4 flow and Throttle angle each primarily affect
pressure. Specifically, if we open the exhaust throttle, then the
steady state pressure will decrease. On the other hand, if we increase
the rate of CF4 flow, then the steady state pressure will i n c r e a s e .
These effects are plausible, because flow affects the rate at which
gases enter the chamber, and throttle affects the rate at which gases
leave the chamber. Power has no steady state effect on pressure, but
it does affect both Vbias and [ F ] relatively more strongly than do flow
and throttle.
»[U,S,V] = svd(P_N)
23
S= % singular values
1.9663 0 0
0 1.7846 0
0 0 0.0046
»kappa = cond(P_N)
kappa =
425.0254
The large condition number is consistent with the fact that the
throttle and flow inputs are almost redundant, and thus do not yield
two independent degrees of control authority. As we have noted,
both these inputs primarily affect pressure.
We now have a control problem with two inputs and three outputs.
Becasue we cannot independently control all three outputs, we must
choose two of them. (More generally, we can choose any two
independent linear combinations of the three outputs.) The physics
of the etch process suggests that Vbias and [ F ] are relatively more
important than Pressure (although this is a debatable point). Deleting
Pressure from the DC gain matrix yields
»svd(DCnew2)
ans =
1.9423
0.2053