0% found this document useful (0 votes)

19 views9 pages

S M S T C Lecture Notes Lecture2

The document discusses the Least Squares approach to solving linear inverse problems, focusing on the estimation of model parameters through minimization of prediction errors. It covers various norms used to measure errors, the classification of linear inverse problems based on the quality of information, and the least squares solution for linear equations. Additionally, it introduces alternative approaches and considerations for underdetermined problems, emphasizing the importance of incorporating prior information to achieve a unique solution.

Uploaded by

miru park

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views9 pages

S M S T C Lecture Notes Lecture2

Uploaded by

miru park

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

SM ST C : INVERSE PROBLEMS 1–12

1.2 Lecture 2: Investigation of the Least Squares approach

Recall that we had seen examples of modelling (or discretising the problem to ) the linear system of equations

Km = d, m = mtrue (1.2)

Even though the matrix K is an M × N matrix, we will try to ”invert” it. We will use the following notation adopted
from [1]: mest is an estimate of model parameters, could be given as bounds, or with some degree of certainty.
For example, mest
1 = 1.2 ± 0.1 (95%) would mean that there is 0.95 probability that the true value of the model
parameterm1 = mtrue
1 is between 1.1 and 1.3. Next the estimited model parameters can generate a predicted data:

Kmest = d pre .

Remark: This linear discrete system (1.2) can be used as local approximations for weakly non-linear problems:

d = k(m) ≈ k(m̂n ) + ∇k[m − m̂n ] =

using Taylor series expansion around m̂n , so that

= k(m̂n ) + Kn ∆mn+1 , n ∈ {0, 1, 2, . . . }

where
∂ki
Kn = ∇k|m=m̂n ; [Kn ]i j = (n)
, ∆mn+1 = m − m̂n
∂m̂ j
Thus we need to invert d0 = d − k(m̂n ) = Kn ∆mn+1 for ∆mn+1 , starting with m0 .
We will consider the methods based on the size (length) of d pre .
Example: Fitting a straight line:

dobs = [d1obs , . . . , dNobs ]T is the observed data (known)

We are interested in model parameters

m = [m1 , m2 ]T , gradient and y-intersept, M = 2

we want to choose d pre as close as possible to dobs . To get ”as close as possible” we need to define the dis-
tance/length, that we will minimise.
Consider
pre
ei = diobs − di the prediction error of i-th measurement, or residual
then the total overall error should tend to its minimum, i.e.
N
E = ∑ e2i = eT e → min
i=1

here
e = dobs − d pre .

1.2.1 Measures of length

Given a vector space V over a field F (C or R), a norm on V is a non-negative-valued function p : V → R such that
for any α ∈ F nd for any u, v ∈ V,

• p(u + v) ≤ p(u) + p(v) triangle inequality

• p(αv) = |α|p(v) absolute scalability

• p(v) = 0 ⇒ v = 0 positive-definiteness

if p satisfies the first two only, it is called a semi-norm.

SM ST C : INVERSE PROBLEMS 1–13

Vector norms

L1 (l1 ) norm: kek1 = ∑|ei |

!1
2
1
L2 (l2 ) Euclidean norm: kek2 = ∑|ei |2 = (e, e) 2
i

!1
n

Ln (ln ) norm: kek2 = ∑|ei |n

Example:. Consider a vector (of errors): e = [0.01.0.13]T , then its norma are:

kek1 = 0.01 + 0.1 + 3 = 3.11

p
kek2 = (0.012 + 0.12 + 32 ) ≈ 3.00168286
p
25
kek25 = 0.0125 + 0.125 + 325 ≈ 3.000000000000000000000000000000000000014163...
The largest component matters more and more when n → ∞, hence, taking the limit, the infinity norm is

L∞ (l∞ ) norm: kek∞ = max|ei |

which selects the vector element with the largest absolute value as the measure of length.

The choice of norm

Figure 1.6: This illustration [1] shows the difference in taking various norms

Computations are simpler with L2 norm than with L1 norm.

The least squares method uses the L2 norm to quantify the length, the norms above vary according to the weight
of the outliers. An outlier is a data point far from the average trend). If the data is very accurate, we might use
higher order norm as having an outlier would mean it contain lots of information and should be taken into account
properly. L2 norm implied that the data obey Gaussian statistics (will see under the probabilistic approach), it
weights outliers heavily, since Gaussian distribution is short-tailed, i.e. leaves no space for the outliers (a few
scattered points), see Fig. 1.7. The long-tailed distributions have many scattered (improbably) points and are
better treated by L1 norm. The choice of norm depends on statistics the data obey. Methods that can tolerate a few
bad data points are known as robust.
SM ST C : INVERSE PROBLEMS 1–14

Figure 1.7: This illustration [1] compares two distributions with different tails: the left distribution is long-tailed,
the right one is short-tailed

Matrix norms

A vector-induced matrix norm is

kKmk
kKk = max or kKk = max kKmk
m6=0 kmk kmk=1

the latter implies that kKmk ≤ kKkkmk and kIk = 1.

kKmk1
L1 − norm is: kKk1 = max = max∑|ki j | = maxkc j k
m6=0 kmk1 j i j

where c j are the jth column vectors of matrix K. Thus matrix L1 norm is the largest L1 norm of the columns of
the matrix.
The L∞ norm generates pretty similar L∞ matrix norm, i.e.
kKmk∞
L∞ − norm is: kKk1 = max = max∑|ki j | = maxkri k
m6=0 kmk∞ i j j

where ri are the ith row vectors of matrix K. Thus matrix L∞ norm is the largest L∞ norm of the rows of the
matrix.
If we use the vector L2 norm now, we get matrix L2 norm:
kKmk2 p
= ρ(K ∗ K) = ρ(KK ∗ ) = kK ∗ k2
p
L2 − norm is: kKk2 = max
m6=0 kmk2

where ρ(K) is the spectral radius of matrix K; K ∗ is the adjoint of matrix K. For the square matrices, the spectral
radius is
ρ(K) = max {|λi (K)|}
1≤i≤N

where in turn, λi (K) are the eigenvalues of K such that det (K − λIN ) = 0. The matrix L2 norm of K is the largest
square roots of the eigenvalue of matrix KK ∗ or matrix K ∗ K (the largest singular value of matrix K). A singular
value of matrix K are positive square roots of the eigenvalues of KK T or K T K, i.e.
q q
µi (K) = λi (K T K) = λi (KK T )

If matrix K is Hermitian (or self-adjoint) K = K ∗ , or symmetric K = K T , the L2 norm is a spectral radius of matrix
K:
kKk2 = ρ(K), if K = K ∗ or K T = K; kKk ≥ ρ(K)
The Frobenius norm is not vector-induced matrix norm, but can be computed easily
!1
M M 2
2 ∗
kKkF = ∑ ∑ |ki j | , or kKkF = {tr(K K)}
i=1 j=1

The latter allows to compute L2 norm effectively for an N × N by using

√
kKk2 ≤ kKkF ≤ NkKk2
SM ST C : INVERSE PROBLEMS 1–15

1.2.2 Classification of linear inverse problems basing on the quality of information

We are still dealing with equation (1.2) and M is the dimension of model parameters, N is the dimension of data.
We judge the quality of information that depends on N data points and number M of model parameters, but also on
the structure of matrix K which can be sparse or full.

(Purely) underdetermined problems If the data doesn’t provide enough information to determine m uniquely.
One easy explanation is N < M and we have more unknowns than data. We usually have several solutions
that deliver zero prediction error E = 0.
Assume N < M and there are no inconsistencies, then we can find more than one solution, such that total
overall error E = 0.
Solution: Minimization of the norm of the estimated solution.
Overdetermined problems We have too much information known, usually N > M, and we minimise the total
overall error.
Solution: Minimization of the prediction error

Mixed-determined problems However, frequently, the data determines uniquely some of the model parameters,
but not the other. The situation is typical for tomography: if the rays miss some of the blocks, but go through
the others, some blocks may have too much data about their parameters, but missed block won’t have a
chance to be reconstructed. Usually K T K is singular and even though M < N, the data kernel has poor
structure.

Even-determined problems Those ones have exactly enough information to determine the model parameters.

1.2.3 Least squares problem for a straight line

Assume the data d = [d1 , . . . , dN ]T can be described by the linear equation y = m2 x + m1 , hence

di = m1 + m2 zi , m = [m1 , m2 ]T , M = 2

Usually, N > M and then there is no solution (for which e = 0) except for special case when all data points actually
belong to a straight line, and hence the inverse problem is overdetermined. So we look for an approximate
solution, such that
N
E = eT e = ∑ (di − m1 − m2 zi )2 → min
i=1

That’s calculus problem, we consider the total overall error as a function of model parameters E = E(m1 , m2 ) and
∂E
hence take the derivatives with respect to mq , q ∈ {1, 2}, ∂m q
, for each i ∈ {1, 2, . . . , N} we get

∂
(di − m1 − m2 zi )2 = 2m1 − 2di + 2m2 zi
∂m1
∂
(di − m1 − m2 zi )2 = 2m1 zi − 2di zi + 2m2 z2i
∂m2
hence we have
N N
∂E
= 2Nm1 − 2 ∑ di + 2m2 ∑ zi
∂m1 i=1 i=1
N N N
∂E
= 2m1 ∑ zi − 2 ∑ di zi + 2m2 ∑ z2i
∂m2 i=1 i=1 i=1

setting them to zero for a minimum,


N N
2Nm1 − 2 ∑ di + 2m2 ∑ zi = 0


i=1 i=1
N N N
2m1 ∑ zi − 2 ∑ di zi + 2m2 ∑ z2i = 0


i=1 i=1 i=1
SM ST C : INVERSE PROBLEMS 1–16

dividing everything by two, the latter can be written as a matrix equation:

   
N N
 N i=1 ∑ zi   ∑ di 

  m1 =  i=1 
N N  m2 N
∑ ∑ z2i
 
∑ zi di
i=1 i=1 i=1

1.2.4 Least squares solution to the linear inverse problem

The matrix equation we consider is the same:

d = Km, m = [m1 , m2 , . . . , mM ]T

and recall the total overall error

T
E = eT e = dobs − d pre dobs − d pre = (d − Km)T (d − Km) =
" #" #!
N M M
=∑ di − ∑ Ki j m j di − ∑ Kik mk
i=1 j=1 k=1

Then taking the derivatives, we get so-called normal equation

K T Kmest − K T d = 0, (1.3)

where K T K is a square M × M square matrix, m is a vector length M, also K T d is a vector length M.

If [K T K]−1 exists, then solution exists and can be solved:

mLS = [K T K]−1 K T d

where mLS is the least squares solution to Km = d. Please notice, that [K T K]−1 can be extremely hard to compute,
as it can be huge. Usually even if K is sparse, [K T K]−1 is not sparse enough to simplify computations. Also for
large matrices, [K T K]−1 can be computed by biconjugate gradient algorithm.
If [K T K]−1 doesn’t exist, solution might be not unique.

1.2.5 An alternative approach to least squares approximation

Consider again normal equation (1.3)
K T Kmest = K T d, (1.4)
where K is a given M × N matrix and d is a given vector in RN , we define a real-valued function F : RM → R by

F(m) := kKm − dk2 = (Km − d) · (Km − d), for all m ∈ RM .

We can show that the gradient vector of F satisfies

∇F(m) = 2 K T Km − K T d , for all m ∈ RM .

(1.5)

Thus, ∇F(m) = 0 for the least squares solution mest . According to equation (1.5), the least squares solution should
satisfy normal equation (1.4)

1.2.6 Minimal Length solution: purely underdetermined problems

We consider equation (1.2) such that it is purely underdetermined, i.e. there are more than one solution minimizes
the error.
What should we do? We use another guiding principle: add some more a priori information. Mainly it quantifies
expectations about the character of the solution that is not based on actual data, some examples might include
density is positive; the solution is ”small” or ”simple” (in some measure of length).
We then might want to consider
L = mT m = ∑ m2i → min (1.6)
SM ST C : INVERSE PROBLEMS 1–17

We still want to minimise the error (but several solutions) so we have added that as a constraint as get Lagrange
multipliers problem
min L subject to the constraint e = d − Km = 0
m

We are thus looking to minimise the following

" #
N N N M
Φ(m) = L + ∑ λi ei = ∑ mi + ∑ λi di − ∑ Ki j m j λi (1.7)
i=1 i=1 i=1 j=1

∂Φ
Taking the derivatives ∂mq = 0, we get

N
∂Φ
= 2mq − ∑ λi Kiq = 0
∂mq i=1

and hence in a matrix form

2m = K T λ, Km = d
The latter implied that
λ
d = K[K T ]
2
where KK T is N × N matrix and if [KK T ]−1 then

λ = 2[KK T ]−1 d

and hence the minimal length solution is

mML = K T [KK T ]−1 d (1.8)

1.2.7 Weak underdetermination: damped least squares

For weakly underdetermined problems we can adopt some approximation process for matrix partition. Recall
that if some parameters can be determined and others cannot, we would like to separate them (later in the mised-
determined cases sections).
Determine a solution that minimises some combination Φ(m) of the predication error and the solution length for
m:
Φ(m) = E + ε2 L = eT e (1.9)
where ε2 is some weighting factor which determines the relative importance given to the prediction error and
solution length.
If ε is large, we minimize the underdetermined part of the solution: BUT it tends to minimum together with the
overdetermined part also, hence E is not minimised.
If ε = 0 the E is minimal, but no a priori information will be provided to single out.
We need a compromise, there are some methods, by mostly trial and error.
Minimimation of Φ (Lectures and exercise) brings us
" #" #
N M M N
Φ(m) = E + ε2 L = ∑ di − ∑ Ki j m j di − ∑ Kik mk + ε2 ∑ mi
i=1 j=1 k=1 i=1

N M N
∂Φ
= ε2 mq − ∑ Kiq di + ∑ mk ∑ Kiq Kik = 0
∂mq i=1 k=1 i=1
and hence
K T Km − K T d + ε2 m = 0
and the corresponding solution is called the damped least squares

[K T K + Iε2 ]mDLS = K T d (1.10)

Underdeterminicity is said to be damped.

SM ST C : INVERSE PROBLEMS 1–18

1.2.8 Other a priori information

Example If we want to reconstruct density fluctuations in the ocean, we don’t want the solution to be close to
zero, by rather to some typical sea level a priori value m priori , so L = mT m might not be good, and we need

L = (m − m priori )T (m − m priori )

Example We want our solution to be flat, and that is easy to quantify. Flatness is the opposite of steepness, and the
first derivative controls it. In the discrete case we consider
∂m mi+1 − mi
→
∂x ∆x
thus the steepness of m is
 
  m1
−1 1 0 0 ... 0   m2 

1 0 −1 1 0 ... 0  m3 
l=    = Dm
∆x  0 0 −1 0 ... 0  . 

... ... ... ... ... ...  . 
mM

where D is the steepness matrix and is an approximation of dm

dx .
Example Assume solution m is smooth, parameters vary slowly with position. Roughness is the opposite of
smoothness, and the second derivative controls that. We will now have term like

(∆x)−2 [. . . . . . 1 − 2 1 . . . ]
2
Hence the approximation to ddxm2 matrix is
 
1 −2 1 0 ... 0
0 1 −2 1 ... 0  = lT l = [Dm]T Dm = mT DT Dm = mT Wm m
L= 0 0 1 −2 ... 0 
... ... ... ... ... ...
Matrix Wm = D is the weighting factor that enters into the calculation of the length of the vector m.
kmk2weighted = mT Wm m is not a proper norm as it violates positivity. The reason for the latter is that kmk2weighted = 0
for some non-zero vectors, such as a constant vector. It can cause non-uniqueness.

1.2.9 A priori and weighting matrix

The measure of solution simplicity can therefore be generalised to

L = (m − m priori )T Wm (m − m priori ) (1.11)

see the Section 1.2.8 for matrix Wm . By suitable choosing a priori model m priori and weights Wm one can generate
a variety of measures of simplicity. We can also consider the generalised prediction error

E = eT We e

We in its turn, defines relative contribution of each individual error (normally this is a diagonal matrix).
Example: Weighted least squares To solve completely overdetermined Km = d with E = eT We e we have

mW LS = [K T We K]−1 K T We d

Example: Weighted minimum length To solve completely underdetermined Km = d with L = [m−m priori ]T Wm [m−
m priori ] we have
mW ML = m priori +Wm−1 K T [KWm−1 K T ]−1 [d − Km priori ]
Example: Weighted Damped least squares To solve slightly underdetermined Km = d with Φ(m) = E + ε2 L (ε
is chosen by trial and error) we have

mW DLS = [K T We K + ε2Wm ]−1 [K T We d + ε2Wm m prior ]

SM ST C : INVERSE PROBLEMS 1–19

1.2.10 Moore-Penrose Pseudoinverses and least squares

Consider
min kKm − dk
Assume m0 and m1 are any two solutions to the normal equation (1.3), (1.4):

K T Km = K T d

then
K T K(m0 − m1 ) = 0 so m0 − m1 ∈ Null(K T K)
Recall, the Null(K) = {m : Km = 0}.
Theorem 1.1. Let K be an M × N matrix, let m be a vector in RM , then

Km · d = m · K T d

by · we highlight the dot or scalar product of vectors (and will be omitted in general)

Proof. The proof is similar to the example below

Example This example illustrates well what happens in Theorem 1.1, consider
   
4 0 d1
m
K = −3 2 , m = 1 d = d2  , M = 2, N = 3
m2
1 5 d3

then    
4m1 d1
Km · d = −3m1 + 2m2  · d2  =
m1 + 5m2 d3
4m1 d1 + (−3m1 + 3m2 )d2 + (m1 + 5m2 )d3 = [4d1 − 3d2 + d3 ]m1 + [2d2 + 5d3 ]m2 =
 
d
m1 4 −3 1  1 

m1 4d1 − 3d2 + d3
= = d2 = m · K T d
m2 2d2 + 5d3 m2 0 2 5
d3

Theorem 1.2 (Corollary). If d is orthogonal to the range of K, then d is in the Null(K T )

Proof. If d is orthogonal to the range of K, then

Km · d = 0 for all m ∈ RM ⇒ m · K T d = 0 due to Theorem 1.1

i.e. vector K T d is orthogonal to any vector in RN , in particular to itself, so

K T d · K T d = 0 and so kK T dk2 = 0 ⇒ K T d = 0.

The latter proves that d ∈ Null(K T ).

Theorem 1.3. For an arbitrary matrix K, the Null(K) is the same as the Null(K T K).

Proof. We should the direct implication first: assume K is any matrix, suppose m is in the Null(K), so Km = 0,
then
m ∈ Null(K) ⇒ Km = 0 ⇒ K T Km = K T 0 = 0 ⇒ m ∈ Null(K T K)
Now, the back implication. Assume m ∈ Null(K T K). Then K T Km = 0, then the range of K is a subspace of RM
and the null space of K is a subspace of RN . Thus

kKmk2 = Km · Km = m · K T Km = m · 0 = 0 ⇒ Km = 0

Thus the null spaces are the same.

SM ST C : INVERSE PROBLEMS 1–20

Hence m0 − m1 ∈ Null(K) and thus there exists a unique solution to the normal equation that is also orthogonal
to the null space of K. If there were two solutions, then their difference would be in the null space of K and at the
same time orthogonal to the null space, and hence zero.
Let us denote this unique solution by m+ , then km+ k is as small as possible for solutions of the normal equation.
Som,e explanation: any other solution must differ from m+ by a component in the null space of K, orthogonal to
m+ , so adding an orthogonal component only increases the norm by Pythogorean theorem.
The uniquely determined vector m+ is called the Moore-Penrose solution to the normal equation and m+ ∈
Range(K T ).
If K T K is invertible, the solution to normal equation is unique and Moore-Penrose solution coincides with the least
squares solution mLS = mMP = [K T K]−1 K T d
   
1 1 0 1
1 1 0 3 T
Example: Consider K =  1 0 1 and data d = 8 , K K is not invertible (exercise), the Null(K) = {t(−1, 1, 1) :
  

1 0 1 2
t ∈ R} and hence the set of solutions to the corresponding normal equation is given by
{(1, 1, 4) + t(−1, 1, 1),t ∈ R}
 
x
since all the solutions to the normal equation are {m̂ = 2 − x} The norm of a typical solution is
5−x
8
q p
(1 − t)2 + (1 + t)2 + (4 + t)2 = 3t 2 + 8t + 18 → min when 6t = −8 ⇒ t = − .
6
Thus Moore-Penrose solution is
8 7 1 8
mMP = m+ = (1, 1, 4) − (−1, 1, 1) = ( , − , )
6 3 3 3
and is indeed, orthogonal to the vector (−1, 1, 1).
Summary: here we used the following approach: found all the solutions to the normal equation and then found the
min norm one. Singular value decomposition is another approach to Moore-Penrose solution.

Exercises
   
−1 2 4
1–6. Take K =  2 −3 and data d =  1  .. Show that K T K is invertible and then find a unique least squares
−1 3 −2
3
solution mest = . Compute kKmest − dk.
2
   
1 1 0 1
1 1 0 3 T
1–7. Take K =  1 0 1 and data d = 8. Show that K K is non-invertible. Find all the solutions to the
  

1 0 1 2
 
2
2
5 . Compute kŷ − dk
normal equation and show that the unique element closest to d in range of K is ŷ =  

5
in this case.
1–8. Compute singular values of
1 2 3
K=
4 5 6

References
[1] W ILLIAM M ENKE, Geophysical Data Analysis: Discrete Inverse Theory, 3rd edition, Essevier, 2012.
[2] P ER C HRISTIAN H ANSEN, Discrete Inverse Problems, Insights and Algorithms, SIAM, 2010.

Solution Manual For Discrete Time Signal Processing 3 E 3rd Edition Alan V Oppenheim Ronald W Schafer
0% (1)
Solution Manual For Discrete Time Signal Processing 3 E 3rd Edition Alan V Oppenheim Ronald W Schafer
4 pages
S M S T C Inverse Problems Lecture 2 Annotated
No ratings yet
S M S T C Inverse Problems Lecture 2 Annotated
49 pages
S M S T C Inverse Problems Lecture 2
No ratings yet
S M S T C Inverse Problems Lecture 2
49 pages
Lecture 15
No ratings yet
Lecture 15
4 pages
GP506 L2 Error Analysis
No ratings yet
GP506 L2 Error Analysis
11 pages
S M S T C Lecture Notes Lecture3
No ratings yet
S M S T C Lecture Notes Lecture3
11 pages
Some Notes On Least Squares, QR-factorization, SVD and Fitting
No ratings yet
Some Notes On Least Squares, QR-factorization, SVD and Fitting
12 pages
Iterative Linear
No ratings yet
Iterative Linear
10 pages
Linear Least Squares
No ratings yet
Linear Least Squares
21 pages
Matrix Norms
100% (1)
Matrix Norms
15 pages
Chap 03
No ratings yet
Chap 03
59 pages
Lecture 14: Linear Algebra: cs412: Introduction To Numerical Analysis
No ratings yet
Lecture 14: Linear Algebra: cs412: Introduction To Numerical Analysis
8 pages
Lecture 17 Least Squares, State Estimation
No ratings yet
Lecture 17 Least Squares, State Estimation
29 pages
Chapter 2
No ratings yet
Chapter 2
16 pages
Vector Norm
No ratings yet
Vector Norm
5 pages
Leastsquares Minnorm Problems
No ratings yet
Leastsquares Minnorm Problems
6 pages
Linear Algebra Concepts & Methods
No ratings yet
Linear Algebra Concepts & Methods
2 pages
Class 9 10 Unlocked
No ratings yet
Class 9 10 Unlocked
34 pages
15 Notes 6250 f13
No ratings yet
15 Notes 6250 f13
9 pages
Least Squares
No ratings yet
Least Squares
12 pages
Linear Algebra Cheat Sheet
100% (1)
Linear Algebra Cheat Sheet
2 pages
Inverse Problems
No ratings yet
Inverse Problems
45 pages
Eigenvalue Stability and QR Decomposition
No ratings yet
Eigenvalue Stability and QR Decomposition
9 pages
Notes Linearregression
No ratings yet
Notes Linearregression
4 pages
Midtermsols Sp2010
No ratings yet
Midtermsols Sp2010
6 pages
ECEN615 Fall2022 Lect16-1
No ratings yet
ECEN615 Fall2022 Lect16-1
47 pages
Properties of The Singular Value Decomposition: Preliminary Definitions
No ratings yet
Properties of The Singular Value Decomposition: Preliminary Definitions
24 pages
LeastSquares DeptMath
No ratings yet
LeastSquares DeptMath
7 pages
3.1 Least-Squares Problems
No ratings yet
3.1 Least-Squares Problems
28 pages
S M S T C Lecture Notes Lecture5
No ratings yet
S M S T C Lecture Notes Lecture5
14 pages
Notas de Optimizacion
No ratings yet
Notas de Optimizacion
3 pages
Total Least Squares
No ratings yet
Total Least Squares
11 pages
Linear Least Squared
No ratings yet
Linear Least Squared
23 pages
Total Least Squares
No ratings yet
Total Least Squares
11 pages
Least Squares, QR, SVD Explained
No ratings yet
Least Squares, QR, SVD Explained
22 pages
Iterative Linear System PDF
No ratings yet
Iterative Linear System PDF
13 pages
Review: W2 W1 Inverse Problem Estimating Weights of Ladies From W1 and W2
No ratings yet
Review: W2 W1 Inverse Problem Estimating Weights of Ladies From W1 and W2
41 pages
Module-3.1 Static, Linear Inverse Problem - Nov-06
No ratings yet
Module-3.1 Static, Linear Inverse Problem - Nov-06
29 pages
Lecture Week04 PDF
No ratings yet
Lecture Week04 PDF
9 pages
Ee127-Fa2018-Mt1-El Ghaoui-Soln
No ratings yet
Ee127-Fa2018-Mt1-El Ghaoui-Soln
15 pages
hw1 Sol
No ratings yet
hw1 Sol
12 pages
Least Squares Aproximations
No ratings yet
Least Squares Aproximations
10 pages
Lecture-04 - Least Squares and Geometry
No ratings yet
Lecture-04 - Least Squares and Geometry
35 pages
Nonlinear Optimization Basics
No ratings yet
Nonlinear Optimization Basics
6 pages
Lecture-05 - Least Squares and Optimization
No ratings yet
Lecture-05 - Least Squares and Optimization
34 pages
Linear Algebra Review
No ratings yet
Linear Algebra Review
18 pages
OptimumEngineeringDesign Day2b
No ratings yet
OptimumEngineeringDesign Day2b
24 pages
Lecture25 Ps
No ratings yet
Lecture25 Ps
10 pages
Performance of Differential Evolution Method in Least Squares Fitting of Some Typical Nonlinear Curves
No ratings yet
Performance of Differential Evolution Method in Least Squares Fitting of Some Typical Nonlinear Curves
21 pages
Linear Least Squares Problems
No ratings yet
Linear Least Squares Problems
38 pages
Matrix Norms: Tom Lyche
No ratings yet
Matrix Norms: Tom Lyche
45 pages
Lecture 13 - Least Squares
No ratings yet
Lecture 13 - Least Squares
28 pages
ECEN615 Fall2020 Lect15
No ratings yet
ECEN615 Fall2020 Lect15
52 pages
Final 4 Sem
No ratings yet
Final 4 Sem
29 pages
Convex Optimization Prerequisite - Topics
No ratings yet
Convex Optimization Prerequisite - Topics
6 pages
Least Square+Best Fit
No ratings yet
Least Square+Best Fit
78 pages
Lecture 4
No ratings yet
Lecture 4
27 pages
Advanced Linear Algebra Homework
No ratings yet
Advanced Linear Algebra Homework
2 pages
Comb Trig With Cheby
No ratings yet
Comb Trig With Cheby
6 pages
Geoff 9
No ratings yet
Geoff 9
17 pages
Composite Fermions and Integer Partitions
No ratings yet
Composite Fermions and Integer Partitions
8 pages
S M S T C Lecture Notes Lecture4
No ratings yet
S M S T C Lecture Notes Lecture4
11 pages
S M S T C Inverse Problems Lecture 3 Annotated
No ratings yet
S M S T C Inverse Problems Lecture 3 Annotated
38 pages
S M S T C I P Exercises Solutions Week 6
No ratings yet
S M S T C I P Exercises Solutions Week 6
2 pages
Askey Wilson
No ratings yet
Askey Wilson
23 pages
S M S T C Inverse Problems Lecture 1 Annotated
No ratings yet
S M S T C Inverse Problems Lecture 1 Annotated
30 pages
Stochastic Notes
No ratings yet
Stochastic Notes
133 pages
S M S T C Inverse Problems Lecture 4
No ratings yet
S M S T C Inverse Problems Lecture 4
47 pages
Lecture 2 - Periodicity Properties and Lyndon Words
No ratings yet
Lecture 2 - Periodicity Properties and Lyndon Words
8 pages
S M S T C Inverse Problems Lecture 5
No ratings yet
S M S T C Inverse Problems Lecture 5
39 pages
How To Construct Qab Wald
No ratings yet
How To Construct Qab Wald
8 pages
Gradientflows 2
No ratings yet
Gradientflows 2
8 pages
Coherent Op
No ratings yet
Coherent Op
8 pages
SMSTC Adhm 2020 Slides
No ratings yet
SMSTC Adhm 2020 Slides
18 pages
Lecture 3 - Unavoidable Patterns
No ratings yet
Lecture 3 - Unavoidable Patterns
7 pages
Algebraic Der of Laguerre
No ratings yet
Algebraic Der of Laguerre
11 pages
Lecture Slides Week3
No ratings yet
Lecture Slides Week3
23 pages
Interface Lecture1
No ratings yet
Interface Lecture1
29 pages
Methods Lecture5 Slides 2024
No ratings yet
Methods Lecture5 Slides 2024
255 pages
Lecture Slides Week5
No ratings yet
Lecture Slides Week5
11 pages
Lecture Slides Week1
No ratings yet
Lecture Slides Week1
33 pages
Finite Element Method for 2D Problems
No ratings yet
Finite Element Method for 2D Problems
24 pages
Elliptic Problems in Weak Form
No ratings yet
Elliptic Problems in Weak Form
37 pages
Lecture Slides Week5
No ratings yet
Lecture Slides Week5
15 pages
S M S T C Lecture 2425 4
No ratings yet
S M S T C Lecture 2425 4
43 pages
Methods Lecture3 Slides 2024
No ratings yet
Methods Lecture3 Slides 2024
229 pages
Applied Mathematics II
100% (1)
Applied Mathematics II
2 pages
Syllabus Ma3151
No ratings yet
Syllabus Ma3151
1 page
Uhlig
No ratings yet
Uhlig
15 pages
Data Structure Lec37 Handout
No ratings yet
Data Structure Lec37 Handout
8 pages
Fashion System Analysis by Barthes
No ratings yet
Fashion System Analysis by Barthes
5 pages
Engineering Applications of Ansys Inside Siemens AG
No ratings yet
Engineering Applications of Ansys Inside Siemens AG
38 pages
450 Questions
No ratings yet
450 Questions
15 pages
Phy 311 Question Bank 2023
No ratings yet
Phy 311 Question Bank 2023
3 pages
Computer Methods in Power System Lab: Electrical Engineering
No ratings yet
Computer Methods in Power System Lab: Electrical Engineering
31 pages
Matlab For Computational Physics
No ratings yet
Matlab For Computational Physics
9 pages
Non-Negative Matrix Factorization (NMF) : Benjamin Wilson
No ratings yet
Non-Negative Matrix Factorization (NMF) : Benjamin Wilson
43 pages
7th Pay Fixation
No ratings yet
7th Pay Fixation
69 pages
Issue 05
No ratings yet
Issue 05
68 pages
Non-Linear Vibrational Modes in Biomolecules
No ratings yet
Non-Linear Vibrational Modes in Biomolecules
7 pages
THR Calculus 7
No ratings yet
THR Calculus 7
6 pages
Model Quantization Guide
No ratings yet
Model Quantization Guide
48 pages
Fundamentals of Signal Enhancement and Array Signal Processing - 2017 - Benesty - Front Matter
No ratings yet
Fundamentals of Signal Enhancement and Array Signal Processing - 2017 - Benesty - Front Matter
11 pages
Search in A Sorted Matrix
No ratings yet
Search in A Sorted Matrix
10 pages
Dead Beat Observer Synthesis: Maria Elena Valcher, Jan C. Willems
No ratings yet
Dead Beat Observer Synthesis: Maria Elena Valcher, Jan C. Willems
8 pages
Kalman-Yakubovich-Popov Lemma: 6.245: Multivariable Control Systems by A. Megretski
No ratings yet
Kalman-Yakubovich-Popov Lemma: 6.245: Multivariable Control Systems by A. Megretski
4 pages
Creating Sparse Finite-Element Matrices in MATLAB Loren On The Art of MATLAB
No ratings yet
Creating Sparse Finite-Element Matrices in MATLAB Loren On The Art of MATLAB
8 pages
Multi-Quadrotor Payload Control
No ratings yet
Multi-Quadrotor Payload Control
11 pages
Article-Dynamic Stiffness Matrices For Linear Members With Distributed Mass
No ratings yet
Article-Dynamic Stiffness Matrices For Linear Members With Distributed Mass
12 pages
Analysis and Application of Transformer Models in The ATP Program For The Study of Ferroresonance
No ratings yet
Analysis and Application of Transformer Models in The ATP Program For The Study of Ferroresonance
7 pages
Introduction To Java Programming Chapter 7: Multidimensional Arrays
No ratings yet
Introduction To Java Programming Chapter 7: Multidimensional Arrays
43 pages
Seismic Analysis of Multistorey Building With Floating Column PDF
100% (3)
Seismic Analysis of Multistorey Building With Floating Column PDF
93 pages
If The Marks Obtained by A Student in Five Different Subjects Are Input Through The Keyboard
No ratings yet
If The Marks Obtained by A Student in Five Different Subjects Are Input Through The Keyboard
4 pages
Linear Algebra
No ratings yet
Linear Algebra
36 pages
Maths - SrSec - 2024-25 XII
No ratings yet
Maths - SrSec - 2024-25 XII
2 pages
Zhihui X (2008) Computer - Vision, I-Tech
No ratings yet
Zhihui X (2008) Computer - Vision, I-Tech
549 pages

S M S T C Lecture Notes Lecture2

Uploaded by

S M S T C Lecture Notes Lecture2

Uploaded by

SM ST C : INVERSE PROBLEMS 1–12

1.2 Lecture 2: Investigation of the Least Squares approach

d = k(m) ≈ k(m̂n ) + ∇k[m − m̂n ] =

using Taylor series expansion around m̂n , so that

= k(m̂n ) + Kn ∆mn+1 , n ∈ {0, 1, 2, . . . }

dobs = [d1obs , . . . , dNobs ]T is the observed data (known)

We are interested in model parameters

m = [m1 , m2 ]T , gradient and y-intersept, M = 2

1.2.1 Measures of length

• p(u + v) ≤ p(u) + p(v) triangle inequality

• p(αv) = |α|p(v) absolute scalability

if p satisfies the first two only, it is called a semi-norm.

L1 (l1 ) norm: kek1 = ∑|ei |

Ln (ln ) norm: kek2 = ∑|ei |n

kek1 = 0.01 + 0.1 + 3 = 3.11

L∞ (l∞ ) norm: kek∞ = max|ei |

The choice of norm

Computations are simpler with L2 norm than with L1 norm.

A vector-induced matrix norm is

the latter implies that kKmk ≤ kKkkmk and kIk = 1.

The latter allows to compute L2 norm effectively for an N × N by using

1.2.2 Classification of linear inverse problems basing on the quality of information

1.2.3 Least squares problem for a straight line

setting them to zero for a minimum,

dividing everything by two, the latter can be written as a matrix equation:

1.2.4 Least squares solution to the linear inverse problem

and recall the total overall error

Then taking the derivatives, we get so-called normal equation

where K T K is a square M × M square matrix, m is a vector length M, also K T d is a vector length M.

1.2.5 An alternative approach to least squares approximation

F(m) := kKm − dk2 = (Km − d) · (Km − d), for all m ∈ RM .

We can show that the gradient vector of F satisfies

∇F(m) = 2 K T Km − K T d , for all m ∈ RM .

1.2.6 Minimal Length solution: purely underdetermined problems

We are thus looking to minimise the following

and hence in a matrix form

and hence the minimal length solution is

1.2.7 Weak underdetermination: damped least squares

[K T K + Iε2 ]mDLS = K T d (1.10)

Underdeterminicity is said to be damped.

1.2.8 Other a priori information

where D is the steepness matrix and is an approximation of dm

1.2.9 A priori and weighting matrix

L = (m − m priori )T Wm (m − m priori ) (1.11)

mW DLS = [K T We K + ε2Wm ]−1 [K T We d + ε2Wm m prior ]

1.2.10 Moore-Penrose Pseudoinverses and least squares

Proof. The proof is similar to the example below

Theorem 1.2 (Corollary). If d is orthogonal to the range of K, then d is in the Null(K T )

Proof. If d is orthogonal to the range of K, then

Km · d = 0 for all m ∈ RM ⇒ m · K T d = 0 due to Theorem 1.1

i.e. vector K T d is orthogonal to any vector in RN , in particular to itself, so

The latter proves that d ∈ Null(K T ).

Thus the null spaces are the same.

You might also like