Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
99 views40 pages

Lecture 7: Optimal Smoothing: Simo Särkkä

This document describes optimal smoothing techniques for state space models. It begins with an overview of optimal smoothing and the Bayesian smoothing equations. It then derives the Rauch-Tung-Striebel smoother for linear Gaussian state space models. Extensions of this smoother to nonlinear models using techniques like the extended Kalman filter and statistical linearization are also presented. The document provides mathematical derivations of the smoothing equations and algorithms for linear and nonlinear models.

Uploaded by

rahehaqguests
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views40 pages

Lecture 7: Optimal Smoothing: Simo Särkkä

This document describes optimal smoothing techniques for state space models. It begins with an overview of optimal smoothing and the Bayesian smoothing equations. It then derives the Rauch-Tung-Striebel smoother for linear Gaussian state space models. Extensions of this smoother to nonlinear models using techniques like the extended Kalman filter and statistical linearization are also presented. The document provides mathematical derivations of the smoothing equations and algorithms for linear and nonlinear models.

Uploaded by

rahehaqguests
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Lecture 7: Optimal Smoothing

Simo Srkk
Department of Biomedical Engineering and Computational Science
Aalto University

March 17, 2011

Simo Srkk

Lecture 7: Optimal Smoothing

Contents

What is Optimal Smoothing?

Bayesian Optimal Smoothing Equations

Rauch-Tung-Striebel Smoother

Gaussian Approximation Based Smoothing

Particle Smoothing

Summary and Demonstration

Simo Srkk

Lecture 7: Optimal Smoothing

Filtering, Prediction and Smoothing

Simo Srkk

Lecture 7: Optimal Smoothing

Types of Smoothing Problems

Fixed-interval smoothing: estimate states on interval [0, T ]


given measurements on the same interval.
Fixed-point smoothing: estimate state at a fixed point of
time in the past.
Fixed-lag smoothing: estimate state at a fixed delay in the
past.
Here we shall only consider fixed-interval smoothing, the
others can be quite easily derived from it.

Simo Srkk

Lecture 7: Optimal Smoothing

Examples of Smoothing Problems

Given all the radar measurements of a rocket (or missile)


trajectory, what was the exact place of launch?
Estimate the whole trajectory of a car based on GPS
measurements to calibrate the inertial navigation system
accurately.
What was the history of chemical/combustion/other
process given a batch of measurements from it?
Remove noise from audio signal by using smoother to
estimate the true audio signal under the noise.
Smoothing solution also arises in EM algorithm for
estimating the parameters of a state space model.

Simo Srkk

Lecture 7: Optimal Smoothing

Optimal Smoothing Algorithms


Linear Gaussian models
Rauch-Tung-Striebel smoother (RTSS).
Two-filter smoother.

Non-linear Gaussian models


Extended Rauch-Tung-Striebel smoother (ERTSS).
Unscented Rauch-Tung-Striebel smoother (URTSS).
Statistically linearized Rauch-Tung-Striebel smoother
(URTSS).
Gaussian Rauch-Tung-Striebel smoothers (GRTSS),
cubature, Gauss-Hermite, Bayes-Hermite, Monte Carlo.
Two-filter versions of the above.

Non-linear non-Gaussian models


Sequential importance resampling based smoother.
Rao-Blackwellized particle smoothers.
Grid based smoother.
Simo Srkk

Lecture 7: Optimal Smoothing

Problem Formulation
Probabilistic state space model:
measurement model: yk p(yk | xk )

dynamic model: xk p(xk | xk 1 )

Assume that the filtering distributions p(xk | y1:k ) have


already been computed for all k = 0, . . . , T .
We want recursive equations of computing the smoothing
distribution for all k < T :
p(xk | y1:T ).
The recursion will go backwards in time, because on the
last step, the filtering and smoothing distributions coincide:
p(xT | y1:T ).
Simo Srkk

Lecture 7: Optimal Smoothing

Derivation of Formal Smoothing Equations [1/2]


The key: due to the Markov properties of state we have:
p(xk | xk +1 , y1:T ) = p(xk | xk +1 , y1:k )
Thus we get:
p(xk | xk +1 , y1:T ) = p(xk | xk +1 , y1:k )
p(xk , xk +1 | y1:k )
=
p(xk +1 | y1:k )
p(xk +1 | xk , y1:k ) p(xk | y1:k )
=
p(xk +1 | y1:k )
p(xk +1 | xk ) p(xk | y1:k )
.
=
p(xk +1 | y1:k )

Simo Srkk

Lecture 7: Optimal Smoothing

Derivation of Formal Smoothing Equations [2/2]


Assuming that the smoothing distribution of the next step
p(xk +1 | y1:T ) is available, we get
p(xk , xk +1 | y1:T ) = p(xk | xk +1 , y1:T ) p(xk +1 | y1:T )

= p(xk | xk +1 , y1:k ) p(xk +1 | y1:T )


p(xk +1 | xk ) p(xk | y1:k ) p(xk +1 |y1:T )
=
p(xk +1 | y1:k )

Integrating over xk +1 gives



Z 
p(xk +1 | xk ) p(xk +1 | y1:T )
p(xk | y1:T ) = p(xk | y1:k )
dxk +1
p(xk +1 | y1:k )

Simo Srkk

Lecture 7: Optimal Smoothing

Bayesian Optimal Smoothing Equations

Bayesian Optimal Smoothing Equations


The Bayesian optimal smoothing equations consist of
prediction step and backward update step:
Z
p(xk +1 | y1:k ) = p(xk +1 | xk ) p(xk | y1:k ) dxk

Z 
p(xk +1 | xk ) p(xk +1 | y1:T )
dxk +1
p(xk | y1:T ) = p(xk | y1:k )
p(xk +1 | y1:k )
The recursion is started from the filtering (and smoothing)
distribution of the last time step p(xT | y1:T ).

Simo Srkk

Lecture 7: Optimal Smoothing

Linear-Gaussian Smoothing Problem


Gaussian driven linear model, i.e., Gauss-Markov model:
xk = Ak 1 xk 1 + qk 1
yk = Hk xk + rk ,
In probabilistic terms the model is
p(xk | xk 1 ) = N(xk | Ak 1 xk 1 , Qk 1 )
p(yk | xk ) = N(yk | Hk xk , Rk ).

Kalman filter can be used for computing all the Gaussian


filtering distributions:
p(xk | y1:k ) = N(xk | mk , Pk ).

Simo Srkk

Lecture 7: Optimal Smoothing

RTS: Derivation Preliminaries


Gaussian probability density
N(x | m, P) =

1
(2 )n/2 |P|1/2


1
T 1
exp (x m) P (x m) ,
2

Let x and y have the Gaussian densities


p(x) = N(x | m, P),

p(y | x) = N(y |H x, R),

Then the joint and marginal distributions are


 


 
x
m
P
P HT
N
,
y
Hm
H P H P HT + R
y N(H m, H P HT + R).

Simo Srkk

Lecture 7: Optimal Smoothing

RTS: Derivation Preliminaries (cont.)


If the random variables x and y have the joint Gaussian
probability density
 
  

x
a
A C
N
,
,
y
b
CT B
Then the marginal and conditional densities of x and y are
given as follows:
x N(a, A)

y N(b, B)

x | y N(a + C B1 (y b), A C B1 CT )

y | x N(b + CT A1 (x a), B CT A1 C).

Simo Srkk

Lecture 7: Optimal Smoothing

Derivation of Rauch-Tung-Striebel Smoother [1/4]

By the Gaussian distribution computation rules we get


p(xk , xk +1 | y1:k ) = p(xk +1 | xk ) p(xk | y1:k )

= N(xk +1 | Ak xk , Qk ) N(xk | mk , Pk )



xk
=N
m1 , P 1 ,
xk +1

where
m1 =

mk
A k mk

P1 =

Simo Srkk

Pk
Ak Pk

Pk ATk
Ak Pk ATk + Qk

Lecture 7: Optimal Smoothing

Derivation of Rauch-Tung-Striebel Smoother [2/4]

By conditioning rule of Gaussian distribution we get


p(xk | xk +1 , y1:T ) = p(xk | xk +1 , y1:k )
= N(xk | m2 , P2 ),

where
Gk = Pk ATk (Ak Pk ATk + Qk )1
m2 = mk + Gk (xk +1 Ak mk )

P2 = Pk Gk (Ak Pk ATk + Qk ) GTk .

Simo Srkk

Lecture 7: Optimal Smoothing

Derivation of Rauch-Tung-Striebel Smoother [3/4]


The joint distribution of xk and xk +1 given all the data is
p(xk +1 , xk | y1:T ) = p(xk | xk +1 , y1:T ) p(xk +1 | y1:T )

= N(xk | m2 , P2 ) N(xk +1 | msk +1 , Psk +1 )





xk +1
=N
m3 , P 3
xk

where
m3 =
P3 =


msk +1
mk + Gk (msk +1 Ak mk )


Psk +1 GTk
Psk +1
.
Gk Psk +1 Gk Psk +1 GTk + P2

Simo Srkk

Lecture 7: Optimal Smoothing

Derivation of Rauch-Tung-Striebel Smoother [4/4]

The marginal mean and covariance are thus given as


msk = mk + Gk (msk +1 Ak mk )

Psk = Pk + Gk (Psk +1 Ak Pk ATk Qk ) GTk .

The smoothing distribution is then Gaussian with the above


mean and covariance:
p(xk | y1:T ) = N(xk | msk , Psk ),

Simo Srkk

Lecture 7: Optimal Smoothing

Rauch-Tung-Striebel Smoother
Rauch-Tung-Striebel Smoother
Backward recursion equations for the smoothed means msk and
covariances Psk :
m
k +1 = Ak mk
T
P
k +1 = Ak Pk Ak + Qk
1
Gk = Pk ATk [P
k +1 ]

msk = mk + Gk [msk +1 m
k +1 ]

T
Psk = Pk + Gk [Psk +1 P
k +1 ] Gk ,

mk and Pk are the mean and covariance computed by the


Kalman filter.
The recursion is started from the last time step T , with
msT = mT and PsT = PT .
Simo Srkk

Lecture 7: Optimal Smoothing

RTS Smoother: Car Tracking Example


The dynamic model of the car tracking model from the first &
third lectures was:

xk 1
xk
1 0 t 0
yk 0 1 0 t yk 1

+ qk 1

x k = 0 0 1
0 x k 1
y k 1
y k
0 0 0
1
{z
}
|
A

where qk is zero mean with a covariance matrix Q.


c 3

0
0
q1c t 2 /2
q1 t /3

0
q2c t 3 /3
0
q2c t 2 /2

Q=
c t
q c t 2 /2

0
q
0
1
1
c
2
c
0
q2 t /2
0
q2 t
Simo Srkk

Lecture 7: Optimal Smoothing

Non-Linear Smoothing Problem

Non-linear Gaussian state space model:


xk = f(xk 1 ) + qk 1
yk = h(xk ) + rk ,
We want to compute Gaussian approximations to the
smoothing distributions:
p(xk | y1:T ) N(xk | msk , Psk ).

Simo Srkk

Lecture 7: Optimal Smoothing

Extended Rauch-Tung-Striebel Smoother Derivation


The approximate joint distribution of xk and xk +1 is



xk
p(xk , xk +1 | y1:k ) = N
m1 , P 1 ,
xk +1
where

m1 =
P1 =

mk
f(mk )

Pk
Fx (mk ) Pk

Pk FTx (mk )
Fx (mk ) Pk FTx (mk ) + Qk

The rest of the derivation is analogous to the linear RTS


smoother.

Simo Srkk

Lecture 7: Optimal Smoothing

Extended Rauch-Tung-Striebel Smoother

Extended Rauch-Tung-Striebel Smoother


The equations for the extended RTS smoother are
m
k +1 = f(mk )
T
P
k +1 = Fx (mk ) Pk Fx (mk ) + Qk
1
Gk = Pk FTx (mk ) [P
k +1 ]

msk = mk + Gk [msk +1 m
k +1 ]

T
Psk = Pk + Gk [Psk +1 P
k +1 ] Gk ,

where the matrix Fx (mk ) is the Jacobian matrix of f(x)


evaluated at mk .

Simo Srkk

Lecture 7: Optimal Smoothing

Statistically Linearized Rauch-Tung-Striebel Smoother


Derivation
With statistical linearization we get the approximation



xk
p(xk , xk +1 | y1:k ) = N
m1 , P 1 ,
xk +1

where


mk
m1 =
E[f(xk )]


Pk
E[f(xk ) xTk ]T
.
P1 =
T T
E[f(xk ) xTk ] E[f(xk ) xTk ] P1
k E[f(xk ) xk ] + Qk
The expectations are taken with respect to filtering
distribution of xk .
The derivation proceeds as with linear RTS smoother.
Simo Srkk

Lecture 7: Optimal Smoothing

Statistically Linearized Rauch-Tung-Striebel Smoother

Statistically Linearized Rauch-Tung-Striebel Smoother


The equations for the statistically linearized RTS smoother are
m
k +1 = E[f(xk )]
1
T T
T
P
k +1 = E[f(xk ) xk ] Pk E[f(xk ) xk ] + Qk
1
Gk = E[f(xk ) xTk ]T [P
k +1 ]

msk = mk + Gk [msk +1 m
k +1 ]

T
Psk = Pk + Gk [Psk +1 P
k +1 ] Gk ,

where the expectations are taken with respect to the filtering


distribution xk N(mk , Pk ).

Simo Srkk

Lecture 7: Optimal Smoothing

Gaussian Rauch-Tung-Striebel Smoother Derivation


With Gaussian moment matching we get the approximation
!
 
 

Pk
Dk +1
x k mk
,
, T
p(xk , xk +1 | y1:k ) = N

Dk +1 P
xk +1 m
k +1
k +1
where

m
k +1 =

P
k +1 =

Dk +1 =

f(xk ) N(xk | mk , Pk ) d xk

T
[f(xk ) m
k +1 ] [f(xk ) mk +1 ]

N(xk | mk , Pk ) d xk + Qk
T
[xk mk ] [f(xk ) m
k +1 ] N(xk | mk , Pk ) d xk .

Simo Srkk

Lecture 7: Optimal Smoothing

Gaussian Rauch-Tung-Striebel Smoother


Gaussian Rauch-Tung-Striebel Smoother
The equations for the Gaussian RTS smoother are
Z

mk +1 = f(xk ) N(xk | mk , Pk ) d xk
Z

T
Pk +1 = [f(xk ) m
k +1 ] [f(xk ) mk +1 ]
Dk +1 =

N(xk | mk , Pk ) d xk + Qk

T
[xk mk ] [f(xk ) m
k +1 ] N(xk | mk , Pk ) d xk

1
Gk = Dk +1 [P
k +1 ]

msk = mk + Gk (msk +1 m
k +1 )

T
Psk = Pk + Gk (Psk +1 P
k +1 ) Gk .

Simo Srkk

Lecture 7: Optimal Smoothing

Cubature Smoother Derivation [1/2]


Recall the 3rd order spherical Gaussian integral rule:
Z
g(x) N(x | m, P) d x
2n

1 X

g(m + P (i) ),
2n
i=1

where

(i)


n ei
, i = 1, . . . , n

=
n ein , i = n + 1, . . . , 2n,

where ei denotes a unit vector to the direction of


coordinate axis i.

Simo Srkk

Lecture 7: Optimal Smoothing

Cubature Smoother Derivation [2/2]


We get the approximation


xk
p(xk , xk +1 | y1:k ) = N
xk +1
where
(i)

Xk = mk +


 
!

Pk
Dk +1
mk
,
, T

Dk +1 P
m
k +1
k +1

Pk (i)

2n

m
k +1 =
P
k +1 =
Dk +1 =

1 X
(i)
f(Xk )
2n
1
2n
1
2n

i=1
2n
X
i=1
2n
X
i=1

(i)

(i)

T
[f(Xk ) m
k +1 ] [f(Xk ) mk +1 ] + Qk
(i)

(i)

T
[Xk mk ] [f(Xk ) m
k +1 ] .

Simo Srkk

Lecture 7: Optimal Smoothing

Cubature Rauch-Tung-Striebel Smoother [1/3]


Cubature Rauch-Tung-Striebel Smoother
1

Form the sigma points:


(i)

Xk = mk +

Pk (i) ,

i = 1, . . . , 2n,

where the unit sigma points are defined as



n ei
, i = 1, . . . , n

(i) =
n ein , i = n + 1, . . . , 2n.
2

Propagate the sigma points through the dynamic model:


(i)
(i)
Xk +1 = f(Xk ),

Simo Srkk

i = 1, . . . , 2n.

Lecture 7: Optimal Smoothing

Cubature Rauch-Tung-Striebel Smoother [2/3]


Cubature Rauch-Tung-Striebel Smoother (cont.)
3

Compute the predicted mean m


k +1 , the predicted

covariance Pk +1 and the cross-covariance Dk +1 :


2n

m
k +1

1 X (i)
=
Xk +1
2n

P
k +1 =
Dk +1 =

1
2n
1
2n

i=1
2n
X

(i)

T
(i)
(Xk +1 m
k +1 ) (Xk +1 mk +1 ) + Qk

i=1
2n
X
i=1

(i)

(i)

T
(Xk mk ) (Xk +1 m
k +1 ) .

Simo Srkk

Lecture 7: Optimal Smoothing

Cubature Rauch-Tung-Striebel Smoother [3/3]

Cubature Rauch-Tung-Striebel Smoother (cont.)


4

Compute the gain Gk , mean msk and covariance Psk as


follows:
1
Gk = Dk +1 [P
k +1 ]

msk = mk + Gk (msk +1 m
k +1 )

T
Psk = Pk + Gk (Psk +1 P
k +1 ) Gk .

Simo Srkk

Lecture 7: Optimal Smoothing

Unscented Rauch-Tung-Striebel Smoother [1/3]

Unscented Rauch-Tung-Striebel Smoother


1

Form the sigma points:


(0)

Xk

(i)

= mk ,

n+

= mk n +

hp

Xk = mk +
(i+n)
Xk
2

hp

Pk
Pk

ii
i

i = 1, . . . , n.

Propagate the sigma points through the dynamic model:


(i)
(i)
Xk +1 = f(Xk ),

Simo Srkk

i = 0, . . . , 2n.

Lecture 7: Optimal Smoothing

Unscented Rauch-Tung-Striebel Smoother [2/3]


Unscented Rauch-Tung-Striebel Smoother (cont.)
3

Compute predicted mean, covariance and


cross-covariance:
m
k +1

2n
X

(m)

Wi

i=0

P
k +1 =

2n
X

(c)

(Xk +1 m
k +1 ) (Xk +1 mk +1 ) + Qk

(c)

(i)
(i)
T
(Xk mk ) (Xk +1 m
k +1 ) ,

Wi

i=0

Dk +1 =

2n
X
i=0

(i)
Xk +1

Wi

(i)

Simo Srkk

(i)

Lecture 7: Optimal Smoothing

Unscented Rauch-Tung-Striebel Smoother [3/3]

Unscented Rauch-Tung-Striebel Smoother (cont.)


4

Compute gain smoothed mean and smoothed covariance:


as follows:
1
Gk = Dk +1 [P
k +1 ]

msk = mk + Gk (msk +1 m
k +1 )

T
Psk = Pk + Gk (Psk +1 P
k +1 ) Gk .

Simo Srkk

Lecture 7: Optimal Smoothing

Other Gaussian RTS Smoothers

Gauss-Hermite RTS smoother is based on


multidimensional Gauss-Hermite integration.
Bayes-Hermite or Gaussian Process RTS smoother uses
Gaussian process based quadrature (Bayes-Hermite).
Monte Carlo integration based RTS smoothers.
Central differences etc.

Simo Srkk

Lecture 7: Optimal Smoothing

Particle Smoothing [1/2]

The smoothing solution can be obtained from SIR by


storing the whole state histories into the particles.
Special care is needed on the resampling step.
The smoothed distribution approximation is then of the
form
N
X
(i)
(i)
wT (xk xk ).
p(xk | y1:T )
i=1

where

(i)
xk

(i)

is the kth component in x1:T .

Unfortunately, the approximation is often quite degenerate.


Specialized algorithms for particle smoothing exists.

Simo Srkk

Lecture 7: Optimal Smoothing

Particle Smoothing [2/2]


Recall the Rao-Blackwellized particle filtering model:
sk p(sk | sk 1 )

xk = A(sk 1 ) xk 1 + qk ,
yk = H(sk ) xk + rk ,

qk N(0, Q)

rk N(0, R)

The principle of Rao-Blackwellized particle smoothing is


the following:
1

During filtering store the whole sampled state and Kalman


filter histories to the particles.
At the smoothing step, apply Rauch-Tung-Striebel
smoothers to each of the Kalman filter histories in the
particles.

The smoothing distribution approximation will then be of


the form
N
X
(i)
(i)
s,(i)
s,(i)
p(xk , sk | y1:T )
wT (sk sk ) N(xk | mk , Pk ).
i=1

Simo Srkk

Lecture 7: Optimal Smoothing

Summary
Optimal smoothing is used for computing estimates of
state trajectories given the measurements on the whole
trajectory.
Rauch-Tung-Striebel (RTS) smoother is the closed form
smoother for linear Gaussian models.
Extended, statistically linearized and unscented RTS
smoothers are the approximate nonlinear smoothers
corresponding to EKF, SLF and UKF.
Gaussian RTS smoothers: cubature RTS smoother,
Gauss-Hermite RTS smoothers and various others
Particle smoothing can be done by storing the whole state
histories in SIR algorithm.
Rao-Blackwellized particle smoother is a combination of
particle smoothing and RTS smoothing.
Simo Srkk

Lecture 7: Optimal Smoothing

Matlab Demo: Pendulum [1/2]


Pendulum model:
 

 1 
xk11 + xk21 t
0
xk
+
=
qk 1
xk21 g sin(xk11 ) t
xk2
{z
}
|
f(xk1 )

yk =

sin(xk1 ) +rk ,
| {z }
h(xk )

The required Jacobian matrix for ERTSS:




1
t
Fx (x) =
g cos(x 1 ) t 1

Simo Srkk

Lecture 7: Optimal Smoothing

Matlab Demo: Pendulum [2/2]


The required expected value for SLRTSS is


m1 + m2 t
E[f(x)] =
m2 g sin(m1 ) exp(P11 /2) t
And the cross term:
E[f(x) (x m)T ] =


c11 c12
,
c21 c22

where
c11 = P11 + t P12
c12 = P12 + t P22
c21 = P12 g t cos(m1 ) P11 exp(P11 /2)

c22 = P22 g t cos(m1 ) P12 exp(P11 /2)


Simo Srkk

Lecture 7: Optimal Smoothing

You might also like