Lecture 2: Kalman Filters Lecture 2: Kalman Filters: İtü-Eef
Lecture 2: Kalman Filters Lecture 2: Kalman Filters: İtü-Eef
Of Control Engineering
Kalman Filters
• Why Kalman Filters?
• Overview
• Therotical Basis
• The Kalman Filters
• Algorithm
• Mathematical Derivation
External System
Controls
System State
(desired but not
known)
Observed
Measuring Measurements
Estimator
Devices
Optimal
Estimate of
System State
Measurement
Error Sources
Overview
• Recall Bayes Filter : two different ways to fuse
measurements z and actions u
– Prediction :
bel ( xt ) = ∫ p ( xt | ut , xt −1 ) bel ( xt −1 ) dxt −1
– Correction:
bel ( xt ) = η p ( zt | xt ) bel ( xt )
• Reminders:
• We need the motion model and perception model
• We were working with discrete representations of states
Overview
• Gaussian Filters : a family of recursive state
estimator
• Gaussian filters are earliest implementation of
Bayes Filters for continious spaces
• Basic idea : beliefs are represented by
multivariable normal distributions
(Gaussians)
Overview
• Review of Gaussians :
– Univariate :
p( x) ~ N ( µ , σ 2 ) :
1 ( x−µ )2
1 −
p( x) = e 2 σ2
2π σ
Overview
• Review of Gaussians :
– Multivarate :
p (x) ~ Ν (µ,Σ) :
1
1 − ( x − µ ) t Σ −1 ( x − µ ) µ
p (x) = e 2
(2π ) d /2 1/ 2
Σ
Overview
Properties of Gaussians :
• Propagation of Gaussians : a family of
recursive state estimator
– Univariate:
X ~ N (µ ,σ 2
)
⇒ Y ~ N ( a µ + b, a σ )
2 2
Y = aX + b
– Multivariate:
X ~ N ( µ , Σ)
⇒ Y ~ N ( Aµ + B , AΣAT
)
Y = AX + B
KOM 613E : Probabilistics Methods in Robotics 8 8
İTÜ-EEF Dept. Of Control Engineering
Overview
• Product of Gaussians : a family of recursive state
estimator
– Univariate:
X 1 ~ N ( µ1 , σ 1 ) σ 22 σ
2 2
1
2
⇒ p( X 1 ) ⋅ p( X 2 ) ~ N 2 µ1 + 2 1
µ2 , −2
−2
X 2 ~ N ( µ 2 , σ 2 ) σ1 + σ 2 σ1 + σ 2 σ1 + σ 2
2 2
– Multivariate:
X1 ~ N(µ1,Σ1) Σ1−1 Σ−1 1
⇒ p(X1)⋅ p(X2 ) ~ N −1 −1 µ1 + −1 −1 µ2, −1 −1
2
X2 ~ N(µ2,Σ2 ) Σ1 + Σ2 Σ1 + Σ2 Σ1 + Σ2
Overview
What is a Kalman Filter?
• Recursive data processing algorithm
• Generates optimal estimate of desired quantities
given the set of measurements
• Optimal?
– For linear system and white Gaussian errors, Kalman filter
is “best” estimate based on all previous measurements
– For non-linear system optimality is ‘qualified’
• Recursive?
– Doesn’t need to store all previous measurements and
reprocess all data each time step
Overview
Conceptual Overview
Overview
Conceptual Overview
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0 10 20 30 40 50 60 70 80 90 100
Overview
Conceptual Overview
0.16
0.14
0.08
0.06
0.04
0.02
0
0 10 20 30 40 50 60 70 80 90 100
Overview
Conceptual Overview
0.16
0.1
0.08
measurement z(t2)
0.06
0.04
0.02
0
0 10 20 30 40 50 60 70 80 90 100
Overview
Conceptual Overview
Overview
Conceptual Overview
0.16
ŷ(t2)
0.14
Simple Prediction
0.12
ŷ-(t3)
0.1
0.08
0.06
0.04
0.02
0
0 10 20 30 40 50 60 70 80 90 100
Overview
Conceptual Overview
0.16 Simple Prediction ŷ-(t3)
0.14
ŷ(t2)
0.12
0.1
0.08
0.06
Prediction ŷ-(t3)
0.04
0.02
0
0 10 20 30 40 50 60 70 80 90 100
Overview
Conceptual Overview
0.16
0.12
0.08
0.06
Prediction ŷ-(t3)
0.04
0.02
0
0 10 20 30 40 50 60 70 80 90 100
Overview
Lessons learnt from conceptual overview:
• Initial conditions (ŷk-1 and σk-1)
• Prediction (ŷ-k , σ-k)
– Use initial conditions and model (eg. constant
velocity) to make prediction
• Measurement (zk)
– Take measurement
• Correction (ŷk , σk)
– Use measurement to correct prediction by
‘blending’ prediction and residual – always a case
of merging only two Gaussians
– Optimal estimate with smaller variance
KOM 613E : Probabilistics Methods in Robotics 1919
İTÜ-EEF Dept. Of Control Engineering
KF : Theoretical Basis
Dynamic model:
• linear with additive Gaussian noise:
p ( xt ut , x t −1 ) = N ( xt ; Axt −1 + But , R)
KF : Theoretical Basis
Observation model:
• linear plus additive Gaussian noise:
p ( zt x t −1 ) = N ( zt ; Cxt −1 , Qt )
Initial Contidion:
• Gaussian distribution : bel(x0 ) = p ( x0 ) = N ( x0 ; µ0 ,Σ 0 )
KF : Theoretical Basis
Components:
Matrix (nxn) that describes how the state evolves from t
A to t-1 without controls or noise.
KF : Theoretical Basis
Linear Dynamical and Observation model case:
• if the first distribution p(x0) is Gaussian, then all
consecutive posteriors p(xt |u1:t, z1:t) are also
Gaussian
• we obtain a closed solution for the recursive
equation
• at each t, the Bayes filter re-estimates the mean and
variance-covariance matrix of the posterior Gaussian,
with adequate formulas;
• This is well-known as Kalman Filter
KF : Theoretical Basis
• The Kalman filter estimates p(xt |u1:t, z1:t), as a
normal law, through its mean and variance-
covariance matrix.
• The mean of the posterior is exactly the optimal
recursive estimator (Minimum Mean Square Error-
MMSE) in the case of linear dynamics, linear
observation model with additive, Gaussian noise,
and when the initial distribution is a known normal
KF : Theoretical Basis
The solution can be divided into two steps:
• the integration step, that only considers the dynamic
model, and that we refer to as the prediction : given
the distribution on xt−1 and an uncertain model on
how xt−1 leads to xt , then prediction of xt
∫ p( x
xt −1
t x t −1 ) p ( xt −1 z1:t −1 , u1:t −1 )dxt −1
KF : Theoretical Basis
Notation :
• (x t
_
)
, Σ t− : the Gaussian parameters corresponding
to prediction,
• (x t
+
)
, Σ t+ : the Gaussian parameters corresponding
to correction.
xt_ : state prediction
Ps : xt+ : state estimate
Σ t− : Cov. prediction
Σ t+ : Cov. estimate
KOM 613E : Probabilistics Methods in Robotics 2626
İTÜ-EEF Dept. Of Control Engineering
KF : Theoretical Basis
prediction equations:
+
x = Ax + But
t
_
t −1
− +
Σ = AΣ A + Rt
t t −1
−1
T
KF : Theoretical Basis
Correction equations:
− − −1
K t = Σ C (C Σ C + Qt )
t
T
t
T
+
xt = xt + K t ( zt − C xt )
_ _
+ −
Σt = ( I − K t C ) Σt
KF : Theoretical Basis
Dimensions:
• xt : n × 1,
• A, Σ−t , Σ+t , Rt : n × n,
• zt : m × 1,
• C : m × n,
• Qt : m × m,
• ut : c × 1,
• B : n × c,
• Kt : n × m.
Remember !
System Error Sources : εt X ~ N ( µ , Σ)
⇒ Y ~ N ( Aµ + B, AΣA )
T
with coveriance : Rt
Y = AX + B
External From Slide 8
Controls : ut System
xt = Axt−1 + But +εt
System State : xt
Observed
Measuring Measurements : zt
Estimator
Devices
Optimal
zt = Cxt + δ t Estimate of
External System State :
Controls : ut
xt+
Measurement error : δt
With covariance Qt
Kalman Filtering : 1D
Estimator
state prediction :
External µ t = a t µ t −1 + b t u t
Control : bel(x t ) = 2
σ t = a 2 2
t σ t + σ 2
act, t
ut
Measurement
At step t:
Observed
Optimal
Measur. : Mean : zt and Variance = σ2obs,t Estimate
zt
of
System
State :
xt+
Kalman Filtering : 1D
Estimator
state update at step t:
(correction)
Observed
µ = µ + K t ( zt − µ t ) σ t2
Measur. : bel ( xt ) = t 2 t with Kt = 2
σ t = (1 − K t )σ t σ t + σ obs
2 2
zt ,t
External Optimal
Control : Estimate of
ut System State :
xt+
Kalman Filtering : 1D
Estimator
Observed
Measur. :
zt
External
Control :
ut
Optimal
Estimate
of
System
State :
…… xt+
+
xt = xt + K t ( zt − C xt )
_ _
+ −
Σt = ( I − K t C ) Σt
KOM 613E : Probabilistics Methods in Robotics 3434
İTÜ-EEF Dept. Of Control Engineering
prediction measurement:
Update:
KOM 613E : Probabilistics Methods in Robotics 3535
İTÜ-EEF Dept. Of Control Engineering
Prediction: measurement:
Update:
KOM 613E : Probabilistics Methods in Robotics 3636
İTÜ-EEF Dept. Of Control Engineering
The Prediction-Correction-Cycle
Prediction
The Prediction-Correction-Cycle
K t = Σt− C T (C Σt− C T + Qt ) −1
xt+ = xt_ + K t ( zt − C xt_ )
Σt+ = ( I − K t C ) Σt−
Correction
The Prediction-Correction-Cycle
Prediction
K t = Σt− C T (C Σt− C T + Qt ) −1
xt_ = Axt+−1 + But
xt+ = xt_ + K t ( zt − C xt_ )
Σt+ = ( I − K t C ) Σt− Σ t− = AΣ t+−1 AT + Rt
Correction
Prediction:
+
2. xt
_
= A x t −1 + But
3. Σ t− = AΣ t+−1 AT + Rt
Correction:
4. K t = Σt− C T (C Σt− C T + Qt ) −1
5. xt+ = xt_ + K t ( zt − C xt_ )
6. Σt+ = ( I − K t C ) Σt−
7. Return xt+ , Σ t+
Mathematical Derivation of KF
• Aim : Mathematical implementation of the KF Algorithm
• Initial belief is normally distributed:
bel(x0 ) = N ( x0 ; µ0 ,Σ 0 )
Part 1: Predicition : (line 2 and 3 will be derived)
• Dynamics are linear function of state and control plus
additive noise:
xt = Axt −1 + But + ε t
• State transition probability:
p(xt|ut ,xt −1 ) = Ν ( xt ; Axt −1 + But ,Rt )
KOM 613E : Probabilistics Methods in Robotics 4141
İTÜ-EEF Dept. Of Control Engineering
Mathematical Derivation of KF
• State pretiction at time step t
bel ( xt ) = ∫ p( xt | ut , xt −1 ) p( xt −1 ) dxt −1
xt −1
(2π ) d /2 1/ 2
Σ
Mathematical Derivation of KF
• Let us start by the prediction at time step t
1
bel ( xt ) = η ∫ exp− ( xt − Axt −1 − But ) Rt ( xt − Axt −1 − But )
T −1
2
1 + T + −1 +
exp− ( xt −1 − xt −1 ) (Σ t −1 ) ( xt −1 − xt −1 ) dxt −1
2
• In short {− Lt }
bel ( xt ) = η ∫e dxt −1
xt −1
Mathematical Derivation of KF
• L2t does not depend on xt-1 , so the term can be moved
out the integral:
{
− L1t ( xt −1 , xt ) + L2t ( xt ) }
bel ( xt ) = η ∫e dxt −1
xt −1
{
− L1t ( xt −1 , xt ) }
=η e − L2t ( xt )
∫e dxt −1
xt −1
Mathematical Derivation of KF
• So, How is L1t(xt-1,xt) determined?
• Let’s define L1t(xt-1,xt) function , quadratic in xt-1 :
• To determine coefficents of this quatratic, calculate 1st
and 2nd derivatives of Lt :
∂Lt
∂L
= − AT Rt−1 ( xt − Axt −1 − But ) + (∑ t+−1 ) −1 ( xt −1 − xt+−1 )
∂xt −1
∂ Lt
2
T −1 + −1 −1
= A Rt A + (∑ t −1 ) ≡ Ψt
∂xt −1
2
Mathematical Derivation of KF
• Ψ defines curvature of L1t , setting 1st derivative to 0
gives the “mean” :
AT Rt−1 ( xt − Axt −1 − But ) = (∑ t+−1 ) −1 ( xt −1 − xt+−1 )
By manupulating the equations :
AT Rt−1 ( xt − But ) − AT Rt−1 Axt −1 = (∑ t+−1 ) −1 xt −1 − (∑ t+−1 ) −1 xt+−1
AT Rt−1 Axt −1 + (∑ t+−1 ) −1 xt −1 = AT Rt−1 ( xt − But ) + (∑ t+−1 ) −1 xt+−1
( AT Rt−1 A + (∑ t+−1 ) −1 ) xt −1 = AT Rt−1 ( xt − But ) + (∑ t+−1 ) −1 xt+−1
Ψt−1 xt −1 = AT Rt−1 ( xt − But ) + (∑ t+−1 ) −1 xt+−1
Mathematical Derivation of KF
• Lets define a quadratic form for L1t :
1 T −1 + −1 + −1
L ( xt −1 , xt ) = ( xt −1 − Ψt [ A Rt ( xt − But ) + (∑ t −1 ) xt −1 ]) Ψt
1
t
T
2
( xt −1 − Ψt [ AT Rt−1 ( xt − But ) + (∑ t+−1 ) −1 xt+−1 ])
L2t ( xt ) = Lt − L1t ( xt −1 , xt )
Mathematical Derivation of KF
Then :
1
L ( xt ) = − ( xt − Axt −1 − But )T Rt−1 ( xt − Axt −1 − But )
2
t
2
1
+ ( xt −1 − xt+−1 )T (Σ t+−1 ) −1 ( xt −1 − xt+−1 )
2
1
− ( xt −1 − Ψt [ AT Rt−1 ( xt − But ) + (∑ t+−1 ) −1 xt+−1 ])T Ψt−1
2
( xt −1 − Ψt [ AT Rt−1 ( xt − But ) + (∑ t+−1 ) −1 xt+−1 ])
Mathematical Derivation of KF
1 T T −1
L ( xt ) = − xt −1 A Rt Axt −1 − xtT−1 AT Rt−1 ( xt − But )
2
t
2
1
+ ( xt − But )T Rt−1 ( xt − But )
2
1 T + −1 + −1 + 1 + T + −1 +
+ xt −1 (Σ t −1 ) xt −1 − xt −1 (Σ t −1 ) xt −1 + ( xt −1 ) (Σ t −1 ) xt −1
T
2 2
1 T
− xt −1 ( AT Rt−1 A + (∑ t+−1 ) −1 ) xt −1
2
+ xtT−1[ AT Rt−1 ( xt − But ) + (Σ t+−1 ) −1 xt+−1 ]
1 T −1
− [ A Rt ( xt − But ) + (Σ t+−1 ) −1 xt+−1 ]T ( AT Rt−1 A + (∑ t+−1 ) −1 ) −1
2
[ AT Rt−1 ( xt − But ) + (∑ t+−1 ) −1 xt+−1 ]
KOM 613E : Probabilistics Methods in Robotics 4949
İTÜ-EEF Dept. Of Control Engineering
Mathematical Derivation of KF
• Terms underlined are multiplied by xt-1
• All terms contains xt-1 cancel out (Due to construction
of L1t)
Then :
1 −1 1 + T + −1 +
L ( xt ) = − ( xt − But ) Rt ( xt − But ) + ( xt −1 ) (Σ t −1 ) xt −1
2
t
T
2 2
1 T −1
− [ A Rt ( xt − But ) + (∑ t+−1 ) −1 xt+−1 ]T ( AT Rt−1 A + (∑ t+−1 ) −1 )
2
[ AT Rt−1 ( xt − But ) + (∑ t+−1 ) −1 xt+−1 ])
⇒ L2t is quadratic in xt !
KOM 613E : Probabilistics Methods in Robotics 5050
İTÜ-EEF Dept. Of Control Engineering
Mathematical Derivation of KF
• Hence bel ( xt ) is normal distrubuted
• The Mean and Covariance of this distribution are the
minimum and curvature of L2t
• These are obtained by 1st and 2nd derivates of L2t w.r.t
xt :
∂L2t ( xt )
= Rt−1 ( xt − But )T − Rt−1 A( AT Rt−1 A + (Σ t+−1 ) −1
∂xt
[ AT Rt−1 ( xt − But ) + (∑ t+−1 ) −1 xt+−1 ]
= [ Rt−1 − Rt−1 A( AT Rt−1 A + (∑ t+−1 ) −1 AT Rt−1 ]( xt − But )
− Rt−1 A( AT Rt−1 A + ∑ t+−1 ) −1 ∑ t+−1 xt+−1 ])
Mathematical Derivation of KF
• Using “Matrix Inversion Lemma” (Morrison):
∂L2t ( xt )
= ( Rt + AΣ t+−1 AT ) −1 ( xt − But )
∂xt
− Rt−1 ( AT Rt−1 A + ∑ t+−1 ) −1 (∑ t+−1 ) −1 xt+−1
Mathematical Derivation of KF
• Solving xt :
xt = But + ( Rt + AΣ t+−1 AT ) Rt−1 A ( AT Rt−1 A + ∑ t+−1 ) −1 (∑ t+−1 ) −1 xt+−1
+
A + AΣ A R A
t −1
T
t
−1 (∑ t+−1 AT Rt−1 A + I ) −1
=I
= But + Axt+−1
Mathematical Derivation of KF
• Curvature is calculated by the 2nd derivative of L2t :
∂ 2 L2t ( xt ) + −1
= ( AΣ t −1 AT
+ R )
∂ xt
2 t
− +
Σ = AΣ A + Rt
t t −1
T
Mathematical Derivation of KF
Part 2: Measurement Update :
• Dynamics are linear function of state and control plus
additive noise:
• Here Line 4, 5 and 6 are derived in the KF Algorithm
• Observations are linear function of state plus additive
noise:
zt = Ct xt + δ t
p ( zt | xt ) = N ( zt ; Ct xt , Qt )
Mathematical Derivation of KF
Part 2: Measurement Update :
• General Bayes Filter Mechanism for incorporating
measurements:
bel ( xt ) =η p( zt | xt ) bel ( xt )
~ N (zt ; Ct xt , Qt ) (
~ N xt ; µ t , Σ t )
• p(zt | xt) is defined by a normal distribution and bel(x t )
is defined in slide 37
• The equation above, can given :
KOM 613E : Probabilistics Methods in Robotics 5656
İTÜ-EEF Dept. Of Control Engineering
Mathematical Derivation of KF
bel ( xt ) =η exp{− J t }
1 1
J t = − ( zt − Ct xt ) Qt ( zt − Ct xt ) + ( xt − µt )T Σt−1 ( xt − µt )
T −1
2 2
• Similiarly, it can be shown that :
µ t = µ t + K t ( z t − Ct µ t )
bel ( xt ) = with K t = Σ t CtT (Ct Σ t CtT + Qt ) −1
Σ t = ( I − K t Ct ) Σ t
KF - Terminology
• Gain: specifies to what extent the observation
is taken into account in the estimation
− − −1
K t = Σ C (C Σ C + Qt )
t
T
t
T
KF - Results
• Note that in the 1D case, just after the
correction, the variance-covariance
– is “smaller” than the one of the prediction,
– is even “smaller” than the one of the observation.
KF - Results
• Every prediction makes the uncertainty higher;
KF - Complexity
Prediction :
• mean: O(n2),
• covariance matrix: O(n3),
But in general, A is sparse with diagonal blocks
and so is Rt. Globally, prediction step stays
O(n2)
KF - Complexity
Correction :
• Gain
– 1st multiplication: O(n2m)
– 2nd multiplication: O(n2m + m2n)
– inversion O(m3)
• mean: O(mn),
• covariance matrix: O(mn2+ n3).
Total: O(mn2+ nm2+ n3+ m3) (more realistic
O(m3+n2)) ( dim(x)=n, dim(z)=m )
Summary
• Main idea behind the Kalman Filtering is
given.
• An algorithm is defined as result of prediction
and correction processes.
• Brief mathematical derivations are given
• KFs Optimal for linear Gaussian systems!
Next class…
Homework 1