Filtering and Identification
Lecture 9:
The system identification cycle
Michel Verhaegen and Jan Willem van Wingerden
1/39
Delft Center for Systems and Control
Delft University of Technology
System identification
Is there more then fitting a model to the data?
2/39
Delft Center for Systems and Control
System identification
Is there more then fitting a model to the data?
General Rule: Identify the system under
conditions it will be used for!
2/39
Delft Center for Systems and Control
The system identification cycle
Yes
Experiment Data pre− (u(k),y(k) ) Fit model Model
Start processing to data validation ok? End
Design
No
Model
structure
selection
3/39
Delft Center for Systems and Control
Overview
• Experiment design
• Data pre-processesing
• Model structure selection
• Model validation
• Demo acoustical duct
4/39
Delft Center for Systems and Control
Experiment design
Choice of · · ·
• Sampling frequency
• Experiment duration
• Type of input sequence
• “Persistency of excitation of the input”
5/39
Delft Center for Systems and Control
Choice of sampling frequency
Aliasing: [Shannon] All frequency components
with a frequency higher than ωS /2 are mirrored
across the line ω = ω2S in the band [− ω2S , ω2S ].
6/39
Delft Center for Systems and Control
Choice of sampling frequency
Aliasing: [Shannon] All frequency components
with a frequency higher than ωS /2 are mirrored
across the line ω = ω2S in the band [− ω2S , ω2S ].
⇒ if frequency band of interest equals [−ωB , ωB ]
then pre-filter signals by a band-pass filter
−ωB 0 ωB ω
6/39
Delft Center for Systems and Control
Choice of sampling frequency
Bandwith LTI system: ωB is the frequency at
which |G(ω)| ≈ |G(0)|
√
2
7/39
Delft Center for Systems and Control
Choice of sampling frequency
Bandwith LTI system: ωB is the frequency at
which |G(ω)| ≈ |G(0)|
√
2
⇒ qualitative info by simple transient
experiments.
Rule of tumb: based on a rough estimate of ω
bB
one ’over’ selects ωS ≈ 10b
ωB .
7/39
Delft Center for Systems and Control
Example: qualitative selection
1
Consider G(s) = , 1
1 + τs 0.8
1
then ωB = , (τ = 10) 0.6
τ 0.4
When G(s) is unknown: 0.2
get info on ωB via the step 0
0 10 20 30 40 50 60
response Step response
8/39
Delft Center for Systems and Control
Example: qualitative selection
1
Consider G(s) = , 1
1 + τs 0.8
1
then ωB = , (τ = 10) 0.6
τ 0.4
When G(s) is unknown: 0.2
get info on ωB via the step 0
0 10 20 30 40 50 60
response Step response
Rule: If we select about 8 or 9 samples during
the period [0, ≈ steady state], we sample at
about 10ωB . The time interval T between the
2π
circles-◦ ≈ equals 10ω B
= 2π seconds.
8/39
Delft Center for Systems and Control
Transient analysis
The use of the response to simple inputs (steps,
impulses, . . . ) to retrieve info on:
• which input affects which output?
• linearity of the system
• dominant time constants
9/39
Delft Center for Systems and Control
Transient analysis
The use of the response to simple inputs (steps,
impulses, . . . ) to retrieve info on:
• which input affects which output?
• linearity of the system
• dominant time constants
Procedures can be devised to 1.6
1.4
determine dominant time con- 1.2
stants (and bandwith) from step 0.8
0.6
response when the system is un- 0.4
0.2
derdamped 0
0 5 10 15
9/39
Delft Center for Systems and Control
Experiment duration
In PEM, the covariance matrix of the estimated
parameter vector equals
" N −1 #−1
σe2 1 X
φ(k)φ(k)T ⇒ Select large N .
N N
k=0
10/39
Delft Center for Systems and Control
Experiment duration
In PEM, the covariance matrix of the estimated
parameter vector equals
" N −1 #−1
σe2 1 X
φ(k)φ(k)T ⇒ Select large N .
N N
k=0
However, slow system dynamics constrain N .
For example: with a destillation column the
collection of ≈ 1000 data points may require
testing for over a week.
This can be a lot of money!
10/39
Delft Center for Systems and Control
Experiment duration
A rule of tumb: experiment duration ≈ 10 times
the largest time constant.
11/39
Delft Center for Systems and Control
Experiment duration
A rule of tumb: experiment duration ≈ 10 times
the largest time constant.
1
Example: Consider G(s) = (1+τ1 s)(1+τ 2 s)
τ1 = 1 sec. and τ2 = 0.01 sec.
1 rad
Sampling frequency: ωs = 10. 0.01 sec ≈ 160 Hz
Experiment duration: ≈ 10 · 1 sec.
11/39
Delft Center for Systems and Control
Experiment duration
A rule of tumb: experiment duration ≈ 10 times
the largest time constant.
1
Example: Consider G(s) = (1+τ1 s)(1+τ 2 s)
τ1 = 1 sec. and τ2 = 0.01 sec.
1 rad
Sampling frequency: ωs = 10. 0.01 sec ≈ 160 Hz
Experiment duration: ≈ 10 · 1 sec.
What efficiency do we gain if we could handle
multi-rate (double-rate) sampled data
sequences?
11/39
Delft Center for Systems and Control
Type of input sequences
T
a
a
0
Step 0
Harmonic
1 b N 1 N
c
a
Short pulse
a
0
Gaussian
0
1 b N 1 N
white noise
c
a a
Frequency
0
Doublet 0
sweep
1 b N 1 N
a a
n steps
0 Staircase 0 PRBS
1 N 1 N
PRBS: Pseudo Random Binary Sequence
12/39
Delft Center for Systems and Control
Persistency of excitation
Example Consider the input u(k):
(
sin ωk k = 1, 2, · · · , N
u(k) =
0 k≤0
Can we estimate the parameters of the following
model uniquely?
y(k) = θ1 u(k) + θ2 u(k − 1) + θ3 u(k − 2)
13/39
Delft Center for Systems and Control
Persistency of excitation
The sequence {u(k)}N k=1 is persistently exciting
of order n if and only if there exists an integer N
such that the matrix:
u(0) u(1) · · · u(N − 1)
u(1) u(2) · · · u(N )
U1,n,N = .
. .
.
. .
u(n − 1) u(n) · · · u(N + n − 2)
has full column rank n.
14/39
Delft Center for Systems and Control
Persistency of excitation with an ARX model
Lemma 2: Let the input-output data be generated
by the system:
Bn (q)
y(k) = u(k) + v(k)
An (q)
with v(k) p.e. of any order, then we can uniquely
estimate the coefficients of an n-th order ARX
model provided the input u(k) is persistently ex-
citing of order n.
15/39
Delft Center for Systems and Control
Overview
Yes
Experiment Data pre− (u(k),y(k) ) Fit model Model
Start processing to data validation ok? End
Design
No
Model
structure
selection
16/39
Delft Center for Systems and Control
Data pre-processing
• Decimation
• Detrending
• Pre-filtering
• Concatenating data sequences
17/39
Delft Center for Systems and Control
Decimation
Definition: Taking every jn + 1-th sample (for
j ∈ N and n = 0, 1, · · · ) of the original
input-output data sequences. ⇒ ωS → ωjS
Caution: Make use of a digital anti-aliasing filter
with cut-off ≡ ω2jS ( rad
s ).
18/39
Delft Center for Systems and Control
Decimation
When is decimation necessary?
The effect of having selected a too high (original)
sampling frequency may reflect into:
• The poles cluster around the point z = 1:
lim e−λj T = 1
T →0
• High-frequency disturbance in the data
(above the frequency band of interest)
19/39
Delft Center for Systems and Control
Why Detrending?
Example: Consider the nonlinear system,
x(k + 1) = f x(k) y(k) = x(k)
Linearization of f (x) = x2 in Local coordinates x̃(k)
the point (1, 1)
x(k + 1) x(k + 1)
x2 (k) x̃(k + 1)
2x(k) − 1
x̃(k)
1 x(k) 1 x(k)
-1 -1
20/39
Delft Center for Systems and Control
Detrending
A linear model approximation in a limited region
of the operation envelope:
x x(k) + B(u(k) − u)
e(k + 1) = Ae x
e(k0 ) = x0
y(k) = C x
e(k) + y
Two ways to deal with unknown offsets (u, y):
• Subtracting sample means from input and
output sequences {u(k), y(k)}.
• Estimating the offsets together with other
system parameters.
21/39
Delft Center for Systems and Control
RPs in the frequency domain
The cost function optimized by Prediction error
methods:
2
lim JN (θ) = E[ǫ(k) ]
N →∞
The stochastic variant of Parseval is:
Z π/T
2 T
E[ǫ(k) ] = Rǫ (0) = Φǫ (ω)dω
2π −π/T
22/39
Delft Center for Systems and Control
Data pre-filtering
\ : y = Ĝu + Ĥe,
Given {uk , yk } and a linear filter L(q), and SGM
what system is identified using {L(q)uk , L(q)yk }?
23/39
Delft Center for Systems and Control
Data pre-filtering
\ : y = Ĝu + Ĥe,
Given {uk , yk } and a linear filter L(q), and SGM
what system is identified using {L(q)uk , L(q)yk }?
yk = G(q)uk + vk ⇒ (L(q)yk ) = G(q)(L(q)uk ) + L(q)vk
23/39
Delft Center for Systems and Control
Data pre-filtering
\ : y = Ĝu + Ĥe,
Given {uk , yk } and a linear filter L(q), and SGM
what system is identified using {L(q)uk , L(q)yk }?
yk = G(q)uk + vk ⇒ (L(q)yk ) = G(q)(L(q)uk ) + L(q)vk
However, JN (θ) is modified,
Z π 2 2
1 b 2 |L| Φ u |L| Φv
lim JN (θ) = |G − G| + dω
N →∞ 2π −π b
|H| 2 b
|H| 2
The filter L(q) can be used a posteriori to modify the weighting
term of |Gb − G|2 .
b −1 .
For example: to cancel the effect of H
23/39
Delft Center for Systems and Control
Overview
Yes
Experiment Data pre− (u(k),y(k) ) Fit model Model
Start processing to data validation ok? End
Design
No
Model
structure
selection
24/39
Delft Center for Systems and Control
Model structure selection
• Delay or system dead-time estimation →
shifting the data sequences and then
identifying delay-free models.
• Model structure selection in subspace
identification
• Illustration with acoustical duct model
25/39
Delft Center for Systems and Control
Model selection in subspace ident.
Recipe: Selecting the number of (block) rows of
the Hankel matrices, that is the index s.
s>n
Therefore only a rough estimate of the system
order needs to be known.
Subspace methods provide usefull information
on the only model structure parameter, namely
the order of the state-space model.
26/39
Delft Center for Systems and Control
Model selection in subspace ident.
2
10
1
10
The computed 0
10
singular values
-1
10
-2
10
0 5 10 15 20
LET THE DATA SPEAK FOR ITSELF
27/39
Delft Center for Systems and Control
Model selection in subspace ident.
100
95
90
VAF for 85
different 80
75
model orders
70
65
60
2 4 6 8 10 12
PN !
1 2
N k=1 ky(k) − y
b (k, θ)k2
VAF (y(k), yb(k, θ)) = 1− 1 P N
· 100%
ky(k)k 2
N k=1 2
28/39
Delft Center for Systems and Control
An illustrative example
Consider a schematic view of our acoustical duct:
Noise
Speaker Second Speaker Microphone
u(k) y(k)
We determine the parameters of a sixth order
state space model [A, B, C, D] given
{u(k), y(k)}6000
k=1 with subspace identification using
s = 40.
The results are compared with PEM.
29/39
Delft Center for Systems and Control
An illustrative example
Distribution of the VAF values
for 100 identification experiments
on the acoustical duct
100 100 100
90 53.1 % 90 89.5 % 90 96.9 %
80 80 80
70 70 70
60 60 60
50 50 50
40 40 40
30 30 30
20 20 20
10 10 10
0 0 0
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
PEM SI SI and PEM
30/39
Delft Center for Systems and Control
Overview
Yes
Experiment Data pre− (u(k),y(k) ) Fit model Model
Start processing to data validation ok? End
Design
No
Model
structure
selection
31/39
Delft Center for Systems and Control
Model Validation v(k)
+ y(k)
G(q)
u(k) + + e(k)
−
b θ)
H(q,
+ ŷ(k)
b θ)
G(q, +
b θ) (H(q,
Objective determine G(q, b θ)) such that
e(k) is small as possible.
32/39
Delft Center for Systems and Control
Model Validation v(k)
+ y(k)
G(q)
u(k) + + e(k)
−
b θ)
H(q,
+ ŷ(k)
b θ)
G(q, +
b θ) (H(q,
Objective determine G(q, b θ)) such that
e(k) is small as possible.
What is small?
32/39
Delft Center for Systems and Control
When does small not imply fitting the noise?
How to check this?
33/39
Delft Center for Systems and Control
When does small not imply fitting the noise?
How to check this?
Split the available data into two sets:
• one data set to estimate the models
• another data set to validate the models
Validation tests:
• VAF or value of cost function
33/39
Delft Center for Systems and Control
What is small/What is a good model?
When it satisfies the conditions for optimality?
• e(k) is a white noise signal
34/39
Delft Center for Systems and Control
What is small/What is a good model?
When it satisfies the conditions for optimality?
• e(k) is a white noise signal
• e(k), u(k) are statistically independent
34/39
Delft Center for Systems and Control
What is small/What is a good model?
When it satisfies the conditions for optimality?
• e(k) is a white noise signal
• e(k), u(k) are statistically independent
How to check this?
34/39
Delft Center for Systems and Control
What is small/What is a good model?
When it satisfies the conditions for optimality?
• e(k) is a white noise signal
• e(k), u(k) are statistically independent
How to check this?
• Auto-correlation residuals
• Cross-correlation residuals and inputs
34/39
Delft Center for Systems and Control
Check Conditions for optimality?
White noise test by Cross-correlation test
Auto-correlation residual residual and input
1 0.1
0.5
0 0
-0.5
-1 -0.1
-20 -10 0 10 20 -20 -10 0 10 20 ARX
1 0.1
0.5
0 0
-0.5
-1 -0.1
-20 -10 0 10 20 -20 -10 0 10 20 OE
1 0.1
0.5
0 0
-0.5
-1 -0.1
-20 -10 0 10 20 -20 -10 0 10 20 ARMAX
time lag time lag
35/39
Delft Center for Systems and Control
Demo acoustical duct
• Sampling rate Fs = 2000Hz
• Excitation: zero-mean white noise
• Number of samples N = 4000
• Measure data (u, y)
36/39
Delft Center for Systems and Control
Summary of Lecture 12
The system identification cycle
Yes
Experiment Data pre− (u(k),y(k) ) Fit model Model
Start processing to data validation ok? End
Design
No
Model
structure
selection
37/39
Delft Center for Systems and Control
And there is much more . . .
• Controller design by system identification
38/39
Delft Center for Systems and Control
And there is much more . . .
• Controller design by system identification
• Adaptive control via parameter estimation
methods
38/39
Delft Center for Systems and Control
And there is much more . . .
• Controller design by system identification
• Adaptive control via parameter estimation
methods
• Fast algorithms for system identification
38/39
Delft Center for Systems and Control
And there is much more . . .
• Controller design by system identification
• Adaptive control via parameter estimation
methods
• Fast algorithms for system identification
• Interesting applications: Adaptive Optics,
Windenergy, Automotive, · · ·
38/39
Delft Center for Systems and Control
Next ...
Exam: Tuesday Jan 24, 2017
39/39
Delft Center for Systems and Control