IAS0031 Modeling and Identification
Lecture 7: Time Domain Identification. Validation.
Discrete-time Models. Preprocessing Data.
Aleksei Tepljakov, Ph.D.
Black Box Identification
• The goal of system identification in case of black box modeling
is to infer a dynamic system model based upon experimentally
collected data.
• More specifically, the goal is to obtain a relationship between
system inputs and outputs under external stimuli (input signals,
disturbances) in order to determine and predict the system
behavior.
• Strictly speaking, no assumptions are made on the internal
physical structure of the studied system.
• The quality of a the black box model may be measured through
use of some specific quality criterion, validation, and statistical
methods.
Aleksei Tepljakov 2 / 41
Recall: Important Aspect of Modeling
The map is not the territory.
6=
What this means is that in practice an ideal model of a system can
never be achieved. It is sufficient to evaluate the obtained model
of a system in terms of usefulness for a particular purpose.
Aleksei Tepljakov 3 / 41
Diagram: Identification Procedure
Aleksei Tepljakov 4 / 41
Identification Procedure: Steps
1. Design the experiment. For dynamic systems it is usual to collect
transient response data in the time domain by applying a set of
predetermined input signals.
2. Record the dataset based on an experiment. The collected data
must be as informative as possible subject to potential constraints.
3. Choose a set of models and/or the model structure and the criterion
to fit.
4. Calculate the model using a suitable algorithm.
5. Validate the obtained model. It is desirable to use two different
datasets for identification and validation.
6. If the model is satisfactory, use it for the desired purpose. Otherwise,
revise modeling/identification strategy and repeat the above steps.
Aleksei Tepljakov 5 / 41
Black Box Model of a System
w
u
y
d ?
Measure
A system with measured outputs y, measured inputs u, measured
disturbances d and unmeasured disturbances w.
Aleksei Tepljakov 6 / 41
Collecting Experimental Data: Sampling
• Real systems are sampled with a finite sample rate ts .
• According to the Nyquist-Shannon sampling theorem, the
information about frequencies fΨ , such that
fs 1
fΨ > = , (1)
2 2ts
where fs is the sampling frequency [Hz], is lost. This should be
taken into account when studying the behavior of the identified
model.
• High-frequency components in the experimental signal may
result in unwanted effects, such as aliasing. To prevent this,
filtering of the collected signal data is necessary, and is usually
accomplished by means of a stationary low-pass filter
Aleksei Tepljakov 7 / 41
Identification: System Under Study
Suppose that experimental data is collected from a general single input, single
output nonlinear system Ψ : I → O, where (I , O) ⊂ R2 denote the measured
input and output signals, respectively, such that
z(t) = Ψ(v(t)) + N, (2)
where z(t) denotes the system output, and v(t) denotes the system input, and
N denotes measurement noise, and is represented by a data set holding the
samples from the system input uk = v(kts ) and output yk = z(kts ) under a
uniform sample rate ts = tk+1 − tk ≡ const:
ZN = {u0 , y0 , u1 , y1 , . . . , uN , yN , ts }, (3)
where k = 0, 1, . . . , N . Zero initial conditions are assumed, therefore if
z(0) = y0 6= 0, then the offset is removed from each of the collected output
samples by means of
yk = yk − y0 , k = 0, 1, . . . , N. (4)
Aleksei Tepljakov 8 / 41
Nonlinear Least-Squares Estimation
Methods
The problem is to obtain a model of the system Ψ by means of
minimization of the sum of squares (residual norm)
n
X 2
min F, F = ε2i = kεk2 , (5)
θ
i=1
where θ is a set of model parameters, εi = yi − ŷi is the residual
(simulation error), yi is the true system output and ŷi is the predicted
output under the input signal ui for samples i = 1, 2, . . . , N .
Some solution methods:
• Trust Region methods;
• Levenberg-Marquardt method;
• Nelder-Mead Simplex method.
Aleksei Tepljakov 9 / 41
Linear Single Input Single Output
Approximations: Transfer Functions
For a SISO system given by a transfer function model with dead
time
bm sm + bm−1 sm−1 + · · · + b1 s + b0 −Ls
G(s) = e (6)
an sn + an−1 sn−1 + · · · + a1 s + a0
the task is to obtain a parameter set
θ = {bm , bm−1 , . . . , b0 , an , an−1 , . . . a0 , L}
assuming zero initial conditions. If b0 6= 0 and a0 6= 0, then
K = b0 /a0 is the static gain of the system.
Aleksei Tepljakov 10 / 41
Linear Approximations: State Space
Representation
For a state space model
(
ẋ = Ax + Bu,
(7)
y = Cx + Du
the set of parameters to identify may be chosen as θ = {θA , θB , θC , θD },
where
θA = {a11 , a12 , . . . , a1n , . . . , an1 , an2 , . . . , ann },
θB = {b11 , b12 , . . . , b1p , . . . , bn1 , bn2 , . . . , bnp },
θC = {c11 , c12 , . . . , c1n , . . . , cq1 , cq2 , . . . , cqn },
θD = {d11 , d12 , . . . , d1p , . . . , dq1 , dq2 , . . . , dqp },
where aij , bij , cij , and dij are corresponding entries of matrices A, B, C,
and D of sizes n × n, n × p, q × n, and q × p, respectively.
Aleksei Tepljakov 11 / 41
Linear Approximations: State Space
Representation (continued)
• To limit the number of identified entries in θA , we should consider
the physical structure of the system, i.e., use white- and/or grey box
modeling approaches. Some known entries in the state matrix A may
then be fixed (considered constant).
• We can sometimes fix θB and θC arbitrarily, since we use black box
modeling—that is, if we make no assumptions about the physical
structure of the system. For example:
θC = In×n ,
where In×n denotes the identity matrix of size n × n. Note that this
choice also makes all of the system states observable.
• In many practical applications all entries in the matrix D may be
fixed to zero.
• Other identification methods exist, e.g., the Subspace methods.
Aleksei Tepljakov 12 / 41
Linear Approximations: State Space
Representation (continued)
Yet another option: use the transfer function representation. For
example, a system with two inputs and two outputs may be
represented as
B11 (s) B12 (s)
A(s) A(s)
G(s) = .
B21 (s) B22 (s)
A(s) A(s)
Notice that for each transfer function, the pole polynomial
A(s)—the character polynomial of the model—remains the same.
Aleksei Tepljakov 13 / 41
Residual Analysis
Denote by yr the experimental plant output, and by ym the
identified model output. We consider the SISO case, so both yr
and ym should be vectors of size N × 1. In the following, we
address the problem of statistical analysis of modeling residuals.
Residuals are given by a vector containing the model output error
ε = yr − ym . (8)
The percentage fit may be expressed as
kεk
F it = 1 − · 100%, (9)
kyr − ȳr k
where k·k is the Euclidean norm, and ȳr is the mean value of yr .
Aleksei Tepljakov 14 / 41
Residual Analysis: Basic Statistical Data
• Maximum absolute error
εmax = max |ε(k)|, (10)
k
shows the maximum deviation from the expected behavior of the
model over the examined time interval; however, it may be misleading
in case of disturbances or strong noise;
• The mean squared error
N
1 X 2 kεk22
εM SE = εk = (11)
N N
k=1
may serve as a general measure of model quality. The lower it is, the
more likely the model represents an adequate description of the
studied process.
Aleksei Tepljakov 15 / 41
Residual Analysis: Autocorrelation of
Residuals
Additional useful information is given by an estimate for autocorrelation of residuals
for lag τ = 1, 2, . . . , τmax < N , which may be computed by means of
N −τ
1 X
Rε (τ ) = ε(k)ε(k + τ ). (12)
(N − τ ) k=1
rε
The vector = Rε (1) Rε (2) · · · Rε (τmax ) is constructed and is
normalized such that r ε,norm = r ε /Rε (1). Assuming normal distribution of residuals
the confidence band η̂ is then approximated for a confidence percentage
pconf ∈ (0, 1] around zero mean as an interval
h √ √ i
η̂ = 0−Φ −1
(cp ) / N , 0 + Φ (cp ) / N ,
−1
(13)
√
where cp = 1 − 0.5(1 − pconf ) and Φ−1 (x) = 2 erf −1 (2x − 1) is the quantile
function. If the residual samples represent uncorrelated white noise, then ideally:
riε,norm ∈ η̂ ∀i = 1, 2, . . . , τmax . (14)
Aleksei Tepljakov 16 / 41
SISO System Identification Example:
System Response to PRBS Excitation
1.5
System output y(t)
0.5
−0.5
0 10 20 30 40 50 60 70 80 90 100
System input u(t)
0.5
0
0 10 20 30 40 50 60 70 80 90 100
Time [s]
Aleksei Tepljakov 17 / 41
SISO System Identification Example:
Identification Failed
Mean squared error: 0.051619; Max abs error: 0.87922
1
0.5
Output error
−0.5
−1
0 10 20 30 40 50 60 70 80 90 100
Time [s]
Autocorrelation of residuals (with P=0.95 confidence)
1
0.5
−0.5
5 10 15 20 25 30 35 40 45 50
Lags [Samples]
Aleksei Tepljakov 18 / 41
SISO System Identification Example:
Identification Successful
Mean squared error: 0.0099879; Max abs error: 0.40139
1
0.5
Output error
−0.5
−1
0 10 20 30 40 50 60 70 80 90 100
Time [s]
Autocorrelation of residuals (with P=0.95 confidence)
0.02
−0.02
−0.04
5 10 15 20 25 30 35 40 45 50
Lags [Samples]
Aleksei Tepljakov 19 / 41
Discrete-time Transfer Functions
A linear, time-invariant model may be represented as
y(t) = G(q)u(t) + H(q)e(t), (15)
where y(t) is the output, u(t) is the input, e(t) is the disturbance with
fe (·) representing its probability density functiion, and
N
X M
X
G(q) = g(k)q −k , H(q) = 1 + h(k)q −k . (16)
k=1 k=1
We shall call q −d is the backward shift or delay operator acting on a
signal p(t), such that
q −d p(t) = p(t − d). (17)
Here t enumerates sampling instances. The system under study is
sampled with a constant sample rate Ts .
Aleksei Tepljakov 20 / 41
Discrete-time Transfer Functions:
Parametrization
If we assume that e(t) is Gaussian, and denote by θ the set of parameters
to be identified, then
y(t) = G(q, θ)u(t) + H(q, θ)e(t), (18)
fe (·, θ), the PDF of e(t); {e(t)} white noise. (19)
The parameter vector θ ranges over a subset of Rd , where d is the
dimension of θ:
θ ∈ DM ⊂ Rd .
The one-step-ahead prediction for this model denoted by ŷ(t|θ) is
−1
−1
ŷ(t|θ) = H (q, θ)G(q, θ)u(t) + 1 − H (q, θ) y(t). (20)
Note that this predictor form does not depend on fe (·, θ).
Aleksei Tepljakov 21 / 41
Autoregressive eXogeneous (ARX) Model:
Equation Error Model Structure
The model is given by a difference equation
y(t) + a1 y(t − 1) + · · · + an y(t − n) =
b1 u(t − 1) + · · · + bm u(t − m) + e(t). (21)
The parameter vector is given by
T
θ= a1 a2 . . . an b1 . . . bm . (22)
We introduce polynomials A(q, θ) = 1 + a1 q −1 + · · · + an q −n and
B(q, θ) = b1 q −1 + · · · + bm q −m and obtain
B(q, θ) 1
y(t) = u(t) + e(t). (23)
A(q, θ) A(q, θ)
Aleksei Tepljakov 22 / 41
ARX Model: Linear Regression
Inserting (23) into (20) gives
ŷ(t|θ) = B(q)u(t) + [1 − A(q)] y(t). (24)
Now we introduce the vector
T
ϕ(t) = −y(t − 1) . . . −y(t − n) u(t − 1) . . . u(t − m) .
Then (24) can be rewritten as
ŷ(t|θ) = θT ϕ(t) = ϕT (t)θ. (25)
We have now arrived at a linear regression. The vector ϕ(t) is
called the regression vector. Thus, we have that the predictor
defines a linear regression.
Aleksei Tepljakov 23 / 41
ARMAX Model Structure
To gain more freedom in describing the properties of the disturbance term we
construct a model such that describes the equation error as a moving average
(MA) of white noise
y(t) + a1 y(t − 1) + · · · + an y(t − n) = b1 u(t − 1) + · · ·
+ bm u(t − m) + e(t) + c1 e(t − 1) + · · · + cp e(t − p). (26)
Taking C(q) = 1 + c1 q −1 + · · · + cp q −p we rewrite it as
B(q, θ) C(q, θ)
y(t) = u(t) + e(t). (27)
A(q, θ) A(q, θ)
The parameter vector is given by
T
θ = a1 . . . an b1 ... bm c1 ... cp . (28)
Linear regression is not applicable in case of ARMAX models.
Aleksei Tepljakov 24 / 41
Output Error (OE) and Box-Jenkins (BJ)
Model Structures
The equation error model structures correspond to descriptions
where the transfer functions G and H have a common polynomial
A in the denominators. From a physical point of view it may seem
more natural to parametrize these functions independently.
Output error (OE):
B(q, θ)
y(t) = u(t) + e(t). (29)
A(q, θ)
Box-Jenkins (BJ):
B(q, θ) C(q, θ)
y(t) = u(t) + e(t). (30)
A(q, θ) D(q, θ)
Aleksei Tepljakov 25 / 41
“Matrix” of Model Structures
Aleksei Tepljakov 26 / 41
Model Structures: Identifiability
Definition 1. A model structure M is globally identifiable
at θ⋆ if
M(θ) = M(θ⋆ ), θ ∈ DM ⇒ θ = θ⋆ . (31)
Definition 2. A model structure M is strictly globally
identifiable if it is globally identifiable at all θ⋆ ∈ DM .
Definition 3. A model structure M is globally identifiable if it is
globally identifiable at almost all θ⋆ ∈ DM .
The identifiability concept concerns the unique representation of a
given system description in a model structure.
Aleksei Tepljakov 27 / 41
Modeling Input and Output Nonlinearities
• Hammerstein model: (Static) input nonlinearity:
u(t) f (u(t)) y(t)
−→ f −→ Linear Model −→
• Wiener model: (Static) output nonlinearity:
u(t) y(t) h(y(t))
−→ Linear Model −→ h −→
The functions f (·) and h(·) may be parameterized either in terms
of physical parameters (e.g., saturation levels), or as (nonlinear)
black-box models. Example: Model f (u) as a polynomial
f (u) = αn un + αn−1 un−1 + · · · + α1 u + α0 . (32)
Aleksei Tepljakov 28 / 41
Example: Solar-heated House
So
la
Pumps
r
pa
ne
l
House Storage
Aleksei Tepljakov 29 / 41
Example: Solar-heated House (continued)
Denote by x(t) the temperature of the solar panel collector at time
instance t, by y(t) the storage temperature, and by us (t) the solar
intensity and by up (t) the pump velocity. We need a model of how
the storage temperature y(t) is affected by the two inputs
mentioned above. With some simplifications, the physics of the
system can be described as follows:
• The heating of the air in the collector [= x(t + 1) − x(t)] is
equal to heat supplied by the sun [= d2 us (t)] minus the loss of
heat to the environment [= d3 x(t)] minus the heat transported
to storage [= d0 x(t)up (t)], i.e.,
x(t + 1) − x(t) = d2 us (t) − d3 x(t) − d0 x(t)u(t).
Aleksei Tepljakov 30 / 41
Example: Solar-heated House (continued)
• The increase of storage temperature [= y(t + 1) − y(t)] is equal to supplied
heat [= d0 x(t)up (t)] minus losses to the environment [= d1 y(t)], i.e.,
y(t + 1) − y(t) = d0 x(t)up (t) − d1 y(t).
Since x(t) is not directly measured, it is eliminated and the following model is
obtained:
y(t − 1)up (t − 1)
y(t) = (1 − d1 )y(t − 1) + (1 − d3 )
up (t − 2)
y(t − 2)up (t − 1)
+ (d3 − 1)(1 − d1 ) + d0 d2 up (t − 1)us (t − 2)
up (t − 2)
− d0 up (t − 1)y(t − 1) + d0 (1 − d1 )up (t − 1)y(t − 2).
By a proper choice of parametrization by means of θ and ϕ, a linear regression
may be achieved in the form
ŷ(t|θ) = ϕT (t)θ
Aleksei Tepljakov 31 / 41
Closed Loop Identification: Approaches
Denote by ZN a set of collected samples under a sampling rate ts
and of dimension N . Samples are collected from a pair of points of
the set {r, u, y} thus achieving a particular identification approach.
Aleksei Tepljakov 32 / 41
Closed Loop Identification: Indirect
Approach
• The indirect approach. The controller is assumed to be
known. The identification data set is given by
i
ZN = {r0 , y0 , r1 , y1 , . . . , rN , yN , ts }, (33)
where rk and yk denote the reference signal (set point) and
plant output, respectively, collected at points r and y. The
model structure of the plant G can be easily reconstructed,
once the parameters of the closed-loop system are obtained.
• Disadvantage: Any error in the controller (e.g., due to input
saturation or anti-windup measures) will be transported directly
to the estimate Ĝ.
Aleksei Tepljakov 33 / 41
Closed Loop Identification: Direct Approach
• The direct approach. The feedback is ignored and open-loop
identification is employed. The experimental data set used for
identification is given by
d
ZN = {u0 , y0 , u1 , y1 , . . . , uN , yN , ts }, (34)
where uk and yk denote the plant input and output signal samples
collected at points u and y.
• This approach works regardless of the complexity of the controller.
• Consistency and optimal accuracy are obtained if the model structure
contains the “true” system.
• Unstable systems can be handled without problems as long as the
closed loop system is stable.
• Drawback: Need good noise models.
Aleksei Tepljakov 34 / 41
Experiment Design: Inputs for Open Loop
Experiments
Recall that any signal u(t) may be represented by a Fourier series,
i.e., any signal has (infinitely) many frequency components. For the
data collected from the output of a system under study to be
informative for identification, the input should be persistently
exciting, that is it must contain many distinct frequencies. The
following facts establish the choices of suitable input signals:
1. The properties of the estimate depend only on the input
spectrum—not the actual waveform of the input.
2. The input must have limited amplitude ul 6 u(t) 6 uh .
3. Periodic inputs may have certain advantages.
Example choices for input signals: Pseudo-random binary sequence
(PRBS), filtered Gaussian white noise, sine wave.
Aleksei Tepljakov 35 / 41
Some Common Excitation Signals
Aleksei Tepljakov 36 / 41
Experiment Design: Antialiasing
Consider three approaches for using antialiasing filters for data
acquisition.
1. Sample fast enough that the process is well damped above the
Nyquist frequency. Then the high-frequency components in the
output data that originate from the input are insignificant.
2. Consider the antialiasing output filter as a part of the process
and model the system from input to filtered output. This
might increase the necessary model orders.
3. Since the antialiasing filter is known, include it as a known part
of the model, and let the predicted output pass through the
filter before being used in the identification criterion.
Aleksei Tepljakov 37 / 41
Experiment Design: Antialiasing Solution I:
Illustration
Aleksei Tepljakov 38 / 41
Preprocessing Experimental Data: Drifts
and Detrending
Low-frequency disturbances, offsets, trends, drift, and periodic variations may
be present in the collected experimental data (um (t), y m (t)). To deal with
signal offsets we consider the following two approaches:
1. Let y(t) and u(t) be deviations from a physical equilibrium. We determine
the level ȳ that corresponds to a constant um (t) ≡ ū close to the desired
operating point. Then we define
y(t) = y m (t) − ȳ, u(t) = um (t) − ū (35)
as the deviations from this equilibrium.
2. Subtract sample means: Define
N N
1 X m 1 X m
ȳ = y (t), ū = u (t) (36)
N t=1 N t=1
and use (35). This is a sound approach, and (ū, ȳ) is likely to be close to
an equilibrium point of the system.
Aleksei Tepljakov 39 / 41
Preprocessing Experimental Data: Other
Methods
• Treating outliers (corrupt measurement data points):
◦ Remove or manually adjust such data points;
◦ Cut the segments containing outliers from identification data.
• Prefiltering input and output data through the same filter does not
change the input-output relation of a linear system.
• Remove high-frequency disturbances:
◦ Use an anti-aliasing filter;
◦ Resample the data—take every nth sample from the original
record;
• Remove low-frequency disturbances: apply high-pass filtering.
Aleksei Tepljakov 40 / 41
Questions?
Thank you for your attention!
Aleksei Tepljakov 41 / 41