0% found this document useful (0 votes)

188 views16 pages

Data Pre Processing

This document discusses data pre-processing techniques for system identification, including removing high-frequency disturbances, outliers, drift, offsets, trends and seasonal variations. It describes explicit pre-treatment of the data by subtraction or filtering, or letting the noise model account for disturbances. Validation of identified models is also discussed, including calculating the mean square error of residuals, and checking the auto-correlation of residuals and cross-correlation between residuals and inputs. Subspace identification is introduced as an algorithm to estimate state space models from input-output data by forming output predictors and selecting a basis to reconstruct the state vector. An example identification experiment on a two-tank mixing process is described.

Uploaded by

callsandhya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

188 views16 pages

Data Pre Processing

Uploaded by

callsandhya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

4/CY/O8 Advanced System Identification Lecture 4

Data Pre-processing

When data have been collected from the identification

experiment, it is often necessary to do some processing
on the data set prior to using it for identification.

There are several possible deficiencies in the data that

should be attended to:

1. High-frequency disturbances in the data record, above

the frequencies of interest to the system dynamics.

2. Occasional outliers and missing data.

3. Drift and offset, low frequency disturbances.

Dr. V.M. Becerra 1

4/CY/O8 Advanced System Identification Lecture 4

Drifts and detrending

There are two different approaches to dealing with these

problems:

1. Removing the disturbances by explicit pre-treatment

of the data.

2. Letting the noise model take care of the disturbances.

The first approach involves removing trends and off-sets

by direct subtraction.

Dr. V.M. Becerra 2

4/CY/O8 Advanced System Identification Lecture 4

Signal offsets

There are several approaches to dealing with signal off-

sets or non-zero mean values:

1. Let y(t) and u(t) be deviations from a physical

equilibrium: y (t ) = y m (t ) − y (t ) , y (t ) = y m (t ) − y (t )
2. Subtract the mean values from the data.
1 N m 1 N m
y = ∑ y (t ) , u = ∑ u (t )
N t =1 N t =1
3. Estimate the offset explicitly. For example,

A(q −1 ) y m (t ) = B(q −1 )u m (t ) + α + v(t )

Dr. V.M. Becerra 3

4/CY/O8 Advanced System Identification Lecture 4

Drift, trends and seasonal variations

Methods to cope with other slow disturbances in the

data are analogous to the previous approaches for
dealing with offsets.

Drifts and trends can be seen as time-varying

equilibrium points, or time-varying mean values.

With some knowledge about the frequencies of the

slow variations, an alternative is to high-pass filter the
data:

y (t ) = F (q ) y m (t ), u (t ) = F (q )u m (t )

Dr. V.M. Becerra 4

4/CY/O8 Advanced System Identification Lecture 4

Outliers and missing data

In practice, data acquisition equipment is not perfect.

It may be that single values of the input-output data
are missing or corrupt due to malfunctioning of
sensors or communication links.

A practical method to deal with this problem is to

replace outliers or missing measurements by smoothed
estimates prior to parameter estimation.

Dr. V.M. Becerra 5

4/CY/O8 Advanced System Identification Lecture 4

Model Validation – Residual Analysis

Once a model has been identified, it is important to

validate the model using a data set that should be
independent of the data used to calculate the model
parameters.

The ‘leftovers’ of the modeling process - the part of the

data that the model could not reproduce- are the
residuals:

ε (t ) = ε (t ,θˆN ) = y (t ) − yˆ (t ,θˆN )

These residuals carry information about the quality of the

model.

When doing model validation, one typically computes

the Mean Square Error, the autocorrelation of the
residuals, and the cross-correlation between the residuals
and the input. The values of the autocorrelation and
cross-correlation should be small and lie within certain
confidence limits.

Dr. V.M. Becerra 6

4/CY/O8 Advanced System Identification Lecture 4

The Mean Square Error

This is the average of the squared error:

N
1
MSE =
N
∑ (t )
ε 2

t =1

This is a measure in a single positive number of how well

the model output fits the measured data.

Auto-correlation of the residuals.

As the residuals are assumed to be a white noise

sequence, it is good to check its auto-correlation:

N
1
Rε (τ ) =
N

N
∑ ε (t )ε (t − τ )
t =1

For different values of τ = 1, 2, 3, 4, …

If these numbers are not small for τ ≠ 0 then part of ε

could have been predicted from past data, and so this is a
sign of deficiency in the model.

Dr. V.M. Becerra 7

4/CY/O8 Advanced System Identification Lecture 4

Cross-correlation between the residuals and the input

Similarly, the residuals should not be correlated with the

input, so it is also good to check the cross-correlation of
the residuals and the input:

N
1
R (τ ) =
N
εu
N
∑ ε (t )u(t − τ )
t =1

If there are traces of past inputs in the residuals, then

there is a part of y(t) that originates from the past input
and that has not been properly picked up by the model.
Hence, the model could be improved.

Dr. V.M. Becerra 8

4/CY/O8 Advanced System Identification Lecture 4

Subspace Identification – An introduction

A linear system can always be represented in state space

form:

x(k +1) = A x(k) + Bu(k) + w(t)

y( k) = C x(k) + Du( k) + v( k)

where:

x is a n-dimensional state vector

u is a nu-dimensional input vector
y is a ny dimensional output vector
v is a ny-dimensional noise vector
w is a n-dimensional process noise vector

A, B, C and D are parameter matrices of the appropriate

dimensions.

Dr. V.M. Becerra 9

4/CY/O8 Advanced System Identification Lecture 4

The main idea behind subspace identification techniques

is that given the input-output data sequences u(t), y(t),
t=1…N, the state sequence x(t), t=1…N, is estimated
first, and then the state space matrices A,B,C,D are found
using a least squares procedure.

Assume for a moment that not only y and u are

measured, but also the state vector x. Now, with known
u, y and x we can form a linear regression form the state
space model above:

LMx(t +1)OP L O
Y (t) = M ,P Θ = MM A B PP
MN y(t) PQ MNC D PQ
LMx(t)OP LMw(t)OP
Φ(t) = M P, E(t) = M
u(t
MN PQ ) MN v(t) PPQ

Then the state space model above may be written as:

Y (t) = Θ Φ(t) + E(t)

From this equation, the matrix elements in Θ can be

estimated by the simple least squares method.

Dr. V.M. Becerra 10

4/CY/O8 Advanced System Identification Lecture 4

How do we obtain the state sequence x(t), t=1..N from

the input-output data?

All state vectors that can be reconstructed from input-

output data are linear combinations of the n k-step ahead
output predictors (See Ljung 1999):

y(t + k|),
t k =1,...n

where n is the model order (the dimension of x)

We can form these predictors and select an algebraic

basis among its components.

LM y(t+1|)t OP
x(t) = L
MMy(t+#n|)t PP
N Q
The choice of L will determine the basis of the state
space realization. The predictor y(t + k|)
t is a linear
function of u(s), y(s), s=1,…t

The method is called subspace identification, because it

is based on subspace projections – a concept from linear
algebra.

A well known algorithm for subspace identification is

the so called N4SID, originally developed by Peter Van
Overschee at the University of Leuven, Belgium.

Dr. V.M. Becerra 11

4/CY/O8 Advanced System Identification Lecture 4

EXAMPLE

An identification experiment has been carried out on a

two tank mixing process. This process was located at the
Control Engineering Centre Laboratory at City
University, London, and a schematic diagram is given in
Figure 1.

ucold uhot

T1
L1 L2
T2

Tank 1 Tank 2

Figure 1: The two tank mixing process

The process has two manipulated inputs, which are the
openings (in percentage) of the cold (ucold) and hot
water valves (uhot).

Dr. V.M. Becerra 12

4/CY/O8 Advanced System Identification Lecture 4

The four measured variables are the levels in tanks 1 and

2, L1 and L2 (in cm), and the temperatures in tanks 1 and
2, T1 and T2 (in degrees C).

The experiment was carried out in open loop by

manually specifying the values of the valve openings.

The input sequences to the process were chosen to be

binary pseudo-random signals shifting between 10% and
40% for the cold water valve, and between 20% and 40%
for the hot water valve.

The sampling time used was 20s. The data set consists
of 190 samples.

Dr. V.M. Becerra 13

4/CY/O8 Advanced System Identification Lecture 4

MATLAB code to estimate a state space model for the level in

Tank 1 (uses the System Identification Toolbox):
load ex090398;
mix =iddata([L1],[ucold uhot]);
mixd = detrend(mix,’constant’);
mixe = mixd([1:95],:);
mixv = mixd([96:190],:)
ssmodel = n4sid(mixe,2);
compare(mixv, ssmodel);
resid(mixv, ssmodel);
present(ssmodel)
40

30
L (cm)

25
1

10
0 20 40 60 80 100 120 140 160 180 200
s a m p le

40
ucold (%)

0
0 20 40 60 80 100 120 140 160 180 200
s a m p le
50

40
uhot (%)

0
0 20 40 60 80 100 120 140 160 180 200
s a m p le

Dr. V.M. Becerra 14

4/CY/O8 Advanced System Identification Lecture 4

y1. (sim)

mixv; measured
ssmodel; fit: 93.25%

5
1
y

-5

-10

100 110 120 130 140 150 160 170 180 190

Correlation function of residuals. Output y1

0.5

-0.5
0 5 10 15 20 25
lag
Cross corr. function between input u1 and residuals from output y1
0.4

0.2

-0.2

-0.4
-25 -20 -15 -10 -5 0 5 10 15 20 25
lag

Dr. V.M. Becerra 15

4/CY/O8 Advanced System Identification Lecture 4

These are the model parameters returned by N4SID:

A=
x1 x2
x1 0.90194 0.053728
x2 -0.48226 -0.15668

B=
u1 u2
x1 0.0027808 0.0010754
x2 0.0085512 -0.0001901

C=
x1 x2
y1 41.407 -2.2392

D=
u1 u2
y1 0 0

Dr. V.M. Becerra 16

Subspace-Based Identification For Linear and Nonlinear Systems
No ratings yet
Subspace-Based Identification For Linear and Nonlinear Systems
15 pages
Signals and Systems Lab 1
No ratings yet
Signals and Systems Lab 1
27 pages
Subspace Identification For Linear Systems
No ratings yet
Subspace Identification For Linear Systems
269 pages
L07 Identification
No ratings yet
L07 Identification
41 pages
4.6. Nonlinear Analysis of A 16-Storey RC Building Designed According To EC2 & EC8
No ratings yet
4.6. Nonlinear Analysis of A 16-Storey RC Building Designed According To EC2 & EC8
8 pages
Kinetic-And-Potential-Energy-Worksheet-Examples Key PDF
50% (4)
Kinetic-And-Potential-Energy-Worksheet-Examples Key PDF
2 pages
Determination of Copper Concentration Using UV-Vis Spectrophotometery
100% (7)
Determination of Copper Concentration Using UV-Vis Spectrophotometery
7 pages
Automatic Control Lab 3
No ratings yet
Automatic Control Lab 3
6 pages
HVAC System Analysis & Selection (ch1)
No ratings yet
HVAC System Analysis & Selection (ch1)
6 pages
Signals and Systems Module 1
No ratings yet
Signals and Systems Module 1
35 pages
Fundamentals of Electric Motors
100% (5)
Fundamentals of Electric Motors
40 pages
The Application of PDA and CAPWAP To Ensure Quality and Capacity in Driving Long Steel H-Piles
No ratings yet
The Application of PDA and CAPWAP To Ensure Quality and Capacity in Driving Long Steel H-Piles
6 pages
Ijsrdv3i110384 PDF
No ratings yet
Ijsrdv3i110384 PDF
5 pages
Electromagnetics: Problems in
No ratings yet
Electromagnetics: Problems in
7 pages
Planet: Planets in Astrology and in Your Kundli: SUN Consideration
No ratings yet
Planet: Planets in Astrology and in Your Kundli: SUN Consideration
4 pages
Water Sample Oxygen Analysis
No ratings yet
Water Sample Oxygen Analysis
7 pages
Propeller Design Detail Stage
100% (1)
Propeller Design Detail Stage
4 pages
Allowable Stress & Factor of Safety
100% (1)
Allowable Stress & Factor of Safety
2 pages
Pipeline Sizing for Engineers
No ratings yet
Pipeline Sizing for Engineers
7 pages
6 AshLand Product List
100% (1)
6 AshLand Product List
34 pages
Chapter 22 The FEM Applied To Dynamic Analyses
No ratings yet
Chapter 22 The FEM Applied To Dynamic Analyses
16 pages
Ac16078457 1
No ratings yet
Ac16078457 1
101 pages
Index
No ratings yet
Index
444 pages
Hex-Penta Mesh Tutorial for Students
No ratings yet
Hex-Penta Mesh Tutorial for Students
10 pages
3-Phase Induction Machines
100% (3)
3-Phase Induction Machines
22 pages
Combined State and Least Squares Parameter Estimation Algorithms For Dynamic Systems
No ratings yet
Combined State and Least Squares Parameter Estimation Algorithms For Dynamic Systems
10 pages
Compound Machine Report
No ratings yet
Compound Machine Report
9 pages
Liquefaction Hazard Assessment of Ramgarh Tal Pariyojna
No ratings yet
Liquefaction Hazard Assessment of Ramgarh Tal Pariyojna
16 pages
ELEC4632 - Lab - 01 - 2022 v1
No ratings yet
ELEC4632 - Lab - 01 - 2022 v1
13 pages
A Comparison of Some Subspace Identification Methods
No ratings yet
A Comparison of Some Subspace Identification Methods
3 pages
Report System Identification and Modelling
No ratings yet
Report System Identification and Modelling
34 pages
Non Linear Analysis of Spur Gear Using Matlab Code
100% (2)
Non Linear Analysis of Spur Gear Using Matlab Code
105 pages
Lecture 2
No ratings yet
Lecture 2
43 pages
2 ECE5560-Notes01
No ratings yet
2 ECE5560-Notes01
11 pages
Lec6aa 2021
No ratings yet
Lec6aa 2021
13 pages
Subspace Identification For Linear Systems: Theory - Implementation - Applications
100% (1)
Subspace Identification For Linear Systems: Theory - Implementation - Applications
268 pages
Lab 6 - Learning System Identification Toolbox of Matlab PDF
No ratings yet
Lab 6 - Learning System Identification Toolbox of Matlab PDF
11 pages
Stochastic Subspace Identification
No ratings yet
Stochastic Subspace Identification
6 pages
L3 Linear Systems
No ratings yet
L3 Linear Systems
49 pages
MIT2 017JF09 Acoustics PDF
No ratings yet
MIT2 017JF09 Acoustics PDF
15 pages
Sim PDF
No ratings yet
Sim PDF
268 pages
Control Notes
No ratings yet
Control Notes
47 pages
Physics - Basic General Relativity (Benjamin McKay, U of Utah, 2001) PDF
No ratings yet
Physics - Basic General Relativity (Benjamin McKay, U of Utah, 2001) PDF
24 pages
Parameter System
No ratings yet
Parameter System
103 pages
Inertia Theorems For Matrices: The Semi-Definite Case: Communicated by A. S. Householder, June 4, 1962
No ratings yet
Inertia Theorems For Matrices: The Semi-Definite Case: Communicated by A. S. Householder, June 4, 1962
5 pages
The World Trade Center 9/11 Disaster and Progressive Collapse of Tall Buildings
No ratings yet
The World Trade Center 9/11 Disaster and Progressive Collapse of Tall Buildings
27 pages
Mixed States and Pure States: (Dated: April 9, 2009)
No ratings yet
Mixed States and Pure States: (Dated: April 9, 2009)
14 pages
Control Engineering III Lecture Notes
No ratings yet
Control Engineering III Lecture Notes
55 pages
State Feedback Control
No ratings yet
State Feedback Control
17 pages
University of Kwazulu-Natal Electrical, Electronic and Computer Engineering Ene4Cs - Control Systems 2
No ratings yet
University of Kwazulu-Natal Electrical, Electronic and Computer Engineering Ene4Cs - Control Systems 2
3 pages
Large-Time Asymptotics For Solutions of A Generalized Burgers Equation With Variable Viscosity
No ratings yet
Large-Time Asymptotics For Solutions of A Generalized Burgers Equation With Variable Viscosity
23 pages
SEISMIC DESIGN Priestley PDF
No ratings yet
SEISMIC DESIGN Priestley PDF
22 pages
1 s2.0 S1474667016377849 Main
No ratings yet
1 s2.0 S1474667016377849 Main
6 pages
Slovay 1927
No ratings yet
Slovay 1927
9 pages
Overview Linear Algebra
No ratings yet
Overview Linear Algebra
67 pages
Super Efficient motors-IEEE Paper
No ratings yet
Super Efficient motors-IEEE Paper
6 pages
Introduction To System Identification
No ratings yet
Introduction To System Identification
16 pages
A Novel Effective Medium Theory For Modelling The Thermal Conductivity of Porous Materials
No ratings yet
A Novel Effective Medium Theory For Modelling The Thermal Conductivity of Porous Materials
4 pages
Comparative Study Between Electromagnetic Waves and Acoustic Waves For Underwater Communication
No ratings yet
Comparative Study Between Electromagnetic Waves and Acoustic Waves For Underwater Communication
4 pages
A Model For Induction Motor Aggregation For Power System Studies
No ratings yet
A Model For Induction Motor Aggregation For Power System Studies
4 pages
A Model For Induction Motor Aggregation For Power System Studies
No ratings yet
A Model For Induction Motor Aggregation For Power System Studies
4 pages
Continuous: System Identification Problem
No ratings yet
Continuous: System Identification Problem
41 pages
Chap3 State Variable Models
No ratings yet
Chap3 State Variable Models
47 pages
Subspace System Identification Theory and Applications: Lecture Notes
No ratings yet
Subspace System Identification Theory and Applications: Lecture Notes
282 pages
Advanced Control Using Matlab
100% (6)
Advanced Control Using Matlab
541 pages
Lecture5 PDF
No ratings yet
Lecture5 PDF
51 pages
00703055
No ratings yet
00703055
5 pages
Lab Axial Loading
No ratings yet
Lab Axial Loading
2 pages
Automatic Control: Radouan Ait Mouha ID:12190210101
No ratings yet
Automatic Control: Radouan Ait Mouha ID:12190210101
19 pages
Mathlab PDF
100% (2)
Mathlab PDF
242 pages
MISO System Identification Methods
No ratings yet
MISO System Identification Methods
16 pages
Id PDF
No ratings yet
Id PDF
118 pages
System Identification Lecture Notes
No ratings yet
System Identification Lecture Notes
24 pages
What Is System Identification ?
No ratings yet
What Is System Identification ?
8 pages
Notes For Discrete-Time Control Systems (ECE-520) Fall 2010: by R. Throne The Major Sources For These Notes Are
No ratings yet
Notes For Discrete-Time Control Systems (ECE-520) Fall 2010: by R. Throne The Major Sources For These Notes Are
174 pages
B.Tech Industrial Instruments Exam
No ratings yet
B.Tech Industrial Instruments Exam
1 page
Sarvajanik College of Engineering & Technology, Surat (042) : Be: Iv
No ratings yet
Sarvajanik College of Engineering & Technology, Surat (042) : Be: Iv
1 page
Continuous: System Identification Problem
No ratings yet
Continuous: System Identification Problem
41 pages
Lecture Notes 2013
No ratings yet
Lecture Notes 2013
231 pages
1.1 Systems and Models
No ratings yet
1.1 Systems and Models
5 pages
Subspace State Space System Identification For Industrial Processes
No ratings yet
Subspace State Space System Identification For Industrial Processes
7 pages
(Peter Van Overschee, Bart de Moor (Auth.) ) Subs
No ratings yet
(Peter Van Overschee, Bart de Moor (Auth.) ) Subs
262 pages
Msi PDF
No ratings yet
Msi PDF
127 pages
Residual Generation For Diagnosis of Additive Faults in Linear Systems
No ratings yet
Residual Generation For Diagnosis of Additive Faults in Linear Systems
29 pages
System Identi Cation Data-Driven Modelling of Dynamic Systems - Paul M.J. Van Den Hof
No ratings yet
System Identi Cation Data-Driven Modelling of Dynamic Systems - Paul M.J. Van Den Hof
305 pages
Advanced Control Using Matlab PDF
100% (1)
Advanced Control Using Matlab PDF
564 pages
Ece380 Notes PDF
0% (1)
Ece380 Notes PDF
171 pages
GOOD Notes For System Identification and Parameter Estimation
100% (1)
GOOD Notes For System Identification and Parameter Estimation
103 pages
Lecture Notes - Kristiaan Pelckmans
100% (1)
Lecture Notes - Kristiaan Pelckmans
153 pages
Gee7 2011
No ratings yet
Gee7 2011
318 pages
PVDnotes
No ratings yet
PVDnotes
160 pages