Gauss-Newton Method for Algebraic Models
As seen in Chapter 2 a suitable measure of the discrepancy between a model
and a set of data is the objective function, S(k), and hence, the parameter values are obtained by minimizing this function. Therefore, the estimation of the parameters can be viewed as an optimization problem whereby any of the available general purpose optimization methods can be utilized. In particular, it was found that the Gauss-Newton method is the most efficient method for estimating parameters in nonlinear models (Bard, 1970). As we strongly believe that this is indeed the best method to use for nonlinear regression problems, the Gauss-Newton method is presented in detail in this chapter. It is assumed that the parameters are free to take any values.
4.1 FORMULATION OF THE PROBLEM
In this chapter we are focusing on a particular technique, the Gauss-Newton method, for the estimation of the unknown parameters that appear in a model described by a set of algebraic equations. Namely, it is assumed that both the structure of the mathematical model and the objective function to be minimized are known. In mathematical terms, we are given the model
y = f(x,k)
(4.1)
49
Copyright 2001 by Taylor & Francis Group, LLC
50
Chapter 4
where k=[k1,k?,...,kp]1 is a p-dimensional vector of parameters whose numerical values are unknown, x=[X],x2,...,xn]T is an n-dimensional vector of independent variables (which are often set by the experimentalist and their numerical values are
either known precisely or have been measured), f is a m-dimensional vector function of known form (the algebraic equations) and y=[yi,y2,---,y m ] T is the mdimensional vector of depended variables which are measured experimentally (output vector). Furthermore, we are given a set of experimental data, [ y j , X j ] , i=l,...,N that we need to match to the values calculated by the model in some optimal fashion. Based on the statistical properties of the experimental error involved in the measurement of the output vector y (and possibly in the measurement of some of the independent variables x) we generate the objective function to be minimized as mentioned in detail in Chapter 2. In most cases the objective function can be written as
(4.2a)
where Cj = [y; - f ( X j , k ) J are the residuals and the weighting matrices Q;, i=l,...,N are chosen as described in Chapter 2. Equation 4.2a can also be written as
; X i ,k)J
(4.2b)
Finally, the above equation can also be written as follows
(4.2c)
1=1^=11=1 Minimization of S(k) can be accomplished by using almost any technique available from optimization theory. Next we shall present the Gauss-Newton method as we have found it to be overall the best one (Bard, 1970).
4.2 THE GAUSS-NEWTON METHOD
Let us assume that an estimate kw is available at the jth iteration. We shall try to obtain a better estimate, k^ +1> . Linearization of the model equations around kw yields,
Copyright 2001 by Taylor & Francis Group, LLC
Gauss-Newton Method for Algebraic Models
51
f(x,, k
ti+1)
) == f(xs, k ) +
0)
T
Ak(i+1> + H.O.T. ; i=l,...,N (4.3)
Neglecting all higher order terms (H.O.T.), the model output at k+1> can be approximated by yCx^0) = y( Xi ,k) + G ; AkO +1) ; i=l,...,N (4-4)
where Gj is the f/wx/^-sensitivity matrix \8f /3kjj = |Vf Jj evaluated at x( and k . It is noted that G is also the Jacobean matrix of the vector function f(x,k).
Substitution of y(x1,k(rl>) as approximated by Equation 4.4, into the LS objective function and use of the critical point criterion
yields a linear equation of the form
AAk a + 1 ) = b where
N
(4.6)
A = ^GTQjGj
i=l
(4.7)
and
Solution of the above equation using any standard linear equation solver yields Ak^ +l) . The next estimate of the parameter vector, k^+1), is obtained as
k (M>
kO)
where a stepping parameter, \i (0<u. < 1), has been introduced to avoid the problem of overstepping. There are several techniques to arrive at an optimal value for u.; however, the simplest and most widely used is the bisection rule described below.
Copyright 2001 by Taylor & Francis Group, LLC
52
Chapter 4
4.2.1 Bisection Rule
The bisection rule constitutes the simplest and most robust way available to determine an acceptable value for the stepping parameter u. Normally, one starts with u=l and keeps on halving u until the objective function becomes less than
that obtained in the previous iteration (Hartley, 1961). Namely we "accept" the first value of u that satisfies the inequality
S(ka) + uAk^11) < S(kG)) (4.10)
More elaborate techniques have been published in the literature to obtain optimal or near optimal stepping parameter values. Essentially one performs a
univariate search to determine the minimum value of the objective function along the chosen direction (Ak^1') by the Gauss-Newton method.
4.2.2 Convergence Criteria
A typical test for convergence is ||Ak^+1)|| < TOL where TOL is a userspecified tolerance. This test is suitable only when the unknown parameters are of the same order of magnitude. A more general convergence criterion is
Ak}
< l(T NSIG
(4.11)
where p is the number of parameters and NSIG is the number of significant digits
desired in the parameter estimates. Although this is not guaranteed, the above convergence criterion yields consistent results assuming of course that no parameter converses to zero!
Algorithm - Implementation Steps:
1. Input the initial guess for the parameters, k<0) and NSIG
2. Forj=0,l, 2,..., repeat the following
3. Compute y(x,,k j) ) and Gj for each i=l,...,N, and set up matrix A & vector b
4. Solve the linear equation AAk^'Mb and obtain Ak y+1) . 5. Determine u using the bisection rule and obtain k1-'41)=k(^+
Copyright 2001 by Taylor & Francis Group, LLC
^'''
Gauss-Newton Method for Algebraic Models
53
6. Continue until the maximum number of iterations is reached or convergence
is achieved (i.e.,
1^P
< 1(TNS1G).
7. Compute statistical properties of parameter estimates (see Chapter 11). In summary, at each iteration of the estimation method we compute the
model output, y(x,,k(j)), and the sensitivity coefficients, G i; for each data point
i=l,...,N which are used to set up matrix A and vector b. Subsequent solution of the linear equation yields Ak^1' and hence k^1' is obtained. The converged parameter values represent the Least Squares (LS), Weighted LS or Generalized LS estimates depending on the choice of the weighting matrices QJ. Furthermore, if certain assumptions regarding the statistical distribution of the residuals hold, these parameter values could also be the Maximum Likelihood (ML) estimates.
4.2.3
Formulation of the Solution Steps for the Gauss-Newton Method: Two Consecutive Chemical Reactions
Let us consider a batch reactor where the following consecutive reactions take place (Smith, 1981) A
k|
>B
k2
> D
(4.12)
Taking into account the concentration invariant C A +C B +C D = CAO, i.e. that there is no change in the total number of moles, the integrated forms of the isothermal rate equations are
(4.13a)
(4.13b)
C D (t)=C A O -C A (t)-C B (t)
(4.13c)
where CA, CB and CD are the concentrations of A, B and D respectively, t is the reaction time, and k b k2 are the unknown rate constants. During a typical experiCopyright 2001 by Taylor & Francis Group, LLC
54
Chapter 4
ment, the concentrations of A and B are only measured as a function of time. Namely, atypical dataset is of the form [tj, CAl, CBj], i=l,...,N. The variables, the parameters and the governing equations for this problem can be rewritten in our standard notation as follows:
Parameter vector:
Vector of independent variables:
Output vector (dependent variables): Model equations:
= [k,,k 2 ] T
x = [X]]
y [yi,y2]
T T f = Ff, f,l f=[f,,f 2] = T
w n e r exx,|= =t where
where yi= CA, y2= CE
where f,(x1,k,,k2)=CAOe" (4.14a)
f2(x1,k1,k2)=CAok1
k2-kj
k2-k,
(4.14b)
The elements of the (2x2)-sensitivity coefficient matrix G are obtained as follows:
G,, =
df,
M;
(4.15a)
(4.15b)
G21 =
G22 =
(4.15d)
(k2-k,)2
Equations 4.14 and 4.15 are used to evaluate the model response and the sensitivity coefficients that are required for setting up matrix A and vector b at each iteration of the Gauss-Newton method.
Copyright 2001 by Taylor & Francis Group, LLC
Gauss-Newton Method for Algebraic Models
4.2.4 Notes on the Gauss-Newton Method
55
This is the well-known Gauss-Newton method which exhibits quadratic convergence to the optimum parameter values when the initial guess is sufficiently close. The Gauss-Newton method can also be looked at as a procedure that converts the nonlinear regression problem into a series of linear regressions by linearizing the nonlinear algebraic equations. It is worth noting that when the model equations are linear with respect to the parameters, there are no higher order terms (HOT) and the linearization is exact. As expected, the optimum solution is obtained in a single iteration since the sensitivity coefficients do not depend on k. In order to enlarge the region of convergence of the Gauss-Newton method and at the same time make it much more robust, a stepping parameter is used to avoid the problem of overstepping particularly when the parameter estimates are away from the optimum. This modification makes the convergence to the optimum monotonic (i.e., the objective function is always reduced from one iteration to the next) and the overall estimation method becomes essentially globally convergent. When the parameters are close to the optimum the bisection rule can be omitted without any problem. Finally, an important advantage of the Gauss-Newton method is also that at the end of the estimation besides the best parameter estimates their covariance matrix is also readily available without any additional computations. Details will be given in Chapter 11.
4.3 EXAMPLES
4.3.1 Chemical Kinetics: Catalytic Oxidation of 3-Hexanol
Gallot et al. (1998) studied the catalytic oxidation of 3-hexanol with hydrogen peroxide. The data on the effect of the solvent (CH 3 OH) on the partial conversion, y, of hydrogen peroxide were read from Figure la of the paper by Gallot et al. (1998) and are also given here in Table 4.1. They proposed a model which is given by Equation 4.16.
y = kl[]-exp(-k2t)]
(4.16)
In this case, the unknown parameter vector k is the 2-dimensional vector [k],k2]T. There is only one independent variable (xi=t) and only one output variable. Therefore, the model in our standard notation is y, = f 1 (x 1 ,k,,k 2 ) = Idll-expC-Ml)] (4.17)
Copyright 2001 by Taylor & Francis Group, LLC
56
Chapter 4
Table 4.1
Catalytic Oxidation of 3-Hexanol
Modified Reaction
Time (t)
Partial Conversion (y)
0.75gofCH 3 OH
0.055 0.090 0.120 0.150 0.165 0.175 1.30gofCH 3 OH 0.040 0.070 0.100 0.130 0.150 0.160
(h kg/kmol) -> j 6 13 18
26 28 Sowrce: Gallot et al. (1998)
The (1*2) dimensional sensitivity coefficient matrix G = [Gj,, Gi 2 ] is given
by
G
= I
(7K
= l-ex/7(-k 2 x,)
(4.18a)
'12
exp(-k2x.\)
(4.18b)
Equations 4.17 and 4.18 are used to evaluate the model response and the sensitivity coefficients that are required for setting up matrix A and vector b at each iteration of the Gauss Newton method.
4.3.2
Biological Oxygen Demand (BOD)
Data on biological oxygen demand versus time are usually modeled by the following equation
y = k}[]-exp(-k2x)]
(4.19)
where k t is the ultimate carbonaceous oxygen demand (mg/L) and k2 is the BOD reaction rate constant (</'). A set of BOD data were obtained by 3rd year Environmental Engineering students at the Technical University of Crete and are given in Table 4.2.
Copyright 2001 by Taylor & Francis Group, LLC
Gauss-Newton Method for Algebraic Models
Table 4.2 A Set of BOD Data
57
Time (days) 1 2 3 4 5
BOD (mg/L) 110 180
6 1 8
230 260 280 290 310 330
As seen the model for the BOD is identical in mathematical form with the model given by Equation 4.17.
4.3.3 Numerical Example 1
Let us consider the following nonlinear model (Bard, 1970). Data for the model are given in Table 4.3. =k
K. } X } ~T K. T X ^
(4.20)
This model is assumed to be able to fit the data given in Table 4.3. Using our standard notation [y=f(x,k)] we have,
Parameter vector: Vector of independent variables: Output vector: Model Equation:
where f ] (x 1 ,x 2 ,X3 i k 1 ,k 2 ,k 3 )= k,
k = [k,,k 2 , k3]T x = [x,, x2 x3]T
y =[y] f = [f,]
(4.21)
K-9 \2 ~^~ ^1^3
The elements of the (/jd)-dimensional sensitivity coefficient matrix G are obtained by evaluating the partial derivatives: G,,= l - - l = 1 . (4.22a)
ok i
Copyright 2001 by Taylor & Francis Group, LLC
58
Chapter 4
Table 4.3
Data for Numerical Example 1
*1
*2
Run
X3
Ycalc
0.14 1 0.18 2 o 0.22 _> 0.25 4 0.29 5 0.32 6 0.35 7 0.39 8 0.37 9 0.58 10 0.73 11 0.96 12 1.34 13 14 2.10 4.39 15 Source: Bard (1970).
1 2
o J
4 5
6 7 8 9 10 11 12 13 14 15
15 14 13 12 11 10 9 8 7
1 2 3 4 5
6 5 4 -> j 2 1
6 7 8 7 6 5 4 -) j 2 1
0.1341 0.1797 0.2203 0.2565 0.2892 0.3187 0.3455 0.3700 0.4522 0.5618 0.7152 0.9453 1.3288 2.0958 4.3968
5k 2
J*L ok.
(k 2 x 2 + k 3 x 3 ) 2 (k 2 x, +k 3 x 3 ) 2
(4.22b)
(4.22c)
Equations 4.21 and 4.22 are used to evaluate the model response and the sensitivity coefficients that are required for setting up matrix A and vector b at
each iteration of the Gauss-Newton method.
4.3.4 Chemical Kinetics: Isomerization of Bicyclo [2,1,1] Hexane
Data on the thermal isomerization of bicyclo [2,1,1] hexane were measured by Srinivasan and Levi (1963). The data are given in Table 4.4. The following nonlinear model was proposed to describe the fraction of original material remaining (y) as a function of time (x,) and temperature (x2). The model was reproduced from Draper and Smith (1998)
Copyright 2001 by Taylor & Francis Group, LLC
Gauss-Newton Method for Algebraic Models
59
y = exp< - k ] x i exp - k -
620
(4.23)
Using our standard notation [y=f(x,k)] we have, Parameter vector: Vector of independent variables: Output vector: Model Equation: where
f | ( X ] , x 2 , k 1 , k 2 ) = e x p j - k [ X ] exp - k ^
vx2
k = [kh k2]T x = [xh x2]T y = [y] f = [f,]
620
(4.24)
Table 4.4
Isomerization ofBicyclo (2,1,1) Hexane
Run 1
2 3 4 5 6
7
X]
X2
y
0.900 0.949 0.886 0.785 0.791 0.890 0.787 0.877 0.938 0.782 0.827 0.696 0.582 0.795 0.800 0.790 0.883
0.712 0.576 0.715
Run
21
X|
X2
y
0.673 0.802 0.802 0.804 0.794 0.804 0.799 0.764 0.688
0.717 0.802
8 9 10 11 12 13 14 15 16 17
18
19
20
120.0 60.0 60.0 120.0 120.0 60.0 60.0 30.0 15.0 60.0 45.1 90.0 150.0 60.0 60.0 60.0 30.0 90.0 150.0 90.4
600 600 612 612 612 612 620 620 620 620 620 620 620 620 620 620 620 620 620 620
22 23 24 25 26 27 28 29 30
31
32 33 34 35 36 37 38 39 40 41
120.0 60.0 60.0 60.0 60.0 60.0 60.0 30.0 45.1 30.0 30.0 45.0 15.0 30.0 90.0 25.0 60.1 60.0 30.0 30.0 60.0
620 620 620 620 620 620 620 631 631 631 631 631 639 639 639 639 639 639 639 639 639
0.695 0.808 0.655 0.309 0.689
0.437 0.425 0.638
0.659 0.449
Source: Srinivasan and Lev! (1963).
Copyright 2001 by Taylor & Francis Group, LLC
60
Chapter 4
The elements of the (/x2)-dimensional sensitivity coefficient matrix G are
obtained by evaluating the partial derivatives:
G,, =
5k,
(4.25a)
Gl2 =
_6f, ok 7
= Mi
I exp\ - k 2 2 1 K1 620 j I U 2 620 If '
(4.25b)
Equations 4.24 and 4.25 are used to evaluate the model response and the sensitivity coefficients that are required for setting up matrix A and vector b at
each iteration of the Gauss-Newton method.
4.3.5 Enzyme Kinetics
Let us consider the determination of two parameters, the maximum reaction
rate (rmax) and the saturation constant (Km) in an enzyme-catalyzed reaction following Michaelis-Menten kinetics. The Michaelis-Menten kinetic rate equation
relates the reaction rate (r) to the substrate concentrations (S) by
(4.26)
Km+S
The parameters are usually obtained from a series of initial rate experiments performed at various substrate concentrations. Data for the hydrolysis of benzoylL-tyrosine ethyl ester (BTEE) by trypsin at 30 C and pH 7.5 are given below:
20 15 faM) r 330 300 (liM/min) Source: Blanch and Clark (1996)
10 260
5.0 220
2.5 110
In this case, the unknown parameter vector k is the 2-dimensional vector [rmax, Km]T, the independent variables are only one, x = [S] and similarly for the output vector, y = [r]. Therefore, the model in our standard notation is
y, = f ] (x 1 ,k 1 ,k 2 ) =
Copyright 2001 by Taylor & Francis Group, LLC
(4.27)
Gauss-Newton Method for Algebraic Models
61
The (ly-2) dimensional sensitivity coefficient matrix G = [Gn, Gp] is given
by
'of,
(4 28a
k +x
- >
M2
2+X|
Equations 4.27 and 4.28 are used to evaluate the model response and the sensitivity coefficients that are required for setting up matrix A and vector b at each iteration of the Gauss-Newton method.
4.3.6 Catalytic Reduction of Nitric Oxide
As another example from chemical kinetics, we consider the catalytic reduction of nitric oxide (NO) by hydrogen which was studied using a flow reactor operated differentially at atmospheric pressure (Ayen and Peters, 1962). The following reaction was considered to be important
N.O + H, <> H20+-N2 (4.29)
Data were taken at 375, 400 C, and 425 C using nitrogen as the diluent. The reaction rate in gmol/(min-g-catalyst) and the total NO conversion were measured at different partial pressures for H 2 and NO. A Langmuir-Hinshelwood reaction rate model for the reaction between an adsorbed nitric oxide molecule and one adjacently adsorbed hydrogen molecule is described by:
=
KNOpNO + K H 2 p H
where r is the reaction rate in gmol/(min-g-catalyst), p H2 is the partial pressure of hydrogen (aim), PNO is the partial pressure of NO (atm), KNO= A2exp{-E2/RT} atm'1 is the adsorption equilibrium constant for NO, KH2= A3exp{-E3/KT} atm'1 is the adsorption equilibrium constant for H 2 and k=A\exp{-Ei/RT} gmol/(min-gcatalyst) is the forward reaction rate constant for surface reaction. The data for the above problem are given in Table 4.5.
Copyright 2001 by Taylor & Francis Group, LLC
62
Chapter 4
The objective of the estimation procedure is to determine the parameters k, K H 2 and K NO (if data from one isotherm are only considered) or the parameters A h
A2, A3. E|, E2, E3 (when all data are regressed together). The units of E,, E2, E3 are in cal/mol and R is the universal gas constant (1 .987 cal/mol K). For the isothermal regression of the data, using our standard notation [y=f(x,k)] we have, Parameter vector: Independent variables: Output vector Model Equation where
f1(x1.x2.k..k2.k3)= 2 . 2 (l + k 3 x 2 + k 2 x , ) (4.31)
k = [kh k2, k3]1 x = [\t, x2f y = [y,] f = [f,]
where k,=k, k2=KH2 & k 3 =K N o where x,=p H2 , x2=pNO where yi=r
The elements of the (/x3)-dimensional sensitivity coefficient matrix G are obtained by evaluating the partial derivatives: G,,= 1 - = , (l + k 3 x 2 + k 2 x , ) (4.32a)
k]k3x1x2
2k 1 k 2 k 3 x 1 x 2 ~ ~ k3x2
, . _, ,
(4.JZD)
k3x2+k2x,)2
(l+ k 3 x , + k 2 x , }
Equations 4.31 and 4.32 are used to evaluate the model response and the sensitivity coefficients that are required for setting up matrix A and vector b at each iteration of the Gauss Newton method.
4.3.7 Numerical Example 2
Let us consider the following nonlinear model (Hartley, 1961).
y = k1+k2et/7(k3x)
Copyright 2001 by Taylor & Francis Group, LLC
(4.33)
Gauss-Newton Method for Algebraic Models
Table 4.5 Experimental Data for the Catalytic Reduction of Nitric Oxide
PH2
63
PNO
(atm)
(atm)
Reaction Rate, rxlO 5 gmol/(min-gcalalyst)
Total NO
Conversion (%)
0.00922 0.0136 0.0197 0.0280 0.0291 0.0389 0.0485 0.0500 0.0500 0.0500 0.0500 0.0500 0.00659 0.0113 0.0228 0.0311 0.0402 0.0500 0.0500 0.0500 0.0500 0.0500 0.0500 0.00474 0.0136 0.0290 0.0400 0.0500 0.0500 0.0500 0.0500
T=375 <C, Weight of catalyst=2.39g 1.60 0.0500 0.0500 2.56 3.27 0.0500 3.64 0.0500 0.0500 3.48 4.46 0.0500 4.75 0.0500 1.47 0.00918 0.0184 2.48 3.45 0.0298 4.06 0.0378 4.75 0.0491 T=400 C, Weight of catalyst= 1.066 g 2.52 0.0500 4.21 0.0500 5.41 0.0500 6.61 0.0500 6.86 0.0500 8.79 0.0500 3.64 0.0100 4.77 0.0153 6.61 0.0270 7.94 0.0361 7.82 0.0432 T=425 t. Weight of catalyst=l .066 g 0.0500 5.02 0.0500 7.23 0.0500 11.35 0.0500 13.00 0.0500 13.91 0.0269 9.29 0.0302 9.75 0.0387 11.89
1.96
2.36
2.99 3.54 3.41
4.23
4.78 14.0 9.15 6.24 5.40 4.30 0.59 1.05 1.44 1.76 1.91
2.57
8.83 6.05 4.06 3.20 2.70 2.62 4.17 6.84 8.19 8.53 13.3 12.3 10.4
Source: Ayen and Peters (1962).
Copyright 2001 by Taylor & Francis Group, LLC
64
Chapter 4
Data for the model are given below in Table 4.6. The variable y represents
yields of wheat corresponding to six rates of application of fertilizer, x, on a coded
scale. The model equation is often called Mitcherlisch's law of diminishing returns.
According to our standard notation the model equation is written as follows y = f 1 ( x , , k 1 , k 2 , k 3 ) = k1 + k 2
(4.34)
The elements of the (7x3)-dimensional sensitivity coefficient matrix G are
obtained by evaluating the partial derivatives:
Sf, 9k i Sf, i.35a)
(4.35b)
Gn =
Sf,
ok.
(4.35c)
Table 4.6 Data for Numerical Example 2
X
y
127 151 379 421 460 426
-5 -3 -1 1 3
5
Source: Hartley (1961).
4.4 SOLUTIONS
The solutions to the Numerical Examples 1 and 2 will be given here. The rest of the solutions will be presented in Chapter 16 where applications in chemical reaction engineering are illustrated.
Copyright 2001 by Taylor & Francis Group, LLC
Gauss-Newton Method for Algebraic Models
4.4.1 Numerical Example 1
65
Starting with the initial guess k (0) =[l, 1, 1]T the Gauss-Newton method easily converged to the parameter estimates within 4 iterations as shown in Table 4.7. In the same table the standard error (%) in the estimation of each parameter is also shown. Bard (1970) also reported the same parameter estimates [0.08241, 1.1330, 2.3437] starting from the same initial guess. The structure of the model characterizes the shape of the region of convergence. For example if we change the initial guess for k, substantially, the algorithm converges very quickly since it enters the model in a linear fashion. This is clearly shown in Table 4.8 where we have used k<0)=[ 100000, 1, 1]T. On the other hand, if we use for k2 a value which is just within one order of magnitude away from the optimum, the Gauss-Newton method fails to converge. For example if k (0) =[I, 2, I] 1 is used, the method converges within 3 iterations. If however, k (0) =[l, 8, 1]T or k (0) =[l, 10, 1]T is used, the Gauss-Newton method fails to converge. The actual shape of the region of convergence can be fairly irregular. For example if we use k(0)=[l, 14, 1]T or k (0) =[l, 15, if the Gauss-Newton method converges within 8 iterations for both cases. But again, when k (0) =[l, 16, 1]T is used, the Gauss-Newton method fails to converge.
Table 4.7 Parameter Estimates at Each Iteration of the Gauss-Newton Method for Numerical Example-Iwith Initial Guess [1, 1, 1]
Iteration
LS Objective function 0 41.6817 1 1 .26470 2 0.03751 3 0.00824387 4 0.00824387 Standard Error (%)
k, 1
0.08265 0.08249 0.08243 0.08241 15.02
k2 1
1.183 1.165 1.135 1.133 27.17
k3 1
1.666 2.198 2.338 2.344 12.64
Table 4.8 Parameter Estimates at Each Iteration of the Gauss-Newton Method for Numerical Example I with Initial Guess [100000, I, 1]
Iteration LS Objective function l.SOxlO 9 1.26470 0.03751 0.00824387 0.00824387
k,
100000 0.08265 0.08249 0.08243 0.08241
k2 1 1.183 1.165 1.135 1.133
k3 1
1.666
0 1 2 o j 4
2.198 2.338 2.344
Copyright 2001 by Taylor & Francis Group, LLC
66
Chapter 4
4.4.2
Numerical Example 2
Starting with the initial guess k(0)=[100, -200, -1] the Gauss-Newton method
converged to the optimal parameter estimates given in Table 4.9 in 12 iterations. The number of iterations depends heavily on the chosen initial guess. If for example we use k(0)=[1000, -200, -0.2] as initial guess, the Gauss-Newton method converges to the optimum within 3 iterations as shown in Table 4.10. At the bottom of Table 4.10 we also report the standard error (%) in the parameter estimates. As expected the uncertainty is quite high since we are estimating 3 parameters from
only 6 data points and the structure of the model naturally leads to a high correlation between k2 and k3.
Hartley (1961) reported also convergence to the same parameter values
k*=[523.3, -156.9, -0.1997]' by using as initial guess k(0)=[500, -140, -0.18]T.
Table 4.9 Parameter Estimates at Each Iteration of the Gauss-Newton
Method for Numerical Example 2. Initial Guess [100, -200, -I]
Iteration 0 1 2
3
O
4 5 6 7 8 9 10 11 12
LS Objective Function 9.003xl0 8 1.514xl0 7 l.SOlxlO 7 8.722x1 04 2.47 IxlO 4 1.392xl0 4 1.346xl0 4 1.340xl0 4 1.339xl0 4 1.339xl0 4 1.339xl0 4 1.339xl0 4 1.339xl0 4
k. 100
443.6 445.1 457.9 494.1 508.2 528.9 518.7 524.8 522.5 523.6 523.2 523.3
k2
-200 -32.98 -36.00 -62.79 -127.0 -140.0 -164.1 -151.7 -158.8 -156.1 -157.3 -156.8 -156.9
k3 -1
-0.9692 -0.7660 -0.4572 -0.1751 -0.2253 -0.1897 -0.2041 -0.1977 -0.2005 -0.1993 -0.1998 -0.1997
Table 4.10 Parameter Estimates at Each Iteration of the Gauss-Newton Method for Numerical Example 2. Initial Guess [10s, -200, -0.2]
Iteration
LS Objective Function 0 1.826xl0 7 1 1.339xl0 4 2 1.339xl0 4 "i j 1.339xl0 4 Standard Error (%)
k. 1000
523.4 523.1 523.3 30.4
k2 -200
-157.1
k3 -0.2 -0.1993 -0.1998 -0.1997 85.2
-156.8 -156.9 115.2
Copyright 2001 by Taylor & Francis Group, LLC