0 ratings0% found this document useful (0 votes) 44 views12 pagesNonlinear Regression
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
160
Nonlinear Regression @
I. Nonlinear model
FA. Del
‘A model in which one or more independent variables are not
linear in the parameters.
B. Mathematically: Assume function f:0), where x is an independent
variable and g isa parameter veetor. 6) is non-linear if
ar
a
where g is a function of the parameter vector.
= Yond 2
Example: "y = By*B,x" is linear because 2Y
exch of which are
not functions of, nd :
"is nonlinear because 2Y = ex ana SY
By ®
Example: "y = fe! Bee, each
of which are functions of B, and B
C. Objective: The objective of nonlinear regression is to infer a value of a
variable (say ¥) from the value of another variable (say X),
D, Method: To complete our objective, we must establish a relationship
between Y and X. This relationship can be related in a stochastic
fashion by the equation
X0)+e
Where g isa vecror of unknown parameters, Xis a vector of known
explanatory variables, and e is a random variable.161
II. Nonlinear regression MODEL
‘A. Using the stochastic relationship, we can define a model in the form
= f9*6 i= 1m
B. Definitions and Assumptions
1. Definitions
1, x isa known, independent, fixed constant (predictor or input)
2. Y,=dependent variable; Y;is a random variable; y,~ ith observed
value of Y,
3. 0 isa vector of unknown parameters
4. e is called he disturbance term representing variability
I. Assumptions
5. eis arandom variable distributed N(O, #3)
6. sis unknown and constant
7. The expec:ation function is m=E[Y] = fs,0)
8. VIYJ=s
9. Y,is distributed Nim, s*)
10. Covte,s) die}
C. Determining 0)
1. The expectation function, (,0), will most likely be derived from some
physical, chemical, or other theoretical consideration,
2. Itis the researchers job to determine the simplest form of the model
‘and its parameter estimates to represent the data,
3. ‘The model is often an algebraic equation of the parameters and control
variables. However, itis not necessary for the expectation function to
be an explisit function of the parameters and control variables. For
example, the model could be defined by a set of linear differential
equations.12
Ill. Method of nonlinear regression - Principle of Least
Squares
A. Our goal is to find the simplest model that best represents a set of data.
‘The method of least squares will be employed to determine the "best fit"
curve for the data,
B. The least squeres criterion "Q" is defined as the sum of the squared
differences between each individual data point and its respective fitted
‘model values. In equation form this is expressed as
be OP
(fix, 0)= ith residual
The values of the parameter vector 0 that minimize the quantity "Q”
define the parameters of the best fit curve for the data.163
Graphically,
+ As shown by the plot, the idea is to minimize the sum of the squared
vertical distances from each data point to the curve.
C. Minimization of the least squares criterion "Q"
2
Analytically. Values of g, that minimize Q
can be found by solving sinnultancously
Lt)
20,
Numerically. Minimizing Q analytically is
most often either very difficult or not possible. Therefore, numerical
techniques must be employed to determine 0.
D. Numerical Techniques
1
Iteration, Several computer software packages use methods similar to
the Gauss-Newton method to iterate the values of the parameter vector
to find the minimum value of
However, caution should be used when using computer software,
Model assumptions should always be verified before interpreting
results
Many programs cannot distinguish between global and local minima.
‘There may be several minitna defined by combinations of different16
parameter values. However, there is normally only one global
minimum that defines the best solution to the problem.
Pictorially,
B. The probability of successful convergence when solving for the
parameter vector is made greater by supplying a good initial first guess
for the parameters. Initial guesses for can be determined by
1. Experience
2, Related experiments
3. Plotting and interpolating data
4, Interpreting derivatives
5. Transformng the fimetion
F, Iterative software packages
1. Spreadsheet
Example: Excel
1. Advantages:
1. Spreadsheets are readily available
IL Disadvanteges:
1, Must be programmed by user165
2. Output does not include descriptive statistics unless they are
programmed
3. New spreadsheet must be created for each new model
4, Will no: distinguish between local and global minimum which
‘may lead to inaccurate solutions
5. Requires good first guess
6. Does net work well with highly complicated models
2. Commercial Estimation Packages
Examples: TKSolver, Kalidograph
1. Advantages:
1. Basy touse
2.Limited descriptive statistics
3.Can compute more complicated models
IL, Disadvantages:
| Purchase required
2.Still may require good first guess
3. FORTRAN computer programs
Example: General REGression program (GREG)
1. Advantages:
1, Provides detailed descriptive statistics
2. Distinguishes between local and global minimum,
3. Can compute very complicated models
IL. Disadvantages:
1Time consuming programming required166
IV. Assessing Assumptions
A. Assumptions:
isa random variable distributed N(, s?)
S*is unknown and constant
‘The expec:tion function is m= E[Y] = x8)
VIYRS
Y, is distributed Nom, s*)
Independerce between cases ; Cow(e,e)=0: i#f
awn
B, Residual Plots:
1. Correct Picture
2. Incorrect Expectation Function167
3. Violates assumption of constant variance
4, Violates assumption of independence ; Cov(e,¢) = 0; ie
V. _ Inferences about nonlinear regression parameters
A. Estimating?
1. The error sum of squares (SSE) is equivalent to the least squares
criterion Q
Eos 0
is168
2. Anestimator for sis
s8 infbs,0))
oP
where n-p is the degrees of freedom for error and p is the number of
parameters.
B. The Coefficient of Determination, R?
1. Ris defined as the proportion of observed variation in Y explained
by the fitted nontinear regression equation.
2. The total sum of squares (SST) is defined as the sum of square
deviations about the sample mean of the observed y value
Sst = Poy)
Et
3. Thesot proportion of unexplained variations eqalto SE
4. The total proportion of explained variation is equal to the
coefficient of determination
SSE
R?
Sst
C. Interval estimation of a single q,
1. If the error terms in the nonlinear regression model are independent and
normally distributed, the following approximate (1 - a)% confidence
interval may be used
8) ty way 288)169
IL See Bates and Watts (pp. 58-59) for a detailed definition of the
approximate derivative matrix D' D’
D, Simultaneous interval estimation of several q
1. Bonferroni jont approximate confidence intervals for m parameters with
confidence level of (I~ )% are defined by
= twama-2 (0)
where s3(0)-MSE(D'D)"
Example: Fit the folowing data using the modelY = fy+e'*" and determine 90%
approximate confidence intervals for each parameter and @ 90% Bonferroni joint
approximate confidence interval. Use 40 0 and -0,80 as initial values for Band by,
x: [2 [s [a fio [as [19 J 26 Jn [34 [38 [as [52 [53 [00
vy: [4 ]s0 [4s [37 [35 [25 [20 Jie [is fia [s [ur [es [a10
Solution:
‘The table below summarizes calculations for %,0), Q, and SSE.
£x0) i
2 54 54.1454 0.0211
5 50 48.0823 3.676
7 45 444223 0.3338
10 37 39.4480 5.9925
14 35 33.6710 1.7663
19 25 27.6245 6.8882
26 © 20 209387 0.8812
31-16 -:17.1787 1.3892
3418152550 7.5350
38 13 130210 0.0004
45 8 93696 3.4953
52 ML 7.4809 12.3841
53-8 7.1905 0.6552
604 = 5.4502 2.1032
656 4.715 2.3362
SSE= te- 49.4503
SSE was minimized by iterating the values of by and b, using Microsoft Excel.
‘The values of by andb, that minimize SSE are
8, = 58.6065
Using the solution given by Bates and Watts, the respective values of S, and Sy
are determined to be approximately
Sy2 1.4720
Si, = 0.00171am
An approximate 90% confidence interval for by is
58.6065 fez L4720)
(55.9996, 61.2134)
An approximate 90% confidence interval for by is
00396 F419 (000171)
(0.0426, 0.0366)
An approximate 90% joint Bonferroni confidence interval for by is
5860654
youn)
a}
(55.4270, 61.7860)
An approximate 90% joint Bonferroni confidence interval for b, is
00396:
ogee)
(0.0433, -0.3591)
VIL. References
Bates, Douglas M., nd D.G. Watts (1988), Nonlinear Regression Analysis
and its Applications. New York: Wiley.
Neter, John, W. Wasserman, and M.H. Hunter (1983), Applied Linear
Regression Models. Homewood, Ilinois: Richard D. Inwin, INC,