Nonlinear Equations: 6.1 The Problem of Nonlinear Root-Finding
Nonlinear Equations: 6.1 The Problem of Nonlinear Root-Finding
Nonlinear Equations
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
70 N ONLINEAR E QUATIONS
be a sequence which converges to a root , and let (k) = x(k) , . If there exists a number p and
a non-zero constant c such that
(k+1)
lim
k!1 j(k) jp = c; (6.1)
then p is called the order of convergence of the sequence. For p = 1; 2; 3 the order is said to be
linear, quadratic and cubic respectively.
= jf 0()j : (6.2)
This is the best error bound for any root finding method. Note that is large when the first derivative
of the function at the root, jf 0 ()j, is small. In this case the problem of finding the root is ill-
conditioned. This is shown graphically in Figure 6.1.
f (x)
f (x)
21 22
1 2
2
2 x
f 0(1) 1 large 2 small
ill-conditioned well-conditioned
F IGURE 6.1:
With further iterations, rounding errors will dominate and the differences will vary irregularly.
The iterations should be terminated and x(k) be accepted as the estimate for the root when the
following two conditions are satisfied simultaneously:
(k+1)
1. x , x(k) x(k) , x(k,1) and (6.3)
x(k) , x(k,1)
2.
1 + jx(k) j < : (6.4)
is some coarse tolerance to prevent the iterations from being terminated before x(k) is close to ,
i.e. before the step size x(k) , x(k,1) becomes “small”. The condition (6.4) is able to test the relative
offset when x(k) is much larger than 1 and much smaller than 1. In practice, these conditions are
used in conjunction with the condition that the number of iterations not exceed some user defined
limit.
,
be the mid-point of the interval x(0) ; x(1) . Three mutually exclusive possibilities exist:
,
if f x(2) = 0 then the root has been found;
, , ,
if f x(2) has the same sign as f x(0) then the root is in the interval x(2) ; x(1) ;
, , ,
if f x(2) has the same sign as f x(1) then the root is in the interval x(0) ; x(2) .
In the last two cases, the size of the interval bracketing the root has decreased by a factor of two.
The next iteration is performed by evaluating the function at the mid-point of the new interval.
After k iterations the size of the interval bracketing the root has decreased to:
x(1) , x(0)
2k
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
72 N ONLINEAR E QUATIONS
The process is shown graphically in Figure 6.2. The bisection method is guaranteed to converge
to a root. If the initial interval brackets more than one root, then the bisection method will find one
of them.
f (x)
x(0) x(1)
x
x(2)
x(3) x(4)
x(5)
F IGURE 6.2:
Since the interval size is reduced by a factor of two at each iteration, it is simple to calculate in
advance the number of iterations, k , required to achieve a given tolerance, 0 , in the solution:
(1)
(0)
k = log2 x , x :
0
The bisection method has a relatively slow linear convergence.
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
6.6 R EGULA FALSI 73
Note that the secant method requires two initial function evaluations but only one new function
evaluation is made at each iteration. The secant method is shown graphically in Figure 6.3.
f (x)
F IGURE 6.3:
The secant method does not have the root bracketing property of the bisection method since
the new estimate, x(k+1) , of the root need not lie within the bounds defined by x(k,1) and x(k) . As
a consequence, the secant method does not always converge, but when it does so it usually does
so faster than the bisection method. It can be shown that the order of convergence of the secant
method is:
p
1 + 5 1:618:
2
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
74 N ONLINEAR E QUATIONS
f (x)
x(0) x(1)
x(3) x(2) x
F IGURE 6.4:
f (x) x(3)
x(0)
x(2) x(1) x
F IGURE 6.5:
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
6.7 N EWTON ’ S M ETHOD 75
f (x)
F IGURE 6.6:
Newton’s method may be derived from the Taylor series expansion of the function:
2
f (x + x) = f (x) + f 0(x)x + f 00(x) (2x) + : : : (6.8)
For a smooth function and small values of x, the function is approximated well by the first two
terms. Thus f (x + x) = 0 implies that x = , ff0((xx)) . Far from a root the higher order terms are
significant and Newton’s method can give highly inaccurate corrections. In such cases the Newton
iterations may never converge to a root. In order to achieve convergence the starting values must be
reasonably close to a root. An example of divergence using Newton’s method is given in Figure 6.7.
Newton’s method exhibits quadratic convergence. Thus, near a root the number of significant
digits doubles with each iteration. The strong convergence makes Newton’s method attractive in
cases where the derivatives can be evaluated efficiently and the derivative is continuous and non-
zero in the neighbourhood of the root (as is the case with multiple roots).
Whether the secant method should be used in preference to Newton’s method depends upon
the relative work required to compute the first derivative of the function. If the work required to
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
76 N ONLINEAR E QUATIONS
f (x)
x(1) x(0) x
F IGURE 6.7:
evaluate the first derivative is greater than 0:44 times the work required to evaluate the function,
then use the secant method, otherwise use Newton’s method.
It is easy to circumvent the poor global convergence properties of Newton’s method by com-
bining it with the bisection method. This hybrid method uses a bisection step whenever the Newton
method takes the solution outside the bisection bracket. Global convergence is thus assured while
retaining quadratic convergence near the root. Line searches of the Newton step from x(k) to x(k+1)
are another method for achieving better global convergence properties of Newton’s method.
6.8 Examples
In the following examples, the bisection, secant, regula falsi and Newton’s methods are applied to
find the root of the non-linear function f (x) = x2 , 1 between [0; 3]. xL and xR are the left and
right bracketing values and xnew is the new value determined at each iteration. is the distance
from the true solution, nf is the number of function evaluations required at each iteration. The
bisection method is terminated when conditions (6.3) and (6.4) are simultaneously satisfied for
= 1 10,4. The remaining methods determine the solution to the same level of accuracy as the
bisection method.
The examples show the linear convergence rate of the bisection and regula falsi methods, the
better than linear convergence rate of the secant method and the quadratic convergence rate of
Newton’s method. It is also seen that although Newton’s method converges in 5 iterations, 5
function and 5 derivative evaluations, to give a total of 10 evaluations, are required. This contrasts
to the secant method where for the 8 iterations, 9 function evaluations are made.
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
6.8 E XAMPLES 77
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
78 N ONLINEAR E QUATIONS
Let:
G = d ln jdx
Pn(x)j (6.11)
= 1 + 1 +:::+ 1
x , x1 x , x2 x , xn (6.12)
0
= PPn((xx)) (6.13)
n
and
2 jPn(x)j
H = , d lndx 2 (6.14)
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
6.9 L AGUERRE ’ S M ETHOD 79
G = 1 + n ,
1 (6.19)
H = 2 + ,2 1 :
1 n (6.20)
= n
p (6.21)
G (n , 1) (nH , G2)
where the sign should be taken to yield the largest magnitude for the denominator. The new esti-
mate of the root is obtained from the old using the update:
6.9.1 Example
Following is a simple example that illustrates the use of Laguerre’s method. Consider the third
order polynomial:
P2(x) = x2 , 4x + 3
P 02(x) = 2x , 4
P 00 2(x) = 2
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
80 N ONLINEAR E QUATIONS
x(0)
1 = 0:5
then
(0)
G = (0) 2x2 1 , 4(0) = ,2:4
x1 , 4x1 + 3
H = G2 , (0) 2 2 (0) = 4:16
x1 , 4x1 + 3
= p
2 = ,0:5 (choosing ,4 as the largest denominator)
G (2 , 1)(2H , G2 )
The first root is then:
2. The second root is found by dividing P2 by (x1 , 1). The trivial result is that x2 = 3.
p an ;
p0 0:0;
p00 0:0;
for i in n , 1 : : : 0 loop
p00 x p00 + p0 ;
p0 x p0 + p;
p x p + ai ;
end loop;
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
6.9 L AGUERRE ’ S M ETHOD 81
p00 2 p00;
If the polynomial and its derivatives are evaluated using the individual powers of x, (n2 +3n)=2
operations are required for the value, (n2 + n , 2)=2 for the first derivative and (n2 , n , 2)=2 for
the second derivative, i.e. a total of (3n2 + 3n , 4)=2 operations. Horner’s method requires 6n + 1
operations to evaluate the polynomial and both derivatives and thus requires less operations when
n > 3. If just the function evaluation is required, then Horner’s method will require less operations
for all n.
6.9.3 Example
Following is an example of how Horner’s method can be used to evaluate
P3(x) = 1 + 2x , 3x2 + x3
and its derivatives.
1. P = 1, P 0 = 0 and P 00 = 0.
2. P 00 = 0 + 0, P 0 = 0 + 1 and P = x , 3.
3. P 00 = 0 + 1, P 0 = x + x , 3 and P = x(x , 3) + 2.
4. P 00 = x + 2x , 3, P 0 = x(2x , 3) + x(x , 3) + 2 and P = x(x(x , 3) + 2) + 1.
5. P 00 = 2(3x , 3)
6.9.4 Deflation
Dividing a polynomial of order n by a factor (x , x1 ) may be performed using the following
algorithm:
r an ;
an 0:0;
for i in n , 1 : : : 0 loop
q ai ;
ai r;
r x1 r + q ;
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
82 N ONLINEAR E QUATIONS
end loop;
The coefficients of the new polynomial are stored in the array ai , and the remainder in r .
fi (x1; x2 ; : : : ; xn ) = 0; i = 1; 2; : : : ; n:
Newton’s method can be generalised to n dimensions by examining Taylor’s series in n dimen-
sions:
, , ,
fi x(k+1) = fi x(k) + f 0ij x(k) x(jk+1) , x(jk) + : : :
,
where f 0 ij x(k) is an n n matrix called the Jacobian, having elements:
,
The vector xj is a vector of zeros, except for a small perturbation in the j th position.
Setting fi x(k+1) = 0 gives Newton’s formula in n dimensions:
, (k) (k+1) ( )
,
0
f ij x xj , xj = ,fi x(k)
k (6.25)
or,
,
f 0ij (x(k) )j(k+1) = ,fi x(k) (6.26)
where j
(k+1) (k+1) (k)
= xj , xj is the vector of updates. Equation (6.26) is a set of n linear
(k+1) . If the Jacobian is non-singular, then the system of linear equations
equations in n unknowns, j
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
6.10 S YSTEMS OF N ONLINEAR E QUATIONS 83
Each step of Newton’s method requires the solution of a set of linear equations. For small n the
set of linear equations may be solved using LU decomposition. For large n, alternative iterative
methods may be required. As with the one-dimensional version, Newton’s method converges
quadratically if the initial estimate, x(0) , is sufficiently close to a root. Newton’s method in multiple
dimensions suffers from the same global convergence problems as its one-dimensional counterpart.
6.10.1 Example 1
Consider finding x and y such that the following are true:
This is a contrived problem, but it serves to illustrate the application of Newton’s in multiple
dimensions. Solving for the simultaneous roots of these equations is equivalent to finding the roots
of the single factored equation:
Inspection shows that this equation has the root (0; ,1).
The first step in applying Newton’s method is to determine the form of the Jacobian and the
right hand side function vector for calculating the update vector for each iteration. These are:
2 3
@f1 (x(k) ; y(k)) @f1 (x(k) ; y(k)) " (k+1) # " #
, f (x (k) ; y (k))
6 @x @y 7 x 1
4 @f2 (x(k) ; y (k) ) @f2 (x(k) ; y (k) ) 5 (k+1) = ,f (x(k) ; y (k) )
6 7 (6.31)
y 2
"
@x @y #" # " #
2(x , y , 1) 2(y , x , 1) x(k+1) , x 2 + 2x , y 2 + 2y + 2xy , 1
=
2(x + y + 1) 2(x + y + 1) y(k+1) ,x2 , 2x , y2 , 2y , 2xy , 1
(6.32)
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
84 N ONLINEAR E QUATIONS
k 0;
> ;
specify starting point: (x(k) ; y (k) );
Where N is some specified maximum number of iterations and is some specified convergence
tolerance.
Sequences of Newton steps converging toward the root (0; ,1) for the starting points (,1; ,3),
(2; ,3) and (2; 0) are shown in figures (6.8) and (6.9).
20
10
0 0
f(x,y)
−0.5
−10
−1
−20
−1.5
−30 y
−1 −2
−0.5
0
0.5 −2.5
1
x 1.5
2
2.5 −3
3
F IGURE 6.8: Sequence of Newton steps plotted on the surface defined by Equation (6.30).
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
6.10 S YSTEMS OF N ONLINEAR E QUATIONS 85
10
5
start at (−1,−3)
0
f(x,y)
−5
−10
start at (2,0)
−15
start at (2,−3)
−20
5 10 15 20 25 30
Number of iterations
6.10.2 Example 2
Newton’s method for finding roots in multiple dimensions becomes very useful when finding ap-
proximate solutions to non-linear partial differential equations. Consider the diffusion equation in
one-dimension, where the diffusivity, D , is an exponential function of the concentration:
@c , Aec @ 2 c = 0
@t @x2 (6.33)
Notice that this is a non-linear equation whose roots are the unknown concentrations cni,+1 n+1
1 , ci
+1 . If there are m discrete nodes in our problem, then there are m equations and m unknown
and cni+1
concentrations, so Newton’s method for multiple dimensions can be applied.
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
86 N ONLINEAR E QUATIONS
The corresponding right hand side vector entry for the Newton update is ,fi . Note that in contrast
to the case of the linear diffusion equation, the system of discrete finite difference equations are
now time varying and must be constructed and factorised for each time step.
If Dirichlet boundary conditions of 1 and 0 are applied at nodes 1 and n, respectively, then the
functions whose roots are to be found are:
f1 = cn1 +1 , 1 (6.37)
fn = cnn+1 (6.38)
The right hand side vector entries for the Newton update are 1 , cn1 +1 and ,cnn+1 , respectively.
Note that the unknown dependent variables for this non-linear problem are cn+1 . Let these
unknowns be referred to simply as c so that the k th Newton update becomes c(k) . The algorithm
for solving this problem is similar to that of Example 1:
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
6.10 S YSTEMS OF N ONLINEAR E QUATIONS 87
for t in 0 T loop
k 0;
> ;
specify starting point: c(k) ; (This will be the solution from the previous time step)
while: > and k < N
(k+1) ,f ,1f ; (LU factorisation and solution)
0
end loop;
The coefficient was chosen such that D = 1 m2 s,1 at c = 1 kg m,2. The linear solutions were
calculated using D = 0:632 m2 s,1 . This value was chosen as the integrated mean value of D (c)
from c = 0 kg m,2 to c = 1 kg m,2 . The variation of the diffusion coefficient with concentration
is shown in Figure 6.11. The non-linear diffusion coefficient is smaller than the constant diffusion
coefficient at lower concentrations. This is observed in the solutions, as the concentration profile
for the non-linear diffusion does not advance as far as for constant diffusion over the same interval
of time.
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
88 N ONLINEAR E QUATIONS
1
L 5s 0.5
NL 5s
0.9 L 10s
NL 10s 0.45
L 50s
0.8 NL 50s
0.4
0.7
0.35
0.6
Concentration
0.3
Concentration
0.5
0.25
0.4
0.2
0.3 0.15
0.2 0.1
0.1 0.05
Linear
Non−linear
0 0
0 1 2 3 4 5 6 7 8 9 10 0 10 20 30 40 50 60 70 80 90 100
x (m) Time (s)
F IGURE 6.10: Comparative solutions for constant diffusion and non-linear diffusion.
1.1
D=0.368exp(c)
D=0.632
1
0.9
Diffusion Coefficient
0.8
0.7
0.6
0.5
0.4
0.3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Concentration
Solving this example problem produced the following non-linear iteration information over the
course of the first time step:
Time: 1 second
Iteration: 1 L2Norm(F) = 0.1000E+01 Delta=0.1026E+01
Iteration: 2 L2Norm(F) = 0.5549E-01 Delta=0.9876E+00
Iteration: 3 L2Norm(F) = 0.8958E-03 Delta=0.3755E-01
Iteration: 4 L2Norm(F) = 0.2278E-06 Delta=0.5565E-03
Iteration: 5 L2Norm(F) = 0.1481E-13 Delta=0.1438E-06
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002
6.10 S YSTEMS OF N ONLINEAR E QUATIONS 89
Note that F is the right hand side vector, so its magnitude (or L2 norm) may be thought of as an
indication of how good the solution at that iteration is. The solution is accurate when F is very
small. Delta is a measure of how much the solution is changing from one iteration to the next.
The Newton iterations are clearly converging.
c Department of Engineering Science, University of Auckland, New Zealand. All rights reserved. July 11, 2002