Rungekutta
Rungekutta
QMW preprint DYN #91-9, Int. J. Bifurcation and Chaos, 2, 427–449, 1992
A CONICET (Consejo Nacional de Investigaciones Cientificas y T écnicas de Argentina)
1
1. Introduction
N
UMERICAL solution of ordinary differential equations is the most
important technique in continuous time dynamics. Since most or-
dinary differential equations are not soluble analytically, numeri-
cal integration is the only way to obtain information about the trajectory.
Many different methods have been proposed and used in an attempt to
solve accurately various types of ordinary differential equations. However
there are a handful of methods known and used universally (i.e., Runge–
Kutta, Adams–Bashforth–Moulton and Backward Differentiation Formulae
methods). All these discretize the differential system to produce a differ-
ence equation or map. The methods obtain different maps from the same
differential equation, but they have the same aim; that the dynamics of
the map should correspond closely to the dynamics of the differential equa-
tion. From the Runge–Kutta family of algorithms come arguably the most
well-known and used methods for numerical integration (see, for example,
Henrici [1962] , Gear [1971] , Lambert [1973], Stetter [1973] , Chua & Lin
[1975] , Hall & Watt [1976] , Butcher [1987] , Press et al. [1988] , Parker &
Chua [1989] , or Lambert [1991] ). Thus we choose to look at Runge–Kutta
methods to investigate what pitfalls there may be in the integration of
nonlinear and chaotic systems.
We examine here the initial-value problem; the conditions on the so-
lution of the differential equation are all specified at the start of the
trajectory—they are initial conditions. This is in contrast to other prob-
lems where conditions are specified both at the start and at the end of the
trajectory, when we would have a (two-point) boundary-value problem.
Problems involving ordinary differential equations can always be re-
duced to a system of first-order ordinary differential equations by intro-
ducing new variables which are usually made to be derivatives of the origi-
nal variables. Thus for generality, we consider the non-autonomous initial
value problem
1
.. ..
. .
" " 1 ' 2 ')(*()(, " ( (2)
The variable often represents time. Almost all numerical methods for
the initial value problem are addressed to solving it in the above form. 1
1
We use the unorthodox notation -/. 1 0 etc. to avoid any confusion with the iterates of a
map.
2
A non-autonomous system can always be transformed into
an autonomous system of dimension one higher by letting
" 1 , so that we add the equation " 1 / 1 to the system and " 1
to the initial conditions. In this case, however, we will have unbounded
solutions since " 1 as
. This can be prevented for non-
autonomous systems that are periodic in by identifying planes of constant
" 1 separated by one period, so that the system is put onto a cylinder. We
are usually interested in one of the two cases above: either an autonomous
system, or a non-autonomous system that is periodic in . In these cases, we
can define the concepts of the limit sets of the system and their associated
basins of attraction which are so useful in dynamics.
It is known that sufficient conditions for a unique, continuous, dif-
ferentiable function / to exist as a solution to this problem are that
' be defined and continuous and satisfy a Lipschitz condition in in
" . The Lipschitz condition is that
" ( 3
Here is the Lipschitz constant which must exist for the condition to be
satisfied. We shall always assume that such a unique solution exists.
Our aim is to investigate how well Runge–Kutta methods do at mod-
elling ordinary differential equations by looking at the resulting maps as
dynamical systems. Chaos in numerical analysis has been investigated
before: the midpoint method in the papers by Yamaguti & Ushiki [1981]
and Ushiki [1982], the Euler method by Gardini et al. [1987] , the Eu-
ler method and the Heun method by Peitgen & Richter [1986] , and the
Adams–Bashforth–Moulton methods in a paper by Prüfer [1985]. These
studies dealt with the chaotic dynamics of the maps produced in their own
right, without relating them to the original differential equations.
In recent papers by Iserles [1990] and Yee et al. [1991], the connection
is examined between a map and the differential equation that it models.
Other studies by Kloeden & Lorenz [1986] , and Beyn [1987a; 1987b] , con-
centrate on showing how the limit sets of the map are related to those of
the ordinary differential equations. Sauer & Yorke [1991] use shadowing
theory to find orbits of the map which are shadowed by trajectories of the
differential equation.
We bring together here all the strands in these different papers, and
extend the examination of the connection between the map and the differ-
ential equation from our viewpoint as dynamicists. This topic has begun
to catch the awareness of the scientific community lately (see for example
Stewart [1992]), and several of the papers we discuss appeared after the
initial submission of this work; we have included comments on them in this
revised version.
3
2. Derivation of Runge–Kutta methods
R
UNGE–KUTTA methods compute approximations to ,
with initial values 0 0 , where , , using the
Taylor series expansion
where
( )* / , + 102 7
.-
1 /
so that
3, + 405 8
1 .-
1 /
with
" 1 0
0 $67& ' , 8 8;:< 9
8 - 19
and 1
0 for an explicit method, or
for an implicit method. For an explicit method, Eq.(9) can be solved for
05
each in turn, but for an implicit method, Eq.(10) requires the solution
0=
of a nonlinear system of s at each step. The set of explicit methods may
be regarded as a subset of the set of implicit methods with 8 0, >@? .
Explicit methods are obviously more efficient to use, but we 9 shall see that
implicit methods do have advantages in certain circumstances.
For convenience, the coefficients , , and of the Runge–Kutta method
can be written in the form of a Butcher9 array:/
B
11
/BA
where DC 1
2
()(*( , GC
1 2
()(*( and B GC 8 .
+FE A / / / / H+ E A 9 E
4
Runge–Kutta schemes are one-step or self-starting methods; they give
1 in terms of only, and thus they produce a one-dimensional map
if they are dealing with a single differential equation. This may be con-
trasted with other popular schemes (the Adams–Bashforth–Moulton and
Backward Differentiation Formulae methods), which are multistep meth-
ods; is given in terms of " 1 down to . Multistep methods give
rise to multi-dimensional maps from single differential equations.
A method is said to have order if is the largest integer for which
/ )(/ * / % 1 ( 12
For a method of order , we wish to find values for , 8 and with
1 %+ > so that Eq.(8) matches the first @ 1 terms 9 in Eq.(4).
/ To
do this we Taylor expand Eq.(8) about under the assumption that
, so that all previous values are exact, and compare this with Eq.(4)
in order to equate coefficients.
For example, the (unique) first-order explicit method is the well-known
Euler scheme
1 ' ( 13
so expanding this,
$' 1 2
% 3 ( 15
1
2
5
We can now equate coefficients in Eqs.(15) and (21) to give:
' : 1 (22)
[ ] 1
/ / 1
2
2
: 2 2
2
(23)
/
2 1(
: 2 21
2
(24)
/ 9
This is a system with three equations in four unknowns, so we can solve
in terms of (say) 2 to give a one-parameter family of explicit two-stage,
/
second-order Runge–Kutta methods:
C1 0 0
1 2 1 2 2 (25)
0 / / E
( ) (26)
1
0 0 (
2 1 (27)
2 2 2 2
/ /
Well-known second-order methods are obtained with 2 1 2, 3 4 and 1.
When 2 0, the equation collapses to the first-order / Euler method.
It is/ easy to see that we could not have obtained a third-order method
with two stages, and in fact it is a general result that an explicit ' -stage
method cannot have order greater than ' , but this is an upper bound that
is realized only for ' 4. The minimum number of stages necessary for
an explicit method to attain order is still an open problem. Calling this
' min , the present knowledge [Butcher, 1987; Lambert, 1991] is:
1 2 3 4 5 6 7 8 9 10
' / 1 2 3 4 6 7 9 11 12 ' 17 13 ' 17
min min min
One can see from the table above the reason why fourth-order methods are
so popular, because after that, one has to add two more stages to the method
to obtain any increase in the order. It is not known exactly how many stages
are required to obtain a ninth-order or tenth-order explicit method. We
only know that somewhere between twelve and seventeen stages will give
us a ninth-order explicit method, and somewhere between that number
and seventeen stages will give us a tenth-order explicit method. Nothing
is known for explicit methods of order higher than ten. In contrast to
explicit Runge–Kutta methods, it is known that for an implicit ' -stage
Runge–Kutta method, the maximum possible order max ' 2 ' for any
' . It should be noted that the order of a method can change depending on
whether it is being applied to a single equation or a system, and depending
on whether or not the problem is autonomous (see, for example, Lambert
[1991] ).
Derivation of higher-order Runge–Kutta methods using the technique
above is a process involving a large amount of tedious algebraic manipula-
tion which is both time consuming and error prone. Using computer algebra
removes the latter problem, but not the former, since finding higher-order
methods involves solving larger and larger coupled systems of polynomial
6
equations. This defeats Maple running on a modern workstation at ' 5.
To overcome this problem a very elegant theory has been developed by
Butcher which enables one to establish the conditions for a Runge–Kutta
method, either explicit or implicit, to have a given order (for example the
conditions given in Eqs.(22)–(24)). We shall merely mention here that the
theory is based on the algebraic concept of rooted trees, and we refer you to
books by Butcher [1987] , and Lambert [1991] for further details.2
2
A Mathematica package implementing Butcher’s method for obtaining order conditions
is now distributed as standard with version 2 of Mathematica.
7
3. Accuracy
T
HERE are two types of error involved in a Runge–Kutta step: round-
off error and truncation error (also known as discretization error).
Round-off error is due to the finite-precision (floating-point) arith-
metic usually used when the method is implemented on a computer. It
depends on the number and type of arithmetical operations used in a step.
Round-off error thus increases in proportion to the total number of integra-
tion steps used, and so prevents one from taking a very small step length.
Normally, round-off error is not considered in the numerical analysis of
the algorithm, since it depends on the computer on which the algorithm
is implemented, and thus is external to the numerical algorithm. Trunca-
tion error is present even with infinite-precision arithmetic, because it is
caused by truncation of the infinite Taylor series to form the algorithm. It
depends on the step size used, the order of the method, and the problem
being solved.
An obvious requirement for a successful numerical algorithm is that it
be possible to make the truncation error involved as small as is desired by
using a sufficiently small step length: this concept is known as convergence.
A method is said to be convergent if
( 28
lim
- 0
"
Notice that is kept constant, so that is always the same point and a
sequence of approximations converges to the analytic answer as the
step length is successively decreased. This is called a fixed-station limit. A
concept closely related to convergence is known as consistency; a method is
said to be consistent (with the initial value problem) if
( * 0 29
,+ 30
.- 1
1 /
as the necessary and sufficient condition for Runge–Kutta methods to be
consistent. Looking back at Eq.(22), we can see that we satisfied this
condition in deriving the family of second-order explicit methods, and in
fact it turns out to be automatically satisfied when the method has order
one or higher. It is known that consistency is necessary and sufficient for
convergence of Runge–Kutta methods, so all Runge–Kutta methods are
convergent. We provide a proof of this in Appendix A.1.
The two crucial concepts in the analysis of numerical error are local
error and global error. Local error is the error introduced in a single step
of the integration routine, while global error is the overall error caused by
8
repeated application of the integration formula. It is obviously the global
error that we wish to know about when integrating a trajectory, however it
is not possible to estimate anything other than bounds which are usually
orders of magnitude too large, and so we must content ourselves with
estimating the local error. Local and global error are sometimes defined
to include round-off error and sometimes not. We do not include round-off
error and to avoid any ambiguity we term the local and global error thus
defined local and global truncation error.
Global truncation error at 1 is
1
1 1
31
If we assume that
then 1
1
, i.e.,
no previous truncation errors have occurred,
1 . So if the previous truncation error is zero,
the local truncation error and the global truncation error are the same.
Comparing Eq.(32) with Eq.(12), we can see that a th-order method has
local truncation error % 1 . We can write Eq.(31) as
1 (33)
)*
1 1
1 )( (34)
1 )( * /
$ ( )* / ( )*
(
1
(35)
(36)
Thus
1
1 $
1 (37)
1
0
1
1
1
(38)
since 0 0. As 1 % 1 , we can now see that the global truncation
error is % .
We can write the local truncation error as
1
$
% 1 % 2 ( 39
9
may be based on step doubling, otherwise known as Richardson extrapo-
lation. Each step is taken twice, once with step length , and then again
with two steps of step length 21 . The difference between the two new s
gives an estimate of the principal local truncation error of the form
1 1( 40
2 1 1
A new step length can then be used for the next step to keep the principal
local truncation error within the required bounds. A better technique, be-
cause it is less computationally expensive, is to use a Runge–Kutta method
that has been specially developed to provide an estimate of the principal
local truncation error at each step. This may be done by embedding a
' -stage, th-order method within a ' 1 -stage, 1 th-order method.
Runge–Kutta–Merson and Runge–Kutta–Fehlberg are examples of algo-
rithms using this embedding estimate technique. It should however be
noted that the error estimates provided by some of the commonly used
algorithms are not valid when integrating nonlinear equations. For exam-
ple, Runge–Kutta–Merson was constructed for the special case of a linear
differential system with constant coefficients, and the error estimates it
provides are only valid in that rare case. It usually overestimates the
error, which is safe but inefficient, but sometimes it underestimates the
error, which could be disastrous. Thus some care has to be taken to ensure
that the embedding algorithm used will provide suitable error estimates.
As well as varying the step length, some codes based on Runge–Kutta
methods may also change between methods of different orders depending
on the error estimates being obtained. These variable-step, variable-order
(VSVO) Runge–Kutta based codes are at present the last word in numerical
integration.
The fact that codes based on Runge–Kutta methods use estimates of the
principal local truncation error, which is proportional to 1 , rather than
estimates of the local truncation error, which is % 1 , can be significant.
The principal local truncation error is usually large in comparison with
the other parts of the local truncation error, in which case we are justified
in using principal local truncation error estimates to set the step length.
However, this is not always the case, and so one should be wary. The phe-
nomenon of B-convergence [Lambert, 1991] shows that the other elements
in the local truncation error can sometimes overwhelm the principal local
truncation error, and the code could then produce incorrect results without
informing the user.
10
4. Absolute Stability
I
F the step length used is too small, excessive computation time and
round-off error result. We should also consider the opposite case, and
ask whether there is any upper bound on step length. Often there
is such a bound, and it is reached when the method becomes numerically
unstable: the numerical solution produced no longer corresponds qualita-
tively with the exact solution because some bifurcation has occurred.
The traditional criterion for ensuring that a numerical method is stable
is called absolute stability. Absolute-stability analysis of Runge–Kutta and
other numerical methods is carried out using the linear model problem
'
41
11
so that
%
1 1 2
45
2
and the stability function is
1
- (46)
fixed point
% 2
1 ( (47)
2
In general, recall that for an explicit method of order , we wished to
have Eq.(8) matching Eq.(4) up to terms of . Substituting
and " 1 , " 1 into Eq.(5):
Thus for an explicit -stage method of order (which is only possible for
4), the stability function is
% 2
1 ( 49
2! !
12
13
5. Nonlinear Absolute Stability
U
SING a linear model problem, the Runge–Kutta map is also linear.
This means that the Runge–Kutta method is bound to have only
one fixed point, as has the model problem. The basin of attraction
is bound to be infinite if the fixed point is attractive, and merely the point
itself otherwise, as in the model problem. This need not be the case with
a nonlinear model problem; a Runge–Kutta method which has a certain
absolute-stability region with the linear model problem could have quite
a different region of stability with a nonlinear problem. The conventional
absolute-stability analysis can be extended to nonlinear model problems
as long as they have a stable fixed point. In this case a nonlinear absolute-
stability test can be carried out in the same way as the linear absolute-
stability test, by finding values of in complex space at which the fixed
point loses stability. For example, let us look at the simplest nonlinear
equation, the logistic equation
1
51
(
14
with a similar linear coordinate transformation.) Now Eq.(54) is immedi-
ately recognizable as giving the Mandelbrot set when iterated with and
complex and the initial value, 0 , set to the critical point of the map,
0. The Mandelbrot set is then the set of complex values for which the
0
orbit remains bounded. From Julia and Fatou, it is known that the basin
of attraction of any finite attractor will contain the critical point (see, for
example, Devaney [1989] ), so the Mandelbrot set catalogues the parameter
values for which a finite attractor exists. Other initial conditions may not
fall in the basin of attraction of a finite attractor even if one exists; thus the
Mandelbrot set is the maximum region in parameter space for which orbits
can remain bounded. That is to say that using other initial conditions will
lead to subsets of the Mandelbrot set.
The Mandelbrot set for Eq.(53) is shown in Fig. 2. The set itself is shown
in red and the different coloured regions around it indicate the speed of es-
cape to infinity. One can see the two nonlinear absolute-stability regions
mentioned earlier; the circle of radius 1 and centre 1 which contains all
parameter values for which the fixed point at 0 is stable, and the circle of
radius 1 and centre 1 containing the parameter values for which the fixed
point at 1 is stable. These circles map to the cardioid of the well-known
Mandelbrot set of Eq.(54) under the coordinate transformation given above.
The successively smaller circles further along the real axis in both direc-
tions are of periods 2, 4, 8, ()(*( ; this is the well-known period-doubling
cascade of the logistic map. Off the real axis the largest buds on the main
circles are of period 3. Periods 4 and 5 are the next most prominent. We
can see that the breakdown of nonlinear absolute stability on moving from
inside to outside the main circles will not necessarily immediately lead to
divergence and overflow in the computer. The result will depend on the
point at which crosses the boundary in complex space, but it might well
enter one of the buds surrounding the main circles for which the attractor
has a higher period. The attracting set for initial conditions other than
the critical point (which is 0 1 , 2 in these coordinates) is a
subset of the Mandelbrot set since even if lies inside the boundary of
the Mandelbrot set, there is no guarantee that the orbit will converge to an
attractor other than infinity, because the basin boundaries of the attractors
are finite. The basin boundary at the point is known as the Julia set of
.
Here we start to see the big difference between this model problem and
the previous linear one. We have finite basins of attraction in this nonlinear
problem, so to arrive at the required fixed point solution, not only must one
use a sufficiently small step length, but one must also be within the Julia
set at that value of . That is not all; it is possible for the Runge–
Kutta map of an autonomous problem to have a set of fixed points that is
larger than the set of fixed points of the differential equation [Iserles, 1990;
Yee et al., 1991] . This is obvious for an explicit method if in ,
is a polynomial, since the Runge–Kutta map will be a higher-degree
polynomial than due to the construction of the Runge–Kutta method,
and so must have more fixed points. In fact, the fixed-point set of the
15
Figure 2: The Mandelbrot set for the map 1 1 , which
arises from applying the Euler method to the logistic equation, is shown
in red. The different coloured regions surrounding it indicate the speed of
escape to infinity at that point of complex -space. The two large circles
in this Mandelbrot set map to the prominent cardioid seen in the normal
parameterization 1 2 of the Mandelbrot set.
16
Runge–Kutta map contains the fixed-point set of the differential equation
as a subset. If is a fixed point of then 0. The Runge–
Kutta map
, + 405 55
1 .-
1 /
has fixed points given by
( ,+ 0
0 56
.-
1 /
where
05 $67 , " 1 0
$ 8 8 :< 57
8 - 19
for an explicit method. Now if 0
, 1 0 and 0 0 for
all . For an implicit method
17
h
-4
-2
0
2
4
10
Y 0
-5
-10
-2
-1
0
1
ω
h 2
-4
-2
0
2
4
10
Y 0
-5
-10
-2
-1
0
1
ω 2
Figure 3: (a) & (b). The two ghost fixed points of two-stage explicit Runge–
Kutta maps of the logistic equation are shown as functions of and the
Runge-Kutta parameter 2 . Notice that they tend to infinity as 0,
/
and that they are in general dependent on and 2 , whereas the real
/
fixed points of the logistic equation, 0 and 1, are independent of these
parameters.
18
19
Since consistency tells us that ( * 0 , we know that there
must, for an irregular method, be fewer fixed points when the step length
is zero than when it is nonzero. One can ask what happens to the ghost
fixed points as the step length tends to zero. Figure 3 shows that in this case
the ghost fixed points tend to infinity as the step length decreases. In Fig. 4
we show the nonlinear absolute-stability regions of all four fixed points for
the Runge–Kutta method with 2 1. (It is interesting that whereas the
two absolute-stability regions of/ the real fixed points are independent of
1 as
2 , the two regions of the ghost fixed points are not; we choose 2
/
our /
example.) The union of these four regions is the part of the Mandelbrot
set for this map in which iterates tend to a fixed point. In addition to
these regions, the Mandelbrot set will have further regions where periodic
orbits of period greater than one are stable, similar to the buds off the most
prominent circles in the Mandelbrot set shown in Fig. 2.
In Fig. 5, we show a bifurcation diagram for a fourth-order Runge–
Kutta scheme integrating the logistic equation. What we are doing here
is just looking along the real axis of the Mandelbrot set for this case; we
are keeping real. The complete Mandelbrot set would be much more
difficult to compute than the quadratic case of Fig. 2, since one would have
to follow fifteen critical points. One can however say that it would have
the fourth-order absolute-stability region of Fig. 1 and its mirror image in
the imaginary axis as subsets, in a similar fashion to the circles of Fig. 2.
The map is a sextodecic (sixteenth degree) polynomial in , but one can
see that period doubling leading to chaos and eventually escape to infinity
occurs along the real axis in a similar way to the cascade in the logistic
map. Since the fourth-order absolute-stability regions are larger than the
first-order ones, the behaviour remains stable up to larger step lengths
here than in the logistic-map case.
For another example of the appearance of ghost fixed points, we inte-
grate the equation cos , which has real fixed points at 2 1 2
where is an integer, with a second-order explicit Runge–Kutta method.
In addition to the real fixed points, we also get ghost fixed points which are
the roots of
cos cos
1 2
2
2 2 0( 61
/ / cos /
We plot a pair of these ghost fixed points against and 2 in Fig. 6. The
/
ghost fixed points, which are stable, come together and coalesce at a nonzero
. At smaller values of , the ghost fixed points are imaginary even when
the other variables are real. The pattern shown in Fig. 6 is repeated
periodically in and for all values of 2 . (There are no ghost fixed points
for 2 0, since in this case we have the / first-order Euler method.)
/Although ghost fixed points have been known about for some time [Ya-
maguti & Ushiki, 1981; Ushiki, 1982; Prüfer, 1985] , it is only recently that
it has been appreciated that, in some cases, they exist for all step lengths,
i.e., at step lengths below the linear absolute-stability boundary [Yee et
al., 1991] . Thus we can see that irregularity in the numerical method can
20
1.0
0.8
0.6
Yn
0.4
0.2
0.0
2.7 3.2 3.7 4.2
hλ
21
ω
-2
-1
0
1
2
4
2
h
4 3 2 1 0 0
-1
22
be a serious problem. Convergence, because it is a limit concept, ties the
dynamics of the map from the numerical method only loosely to that of the
differential equation, leaving room for major differences to occur. These
differences manifest themselves in ghost fixed points. As we have seen,
the ghost fixed points must disappear when the step length is zero, but
they may be present for all nonzero step lengths. They can be stable for
arbitrarily small step lengths, in which case a trajectory may converge to
a fixed point which does not exist in the original system. Even if they
are unstable, they still greatly affect the dynamics of the discrete system
compared to the continuous original. The difference between linear and
nonlinear absolute-stability regions is that basin boundaries are infinite in
the linear case, but finite in the nonlinear case. Thus convergence to the
fixed point is guaranteed if is within the linear absolute-stability region,
whereas this is not true in the nonlinear case since, in addition, 0 must
be inside the Julia set.
23
6. Stiff Problems
O
FTEN, accuracy requirements that set a bound on the local trun-
cation error keep the step length well within the region of stability.
When this is not the case, and maximum step length is dictated by
the boundary of the stability region, the problem is said to be stiff.
Traditionally, a linear stiff system of size was defined by
Re 0 1 62
with
Re Re ( 63
max min
1 1
24
We can write this as / where
0 1 66
1000 1001
so the eigenvalues are 1 and
1000. The equation has solution
1 2
" " 1000 67
so we would expect to be able to use a large step length after the " 1000
transient term had become insignificant in size, but in fact the presence
of the large negative eigenvalue 2 prevents this, since 2 would then lie
outside the absolute-stability region.
With appropriate initial conditions,
one could even remove the " 1000
term from the solution entirely, but this
would not change the fact that step length is dictated here by the size of
; one would still have to use a very small step length throughout the
2
calculation.
Now let us look at a nonlinear stiff problem. Take the equation
/( 68
This has solution
//(
69
y 0
-1
-2
-3
0 20 40 60 80 100
λ=100
3
y 0
-1
-2
-3
0 20 40 60 80 100
x
Figure 7: Results of numerical integration of the van der Pol equation
1 2 0 with 1 and 100 using a variable-step fourth-
order Runge-Kutta method. Each step is represented by a cross. The
far greater number of steps taken in the latter case, despite the greater
smoothness of the computed solution, shows the presence of stiffness. The
steps are so small at 100 that the individual crosses merge to form a
continuous broad line on the graph.
26
λ=100, Α=10, ω=1
3
y 0
-1
-2
-3
0 20 40 60 80 100
x
Figure 8: Results of numerical integration of the forced van der Pol
equation 1 2 % cos with 100, 10, and 1
/
using a variable-step fourth-order Runge-Kutta method, with each/ step
represented by a cross. It is obviously stiff, since as in the 100 case in
Fig. 7, the steps are so small that they have merged together in the picture.
The chaotic nature of the forced van der Pol equation is not apparent at
this timescale; the manifestation of chaos in this system lies in the random
selection of one of two possible periods for each relaxation oscillation, so
this picture, which shows only part of one cycle, cannot display chaos.
27
This forced van der Pol equation exhibits chaotic behaviour (see, for exam-
ple, Tomita [1986] , Thompson & Stewart [1986] , or Jackson [1989] ) and is
also stiff, as can be seen in Fig. 8. The presence of fast and slow time scales
in a problem is a characteristic of stiffness. Stiff problems are not mere
curiosities, but are common in dynamics and elsewhere [Aiken, 1985] .
When integrating a stiff problem with a variable-step Runge–Kutta
code, the initial step length chosen, which often causes the method to be
at or near numerical instability, generally leads to a large local truncation
error estimate. This then causes the routine to reduce the step length,
often substantially, until the principal local truncation error is brought
back within its prescribed bound. The routine then integrates the problem
successfully, but uses a far greater number of steps than seems reasonable,
given the smoothness of the solution. Because of this, round-off error and
computation time are a problem when using conventional techniques to
integrate stiff problems.
It would seem to be especially desirable then, for methods of integra-
tion for stiff problems, that the method be stable for all step lengths for the
parameter values where the original system is stable. For example, the lin-
ear model problem Eq.(41) is stable for Re 0, so the numerical method
should be stable for all for Re 0 i.e., the absolute-stability region
should be the left half-plane. The concept of A-stability was introduced
for this reason. A method is A-stable if its linear absolute-stability region
contains the whole of the left half-plane. This being the case, a numeri-
cal method integrating the linear model problem will converge to the fixed
point for all values of that the model problem itself does, and for all values
of . A-stability is a very severe requirement for a numerical method: we al-
ready know that explicit Runge–Kutta methods cannot fulfill this require-
ment since their absolute-stability regions are finite. It is known, how-
ever, that some implicit Runge–Kutta methods are A-stable [Butcher, 1987;
Lambert, 1991]. The drawback with implicit methods is that at each step
a system of nonlinear equations must be solved. This is usually achieved
using a Newton–Raphson algorithm, but at the expense of many more
function evaluations than are necessary in the explicit case. Consequently,
implicit Runge–Kutta methods are uneconomical compared to rival meth-
ods for integrating stiff problems. Usually, stiff problems are instead solved
using Backward Differentiation Formulae (also known as Gear) methods.
A-stability is not quite what we required above however, since it is based
on linear absolute stability, and also because it allows regions in addition
to the left half-plane to be in the absolute-stability region, so that the nu-
merical method may give a convergent solution when the exact solution is
diverging. Better then is what has been called precise A-stability, which
holds that the absolute-stability region should be just the left half-plane.
Precise A-stability though is still based on linear absolute-stability theory.
28
7. A Nonlinear Stability Theory
I
N the past few years numerical analysts have come to realize that
linear stability theory cannot be applied to nonlinear systems. One
cannot say that the Jacobian represents the local behaviour of the
solutions except at a fixed point. This had not previously been appreciated
in numerical analysis, and there was a tendency to believe that looking
at the Jacobian at one point as a constant, the solutions nearby would
behave like the linearized system produced from this ‘frozen’ Jacobian.
Numerical analysts have now recognized the failings of linear stability
theory when applied to nonlinear systems, and have constructed a new
theory of nonlinear stability.
The theory looks at systems that have a property termed contractivity;
if and / are any two solutions of the system satisfying
different initial conditions, then if
2 1 1 72
2
the former insists that max 0 ( being the Lyapunov exponents of the
system).
30
8. Towards a Comprehensive Stability Theory
W
E are still seeking a comprehensive nonlinear stability theory.
The theory of the previous section deals only with the special
case of contractive systems. Nonlinear absolute stability shows
that regularity in the numerical method is obviously a good thing, but it is
only concerned with fixed-point behaviour. Other studies have been made
on the link between the fixed points in the differential equation and those
in the map produced by the numerical analysis. Stetter [1973] has shown
that hyperbolic stable fixed points in the continuous system remain as
hyperbolic stable fixed points in the discrete system for sufficiently small
step lengths. Beyn [1987b] shows that hyperbolic unstable fixed points are
also correctly represented in the discrete system for sufficiently small step
lengths, and that the local stable and unstable manifolds converge to those
of the continuous system in the limit as the step length tends to zero. We
need to look at other sorts of asymptotic behaviour apart from fixed points:
invariant circles (limit cycles) and strange attractors.
Peitgen & Richter [1986] use two different Runge–Kutta methods, the
Euler method and the two-stage, second-order Heun method, to discretize
the Lotka–Volterra equations
1 1 1 2&
2 2 1 2 &(
(77)
31
In the previous example, we have seen that a non-structurally-stable
configuration of invariant circles in a system of ordinary differential equa-
tions collapses to a system with one invariant circle under the discretization
imposed. Beyn [1987a] shows that for a continuous system with a hyper-
bolic invariant circle, the invariant circle is retained in the discretization
if the step length is small enough. He demonstrates that the continuous
and discrete invariant circles run out of phase. The relative phase shift
per revolution depends on the step length, and the global error oscillates
as the discrete and continuous systems move into and out of phase. Beyn
gives examples of integrating systems that have invariant circles with the
Euler method and with a fourth-order Runge–Kutta method. He shows
that when the step length is increased, the invariant circles in the discrete
system are transformed in the former case into a strange attractor, and in
the latter, into a stable fixed point via a Hopf bifurcation.
The work reviewed in the previous paragraphs shows that under cer-
tain reasonable conditions, the numerical method can correctly reproduce
different kinds of limit sets that are present in the differential equation.
The conditions include asking that the behaviour in the continuous system
be structurally stable. This condition is not satisfied for the Lotka–Volterra
example above, which is why the numerical methods used do not correctly
reproduce the behaviour of that system.
The problem with discretization, however, is that it introduces new
limit-set behaviour in addition to that already existing in the continuous
system. This is highlighted by the work of Kloeden & Lorenz [1986] who
show that if a continuous system has a stable attracting set then, for
sufficiently small step lengths, the discrete system has an attracting set
which contains the continuous one. (An attracting set need not contain a
dense orbit, which is what distinguishes it from an attractor.) In particular,
this is shown by the demonstration that the fixed-point set of the continuous
system is a subset of the fixed-point set of the discrete system, with extra
ghost fixed points appearing in the discretization.
In general, a discretized dynamical system may possess fixed points,
invariant circles, and strange attractors which are either fewer in num-
ber, or are entirely absent in the continuous system from which it arose.
For instance, strange attractors are generic in systems of dimensionality
three or more. Discretizing a system with a strange attractor will lead to
a strange attractor. The question then to be asked is whether the proper-
ties of the discrete strange attractor are similar to those of the continuous
one. Discretizing a system without a strange attractor may also lead to a
strange attractor. If a structurally-stable feature is present in the contin-
uous system, it will also be found in the discrete version, but the converse
is not true. Note that non-structurally-stable behaviour will not in general
persist under the perturbation of the system introduced by discretization.
Shadowing theory was first used to demonstrate that an orbit of a
floating-point map produced by a computer using real arithmetic can be
shadowed by a real orbit of the exact map [Hammel et al., 1988] . More
recently it has been shown that an orbit of a floating-point map can be
32
shadowed by a real trajectory of an ordinary differential equation [Sauer &
Yorke, 1991]. In the former case, we are investigating the effect of round-off
error, in the latter case, global error. The last result would seem to contra-
dict what has been stated before about the defects of numerical methods;
in fact, it does not. The reason is that although shadowing theory is able to
put a bound on the global error, it is only possible to do this if the numerical
method satisfies
1 ( 78
This is a form of local error where 1 is a point on the numerical orbit and
/ is the exact time- map applied to the previous point on the numerical
orbit. Compare this with our definition of local truncation error in Eq.(32),
which instead takes the difference between a point on the orbit of the exact
time- map and the numerical method applied to the previous point on
the exact orbit. It is not possible to get a good bound on with Runge–
Kutta methods, but it is possible with the direct Taylor series method. (The
th order Taylor series method is just the Taylor series truncated at that
order.) The disadvantage of the Taylor series method is that one has to do
a lot of differentiation. However, if one can satisfy this bound, and another
condition which is basically an assumption of hyperbolicity, then an orbit
is -shadowed by an orbit of the exact time- map:
(
79
Comparing this with Eq.(31), we can see that it is giving us a bound on the
global error in the method. Sauer & Yorke [1991], as an example, apply
this shadowing method to prove that a chaotic numerical orbit of the forced,
damped pendulum is shadowed by a chaotic real trajectory.
33
9. Symplectic Methods for Hamiltonian Systems
I
T is now well known that numerical methods such as the ordinary
Runge–Kutta methods are not ideal for integrating Hamiltonian sys-
tems, because Hamiltonian systems are not generic in the set of all dy-
namical systems, in the sense that they are not structurally stable against
non-Hamiltonian perturbations. The numerical approximation to a Hamil-
tonian system obtained from an ordinary numerical method does introduce
a non-Hamiltonian perturbation. This means that a Hamiltonian system
integrated using an ordinary numerical method will become a dissipative
(non-Hamiltonian) system, with completely different long-term behaviour,
since dissipative systems have attractors and Hamiltonian systems do not.
This problem has led to the introduction of methods of symplectic in-
tegration for Hamiltonian systems, which do preserve the features of the
Hamiltonian structure by arranging that each step of the integration be a
canonical or symplectic transformation [Menyuk, 1984; Feng, 1986; Sanz-
Serna & Vadillo, 1987; Itoh & Abe, 1988; Lasagni, 1988; Sanz-Serna, 1988;
Channell & Scovel, 1990; Forest & Ruth, 1990; MacKay, 1990; Yoshida,
1990; Auerbach & Friedmann, 1991; Candy & Rozmus, 1991; Feng &
Qin, 1991; Miller, 1991; Marsden et al., 1991; Sanz-Serna & Abia, 1991;
Maclachlan & Atela, 1992] .
A symplectic transformation satisfies
80
A
where is the Jacobian of the map for the integration step, and is the
matrix
0 81
0 #
with being the identity matrix. Preservation of the symplectic form is
equivalent to preservation of the Poisson bracket operation, and Louiville’s
theorem is a consequence of it.
Many different symplectic algorithms have been developed and dis-
cussed, and many of them are Runge–Kutta methods [Lasagni, 1988; Sanz-
Serna, 1988; Channell & Scovel, 1990; Forest & Ruth, 1990; Yoshida, 1990;
Candy & Rozmus, 1991; Sanz-Serna & Abia, 1991; Maclachlan & Atela,
34
symplectic. Maclachlan & Atela [1992] find these Gauss–Legendre Runge–
Kutta methods to be optimal for general Hamiltonians. Thus symplectic
integration proves to be a situation where implicit Runge–Kutta methods
find a use, despite the computational penalty involved in implementing
them compared to explicit methods.
A positive experience with practical use of these methods in a problem
from cosmology has been reported by Santillan Iturres et al. [1992] . They
have used the methods described by Channell & Scovel [1990] to integrate
a rather complex Hamiltonian, discovering a structure (suspected to be
there from nonnumerical arguments) which nonsymplectic methods were
unable to reveal.
Although symplectic methods of integration are undoubtedly to be pre-
ferred in dealing with Hamiltonian systems, it should not be supposed
that they solve all the difficulties of integrating them; they are not perfect.
Channell & Scovel [1990] give examples of local structure introduced by the
discretization. For another example, integration of an integrable Hamilto-
nian system, where the solution of Newton’s equations is reducible to the
solution of a set of simultaneous equations, followed by integration over
single variables, and trajectories lie on invariant tori, will cause a noninte-
grable perturbation to the system. For a small perturbation, however, such
as we should get from a good symplectic integrator, the KAM theorem tells
us that most of the invariant tori will survive. Nevertheless, the dynami-
cal behaviour of the symplectic map is qualitatively different to that of the
original system, since in addition to invariant tori, the symplectic map will
possess island chains surrounded by stochastic layers. Thus the numerical
method perturbing the nongeneric integrable system restores genericity.
There is a more important reason why care is needed in integrating
Hamiltonian systems, even with symplectic maps, and that is the lack of
energy conservation in the map. It would seem to be an obvious goal for a
Hamiltonian integration method both to preserve the symplectic structure
and to conserve the energy, but it has been shown that this is in general
impossible, because the symplectic map with step length would then
have to be the exact time- map of the original Hamiltonian. Thus a
symplectic map which only approximates a Hamiltonian cannot conserve
energy [Zhong & Marsden, 1988; MacKay, 1990; Marsden et al., 1991].
Algorithms have been given which are energy conserving at the expense of
not being symplectic, but for most applications retaining the Hamiltonian
structure is more important than energy conservation. Marsden et al.
[1991] mention an example where using an energy-conserving algorithm
to integrate the equations of motion of a rod which can both rotate and
vibrate leads to the absurd conclusion that rotation will virtually cease
almost immediately in favour of vibration.
In fact, the symplectic map with step length is the exact time- map
of a time-dependent Hamiltonian ' with period , and is near to
35
not too much of a problem, because if the system is close to being integrable,
and has less than two degrees of freedom, there will be invariant tori in
the symplectic map which the orbits cannot cross, and so the energy can
only undergo bounded oscillations. This is in contrast to integrating the
same system with a nonsymplectic method, where there would be no bound
on the energy, which could then increase without limit. This is a major
advantage of symplectic methods. However, consider a system which has
two degrees of freedom. The phase space of the symplectic map is extended
compared to that of the original system, so that an -degree-of-freedom
system becomes an 1 -degree-of-freedom map. (The extra degree of
freedom comes from and .) Now in the case where
2, the original
system, if it were near integrable, would have two-dimensional invariant
tori acting as boundaries to motion in the three-dimensional energy shell,
but in the map the extra degree of freedom would mean that the three-
dimensional invariant tori here would no longer be boundaries to motion in
the five-dimensional energy shell, so Arnold diffusion would occur. This is a
major qualitative difference between the original system and the numerical
approximation. It has been shown to occur for two coupled pendulums by
Maclachlan & Atela [1992], and proves that symplectic methods should
not blindly be relied upon to provide predictions of long-time behaviour for
Hamiltonian systems.
There is a further point about symplectic maps that affects all numer-
ical methods using floating-point arithmetic, and that is round-off error.
Round-off error is a particular problem for Hamiltonian systems, because
it introduces non-Hamiltonian perturbations despite the use of symplec-
tic integrators. The fact that symplectic methods do produce behaviour
that looks Hamiltonian shows that the non-Hamiltonian perturbations are
much smaller than those introduced by nonsymplectic methods. How-
ever, it is shown by Earn & Tremaine [1992] that round-off error does
adversely affect the long-term behaviour of Hamiltonian maps like the
standard map, by introducing dissipation. To iterate the map they instead
use integer arithmetic with Hamiltonian maps on a lattice that they con-
struct to be better and better approximations to the original map as the
lattice spacing is decreased. They show that these lattice maps are superior
to floating-point maps for Hamiltonian systems. Possibly a combination of
the techniques of symplectic methods and lattice maps may lead to the
numerical integration of Hamiltonian systems being possible without any
non-Hamiltonian perturbations.
36
10. Conclusions
R
UNGE–KUTTA integration schemes should be applied to nonlinear
systems with knowledge of the caveats involved. The absolute-
stability boundaries may be very different from the linear case, so
a linear stability analysis may well be misleading. A problem may occur
if a reduction in step length happens to take one outside the absolute-
stability region due to the shape of the boundary. In this case, the usual
step-control schemes would have disastrous results on the problem, as
step-length reduction in an attempt to increase accuracy would have the
opposite effect.
Even inside the absolute-stability boundary, all may not be well due to
the existence of stable ghost fixed points in many problems. Since basin
boundaries are finite, starting too far from the real solution may land one
in the basin of attraction of a ghost fixed point. Contrary to expectation,
this incorrect behaviour is not prevented by insisting that the method be
convergent.
Stiffness needs a new and better definition for nonlinear systems. We
have provided a verbal description, but a mathematical definition is still
lacking. There is a lot of scope to investigate further the interaction be-
tween stiffness and chaos. Explicit Runge–Kutta schemes should not be
used for stiff problems, due to their inefficiency: Backward Differentiation
Formulae methods, or possibly implicit Runge–Kutta methods, should be
used instead.
Dynamics is not only interested in problems with fixed point solutions,
but also in periodic and chaotic behaviour. This is something that has not in
the past been fully appreciated by some workers in numerical analysis who
have tended to concentrate on obtaining results, such as those of nonlinear
stability theory, that require properties like contractivity which are too
restrictive for most dynamical systems.
There are results that tie the limit set of the Runge–Kutta map to
that of the ordinary differential equation from which it came, but they are
not as powerful as those which relate the dynamics of Poincaré maps to
their differential equations. Structurally-stable behaviour in the ordinary
differential equation is correctly portrayed in the Runge–Kutta map, but
additional limit-set behaviour may be found in the map that is not present
in the differential equation. Nonhyperbolic behaviour will probably not be
correctly represented by the Runge–Kutta method.
Shadowing theory offers hope that it will be possible to produce numer-
ical methods with built-in proofs of correctness of the orbits they produce,
at least for hyperbolic orbits. However, it does not seem to be possible to do
this with Runge–Kutta methods, and the Taylor series method, for which
it is possible, has some severe disadvantages. More research needs to be
done in this area.
Hamiltonian systems should be integrated with symplectic Runge–
Kutta methods so that dissipative perturbations are not introduced. Even
using symplectic integration, Hamiltonian systems still need to be handled
37
with care. As in dissipative systems, nongeneric behaviour like integra-
bility will not be reproduced in the numerical method. A more general
problem is that approximate symplectic integrators cannot conserve en-
ergy. Round-off error is more of a problem for symplectic integration than
in other cases, because it introduces dissipative perturbations to the system
that one is trying to avoid.
A lot more work is needed on predicting the stability and accuracy of
methods for integrating nonlinear and chaotic systems. At present, we
must make do with Runge–Kutta and other methods, but be wary of the
results they are giving us—caveat emptor!
Acknowledgements
W
E should like to acknowledge the helpful suggestions of David
Arrowsmith who has read previous versions of this paper. We
would also like to thank Shaun Bullett and Chris Penrose with
whom we have had useful discussions about complex maps, and Nik Buric
who has helped greatly in the clarification of various points. Thanks also
go to Carl Murray for help in producing some of the illustrations. JHEC
would like to acknowledge the support of the Science and Engineering
Research Council (SERC) and the AEJMC Foundation. OP would like to
acknowledge the support of the Wolfson Foundation and CONICET.
38
References
Aiken, R. C., editor [1985] Stiff Computation (Oxford University Press).
Aronson, D. G., Chory, M. A., Hall, G. R. & McGehee, R. P. [1983] “Bifur-
cations from an invariant circle for two-parameter families of maps of
the plane: A computer-assisted study,” Commun. Math. Phys. 83, 303.
Arrowsmith, D. K., Cartwright, J. H. E., Lansbury, A. N. & Place, C. M.
[1992] “The Bogdanov map: Bifurcations, mode locking, and chaos in a
dissipative system,” Int. J. Bifurcation and Chaos, submitted.
Auerbach, S. P. & Friedmann, A. [1991] “Long-term behaviour of nu-
merically computed orbits: Small and intermediate timestep analysis of
one-dimensional systems,” J. Comput. Phys. 93, 189.
Beyn, W.-J. [1987a] “On invariant closed curves for one-step methods,”
Numer. Math. 51, 103.
Beyn, W.-J. [1987b] “On the numerical approximation of phase portraits
near stationary points,” SIAM J. Num. Anal. 24, 1095.
Butcher, J. C. [1987] The Numerical Analysis of Ordinary Differential
Equations: Runge–Kutta and General Linear Methods (Wiley).
Candy, J. & Rozmus, W. [1991] “A symplectic integration algorithm for
separable Hamiltonian systems,” J. Comput. Phys. 92, 230.
Channell, P. J. & Scovel, C. [1990] “Symplectic integration of Hamilto-
nian systems,” Nonlinearity 3, 231.
Chua, L. O. & Lin, P. M. [1975] Computer-Aided Analysis of Electronic
Circuits: Algorithms and Computational Techniques (Prentice-Hall).
Devaney, R. L. [1989] An Introduction to Chaotic Dynamical Systems
(Addison–Wesley) second edition.
Earn, D. J. D. & Tremaine, S. [1992] “Exact numerical studies of Hamil-
tonian Maps: Iterating without roundoff error,” Physica 56D, 1.
Feng, K. [1986] “Difference schemes for Hamiltonian formalism and
symplectic geometry,” J. Comput. Math. 4, 279.
Feng, K. & Qin, M.–z. [1991] “Hamiltonian algorithms for Hamiltonian
systems and a comparative numerical study,” Comput. Phys. Commun.
65, 173.
Forest, E. & Ruth, R. D. [1990] “Fourth order symplectic integration,”
Physica 43D, 105.
Gardini, L., Lupini, R., Mammana, C. & Messia, M. G. [1987] “Bifur-
cations and transition to chaos in the three-dimensional Lotka–Volterra
map,” SIAM J. Appl. Math. 47, 455.
39
Gear, C. W. [1971] Numerical Initial Value Problems in Ordinary Differ-
ential Equations (Prentice-Hall).
40
Parker, T. S. & Chua, L. O. [1989] Practical Numerical Algorithms for
Chaotic Systems (Springer).
Santillan Iturres, A., Domenech, G., El Hasi, C., Vucetich, H. & Piro, O.
[1992] preprint.
41
Yoshida, H. [1990] “Construction of higher order symplectic integrators,”
Phys. Lett. A 150, 262.
42
A.1. Convergence of Runge–Kutta methods
T
O prove that consistency is necessary and sufficient for convergence
of Runge–Kutta methods, we follow Henrici [1962] . Let /$
( * 0 satisfy a Lipschitz condition as in Eq.(3) so that &'
with has a unique solution / . Using the mean value theorem:
/
Let
%/ max
, / / 89
0 1
Thus
1
/ / /
(92)
1 / %/ % , (93)
and so, since 1 $ and % % , are both positive,
1
1 1 / %/
%/+( 94
0
Now 0 so 0. Thus
0 0
1 $ 1
%/ %/+( 95
43
We now look at the fixed-station limit as in Eq.(28):
1 1 %
%/, ( 96
lim
-
0
lim
- 0
" "
Since / is continuous,
%/ 0 ( 97
lim
- 0
"
whenever this limit exists. Here are the eigenvalues of 0
0 ,
where comes from the variational equation
0 0 101
0 0
where is the Jacobian matrix. One can show (see, for example, Parker &
Chua [1989] ) that a perturbation grows as
/ ( 102
0 0 0
44
Taking the norm of both sides,
/
(103)
0
0
0
(
0 0 0 (104)
where is the Jacobian matrix. One can show (see, for example, Parker &
Chua [1989] ) that a perturbation grows as
1 ( 109
45
and 1
.- " "
114
1
1, or 1 ln
so contractivity is equivalent to 0. Thus
from Eq.(107), contractivity is sufficient to give nonpositive Lyapunov expo-
nents and thence regular motion. (Note that the reverse is not necessarily
true.)
46