Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
7 views9 pages

2021 - DL For Early Warning Signals of Tipping Points

The document discusses the development of a deep learning algorithm that provides early warning signals (EWS) for tipping points in various dynamical systems, enhancing sensitivity and specificity compared to traditional indicators. By leveraging insights from bifurcation theory, the algorithm can predict qualitative aspects of new states beyond tipping points, even in systems it was not explicitly trained on. This approach aims to improve preparedness for sudden shifts in complex natural systems across multiple fields, including ecology and climatology.

Uploaded by

liryqi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views9 pages

2021 - DL For Early Warning Signals of Tipping Points

The document discusses the development of a deep learning algorithm that provides early warning signals (EWS) for tipping points in various dynamical systems, enhancing sensitivity and specificity compared to traditional indicators. By leveraging insights from bifurcation theory, the algorithm can predict qualitative aspects of new states beyond tipping points, even in systems it was not explicitly trained on. This approach aims to improve preparedness for sudden shifts in complex natural systems across multiple fields, including ecology and climatology.

Uploaded by

liryqi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Deep learning for early warning signals of

tipping points
Thomas M. Burya,b , R. I. Sujithc , Induja Pavithrand , Marten Scheffere , Timothy M. Lentonf , Madhur Anandb ,
and Chris T. Baucha,1
a
Department of Applied Mathematics, University of Waterloo, Waterloo, ON N2L 3G1, Canada; b School of Environmental Sciences, University of Guelph,
Guelph, ON N1G 2W1, Canada; c Department of Aerospace Engineering, Indian Institute of Technology Madras, Chennai 600036, India; d Department of
Physics, Indian Institute of Technology Madras, Chennai 600036, India; e Department of Environmental Sciences, Wageningen University, 6708 PB
Wageningen, The Netherlands; and f Global Systems Institute, University of Exeter, Exeter EX4 4PY, United Kingdom

Edited by Alan Hastings, University of California, Davis, CA, and approved August 4, 2021 (received for review March 30, 2021)

Many natural systems exhibit tipping points where slowly chang- cation can lead the system into a state of oscillatory behavior via
ing environmental conditions spark a sudden shift to a new a smooth (supercritical) or abrupt (subcritical) transition.
and sometimes very different state. As the tipping point is Different bifurcation types correspond to distinct types of
approached, the dynamics of complex and varied systems sim- dynamical behavior. Moreover, other behaviors can emerge near
plify down to a limited number of possible “normal forms” that the bifurcation that are common to many normal forms. For
determine qualitative aspects of the new state that lies beyond example, all local bifurcations, that is, those where eigenvalues
the tipping point, such as whether it will oscillate or be sta- of the respective matrices cross the imaginary axis, are accom-
ble. In several of those forms, indicators like increasing lag-1 panied by critical slowing down (3, 13). This is where system
autocorrelation and variance provide generic early warning sig- dynamics become progressively less resilient to perturbations as
nals (EWS) of the tipping point by detecting how dynamics slow the transition approaches, causing dynamics to become more
down near the transition. But they do not predict the nature variable and autocorrelated. As a result, statistical indicators
of the new state. Here we develop a deep learning algorithm such as rising variance and lag-1 autocorrelation (AC) of a time

ECOLOGY
that provides EWS in systems it was not explicitly trained on, by series often precede tipping points in a variety of systems (14–
exploiting information about normal forms and scaling behav- 16). These generic early warning indicators have been found to
ior of dynamics near tipping points that are common to many precede catastrophic regime shifts in systems including epilep-
dynamical systems. The algorithm provides EWS in 268 empirical tic seizures, Earth’s paleoclimate system, and lake manipulation
and model time series from ecology, thermoacoustics, climatology, experiments (17–19).
and epidemiology with much greater sensitivity and specificity Mathematically, critical slowing down occurs when the real
than generic EWS. It can also predict the normal form that char-

MATHEMATICS
part of the dominant eigenvalue (a measure of system resilience;
Downloaded from https://www.pnas.org by 103.88.46.133 on October 23, 2023 from IP address 103.88.46.133.

acterizes the oncoming tipping point, thus providing qualitative Box 2) diminishes and eventually passes through zero at the

APPLIED
information on certain aspects of the new state. Such approaches bifurcation point. This happens for fold, Hopf, and transcritical
can help humans better prepare for, or avoid, undesirable state bifurcations, and thus critical slowing down is manifested before
transitions. The algorithm also illustrates how a universe of these three bifurcations types (16). Generic early warning indi-
possible models can be mined to recognize naturally occurring cators are intended to work across a range of different types of
tipping points. systems by detecting critical slowing down. But this strength is
dynamical systems | machine learning | bifurcation theory |
Significance
theoretical ecology | early warning signals

Early warning signals (EWS) of tipping points are vital to antic-


M any natural systems alternate between states of equilib-
rium and flux. This has stimulated research in fields rang-
ing from evolutionary biology (1) and statistical mechanics (2)
ipate system collapse or other sudden shifts. However, exist-
ing generic early warning indicators designed to work across
all systems do not provide information on the state that lies
to dynamical systems theory (3). Dynamical systems evolve over
beyond the tipping point. Our results show how deep learning
time in a state space described by a mathematical function (3).
algorithms (artificial intelligence) can provide EWS of tipping
Thus, dynamical systems are extremely diverse, ranging in spa-
points in real-world systems. The algorithm predicts certain
tial scale from the expanding universe (4) to quantum systems
qualitative aspects of the new state, and is also more sensi-
(5) and everything in between (6–8).
tive and generates fewer false positives than generic indica-
Different dynamical systems exhibit vastly different levels of
tors. We use theory about system behavior near tipping points
complexity, and correspondingly diverse behavior far from equi-
so that the algorithm does not require data from the study
librium states. Sometimes, a system that is close to equilibrium
system but instead learns from a universe of possible models.
may experience slowly changing external conditions that move
it toward a tipping point where its qualitative behavior changes Author contributions: T.M.B., M.S., T.M.L., M.A., and C.T.B. designed research; T.M.B. and
(we note that “tipping point” has been used to refer to a variety C.T.B. performed research; T.M.B., R.I.S., I.P., M.S., T.M.L., and C.T.B. contributed new
of phenomena (9), but here we will treat it as being synonymous reagents/analytic tools; T.M.B. and C.T.B. analyzed data; T.M.B., R.I.S., I.P., M.S., T.M.L.,
with a local bifurcation point). In these circumstances, dynamical M.A., and C.T.B. wrote the paper; and M.A. and C.T.B. conceived the study.y

systems theory predicts that even very high-dimensional systems The authors declare no competing interest.y
will simplify to follow low-dimensional dynamics (10, 11). More- This article is a PNAS Direct Submission.y
over, there exist a limited number of typical bifurcations of steady This open access article is distributed under Creative Commons Attribution-NonCommercial-
states, each of which may be described by a “normal form”—a NoDerivatives License 4.0 (CC BY-NC-ND).y
canonical example capturing the dynamical features of the bifur- See online for related content such as Commentaries.y
cation (Box 1) (3). For instance, in a fold bifurcation, the system 1
To whom correspondence may be addressed. Email: [email protected]
exhibits an abrupt transition to a very different state. A trans- This article contains supporting information online at https://www.pnas.org/lookup/suppl/
critical bifurcation usually causes a smooth transition, although it doi:10.1073/pnas.2106140118/-/DCSupplemental.y
may sometimes cause an abrupt transition (12). Or, a Hopf bifur- Published September 20, 2021.

PNAS 2021 Vol. 118 No. 39 e2106140118 https://doi.org/10.1073/pnas.2106140118 | 1 of 9


also their weakness, since these indicators do not tell us which Box 1. Dimension Reduction Close to a Bifurcation
type of bifurcation to expect (16).
The dominant eigenvalue is derived from a first-order approx- As a high-dimensional dynamical system approaches a bifur-
imation to dynamics near the equilibrium. Higher-order approx- cation, its dynamics simplify according to the center manifold
imations can distinguish between different types of bifurcations. theorem (10). That is, the dynamics converge to a lower-
But they are not often used to develop early warning indicators dimensional space, which exhibits dynamics topologically
because 1) the first-order approximation dominates dynamics equivalent to those of the normal form of that bifurcation.
sufficiently close to the equilibrium, causing critical slowing down Examples of a fold, (supercritical) Hopf, and transcritical
to generate the strongest signal, and 2) the first-order approxi- bifurcation are shown in Box 1a–c. Dynamics close to the
mation is more tractable to mathematical analysis of stochastic bifurcation (gray box) are topologically equivalent to the
systems than the higher-order approximations (20). However, as normal forms
a system gets closer to a bifurcation, it can drift farther from
dx
equilibrium due to critical slowing down. As a consequence, the = µ − x 2, [1]
higher-order terms become significant and may be large enough dt
to provide clues about the type of transition that will occur. Sta- dx dy
= µx − y − x (x 2 + y 2 ), = x + µy − y(x 2 + y 2 ),
tistical measures such as skew and kurtosis reflect the influence dt dt
of these highest-order terms, for instance (20–22). Higher-order [2]
terms could be associated with features in time series data that dx
are subtle but detectable, if we knew what to look for. Knowing = µx − x 2 , [3]
dt
qualitative information about the tipping point (such as whether
it will be sudden or gradual) and the state that lies beyond it (such respectively, where x and y are state variables that depend
as whether it will oscillate or be stable) based on predicting the on time t, and µ is the bifurcation parameter. The bifurcation
bifurcation type could be valuable in a range of applications. of each system occurs at µ = 0. These normal forms are con-
tained within the set of two-dimensional dynamical systems
Deep Learning and Bifurcation Theory with third-order polynomial right-hand sides, motivating
this as the framework for training the deep learning (DL)
Generic early warning indicators such as variance and lag-1 AC algorithm (see Materials and Methods).
use insights from dynamical systems theory to detect patterns
that emerge before a bifurcation (Box 2). Supervised DL algo-
rithms can also detect patterns (features) in time series, and have
achieved state of the art in time series classification (23)—the
ability to classify time series based on characteristic features in
the data. We hypothesized that DL algorithms can detect both
Downloaded from https://www.pnas.org by 103.88.46.133 on October 23, 2023 from IP address 103.88.46.133.

critical slowing down and other subtle features that emerge in


time series prior to each type of bifurcation, such as the features
generated by higher-order terms.
However, supervised DL algorithms require many thousands
of time series to learn classifications—something we do not have
for many empirical systems (24). And they can only classify time
series similar to the type of data they were trained on. Here, we
propose that simplification of dynamical patterns near a bifur-
cation point provides a way to address the problem of limited
empirical data, and allows us to relax the restriction that DL
algorithms can only classify time series from systems that they
were trained on. Our first hypothesis (H1) is that, if we train a
DL algorithm on a sufficiently large training set generated from
a sufficiently diverse library of possible dynamical systems, the
relevant features of any empirical system approaching a tipping
point will be represented somewhere in that library. Therefore,
the trained algorithm will provide early warning signals (EWS)
in empirical systems that are not explicitly represented in the
training set. Thus, even a relatively limited library might con-
tain the right kinds of features that characterize higher-order network layers. The CNN layer reads in subsequences of the time
terms in real-world time series. Our second hypothesis (H2) is series and extracts features that appear in those subsequences.
that the DL algorithm will detect EWS with greater sensitivity The LSTM layer then reads in the output of the CNN and inter-
(more true positives detected) and specificity (fewer false posi- prets those features. The LSTM layer loops back on itself to
tives) than generic early warning indicators. Our third hypothesis generate memory, enabling the layer to recognize the same fea-
(H3) is that the DL algorithm will also predict qualitative infor- ture appearing at different times in a long time series. As a
mation about the new state that lies beyond the tipping point, result, this approach excels at pattern recognition and sequence
on account of being able to recognize patterns associated with prediction (25, 26).
higher-order terms. All three hypotheses are based on the simpli- We created a training set consisting of simulations from a
fication of complex dynamics close to a bifurcation point (Boxes 1 randomly generated library of mathematical models exhibiting
and 2). local bifurcations (see Materials and Methods). Specifically, we
To test these hypotheses, we developed a DL algorithm to generated three classes of simulations eventually going through
provide EWS for tipping points in systems it was not trained a fold, Hopf, or transcritical bifurcation, and a fourth neutral
upon. We used a CNN-LSTM architecture (convolutional neural class that never goes through a bifurcation. (We note that
network—long short-term memory network; see Materials and other types of state transitions are possible, such as those ca-
Methods). CNN-LSTM sandwiches two different types of neural used by a global bifurcation (27), but we restrict attention to local

2 of 9 | PNAS Bury et al.


https://doi.org/10.1073/pnas.2106140118 Deep learning for early warning signals of tipping points
Box 2. Significance of Higher-Order Terms Close to a from paleoclimate transitions (17), transitions to thermoacous-
Bifurcation tic instability in a horizontal Rijke tube which is a prototypical
thermoacoustic system (30), and sedimentary archives capturing
The local behavior of a dynamical system about an equilib- episodes of anoxia in the eastern Mediterranean (31) (see Materi-
rium point is often well described by a linear approximation of als and Methods). We selected these empirical datasets because,
the equations that govern its dynamics. However, for systems in each case, they have been previously argued to show critical
nearing a bifurcation, higher-order terms become significant. slowing down before a tipping point, based on lag-1 AC and/or
We illustrate this for a one-dimensional system dx /dt = f (x ) variance trends, followed by observation of a state transition. We
with equilibrium x ∗ , that is, f (x ∗ ) = 0. The dynamics about compared the performance of the DL algorithm against lag-1 AC
equilibrium following a perturbation by  satisfy and variance for all six study systems.
d (x ∗ + ) ∂f 1 ∂2f Results
= f (x ∗ + ) = f (x ∗ ) + + 2 + · · ·
dt ∂x x∗ 2 ∂x 2 x∗ The EWS provided by lag-1 AC, variance, and the DL algorithm
= λ1  + λ2 2 + · · · , can be compared as progressively more of the time series lead-
ing up to the bifurcation is made available for their computation,
where λ1 , λ2 , . . . are coefficients of the Taylor expansion, and as might occur in real-world settings where a variable is moni-
λ1 is referred to as the dominant eigenvalue. The potential tored over time. A clear trend in variance or lag-1 AC is taken to
landscape of this system centered on x ∗ is given by provide an early warning signal of an upcoming state transition
(32). For the two ecological models exhibiting the fold, Hopf and
Z
1 1 transcritical bifurcations (Fig. 1 A–C), the lag-1 AC and variance
V () = f (x ∗ + )d  = λ1 2 + λ2 3 + · · · , increase progressively before all three transition types except for
2 3
the Hopf bifurcation, where lag-1 AC decreases due to the pres-
where we have dropped the arbitrary integration constant. ence of an oscillatory component to the motion (20) (Fig. 1 D–I).
Far from a bifurcation in a regime of small noise, displace- Hence the trends in these two indicators suggest that a transition
ment from equilibrium () is small, and so the visited part of will occur.
the potential landscape is well described by the first-order The DL algorithm assigns a probability for each of the four

ECOLOGY
(linear) approximation (Box 2a). As a local bifurcation is possible outcomes (fold, transcritical, Hopf, and neutral) that the
approached, λ1 → 0, which corresponds to critical slowing time series will culminate in that outcome. Therefore, a height-
down, and a flattening of the first-order approximation to the ened probability assigned to one of the outcomes compared to
potential landscape (Box 2b). This allows noise to push the the other three is taken to provide an early warning signal of that
system farther from equilibrium, where higher-order terms outcome. According to this criterion, the DL algorithm also pro-
become significant. vides early warning of a transition in the two ecological models,

MATHEMATICS
and correctly predicts the type of bifurcation in each of the three
Downloaded from https://www.pnas.org by 103.88.46.133 on October 23, 2023 from IP address 103.88.46.133.

cases (Fig. 1 J–L). Inspection of the time series provides support-


A B

APPLIED
ing evidence for our first two hypotheses. Firstly, these model
equations were not used to develop our training library (although
we note that our hypothesis relies on the training library includ-
ing a representative type of dynamics from the models, such as
fold bifurcations). Secondly, the algorithm initially assigns sim-
ilar probabilities to all three transition types in the earlier part
of the time series, but, after a specific time point, the algorithm
becomes highly confident in picking one of the three bifurca-
tion types as the most probable outcome. This is consistent with
the algorithm being able to distinguish features based on higher-
order terms that are held in common between dynamical systems
exhibiting each bifurcation type, but that distinguish the bifurca-
tion types from one another. Examples of these time series for
codimension-one bifurcations in this paper.) Then, we trained the other four study systems appear in SI Appendix, Figs. S1–S10.
the CNN-LSTM algorithm on the training set to classify any These time series, however, do not address how the
given time series into one of the four categories based on the approaches might perform when faced with a neutral time series
prebifurcation portion of the simulation time series. The F1 where no transition occurs, and whether they might mistak-
score of the algorithm—a combined measure of precision (how enly generate a false positive prediction of an oncoming state
many positive classifications are true positives) and sensitiv- transition (33). Hence, we compared the performance of these
ity/recall (how many of the true positives are detected)—tested approaches with respect to both true and false positives through
against a hold-out portion of the training set was 88.2% when a receiver operator characteristics (ROC) curve. The ROC curve
training on time series of length 1,500 data points, and was 84.2% shows the ratio of true positives to false positives, as a discrim-
when training on time series of length 500 data points. ination threshold that determines whether a classifier predicts a
We evaluated the out-of-sample predictive performance of the given outcome (such as transition versus no transition) is var-
algorithm, using data from study systems that were not included ied. The area under the ROC curve determines how well the
in the training set. We tested three model systems and three classifier does with respect to both sensitivity/recall (how many
empirical systems. The model systems included a simple har- true positives are detected) and specificity (how many false pos-
vesting model consisting of a single equation that exhibits a fold itives are avoided). The AUC is one for a perfect classifier, and
bifurcation (28); a system of two equations representing a con- 0.5 for a classifier that is no better than random. For variance
sumer−resource (predator−prey) system exhibiting both Hopf and lag-1 AC, higher positive values of the Kendall τ statistic
and transcritical bifurcations (29); and a system of five equations indicate a more strongly increasing trend. Therefore, these indi-
representing the coupled dynamics of infection transmission cators were taken to predict a given outcome when the Kendall
and vaccine opinion propagation, and exhibiting a transcritical τ statistic exceeded the discrimination threshold. The DL
bifurcation (12). The three empirical datasets consisted of data algorithm was taken to predict a given outcome simply when the

Bury et al. PNAS | 3 of 9


Deep learning for early warning signals of tipping points https://doi.org/10.1073/pnas.2106140118
Downloaded from https://www.pnas.org by 103.88.46.133 on October 23, 2023 from IP address 103.88.46.133.

Fig. 1. Trends in indicators prior to three different bifurcations in ecological models. (A–C) Trajectory (gray) and smoothing (black) of a simulation of an
ecological model going through a fold, Hopf, and transcritical bifurcation, respectively. (D–F) Lag-1 AC computed over a rolling window (arrow) of width
0.25. (G–I) Variance. (J–L) Probabilities assigned to the fold (purple), Hopf (orange), and transcritical (cyan) bifurcation by the DL algorithm. The vertical
dashed line marks the time at which the system crosses the bifurcation.

probability assigned to that outcome exceeded the discrimination opinion, and the DL algorithm outperforms both lag-1 AC and
threshold. variance in this respect. Also, for the paleoclimate data, the DL
We compared ROC curves for the criterion of predicting any algorithm performs about as well as lag-1 AC, and both perform
transition for lag-1 AC, variance, and the DL algorithm, for eight better than variance (Fig. 2H). This may occur because the vari-
comparisons across all six study systems (Fig. 2). In support of ance actually decreases before the transition in several of the
our second hypothesis, the DL algorithm strongly outperforms empirical time series, because the sampling data does not have
lag-1 AC and variance in six of the comparisons. There are two high enough resolution, or because the system was forced too
interesting comparisons where the performance of the DL algo- quickly.
rithm is similar to that of lag-1 AC or variance. For the SEIRx In support of our third hypothesis, we note that the DL algo-
(Susceptible-Exposed-Infectious-Removed-vaccinator) coupled rithm usually predicts the correct type of bifurcation in all eight
behavior−disease model, all three classifiers are little better than comparisons (Fig. 2). An exception occurs for the thermoacous-
random in the model variable for the number of infectious per- tic system (Fig. 2G), where the frequency of the favored DL
sons (I ; Fig. 2E). This occurs due to nonnormality of the system probability for the Hopf bifurcation is only slightly higher than
associated with differing timescales for demographic and epi- for the fold bifurcation. This could be due to our down-sampling
demiological processes (34). However, the early warning signal of the data to enable the time series to be accommodated by the
is apparent in the variable x for the prevalence of provaccine DL algorithm code.

4 of 9 | PNAS Bury et al.


https://doi.org/10.1073/pnas.2106140118 Deep learning for early warning signals of tipping points
ECOLOGY
Fig. 2. ROC curves for predictions using 80 to 100% of the pretransition time series for model and empirical data. ROC curves compare the performance
of the DL algorithm (blue), variance (red), and lag-1 AC (green) in predicting an upcoming transition. The area under the curve (AUC), abbreviated to
A, is a measure of performance. Insets show the frequency of the favored DL probability among the forced trajectories: (F)old, (T)ranscritical, (H)opf, or
(N)eutral. (A) May’s harvesting model going through a fold bifurcation; (B and C) consumer−resource model going through a (B) Hopf and (C) transcritical
bifurcation; (D and E) behavior−disease model going through a transcritical bifurcation using data from (D) provaccine opinion (x) and (E) total infectious
(I); (F) sediment data showing rapid transitions to an anoxic states in the Mediterranean sea; (G) data of a thermoacoustic system undergoing a Hopf
bifurcation; and (H) ice core records showing rapid transitions in paleoclimate data. The diagonal dashed line marks where a classifier works no better than

MATHEMATICS
a random coin toss.
Downloaded from https://www.pnas.org by 103.88.46.133 on October 23, 2023 from IP address 103.88.46.133.

APPLIED
These ROC curves came from classifiers with access to 80 to previously studied in the literature on EWS of tipping points.
100% of the time series (in other words, using the last 20% of The algorithm still detects bifurcations in the empirical sys-
the time series). We also computed the ROC curves when the tems because that is what it was trained to do. However, it
three classifiers had access to 60 to 80% (the fourth quintile) of could be said that the algorithm is really predicting the type of
the time series (SI Appendix, Fig. S11). This allows us to assess bifurcation that researchers would use to describe an observed
the reliability of the approaches when they are required to pro- transition in the real-world system. This reflects the more general
vide early warning for a system that is still far from the tipping issue of how humans leave their imprint—for better or worse—
point. We observe that the DL algorithm provides early warn- on the classifications provided by supervised machine learning
ing with greater sensitivity and specificity than either lag-1 AC or algorithms (37).
variance, in all comparisons except the I variable of the SEIRx Early warning indicators generally require high-resolution
model, where they perform equally poorly. This result suggests data from a sufficiently long time series leading up to the tip-
that the DL algorithm can provide greater forewarning of com- ping point (38). This applies to DL algorithms as well as to lag-1
ing state transitions, although additional statistical tests would be AC and variance. We did not analyze how the performance of
required to show this conclusively. the DL algorithms, lag-1 AC, and variance compare as the time
series becomes shorter. Similarly, none of these approaches can
Discussion predict exactly when a transition will occur. This task lies in the
We tested our DL algorithm on data from systems that exhib- domain of time series forecasting instead of classification and is
ited critical slowing down before a local bifurcation. However, a difficult undertaking, given that stochasticity could cause a sys-
other types of transitions are possible, such as global bifurcations tem to jump prematurely to a new basin of attraction even before
that do not depend on changes to the local stability of equilibria the system has reached the tipping point (39). Also worth noting
(27). EWS of global bifurcations are more challenging to detect. is that we generated a training set based on models with two state
State transitions may also occur through codimension-two bifur- variables and second-order polynomial model equations. This
cations where two forcing parameters are varied simultaneously limits its ability to detect features such as deterministic chaos
(16, 35, 36), or bifurcations of periodic orbits (22), for which (22), which require at least three state variables (40).
EWS are more apparent. In general, DL algorithms only work We did not analyze whether the DL algorithm is using the
for the specific problems they are trained to do. In order for our higher-order terms in the normal form equations, or whether
DL algorithm to provide early warning of other such bifurcations, it is primarily relying on some other features in the data. This
we speculate that the training set would need to be expanded to could be addressed in future work by controlled tests of whether
include simulated data exhibiting those dynamics. the algorithm can distinguish supercritical and subcritical Hopf
Bifurcations are not inherent to real-world systems but rather bifurcations (which differ in the cubic term), for instance. Finally,
are a property of our mathematical model of the systems. We we note that, even though the DL algorithm can predict certain
trained the DL algorithm on data from mathematical models, qualitative features of the new regime (such as oscillations after
but we applied it to empirical data from systems that have been a Hopf bifurcation, or a stable state after a fold bifurcation), it

Bury et al. PNAS | 5 of 9


Deep learning for early warning signals of tipping points https://doi.org/10.1073/pnas.2106140118
cannot say much else about the new regime. For instance, a fold The simulation uses the odeint function from the Python package Scipy
bifurcation could lead an ecosystem into either a stable collapsed (42) with a step size of 0.01. We say the model has converged if the
state or a stable healthy state (41). maximum difference between the final 10 points of the simulation is less
Other early warning approaches have been developed to than 10−8 . Models that do not converge are discarded. For models that
converge, we use AUTO-07P (43) to identify bifurcations along the equi-
predict the type of bifurcation (20, 22, 34). However, these
librium branch as each nonzero parameter is varied within the interval
approaches tend to be system specific. Our results show that [−5, 5]. For each bifurcation identified, we run a corresponding “null” and
DL algorithms not only can improve the sensitivity and speci- a “forced” stochastic simulation of the model with additive white noise.
ficity of EWS for regime shifts but also apply with a great degree Null simulations keep all parameters fixed. Forced simulations increase the
of generality across different systems. Moreover, as long as the bifurcation parameter linearly in time from its original value up to the
generic dynamical features (the normal forms) of the system bifurcation point. Stochastic simulations are run using the Euler Maruyama
near a tipping point are represented in the training set (Boxes method with a step size of 0.01, an initial condition given by the model’s
1 and 2), data from the study system are not required to train the equilibrium value and a burn-in period of 100 units of time. The noise
algorithm. In summary, by combining dynamical systems insights amplitude is drawn from a triangular distribution centered at 0.01 with
upper and lower bounds 0.0125 and 0.0075, respectively, and weighted by
with DL approaches, our results show how to obtain EWS of
an approximation of the dominant eigenvalue of the model (SI Appendix,
tipping points with much greater sensitivity, specificity, and gen- Supplementary Note).
eralizability across systems than is currently possible, as well as For each simulation, we set a sampling rate fs , the number of data points
predicting the type of tipping point, and thus providing spe- collected per unit of time, which is drawn randomly from {1, 2, . . . , 10}.
cific qualitative information about the new state that lies beyond Using a varied sampling rate provided a wider distribution of lag-1 AC
the tipping point. This information is important to know for among the training data entries, which is important for representing a
both theoretical and practical purposes, since tipping points in wide range of systems and timescales. The simulation is then run for 700/fs
many systems can lead to undesirable collapse (14). Improved time units for the 500-classifier and 1,700/fs time units for the 1,500-
EWS can help us better prevent or prepare for such state classifier, providing 700 and 1,700 points, respectively, when sampled at a
frequency fs .
transitions (41).
Due to noise, the simulations often transition to a new regime before
the bifurcation point is reached. We only want the DL classifier to see
Materials and Methods
data prior to the transition. Therefore, we use a change-point detection
Generation of Training Data for the DL Classifier. Training data consist of algorithm contained in the Python package ruptures (44) to locate a tran-
simulations of randomly generated, two-dimensional dynamical systems of sition point if one exists. If a transition point is detected, the preceding
the form 500 (1,500) points are taken as training data. If the transition occurs ear-
lier than 500 (1,500) data points into the simulation, the model is discarded.
10
X If no transition point is detected, the final 500 (1,500) points are taken as
ẋ = ai pi (x, y) [4]
training data.
i=1
10
X DL Algorithm Architecture and Training. We used a CNN-LSTM DL algorithm
ẏ = bi pi (x, y), [5]
Downloaded from https://www.pnas.org by 103.88.46.133 on October 23, 2023 from IP address 103.88.46.133.

i=1
(25, 26). We also experimented with a residual network, functional convolu-
tional network, and recurrent neural network but found that the CNN-LSTM
architecture yielded the highest precision and recall on our training set.
where x and y are state variables, ai and bi are parameters, and ~
p(x, y) is a
The code was written using TensorFlow 2.0 in Anaconda 2020.02. The CNN-
vector containing all polynomials in x and y up to third order,
LSTM architecture appears in Fig. 3. The algorithm was trained for 1,500
epochs with a learning rate of 0.0005, and the hyperparameters were tuned
2 2 3 2 2 3
~
p(x, y) = (1, x, y, x , xy, y , x , x y, xy , y ). through a series of grid sweeps. The same hyperparameter values were
used for training on both 500-classifier and 1,500-classifier (see previous
An individual model is generated by drawing each ai and bi from a nor- subsection).
mal distribution with zero mean and unit variance. Then, half of these The simulation output time series from the random dynamical systems
parameters are selected at random and set to zero. The parameters for the were detrended using Lowess smoothing with a span of 0.2 to obtain the
cubic terms are set to the negative of their absolute value to encourage residual time series that formed the training set. Each residual time series
models with bounded solutions. was normalized by dividing each time series data point by the average
For a DL algorithm to be effective, the training data should cover a absolute value of the residuals across the entire time series. We used a
wide representation of the possible dynamics that could occur in unseen train/validation/test split of 0.95/0.04/0.01 for both the 500- and 1,500-
data. For this reason, we generate many versions of the model in Eqs. classifiers. The test set was chosen as a small percentage because a test set of
4 and 5, each with a different set of parameter values. We continue to a few thousand time series is adequate to provide a representative estimate
generate models until a desired number of each type of bifurcation has of the precision and recall. The f1 score, precision, and recall for an ensemble
been found. For each bifurcation, we run simulations that are used as of ten 500-classifier models were 84.2%, 84.4%, and 84.2%, respectively. The
training data for the DL algorithm. In this study, we consider codimension- f1 score, precision, and recall for an ensemble of ten 1,500-classifier models
one bifurcations of steady states, including the fold, Hopf, and transcritical were 88.2%, 88.3%, and 88.3%, respectively.
bifurcation. The pitchfork bifurcation is another example; however, it only For testing the ability of the DL algorithm to provide EWS of bifurca-
occurs in models with symmetrical dynamics that are not often found in tions, we developed variants where the algorithm was trained on censored
ecological models. versions of the training time series. For the 500 (1,500) length classifier,
We generated two different training sets: one consisting of 500, 000 one variant was trained on a version of the training set where the resid-
time series of length 500 data points, and one consisting of 200, 000 time uals of the simulation time series were padded on both the left and right
series of length 1,500 data points. This was done because the time series by between 0 and 225 (725) zeroes, with the padding length chosen ran-
lengths in the three empirical and three model systems are highly vari- domly from a uniform distribution. This allowed the algorithm to train
able. The algorithm was trained separately on these two training sets (see on time series as short as 50 (50), not necessarily representing the time
next subsection), resulting in a “500-classifier” and a “1,500-classifier.” The phase just before the transition. The intention was to boost the perfor-
500-classifier was used on shorter time series, while the 1,500-classifier mance of the DL algorithm for detecting EWS features from shorter time
was used on the longer time series. For the model time series in Fig. series and from the middle sections of time series. The second variant was
1, we use the 1,500-classifier. For the ROC curves in Fig. 2, we used the trained on a version of the training set where the residuals of the simu-
500-classifier for the paleoclimate data and the ecological models, and lation time series were padded only on the left, by between 0 and 450
used the 1,500-classifier for the thermoacoustic data, anoxia data, and (1,450) zeroes, where the padding length was chosen randomly from a uni-
disease model. form distribution. This allowed the algorithm to train on time series as
Upon generation of a model, we simulate it for 10,000 time steps from short as 50 (50), representing time series of various lengths that lead up
a randomly drawn initial condition and test for convergence to an equi- to the bifurcation (except for the neutral class). As a result, the classifier
librium point. Convergence is required in order to search for bifurcations. could better detect features that emerge most strongly right before the

6 of 9 | PNAS Bury et al.


https://doi.org/10.1073/pnas.2106140118 Deep learning for early warning signals of tipping points
where r is the intrinsic per capita growth rate of the resource (x), k is
its carrying capacity, a is the attack rate of the consumer (y), e is the
conversion factor, h is the handling time, m is the per capita consumer
mortality rate, σ1 and σ2 are noise amplitudes, and ξ1 (t) and ξ2 (t) are
independent Gaussian white noise processes. We fix the parameter val-
ues r = 4, k = 1.7, e = 0.5, h = 0.15, m = 2, σ1 = 0.01, and σ2 = 0.01. In
this configuration, the deterministic system has a transcritical bifurcation
at a = 5.60 and a Hopf bifurcation at a = 15.69. For the transcritical bifur-
cation, we simulate null trajectories with a = 2 and forced trajectories
with a ∈ [2, 6]; for the Hopf bifurcation, we use a = 12 and a ∈ [12, 16],
respectively.
To test the DL algorithm on a model of higher dimensionality than the
ecological models, we used a stochastic version of the SEIRx model that
captures interactions between disease dynamics and population vaccinating
behavior (12, 45) given by

dS
= µN(1 − x) − µS − βSI/N + σ1 ξ1 (t),
dt
dE
= βSI/N − (σ + µ)E + σ2 ξ2 (t),
dt
dI
= σE − (γ + µ)I + σ3 ξ3 (t),
dt
dR
= µx + γI − µR + σ4 ξ4 (t),
dt
dx
= κx(1 − x)(−ω + I + δ(2x − 1)) + σ5 ξ5 (t),
dt

where S is the number of susceptible individuals, E is the number of

ECOLOGY
exposed (infected but not yet infectious) individuals, I is the number of
infectious individuals, R is the number of recovered/immune individuals, x
is the number of individuals with provaccine sentiment, µ is the per capita
birth and death rate, β is the transmission rate, σ is the per capita rate
at which exposed individuals become infectious, γ is the per capita rate
of recovery from infection, κ is the social learning rate, δ is the strength
of injunctive social norms, and ω is the perceived relative risk of vaccina-

MATHEMATICS
tion versus infection. For our simulations, we used µ = 0.02/y, β = 1.5/d
Downloaded from https://www.pnas.org by 103.88.46.133 on October 23, 2023 from IP address 103.88.46.133.

(based on R0 ≈ β/γ = 15), σ = 0.1/d, γ = 0.1/d, κ = 0.001/d, δ = 50, and

APPLIED
N = 100, 000, representing a typical pediatric infectious disease (12). Simula-
tions were perturbed weekly with σi = 5 for i = 1 . . . 4, and σ5 = 5 × 10−4 .
Fig. 3. CNN-LSTM architecture. On account of the large timescale difference in vital dynamics and infec-
tion processes, the system is nonnormal (34). We note that S + E + I + R = 1,
and therefore, since R can be obtained as R = 1 − S − E − I, the model is
four-dimensional. The forcing parameter ω was gradually forced from 0 to
bifurcation. Ten trained models of each variant were ensembled by tak- 100. As perceived vaccine risk increases along the (1, 0, 0, 0, 1) branch corre-
ing their average prediction at each point to generate all of our reported sponding to full vaccine coverage, the model has a transcritical bifurcation
results. at ω = δ (12), which leads to a critical transition corresponding to a drop
in the proportion of individuals with provaccine sentiment and a return of
Theoretical Models Used for Testing. We use models of low and intermediate
endemic infection.
complexity to test the DL classifier. Models are simulated using the Euler
Maruyama method with a step size of 0.01 unless otherwise stated. To test
Empirical Systems Used for Testing. We use three different sources of
detection of a fold bifurcation, we use May’s harvesting model (28) with
empirical data to test the DL classifier.
additive white noise. This is given by
1) The first source is sedimentary archives from the Mediterranean Sea (46).
These provide high-resolution reconstructions of oxygen dynamics in the
dx x x2
 
= rx 1− −h 2 + σξ(t), eastern Mediterranean Sea. Rapid transitions between oxic and anoxic
dt k s + x2 states occurred regularly in this region in the geological past. A recent
study has shown that EWS exist prior to the transitions (31). The data con-
where x is biomass of some population, k is its carrying capacity, h sist of output from three cores that, together, span eight anoxic events.
is the harvesting rate, s characterizes the nonlinear dependence of har- Variables include molybdenum (Mo) and uranium (U), proxies for anoxic
vesting output on current biomass, r is the intrinsic per capita growth and suboxic conditions, respectively, giving us a total of 26 time series for
rate of the population, σ is the noise amplitude, and ξ(t) is a Gaus- anoxic events (some are captured by multiple cores). The sampling rate
sian white noise process. We use parameter values r = 1, k = 1, s = 0.1, provides ∼10- to 50-y resolution depending on the core, with an almost
h ∈ [0.15, 0.27], and σ = 0.01. In this configuration, a fold bifurcation regular spacing between data points. We perform the same data prepro-
occurs at h = 0.26. The parameter h is kept fixed at its lower bound for cessing as Hennekam et al. (31). Interpolation is not done, as most data
null simulations and is increased linearly to its upper bound in forced points are equidistant, and it can give rise to aliasing effects that strongly
simulations. affect variance and AC. Data 10 ky prior to each transition are analyzed
To test the Hopf and transcritical bifurcations, we use the for EWS. Null time series of the same length are generated from an AR
Rozenzweig−MacArthur consumer−resource model (29) with additive (1) (autoregressive lag 1) process fit to the initial 20% of the data. Resid-
white noise. This is given by uals are obtained from smoothing the data with a Gaussian kernel with
a bandwidth of 900 y, and EWS are computed using a rolling window
of 0.5.
dx x axy
 
= rx 1− − + σ1 ξ1 (t), 2) The second source is thermoacoustic instability. Thermoacoustic systems
dt k 1 + ahx often exhibit a critical transition to a state of self-sustained large-
dy eaxy amplitude oscillations in the system variables, known as thermoacoustic
= − my + σ2 ξ2 (t),
dt 1 + ahx instability. The establishment of a positive feedback between the heat

Bury et al. PNAS | 7 of 9


Deep learning for early warning signals of tipping points https://doi.org/10.1073/pnas.2106140118
release rate fluctuations and the acoustic field in the system is often the Computing Early Warning Indicators and Comparing Predictions with the DL
cause for this transition. We perform experiments in a horizontal Rijke Classifier. Generic early warning indicators are computed using the Python
tube which consists of an electrically heated wire mesh in a rectangu- package ewstools (20) which implements established methods (51). This first
lar duct (30). We pass a constant mass flow rate of air through the duct involves detrending the time series to obtain residual dynamics. This is done
and control the voltage applied across the wire mesh to attain the tran- using Lowess smoothing (52) with span 0.2 and degree 1 unless stated oth-
sition to thermoacoustic instability via subcritical Hopf bifurcation as the erwise. Variance and lag-1 AC are then computed over a rolling window
voltage is increased. We have data for 19 forced trajectories where the of length 0.5. To assess the presence of an early warning signal, we use
voltage is increased over time at different rates (2 mV/s to 24,000 mV/s). the Kendall τ value, which serves as a measure of increasing or decreas-
We also have 10 steady-state trajectories where the voltage is kept at a ing trend. The Kendall τ value at a given time is computed over all of the
fixed value between 0 and 4 V. Experimental runs at fixed higher voltages preceding data.
are not used, as they exhibit limit cycle oscillations. We downsample the To compare predictions made between variance, lag-1 AC, and the DL
data from 4 kHz to 10 kHz in experiments to 2kHz. Transition times are classifier, we use ROC. The ROC curve plots the true positive rate vs. the
picked by eye. For each forced time series, we analyze data 1,500 points false positive rate as a discrimination threshold is varied. For variance and
prior to the transition. From the steady-state time series, we extract two lag-1 AC, the discrimination threshold is taken as the Kendall τ value,
random sections of length 1,500 to serve as null time series, giving a total whereas, for the DL classifier, the discrimination threshold is taken as the DL
of 20 null time series. Data are detrended using Lowess smoothing with probability.
a span of 0.2 and degree 1. EWS are computed from residuals using a
rolling window of 0.5. Data Availability. Thermoacoustic data and machine learning and training
3) The third source is paleoclimate transitions (47–50). We use data for set code have been deposited in GitHub (https://github.com/ThomasMBury/
seven out of the eight climate transitions that were previously analyzed deep-early-warnings-pnas). The geochemical data (46) are available at the
for EWS by Dakos et al. (17). Time series for the desertification of North PANGAEA repository (https://doi.pangaea.de/10.1594/PANGAEA.923197).
Africa was not included due to insufficient data. We use the same data The paleoclimate data (47) are available from the World Data Center
preprocessing as Dakos et al. (17), which involves using linear interpola- for Paleoclimatology, National Geophysical Data Center, Boulder, Colorado
tion to make the data equidistant, and detrend with a Gaussian kernel (http://www.ncdc.noaa.gov/paleo/data.html).
smoothing function. Bandwidth of the kernel is specified for each time
series (17) to remove long-term trends while not overfitting. For each ACKNOWLEDGMENTS. We thank Ryan Kinnear for helpful discussions on
time series, we generate 10 null time series of the same length from an time series analysis and Rick Hennekam and Gert-Jan Reichart for providing
AR (1) process fit to the initial 20% of the residuals, yielding a total of 70 the anoxia dataset. This research was supported by Natural Sciences and
null time series and 7 forced time series. Engineering Research Council Discovery Grants to C.T.B. and M.A.

1. S. J. Gould, N. Eldredge, Punctuated equilibrium comes of age. Nature 366, 223–227 25. R. Mutegeki, D. S. Han, “A CNN-LSTM approach to human activity recognition” in
(1993). 2020 International Conference on Artificial Intelligence in Information and Communi-
2. I. Prigogine, Time, structure, and fluctuations. Science 201, 777–785 (1978). cation (ICAIIC), J. H. Lee, S. Park, Eds. (Institute of Electrical and Electronics Engineers,
3. S. H. Strogatz, Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, 2020), pp. 362–366.
Chemistry, and Engineering (Westview, 2014). 26. A. Vidal, W. Kristjanpoller, Gold volatility prediction using a CNN-LSTM approach.
Downloaded from https://www.pnas.org by 103.88.46.133 on October 23, 2023 from IP address 103.88.46.133.

4. J. Wainwright, G. F. R. Ellis, Dynamical Systems in Cosmology (Cambridge University Expert Syst. Appl. 157, 113481 (2020).
Press, 2005). 27. M. W. Adamson, J. H. P. Dawes, A. Hastings, F. M. Hilker, Forecasting resilience profiles
5. S. Ostlund, R. Pandit, D. Rand, H. J. Schellnhuber, E. D. Siggia, One-dimensional of the run-up to regime shifts in nearly-one-dimensional systems. J. R. Soc. Interface
Schrödinger equation with an almost periodic potential. Phys. Rev. Lett. 50, 1873 17, 20200566 (2020).
(1983). 28. R. M. May, Thresholds and breakpoints in ecosystems with a multiplicity of stable
6. B. T. Grenfell, O. N. Bjørnstad, J. Kappey, Travelling waves and spatial hierarchies in states. Nature 269, 471–477 (1977).
measles epidemics. Nature 414, 716–723 (2001). 29. M. L. Rosenzweig, R. H. MacArthur, Graphical representation and stability conditions
7. A. Hastings et al, Transient phenomena in ecology. Science 361, (2018). of predator-prey interactions. Am. Nat. 97, 209–223 (1963).
8. T. M. Bury, C. T. Bauch, M. Anand, Charting pathways to climate change mitigation in 30. I. Pavithran, R. I. Sujith, Effect of rate of change of parameter on early warning signals
a coupled socio-climate model. PLOS Comput. Biol. 15, e1007000 (2019). for critical transitions. Chaos 31, 013116 (2021).
9. E. H. van Nes et al., What do you mean, ‘tipping point’? Trends Ecol. Evol. 31, 902–904 31. R. Hennekam et al., Early-warning signals for marine anoxic events. Geophys. Res.
(2016). Lett. 47, e2020GL089183 (2020).
10. Y. A. Kuznetsov, Elements of Applied Bifurcation Theory (Applied Mathematical 32. V. Dakos, S. R. Carpenter, E. H. van Nes, M. Scheffer, Resilience indicators: Prospects
Sciences, Springer Science & Business Media, 2013), vol. 112. and limitations for early warnings of regime shifts. Philos. Trans. R. Soc. Lond. B Biol.
11. S. A. Campbell, “Calculating centre manifolds for delay differential equations using Sci. 370, 20130263 (2015).
maple” in Delay Differential Equations, T. Erneux, Ed. (Springer, 2009), pp. 1–24. 33. C. Boettiger, A. Hastings, Early warning signals and the prosecutor’s fallacy. Proc. R.
12. T. Oraby, V. Thampi, C. T. Bauch, The influence of social norms on the dynamics of Soc. B Biol. Sci. 279, 4734–4739 (2012).
vaccinating behaviour for paediatric infectious diseases. Proc. R. Soc. B Biol. Sci. 281, 34. S. M. O’Regan, E. B. O’Dea, P. Rohani, J. M. Drake, Transient indicators of tipping
20133172 (2014). points in infectious diseases. J. R. Soc. Interface 17, 20200094 (2020).
13. C. Wissel, A universal law of the characteristic return time near thresholds. Oecologia 35. M. S. Williamson, T. M. Lenton, Detection of bifurcations in noisy coupled systems
65, 101–107 (1984). from multiple time series. Chaos 25, 036407 (2015).
14. M. Scheffer et al., Anticipating critical transitions. Science 338, 344–348 (2012). 36. M. Baurmann, T. Gross, U. Feudel, Instabilities in spatially extended predator-prey
15. C. Boettiger, N. Ross, A. Hastings, Early warning signals: The charted and uncharted systems: Spatio-temporal patterns in the neighborhood of Turing-Hopf bifurcations.
territories. Theor. Ecol. 6, 255–264 (2013). J. Theor. Biol. 245, 220–229 (2007).
16. S. Kéfi, V. Dakos, M. Scheffer, E. H. Van Nes, M. Rietkerk, Early warning signals also 37. J. Zou, L. Schiebinger, AI can be sexist and racist—It’s time to make it fair. Nature 559,
precede non-catastrophic transitions. Oikos 122, 641–648 (2013). 324−326 (2018).
17. V. Dakos et al., Slowing down as an early warning signal for abrupt climate change. 38. N. Boers, Early-warning signals for Dansgaard-Oeschger events in a high-resolution
Proc. Natl. Acad. Sci. U.S.A. 105, 14308–14312 (2008). ice core record. Nat. Commun. 9, 2556 (2018).
18. V. L. Butitta, S. R. Carpenter, L. C. Loken, M. L. Pace, E. H. Stanley, Spatial early 39. R. Wang et al., Flickering gives early warning signals of a critical transition to a
warning signals in a lake manipulation. Ecosphere 8, e01941 (2017). eutrophic lake state. Nature 492, 419–422 (2012).
19. C. Meisel, C. Kuehn, Scaling effects and spatio-temporal multilevel dynamics in 40. J. Pathak, B. Hunt, M. Girvan, Z. Lu, E. Ott, Model-free prediction of large spatiotem-
epileptic seizures. PLoS One 7, e30371 (2012). porally chaotic systems from data: A reservoir computing approach. Phys. Rev. Lett.
20. T. M. Bury, C. T. Bauch, M. Anand, Detecting and distinguishing tipping points using 120, 024102 (2018).
spectral early warning signals. J. R. Soc. Interface 17, 20200482 (2020). 41. C. T. Bauch et al., Early warning signals of regime shifts in coupled
21. S. R. Carpenter, W. A. Brock, Early warnings of regime shifts in spatial dynamics using human–environment systems. Proc. Natl. Acad. Sci. U.S.A. 113, 14560–14567
the discrete Fourier transform. Ecosphere 1, 1–15 (2010). (2016).
22. K. Wiesenfeld, Virtual Hopf phenomenon: A new precursor of period-doubling 42. P. Virtanen et al., SciPy 1.0: Fundamental algorithms for scientific computing in
bifurcations. Phys. Rev. A Gen. Phys. 32, 1744–1751 (1985). Python. Nat. Methods 17, 261–272 (2020).
23. H. I. Fawaz, G. Forestier, J. Weber, L. Idoumghar, P.-A. Muller, Deep learning for time 43. E. J. Doedel et al., Auto-07p: Continuation and bifurcation software for ordi-
series classification: A review. Data Min. Knowl. Discov. 33, 917–963 (2019). nary differential equations. https://re.public.polimi.it/handle/11311/560353. Accessed
24. M. Scheffer, S. R. Carpenter, V. Dakos, E. H. van Nes, Generic indicators of ecological 7 September 2021.
resilience: Inferring the chance of a critical transition. Annu. Rev. Ecol. Evol. Syst. 46, 44. C. Truong, L. Oudre, N. Vayatis, Selective review of offline change point detection
145–167 (2015). methods. Signal Processing 167, 107299 (2020).

8 of 9 | PNAS Bury et al.


https://doi.org/10.1073/pnas.2106140118 Deep learning for early warning signals of tipping points
45. D. A. Pananos et al., Critical dynamics in population vaccinating behavior. Proc. Natl. 49. R. Alley, GISP2 Ice core temperature and accumulation data, IGBP PAGES/World
Acad. Sci. U.S.A. 114, 13762−13767 (2017). Data Center for Paleoclimatology Data Contribution Series no. 2004-013. NOAA/
46. R. Hennekam et al., Calibrated XRF-scanning data (mm resolution) and calibration NGDC Paleoclimatology Program. ftp://ftp.ncdc.noaa.gov/pub/data/paleo/icecore/
data (ICP-OES and ICP-MS) for elements Al, Ba, Mo, Ti, and U in Mediterranean cores greenland/summit/gisp2/isotopes/gisp2 temp accum alley2000.txt. Accessed 7 Jan-
MS21, MS66, and 64PE406E1. PANGAEA. https://doi.org/10.1594/PANGAEA.923197. uary, 2021.
Accessed 21 September 2020. 50. A. Tripati et al. Eocene greenhouse-icehouse transition carbon cycle Data. IGBP
47. K. Hughen et al., Cariaco Basin 2000 deglacial 14C and grey scale data. IGBP PAGES/World Data Center for Paleoclimatology Data Contribution Series no.
PAGES/World Data Center A for Paleoclimatology Data Contribution Series #2000- NOAA/NGDC Paleoclimatology Program. https://www.ncei.noaa.gov/pub/data/paleo/
069. NOAA/NGDC Paleoclimatology Program. https://www.ncei.noaa.gov/access/ contributions by author/tripati2005/tripati2005.txt. Accessed 7 January 2021.
metadata/landing-page/bin/iso?id=noaa-ocean-5853. Accessed 7 January, 2021. 51. V. Dakos et al., Methods for detecting early warnings of critical transitions in time
48. J. R. Petit et al., Vostok ice core data for 420,000 years. IGBP PAGES/World Data Cen- series illustrated using simulated ecological data. PLoS One 7, e41010 (2012).
ter for Paleoclimatology Data Contribution Series #2001-76. ftp://ftp.ncdc.noaa.gov/ 52. W. S. Cleveland, Robust locally weighted regression and smoothing scatterplots.
pub/data/paleo/icecore/antarctica/vostok/dustnat.txt. Accessed 7 January, 2021. J. Am. Stat. Assoc. 74, 829–836 (1979).

ECOLOGY
MATHEMATICS
Downloaded from https://www.pnas.org by 103.88.46.133 on October 23, 2023 from IP address 103.88.46.133.

APPLIED

Bury et al. PNAS | 9 of 9


Deep learning for early warning signals of tipping points https://doi.org/10.1073/pnas.2106140118

You might also like