Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
27 views13 pages

Adapting Machine Learning

This document describes a new methodology for using machine learning algorithms to design gene circuits capable of performing desired functions. The author adapted gradient descent optimization algorithms commonly used in machine learning to rapidly screen parameters and design gene circuits. This computational pipeline allows for the systematic study of natural gene circuits and the automatic design of circuits for synthetic biology applications. It can be applied to biological networks of any size.

Uploaded by

Aileen Turner
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views13 pages

Adapting Machine Learning

This document describes a new methodology for using machine learning algorithms to design gene circuits capable of performing desired functions. The author adapted gradient descent optimization algorithms commonly used in machine learning to rapidly screen parameters and design gene circuits. This computational pipeline allows for the systematic study of natural gene circuits and the automatic design of circuits for synthetic biology applications. It can be applied to biological networks of any size.

Uploaded by

Aileen Turner
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Hiscock BMC Bioinformatics (2019) 20:214

https://doi.org/10.1186/s12859-019-2788-3

METHODOLOGY ARTICLE Open Access

Adapting machine-learning algorithms to


design gene circuits
Tom W. Hiscock1,2

Abstract
Background: Gene circuits are important in many aspects of biology, and perform a wide variety of different
functions. For example, some circuits oscillate (e.g. the cell cycle), some are bistable (e.g. as cells differentiate), some
respond sharply to environmental signals (e.g. ultrasensitivity), and some pattern multicellular tissues (e.g. Turing’s
model). Often, one starts from a given circuit, and using simulations, asks what functions it can perform. Here we
want to do the opposite: starting from a prescribed function, can we find a circuit that executes this function?
Whilst simple in principle, this task is challenging from a computational perspective, since gene circuit models are
complex systems with many parameters. In this work, we adapted machine-learning algorithms to significantly
accelerate gene circuit discovery.
Results: We use gradient-descent optimization algorithms from machine learning to rapidly screen and design
gene circuits. With this approach, we found that we could rapidly design circuits capable of executing a range of
different functions, including those that: (1) recapitulate important in vivo phenomena, such as oscillators, and (2)
perform complex tasks for synthetic biology, such as counting noisy biological events.
Conclusions: Our computational pipeline will facilitate the systematic study of natural circuits in a range
of contexts, and allow the automatic design of circuits for synthetic biology. Our method can be readily
applied to biological networks of any type and size, and is provided as an open-source and easy-to-use
python module, GeneNet.
Keywords: Gene circuits, Machine learning, Numerical screens

Background [10], post-translational modifications [11], phosphoryl-


Biological networks – sets of carefully regulated and inter- ation [12] and metabolism [13, 14].
acting components – are essential for the proper function- Understanding how these networks execute biological
ing of biological systems [1, 2]. Networks coordinate functions is central to many areas of modern biology, in-
many different processes within a cell, facilitating a vast cluding cell biology, development and physiology. Whilst
array of complex cell behaviors that are robust to noise the network components differ between these disci-
yet highly sensitive to environmental cues. For example, plines, the principles of network function are often re-
transcription factor networks program the differentiation markably similar. This manifests itself as the recurrence
of cells into different cell types [3–5], orchestrate the pat- of common network designs (“network motifs”) in tran-
terning of intricate structures during development [6, 7], scriptional, metabolic, neuronal and even social net-
and allow cells to respond to dynamic and combinatorial works [15–19]. For example, negative feedback is a
inputs from their external environment [8, 9]. In addition network design that achieves homeostasis and noise re-
to transcriptional regulation, many other processes form silience, whether that be in the regulation of glucose
biological networks, including protein-protein interactions levels, body temperature [20], stem cell number [21], or
gene expression levels [22].
Correspondence: [email protected] A major challenge to understand and ultimately engin-
1
Cancer Research UK, Cambridge Institute, Li Ka Shing Centre, Robinson Way, eer biological networks is that they are complex dynam-
Cambridge CB2 0RE, UK ical systems, and therefore difficult to predict and highly
2
Wellcome Trust/Cancer Research UK Gurdon Institute, University of
Cambridge, Cambridge, UK non-intuitive [23]. Consequently, for anything other than

© The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Hiscock BMC Bioinformatics (2019) 20:214 Page 2 of 13

the simplest networks, verbal descriptions are insuffi- range of interesting gene circuits, including: fold change
cient, and we rely on computational models, combined detectors [36], robust oscillators [37], stripe-forming
with quantitative data to make progress. For example, motifs [38], polarization generators [39], robust morpho-
after decades of genetics, biochemistry, quantitative mi- gen patterning [40], networks that can adapt [41], gradi-
croscopy and mathematical modeling, a fairly complete, ents that scale [42] and biochemical timers [43].
and predictive, description of Drosophila anterior-pos- These studies demonstrate that unbiased and compre-
terior patterning is emerging [24–29]. hensive in silico screens of gene circuits can generate
Quantitative approaches have also proven useful in the novel and useful insights into circuit function. However,
rational design of circuits for synthetic biology [30, 31]. If the drawback of such an approach is that it is computa-
we are to successfully engineer biological processes (e.g. tionally expensive, and becomes prohibitively slow as the
for the production of biomaterials, for use as biosensors in network size is increased, due to the high dimensional
synthetic biology, or for regenerative medicine [31, 32]), parameter spaces involved. For example, consider a gene
then we need clear design principles to construct net- circuit consisting of N genes, where each gene can acti-
works. This requires formal rules that determine which vate or repress any other gene. There are then N2 inter-
components to use and how they should interact. Math- actions in this network, i.e. at least N2 parameters. It is
ematical models, combined with an expanding molecular therefore challenging to scan through this high dimen-
biology toolkit, have enabled the construction of gene cir- sional parameter space to find parameter regimes where
cuits that oscillate [33], have stable memory [34], or form the network performs well.
intricate spatial patterns [35]. This motivates more efficient algorithms to search
One approach to analyze and design gene circuits is to through parameter space. One example is Monte Carlo
propose a network and, through computational analysis, methods and their extensions, which randomly change
ask whether it (i) fits some observed data, or (ii) per- parameters and then enrich for changes that improve
forms some desired function; and if not, modify parame- network performance [44, 45]. Another approach that
ters until it can. For example, Elowitz and Leibler [33] has had great success is evolutionary algorithms [46].
proposed the repressilator circuit, used simulations to Here, populations of gene circuits are ‘evolved’ in an
show it should oscillate, and demonstrated its successful process that mimics natural selection in silico, whereby
operation in E. coli. This approach critically relies on at each step of the algorithm, there is (1) selection of the
starting with a “good” network, i.e. one that is likely to ‘fittest’ networks (those that best perform the desired
succeed. How do you choose a “good” network? In the function), followed by (2) mutation / random changes to
study of natural networks, this can be guided by what is the circuit parameters. Evolutionary algorithms have
known mechanistically about the system (e.g. from chro- been successfully used to design circuits that exhibit os-
matin immunoprecipitation sequencing data). However, cillations, bistability, biochemical adaptation and even
often the complete network is either unknown or too form developmental patterns [47–51].
complicated to model and therefore researchers must Here we designed an alternative approach inspired by
make an educated guess for which parts of the network gradient-descent algorithms, which underpin many of
are relevant. For synthetic circuits, one can emulate nat- the advances in modern machine learning. We find that
ural designs and/or use intuition and mathematical such approaches can significantly accelerate the compu-
modeling to guide network choice. In both cases, these tational screening of gene circuits, allowing for the de-
approaches start from a single network – either based sign of larger circuits that can perform more complex
on some understanding of the mechanism, or on some functions. In machine learning, complex models (typic-
intuition of the researcher, or both – and then ask what ally ‘neural networks’) with a large number of parame-
function this network performs. (Note, throughout we ters (typically millions) are fit to data to perform some
use the term “function” to refer to two things: (1) some prescribed function [52, 53]. For example, in computer
real biological function, e.g. patterning the Drosophila vision, this function could be to detect a human face in
embryo, or (2) some engineered function in synthetic a complex natural scene [54]. Many of the successes in
biology, e.g. an oscillator circuit.) machine learning have been underpinned by advances in
Here we aim to do the exact opposite, namely to ask: the algorithms to fit parameters to data in high dimen-
given a prescribed function, what network(s) can per- sions. Central to these algorithms is the principle of
form this function? Equivalently, this means considering “gradient descent”, where instead of exhaustively screen-
the most general network architecture possible (i.e. all ing parameter space, or randomly moving within it, pa-
genes can activate/repress all other genes), and then de- rameters are changed in the direction that most
termining for what parameters (i.e. what strengths of ac- improves the model performance [55]. An analogy for
tivation/repression) the network executes the desired gradient descent is to imagine you are walking on a
function. Such numerical screens have discovered a wide mountain range in the fog and wish to descend quickly.
Hiscock BMC Bioinformatics (2019) 20:214 Page 3 of 13

An effective strategy is to walk in the direction of stee- levels of the input. The task of network design is thus to
pest downhill, continuously changing direction as the find the parameters W and k such that the network
terrain varies, until you reach the base. Analogously, gra- operates as desired, translating the input I to a specified
dient descent works by iteratively changing parameters output.
in the “steepest” direction with respect to improving To fit Eq. 1, we start with an analogy to neural net-
model performance. works. Neural networks are highly flexible models
A major challenge is to efficiently compute these di- with large numbers of parameters that are capable of
rections in high dimensions. This relies on being able to performing functions of arbitrary complexity [60],
differentiate the outputs of a complex model with re- whose parameters must be fit to ensure the network
spect to its many parameters. A key advance in this re- performs some designed function. In fact, this corres-
gard has been to perform differentiation automatically pondence is more than an analogy if one considers
using software packages such as Theano [56] and Ten- recurrent neural networks (RNNs) [61]. RNNs differ
sorflow [57]. Here, gradients are not calculated using from canonical feedforward neural networks, in that
pen and paper, but instead algorithmically, and therefore connections between the nodes form a directed cycle,
can be computed for models of arbitrary complexity. allowing the representation of dynamic behavior (e.g.
We realized that training neural networks is in many the leaky-integrate-and-fire RNN model which is simi-
ways similar to designing biological circuits. Specifically, lar to Eq. 1). Therefore, we wondered whether the al-
we start with some prescribed function (or data), and we gorithms used to fit RNNs could be straightforwardly
then must fit a model with a large number of parameters adapted to fit gene circuit parameters.
to perform the function (fit the data). We thus reasoned To do this, we start with the simplest example where
that we could use exactly the same tools as in machine we wish the network to compute some input-output
learning to design gene circuits, namely advanced function, y = f(x). In this case, we allow one of the genes
gradient-descent, Adam [58], to fit parameters, and y1 ≡ x, to respond to external input, and examine the
automatic differentiation with Theano/Tensorflow to output of another gene, yN ≡ y. We then define a “cost”,
calculate gradients. We found that such an approach C, which tracks how closely the actual output of the net-
could effectively and rapidly generate circuits that per- work, y, matches the desired output of the network, ^y .
form a range of different functions, using a fairly simple First, this involves specifying what the desired output is;
python module, “GeneNet”, which we make freely as our first example, we consider the case where we
available. want the network output to respond in an ultrasensitive,
switch-like manner to the level of some input, x, i.e. y =
Results 0 for x < x∗ and y = 1 for x > x∗, as in Fig. 1a. Then, we
Algorithm overview choose the mean squared error as the form of our cost,
P
We seek an algorithm that can robustly fit parameters of i.e. C ¼ x ðyðxÞ−^yðxÞÞ2 .
complex gene circuits models. We start by considering a The goal is then to find the parameters that minimize
simple, but generic model of a transcriptional gene cir- this cost and therefore specify the network that best
cuit that has been well-studied across a range of bio- gives the desired output. To do this rapidly and effi-
logical contexts [26, 38, 59]. (Later, we will show that ciently in high dimensional parameter spaces, we use
our algorithm works just as well for different models). gradient descent. In gradient descent, parameters, pi are
The model comprises N transcription factors, whose updated in the direction that maximally reduces the
concentrations are represented by the N-component cost, i.e. δpi ¼ −lr ∂pi C , where lr is the learning rate.
vector, y. We assume that all interactions between genes Intuitively, for a two-dimensional parameter set, this
are possible; this is parameterized by a N x N matrix, W. corresponds to moving directly “downhill” on the cost
Thus each element Wij specifies how the transcription of surface (Fig. 1a).
gene i is affected by gene j – if Wij is positive, then j acti- For classic gradient descent, the learning rate lr is set
vates i; if Wij is negative, then j inhibits i. We further as- to a constant value. However, this can present problems
sume that each gene is degraded with rate ki. Together when optimizing a highly complex cost function in high
this specifies an ordinary differential equation (ODE) dimensions. Intuitively, and as shown in Fig. 2, we would
model of the network: like the learning rate to adapt as optimization proceeds
(and also to be different for each parameter). A more so-
dyi X  phisticated version of gradient descent, Adaptive mo-
¼ϕ W ij y j þ I i −k i yi ð1Þ
dt j ment estimation, or Adam [58], has been established to
overcome these difficulties and is being widely used to
Here, ϕ(x) is a nonlinear function, ensuring that tran- fit complex neural network models [62]. For this reason,
scription rates are always positive, and saturate at high we choose Adam as our default optimization algorithm,
Hiscock BMC Bioinformatics (2019) 20:214 Page 4 of 13

B C

Fig. 1 Overview of GeneNet. a The optimization algorithm consists of three parts: defining a cost function (left), updating parameters to minimize
the cost via gradient descent (middle), and analyzing the learned networks (right). b Regularization selects networks with varying degrees of
complexity. c Final design of an ultrasensitive switch. Upper: the final values of each of the three genes as a function of different levels of input.
Lower: time traces illustrating network dynamics for three representative values for the input level

which we will later show to be effective for the examples we can “differentiate” the solver in a single line of code
considered in this manuscript. and thereby compute gradients rapidly, even in high di-
Minimizing the cost is, in principle, exactly the same mensions. Moreover, whilst Eq. 1 resembles a neural net-
whether training neural networks or screening gene cir- work model, this is not at all necessary. Rather, any
cuits. The difference arises when computing the gradient dynamics that can be specified algorithmically (in code)
of the cost function with respect to the parameters. For can be accommodated. The general procedure is to write
the algebraic equations in feedforward neural networks, an ODE solver/simulator for the model, in Theano/Ten-
the computation is fairly straightforward and can be sorflow code. Since the solver algorithm consists of
written down explicitly. For a gene circuit ODE model, many elementary operations (addition, multiplication)
however, this is much more difficult. One approach is to combined, each of which can be differentiated, then the
estimate gradients using a finite difference procedure entire solver can be differentiated by automatically com-
[63, 64], in which you compare the model output when bining these using the product and chain rule. Auto-
each of the system parameters is changed by a small matic differentiation in Theano/Tensorflow is analogous
amount. Alternatively, forward sensitivity analysis speci- to calculating the adjoint state in adjoint sensitivity ana-
fies the time-evolution of the gradients as a set of lysis [63], but with the advantage that it can be algorith-
coupled ODEs. However, both these approaches scale mically calculated for any model.
poorly as the number of parameters increases [63]. Together, the gene network model, the cost func-
We realized that machine learning libraries, such as tion, and the gradient descent algorithm define a
Theano and Tensorflow, could provide a much faster procedure to design gene circuits (see Methods). We
and more direct way of computing gradients, since they first tested our pipeline by asking it to generate an
permit automatic (or “algorithmic”) differentiation of ultrasensitive switch, a circuit that is seen in vivo
computer programs. Specifically, by implementing a sim- [65], and has also been rationally engineered [66].
ple differential equation solver in Theano/Tensorflow, Indeed, we find that as we step through repeated
Hiscock BMC Bioinformatics (2019) 20:214 Page 5 of 13

A B

Fig. 2 Adam is an effective gradient descent algorithm for ODEs. a Using a constant learning rate in gradient descent creates difficulties in the
optimization process. If the learning rate is too low (upper schematic), the algorithm ‘gets stuck’ in plateau regions with a shallow gradient
(saddle points in high dimensions). If instead the learning rate is too high (lower schematic), important features are missed and/or the learning
algorithm won’t converge. b An adaptive learning rate substantially improves optimization (schematic). Intuitively, learning speeds up when
traversing a shallow, but consistent, gradient. c Cost minimization plotted against iteration number, comparing classic gradient descent (red) with
the Adam algorithm (blue). Left: an ultrasensitive switch. Right: a medium pass filter

iterations of the gradient descent, we efficiently gradient descent, albeit more slowly and after some fine
minimize the cost function, and so generate parame- tuning of the learning rate parameter. We emphasize
ters of a gene network model that responds sensi- that, in contrast, Adam works well with default parame-
tively to its input (Fig. 1a). ters. Furthermore, we find that as we consider functions
To train this circuit, we have used a sophisticated ver- of increasing complexity, classic gradient descent fails to
sion of gradient descent, Adam. Could a simpler algo- learn, whilst Adam still can. These results echo outputs
rithm – namely classic gradient descent with constant from the machine learning community, where Adam
learning rate – also work? As shown in Fig. 2, we find (and other related algorithms) significantly outperforms
that we can generate ultrasensitive switches using classic gradient descent [58].
Hiscock BMC Bioinformatics (2019) 20:214 Page 6 of 13

Whilst the network in Fig. 1a performs well, its main input directly represses, and indirectly activates, the out-
drawback is that it is complicated and has many parame- put, thus responding at intermediate levels of the input
ters. Firstly, this makes it difficult to interpret exactly (Fig. 3a). Exactly the same network was described in a
what the network is doing, since it is not obvious which large-scale screen of stripe-forming motifs [38], and has
interactions are critical for forming the switch. Secondly, been observed in early Drosophila patterning [69], sug-
it would make engineering such a network more compli- gesting that our learned design may be a common
cated. Therefore, we modified the cost in an attempt to strategy.
simplify gene networks, inspired by the techniques of
“regularization” to simplify neural networks [61]. Specif- Pulse detection
ically, we find that if we add the L1 norm of the param- In our second example, we consider a more complicated
P P
eter sets to the cost, i.e. C ¼ x ðyðxÞ−^yðxÞÞ2 þ λ pi jpi j type of input, namely pulses of varying duration. In many
cases, cells respond not just to the level of some input, but
, we can simplify networks without significantly com- also to the duration [70, 71]. We sought a circuit design to
promising their performance. Intuitively, the extra term measure duration, such that once an input exceeding a
penalizes models that have many non-zero parameters, critical duration is received, the output is irreversibly acti-
i.e. more complex models [67]. By varying the strength vated (Fig. 3b). As before, by changing a few lines of code,
of the regularization, λ, we find networks of varying de- and within a few minutes of laptop compute time, we can
grees of complexity (Fig. 1b). efficiently design such a circuit (Fig. 3b). This circuit
The final output of our algorithm is a simplified gene shares features (such as double inhibition and positive
network that defines a dynamical system whose response feedback) with networks identified in a comprehensive
is a switch-like function of its inputs (Fig. 1c). Therefore, screen of duration detection motifs [43].
we have demonstrated that machine-learning algorithms
can successfully train gene networks to perform a cer- Oscillator
tain task. In the remainder of this work, we show the Our third example takes a somewhat different flavor,
utility of our pipeline by designing more realistic and where instead of training a network to perform a specific
complex biological circuits. input/output function, we train a network to
self-generate a certain dynamical pattern – oscillations
[72, 73]. In this case, the cost is not just dependent on
Applications the final state, but on the entire dynamics, and is imple-
First, we consider three design objectives for which there
mented by the equation C = ∑t(y(t) − A cos(ωt))2, where
already exist known networks, so that we can be sure
A is the amplitude and ω the frequency of the oscillator.
our algorithm is working well. We find that we can rap-
Minimizing this cost yields a network that gives sus-
idly and efficiently design gene circuits for each of the
tained oscillations, and is reminiscent of the repressilator
three objectives by modifying just a few lines of code
network motif that first demonstrated synthetic oscilla-
that specify the objective, and without changing any de-
tions [33]. Interestingly, when plotting how the cost
tails or parameters of the learning algorithm. Further, we
changed in the minimization algorithm, we saw a pre-
can screen for functional circuits within several minutes
cipitous drop at a certain point (Additional file 1: Figure
of compute time on a laptop. In each case, the learned
S1), demonstrating visually the transition through a bi-
network is broadly similar to the networks described
furcation to produce oscillations.
previously, lending support to our algorithm.
Extensions
French-flag circuit Networks of increased size
The first design objective is motivated from the To illustrate the scalability of GeneNet, we considered
French-Flag model of patterning in developmental biol- larger networks (up to 9 nodes, with 81 parameters)
ogy [68]. Here, a stripe of gene expression must be posi- and asked whether they could also be successfully
tioned at some location within a tissue or embryo, in screened. As shown in Fig. 4a, we find that the me-
response to the level of some input. This input is typic- dian number of iterations required to train a French
ally a secreted molecule, or “morphogen”, which is pro- Flag circuit is largely insensitive to the network size,
duced at one location, and forms a gradient across the demonstrating that GeneNet is scalable to models of
tissue. In order to form a stripe, cells must then respond increased complexity.
to intermediate levels of the input. To identify gene cir-
cuits capable of forming stripes, we ran our algorithm More complex / realistic ODE models
using the desired final state as shown in Fig. 3a. The al- Whilst Equation 1 is a good description for transcrip-
gorithm converges on a fairly simple network, where the tional gene circuits [38], we asked whether different
Hiscock BMC Bioinformatics (2019) 20:214 Page 7 of 13

Fig. 3 Using GeneNet to learn gene circuits. a A French-Flag circuit responds to intermediate levels of input to generate a stripe. The red node
corresponds to the output gene, and the blue node the input. b A “duration-detector” which is irreversibly activated when stimulated by pulses
exceeding a certain critical duration. As above, the red node corresponds to the output gene, and the blue node the input. c An oscillator

molecular interactions and model complexities could be in- of the learning algorithm, GeneNet is capable of design-
corporated into our pipeline. One simple extension is to ing a co-operative, repressor-only medium pass (French
allow pairs of molecules to bind, facilitating their degrad- Flag) circuit (Fig. 4c).
ation; this has been shown to be useful when evolving oscil- Together, these results suggest that GeneNet can be
lator circuits in silico [48]. To include dimerization-based straightforwardly adapted to work on a range of different
decay, we modified Eq. 1 to the following: biological models.
dyi X 
¼ϕ W ij y j −k i yi −Γ ij yi y j ð2Þ Alternative cost functions
dt j
The final extension we consider is the form of the cost
Here, Γij is a symmetric matrix that represents the function. We have so far focused on the mean squared
degradation rates of each dimer pair possible. We find error as our measure of the model’s goodness-of-fit.
that, without modifying the optimization algorithm, However, there are other options, and work from the
we can train this more complicated gene circuit field of evolutionary algorithms suggests that the choice
model; an example output for training an oscillator is of cost function can have an impact on the efficacy of
shown in Fig. 4b. circuit design [46]. To determine if GeneNet can be
Another possibility is that the additive input model modified to work with different cost functions, we
specified in Eq. 1 must be extended, since gene circuits trained a switch-like circuit using the negative
often rely on the coincidence of multiple independent cross-entropy cost function, commonly used in image
events. Therefore, we considered a rather different type classification problems. Again, without changing the
of circuit, built of multiple, independent repressors. In optimization algorithm, we can efficiently learn circuit
this case, circuit dynamics are described by a product of parameters (Fig. 4d).
Hill functions:
Comparison to other algorithms
dyi Y 1
¼  n þ I i −k i yi ð3Þ As outlined in the introduction, there are several other
dt j
1 þ W ij y j methods for in silico circuit design, including: (1) com-
prehensive enumeration of circuit parameters / topolo-
where we set the Hill coefficient, n, to be 3. Again, with- gies; and (2) evolutionary algorithms. How does our
out modifying neither the structure nor the parameters method compare?
Hiscock BMC Bioinformatics (2019) 20:214 Page 8 of 13

B C

D E

Fig. 4 Performance and generality of GeneNet. a Scalability of GeneNet. Left: schematic of N(C) – the number of iterations required to reach a
given network performance. Right: For the same desired circuit function as in Figure 3A, we train networks of varying sizes and provide a boxplot
of the number of iterations required to achieve a cost, C = 0.4. The median is roughly constant. b Example output after training a 2-node network
oscillator using the equation provided. c Example output, and circuit, after training a 3-node French Flag circuit using the independent repressor
circuit. d A switch-like network is learned by minimizing the negative cross-entropy function. e Comparing computational efficiencies of different
circuit design algorithms: GeneNet, evolutionary algorithms and comprehensive enumeration. Each algorithm is run 10 times; the shaded area
corresponds to the mean ± standard deviation of the cost value. We see that the cost is rapidly, and reproducibly minimized by GeneNet

Firstly, it is worth noting that there are key shared fea- can be certain to find the global optimal solution; how-
tures. For each approach, one defines: (1) an ODE-based ever, this comes with the drawback of being computa-
model of a gene circuit, and (2) a cost function that se- tionally expensive, and prohibitively slow for larger gene
lects networks to execute a specific function. Therefore, circuits.
these methods can be used somewhat interchangeably. In contrast, evolutionary algorithms iteratively im-
However, differences arise in exactly how the algorithms prove circuit performance by randomly mutating param-
select the networks with high performance. eters across a population of circuits, and retaining
Comprehensive screens consider all possible parame- circuits with the highest cost. This approach can gener-
ters and then identify those that (globally) minimize the ate functioning circuits with fewer computations than a
cost. The advantage of an exhaustive screen is that one comprehensive screen, and has the added advantage of
Hiscock BMC Bioinformatics (2019) 20:214 Page 9 of 13

providing some insight into how gene circuits might Looking more deeply into the network, we see that it
evolve during natural selection. has learned a very interesting way to count, with two
GeneNet is neither exhaustive, nor does it provide an key ingredients. Firstly, it acts as an “off-detector”. Spe-
insight into the evolvability of gene circuits. However, its cifically, examining the dynamic time traces reveals that
key advantage is speed; in particular, its excellent scal- the network responds after the pulse has occurred, as it
ability to larger networks (Fig. 4a) and thus a capacity to turns off. Mechanistically, when the input increases, this
train circuits to execute more complicated functions. allows build up of the purple node and repression of
We performed a side-by-side comparison in training a the orange node. However, activation of the down-
French Flag circuit for each of the three approaches stream green node is only possible once the input
(comprehensive screens, evolutionary algorithms and pulse has ended, and the repression from the purple
GeneNet) and, consistent with our expectation, see that node has been alleviated. In this way, the circuit re-
GeneNet significantly outperforms the other two in sponds to the termination of the pulse, and is thus
terms of speed (Fig. 4e). To demonstrate the real utility robust to its duration.
of our approach, we end by considering a more compli- Secondly, the network uses “digital encoding” to be ro-
cated design objective. bust to the level of the input. This is achieved by having
the green node undergo an “excitable pulse” of stereo-
A more complex circuit: a robust biological typed amplitude and duration, which is then integrated
counter over time by the red node to complete the counter. By
In our final example, we attempt a more ambitious de- using “digital” pulses of activity with an excitable system,
sign– a biological counter – to demonstrate that Gene- the circuit is therefore insensitive to the precise levels of
Net can also design circuits to perform more complex the input. Together, this forms a circuit that reliably
computations and functions. Whilst counters are found counts despite large variations in input stimulus.
in some biological systems (such as in telomere length We emphasize that these behaviors have not been
regulation [74]), we focus our aim on designing a coun- hard-coded by rational design, but rather have emerged
ter for synthetic biology. There are numerous applica- when training the network to perform a complex task.
tions for such counters, two of which are: (1) a safety This example therefore shows that a more challenging
mechanism programming cell death after a specified design objective can be straightforwardly accommodated
number of cell cycles, (2) biosensors that non-invasively into our gene network framework, and that it is possible
count the frequency of certain stimuli, particularly low to learn rather unexpected and complex designs.
frequency events [75].
We consider an analog counter, where we wish some
input set of pulses to result in an output equal (or pro- Discussion
portional) to the number of pulses (Fig. 5a). For ex- By combining ODE-based models of gene networks with
ample, the “input” here could be the level of a cell cycle the optimization methods of machine learning, we present
related protein to count divisions [76]. As is shown in an approach to efficiently design gene circuits. Whilst we
Fig. 5b, the simplest way to count would be simply to in- have focused on gene networks and transcriptional regula-
tegrate over time the levels of the input. One implemen- tion as a proof of principle, our algorithm is rather general
tation of this used an analog memory device driven by and could easily be extended to learn other networks,
CRISPR mutagenesis [77], i.e. when the stimulus is such as phosphorylation, protein-protein interaction and
present, Cas9 is active and mutations arise. However, a metabolic networks, so long as they are described by or-
major shortcoming of such a circuit is that it is unreli- dinary differential equations. Further, whilst the networks
able and sensitive to variations in pulse amplitude and we have focused on are relatively small and have been
duration that are often present (Fig. 5b). trained on a personal laptop, our Theano/Tensorflow
Therefore, we sought to systematically design a novel pipelines can be easily adapted to run much faster on
gene circuit to count pulses that would be robust to GPUs, and therefore we expect that large networks could
their amplitude and duration. To do this, we provided a also be trained effectively [56].
complex ensemble of input stimuli, each containing a Our approach could also be extended to incorporate
different number of pulses, of varying amplitudes and other types of differential equation, such as partial differ-
durations. For each input we then defined the desired ential equations. This would allow us to understand how
output to be equal to the number of pulses present, and networks operate in a spatial, multicellular context,
trained the network to minimize the mean squared error throughout an entire tissue, and thus provide useful in-
cost, as before. sights into how different structures and patterns are
Strikingly, this procedure uncovers a network that is formed during development [32]. Other extensions would
highly robust in counting pulse number (Fig. 5c). be to use real data as inputs to the learning algorithm, in
Hiscock BMC Bioinformatics (2019) 20:214 Page 10 of 13

A B

Fig. 5 Designing a robust biological counter using GeneNet. a Desired input/output function of a robust biological counter. b Simply integrating
the input over time yields an analog counter with significant error, as shown by the spread in the input/output function (upper) and the relative
errors between the two (lower). c GeneNet learns an “off-detector” network (left), with substantially reduced error rates (right). Inspection of gene
dynamics (middle) shows that an excitable “digital” pulse of green activity is initiated at the end of the input pulse (shaded blue line)

which case more sophisticated algorithms would be re- tool to solve problems, but rather as an efficient way to
quired to deal with parameter uncertainty [78, 79]. generate new hypotheses. The role of the scientist is
One drawback of our approach is that it selects only a then to: (1) cleverly design the screen (i.e. the cost) such
single gene circuit out of many, and thus may ignore al- that the algorithm can effectively learn the desired func-
ternative circuits that may also be useful or relevant. A tion, and (2) to carefully analyze the learned circuits and
natural extension would therefore be to combine the the extent to which they recapitulate natural phenom-
speed of GeneNet’s parameter optimization with a com- ena. The distinct advantage of our approach over neural
prehensive enumeration of different network topologies, networks is that we learn real biological models – gene
thus generating a complete ‘atlas’ of gene circuits [38]. circuits – which are both directly interpretable as mech-
Finally, one concern with machine learning methods is anism, and provide specific assembly instructions for
that the intuition behind the models is hidden within a synthetic circuits.
“black box” and opaque to researchers, i.e. the machine,
not the researcher, learns. We would like to offer a Conclusions
slightly different perspective. Instead of replacing the re- In this work, we have developed an algorithm to learn
searcher, our algorithm acts as a highly efficient way to gene circuits that perform complex tasks (e.g. count
screen models. In this sense, one shouldn’t view it as a pulses), compute arbitrary functions (e.g. detect pulses
Hiscock BMC Bioinformatics (2019) 20:214 Page 11 of 13

of a certain duration) or resemble some real bio- 2. Define the desired function of the network
logical phenomenon (e.g. a French-flag circuit). We
have demonstrated that these networks can be trained This consists of two items: (1) a collection of differ-
efficiently on a personal laptop, and require neither ent inputs to train the network on, and (2) for each
fine-tuning of algorithm parameters nor extensive input, a desired output. For example, in Fig. 3b, we
coding to adapt to different network designs. This must include inputs of varying durations, and a func-
ease-of-use means that researchers addressing ques- tion that computes whether the pulse exceeds some
tions in basic biology can quickly generate models critical duration. The input can be static in time (as
and hypotheses to explain their data, without invest- in Fig. 2a), dynamic in time (as in Fig. 3b) or zero (as
ing a lot of time carefully building and simulating in Fig. 3c). The desired output can be the final state
specific models. Equally, our approach should also of the network (as in Fig. 3b), or the complete time
allow synthetic biologists to rapidly generate circuit dynamics of the network (Fig. 3c). See Additional file
designs for a variety of purposes. 2: Table S1 for further details.

Methods 3. Define the cost to be minimized


Gene network model
We considered the following set of coupled ODEs as our As discussed in the main text, we use the mean
gene circuit model, motivated by [26, 27, 38]: squared error as the cost. Since we are often not
concerned with absolute levels of any gene, but ra-
dyi  X   ther the relative levels, we modify the cost such that
¼ ki ϕ W ij y j þ I i −y i
dt j the network output can be rescaled by a factor A,
which is a learnable parameter, i.e. y → Ay. For
Here, yi denotes the concentration of gene i, where i = 1 regularization, we add a term λ∑i, j|Wij| to the cost,
… N for an N-node network. Wij is a matrix correspond to with the aim of simplifying the network matrix Wij.
network weights: Wij > 0 means that gene i activates gene
j, and Wij < 0 means that gene i represses gene j. ki is a 4. Define the parameters that are to be fit
vector that represents the (assumed linear) degradation of
gene i. Ii is the prescribed input to the system, which we For all simulations, we fit the network weights, Wij to
assume directly causes transcription of gene i. The func- the data. For Figs. 1, 3a, b, we do not allow ki to change,
tion ϕ is a nonlinearity chosen such that the gene network and instead set ki = 1. For Figs. 3c and 4c, we use the ki
saturates; we choose ϕ(x) = 1/(exp(x) − 1), as in [38]. y0i as learnable parameters. For Fig. 3c, we must allow the
represent the initial conditions of the system. Note, this is initial conditions of the network to be learned, such that
a non-dimensionalized version of Eq. 1, whereby gene the oscillator has the correct phase.
concentrations are normalized to their maximal level.
Note, also, that we add further terms for Fig. 4b and c as 5. Define the optimization algorithm
discussed in the text.
We use the Adam optimizer [58] with parameters
Algorithm details lr = 0.1, b1 = 0.02, b2 = 0.001, e = 10−8 in all cases.
We coded the algorithm using python v2.7.5, and the
machine-learning library, Theano v0.8.2, which performs 6. Initialize the model parameters
static optimizations for speed and permits automatic dif-
ferentiation. (We also provide an implementation on We set ki to be one and the network weights to be a
Tensorflow). normally distributed random number with mean zero
The key steps in the algorithm are: and standard deviation 0.1. Initial conditions (except for
Fig 3c) are set as y0i ¼ 0:1.
1. Define an ODE solver
7. Train the network
We choose a simple Euler integration method,
whereby the differential equation y_ ¼ f ðy; tÞ is solved it- We iteratively perform the optimization procedure to
eratively: yn + 1 = yn + f(yn, tn)δt. We set δt = 0.01. We im- update parameters, initially setting λ = 0. At each step,
plement the solver using the theano.scan feature, which we train the network using a subset of total input/output
optimizes the computation of loops. Note, that it is data, using a “batch” of B input/output pairs. The idea is
straightforward to implement other solvers, e.g. stiff that batching the data in this way adds stochasticity to
solvers or adaptive sizes. the optimization and therefore avoids local minima.
Hiscock BMC Bioinformatics (2019) 20:214 Page 12 of 13

8. Regularize the network Acknowledgements


I thank John Ingraham for inspiring this project over breakfast and dinnertime
conversations, and for his enthusiasm and generosity in teaching me machine
Once a network has been trained, we now regularize it learning. I thank Sean Megason for advice and mentoring.
by increasing λ and re-implementing the optimization
procedure. A range of different λ values is used until a Funding
This work was supported by an EMBO Long-term fellowship, ALTF 606–2018.
network of desired complexity is achieved. This funding source did not play any role in study design, data collection/
analysis, or manuscript preparation.
9. “Prune” the network
Availability of data and materials
GeneNet (Theano and Tensorflow versions) is available on github: “https://
In the final step, we retrain a simplified network. Spe- github.com/twhiscock/GeneNet-”.
cifically, starting from the regularized network, we set
Author’s contributions
any small network weights to be exactly zero, i.e. TWH designed and performed the research, and wrote the paper. The author
ðpruneÞ
Wij ¼ 0 ∀i; j : W i j < ϵ , and then optimize over read and approved the final manuscript.

the remaining weights, using λ = 0. Ethics approval and consent to participate


Not applicable.
10. Save parameters Consent for publication
Not applicable.
See Additional file 2: Table S1 for the parameter values
Competing interests
for the networks learned. The author declares that he has no competing interests.

Implementation details Publisher’s Note


Networks were designed was performed on a Macbook Springer Nature remains neutral with regard to jurisdictional claims in published
maps and institutional affiliations.
air, 1.3GHz Intel Core i5, with 8GB 1600 MHz DDR3
RAM. We repeated the learning algorithm for each of Received: 18 March 2019 Accepted: 2 April 2019
the designs in the paper several times, with different
regularization levels λ, and found similar, or often identi-
References
cal, network topologies to be learned in each case. In the 1. Barabasi AL, Oltvai ZN. Network biology: understanding the cell's functional
figures we report a representative network, where λ has organization. Nat Rev Genet. 2004;5(2):101–13.
2. Zhu X, Gerstein M, Snyder M. Getting connected: analysis and principles of
been chosen manually to give a minimal network that
biological networks. Genes Dev. 2007;21(9):1010–24.
still performs the function well. 3. Goode DK, et al. Dynamic gene regulatory networks drive hematopoietic
Additional file 2: Table S2 gives details of the algo- specification and differentiation. Dev Cell. 2016;36(5):572–87.
4. Plath K, Lowry WE. Progress in understanding reprogramming to the
rithm implementation specific to the networks learned.
induced pluripotent state. Nat Rev Genet. 2011;12(4):253–65.
For the comparative and extension studies in Fig. 4, we 5. Cahan P, et al. CellNet: network biology applied to stem cell engineering.
developed a TensorFlow implementation of an evolution- Cell. 2014;158(4):903–15.
6. Davidson EH. Emerging properties of animal gene regulatory networks.
ary algorithm and compared its speed to the Tensorflow
Nature. 2010;468(7326):911–20.
implementation of GeneNet, using the same cost function. 7. Rhee DY, et al. Transcription factor networks in Drosophila melanogaster.
We repeated this for a Tensorflow implementation of a Cell Rep. 2014;8(6):2031–43.
8. Lopez-Maury L, Marguerat S, Bahler J. Tuning gene expression to changing
comprehensive screen, for which we randomly sample pa-
environments: from rapid responses to evolutionary adaptation. Nat Rev
rameters and retain the (global) minimum cost value. Genet. 2008;9(8):583–93.
These were performed on a MacBook Pro, 2.7 GHz Intel 9. Mangan S, Alon U. Structure and function of the feed-forward loop network
motif. Proc Natl Acad Sci U S A. 2003;100(21):11980–5.
Core i5, with 8 GB 1867 MHz DDR3 RAM.
10. Stelzl U, et al. A human protein-protein interaction network: a resource for
annotating the proteome. Cell. 2005;122(6):957–68.
Additional files 11. Minguez P, et al. Deciphering a global network of functionally associated
post-translational modifications. Mol Syst Biol. 2012;8:599.
12. Linding R, et al. Systematic discovery of in vivo phosphorylation networks.
Additional file 1: Figure S1. Cost minimization. Example traces of the Cell. 2007;129(7):1415–26.
cost minimization during the optimization procedure. Note, in all cases 13. Jeong H, et al. The large-scale organization of metabolic networks. Nature.
(and particularly in the oscillator), there are sharp drop-offs in cost, which 2000;407(6804):651–4.
are likely reflecting bifurcation points in the dynamics. (PDF 177 kb) 14. Fiehn O. Combining genomics, metabolome analysis, and biochemical
Additional file 2: Table S1. Parameter values for networks learned in modelling to understand metabolic networks. Comp Funct Genomics. 2001;
the main text. Table S2. Algorithm implementation parameters. Table S3. 2(3):155–68.
Speed tests. (DOCX 70 kb) 15. Alon U. Network motifs: theory and experimental approaches. Nat Rev
Genet. 2007;8(6):450–61.
16. Alon U. Biological networks: the tinkerer as an engineer. Science. 2003;
Abbreviations 301(5641):1866–7.
Adam: Adaptive moment estimation; ODE: Ordinary differential equation; 17. Shen-Orr SS, et al. Network motifs in the transcriptional regulation network
RNN: Recurrent neural network of Escherichia coli. Nat Genet. 2002;31(1):64–8.
Hiscock BMC Bioinformatics (2019) 20:214 Page 13 of 13

18. Milo R, et al. Network motifs: simple building blocks of complex networks. 51. Smith RW, van Sluijs B, Fleck C. Designing synthetic networks in silico: a
Science. 2002;298(5594):824–7. generalised evolutionary algorithm approach. BMC Syst Biol. 2017;11(1):118.
19. Rosenfeld N, Elowitz MB, Alon U. Negative autoregulation speeds the 52. Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with
response times of transcription networks. J Mol Biol. 2002;323(5):785–93. neural networks. Science. 2006;313(5786):504–7.
20. Simon E, Pierau FK, Taylor DC. Central and peripheral thermal control of 53. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.
effectors in homeothermic temperature regulation. Physiol Rev. 1986;66(2): 54. Li H, Lin Z, Shen X, Brandt J, Hua G. A convolutional neural network cascade
235–300. for face detection. In: Proceedings of the IEEE conference on computer
21. Shraiman BI. Mechanical feedback as a possible regulator of tissue growth. vision and pattern recognition; 2015. p. 5325–34.
Proc Natl Acad Sci U S A. 2005;102(9):3318–23. 55. Amari S-i. Backpropagation and stochastic gradient descent method.
22. Lestas I, Vinnicombe G, Paulsson J. Fundamental limits on the suppression Neurocomputing. 1993;5(4):185–96.
of molecular fluctuations. Nature. 2010;467(7312):174–8. 56. Bergstra J, et al. Theano: a CPU and GPU math compiler in Python. In: Proc.
23. Alon U. An Introduction to Systems Biology: Design Principles of Biological 9th Python in Science Conf. 2010;1:3–10.
Circuits. Chapman & Hall/CRC; 2006. 57. Abadi, M., et al., Tensorflow: Large-scale machine learning on
24. Fowlkes CC, et al. A quantitative spatiotemporal atlas of gene expression in heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
the Drosophila blastoderm. Cell. 2008;133(2):364–74. 58. Kingma, D. and J. Ba, Adam: A method for stochastic optimization. arXiv
25. Gregor T, et al. Stability and nuclear dynamics of the bicoid morphogen preprint arXiv:1412.6980, 2014.
gradient. Cell. 2007;130(1):141–52. 59. Molinelli EJ, et al. Perturbation biology: inferring signaling networks in
26. Jaeger J, et al. Dynamical analysis of regulatory interactions in the gap gene cellular systems. PLoS Comput Biol. 2013;9(12):e1003290.
system of Drosophila melanogaster. Genetics. 2004;167(4):1721–37. 60. Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are
27. Jaeger J, et al. Dynamic control of positional information in the early universal approximators. Neural Netw. 1989;2(5):359–66.
Drosophila embryo. Nature. 2004;430(6997):368–71. 61. Goodfellow I, Bengio Y, Courville A. Deep learning. MIT Press; 2016. https://
28. Manu, et al. Canalization of gene expression in the Drosophila blastoderm www.deeplearningbook.org/.
by gap gene cross regulation. PLoS Biol. 2009;7(3):e1000049. 62. Ruder, S., An overview of gradient descent optimization algorithms. arXiv:
29. Manu, et al. Canalization of gene expression and domain shifts in the 1609.04747, 2016.
Drosophila blastoderm by dynamical attractors. PLoS Comput Biol. 2009;5(3): 63. Frohlich F, et al. Scalable parameter estimation for genome-scale
e1000303. biochemical reaction networks. PLoS Comput Biol. 2017;13(1):e1005331.
30. Mukherji S, van Oudenaarden A. Synthetic biology: understanding biological 64. Uzkudun M, Marcon L, Sharpe J. Data-driven modelling of a gene regulatory
design from synthetic circuits. Nat Rev Genet. 2009;10(12):859–71. network for cell fate decisions in the growing limb bud. Mol Syst Biol. 2015;
11(7):815.
31. Khalil AS, Collins JJ. Synthetic biology: applications come of age. Nat Rev
65. Tyson JJ, Chen KC, Novak B. Sniffers, buzzers, toggles and blinkers: dynamics
Genet. 2010;11(5):367–79.
of regulatory and signaling pathways in the cell. Curr Opin Cell Biol. 2003;
32. Davies J. Using synthetic biology to explore principles of development.
15(2):221–31.
Development. 2017;144(7):1146–58.
66. Palani S, Sarkar CA. Synthetic conversion of a graded receptor signal into a
33. Elowitz MB, Leibler S. A synthetic oscillatory network of transcriptional
tunable, reversible switch. Mol Syst Biol. 2011;7:480.
regulators. Nature. 2000;403(6767):335–8.
67. Brunton SL, Proctor JL, Kutz JN. Discovering governing equations from data
34. Gardner TS, Cantor CR, Collins JJ. Construction of a genetic toggle switch in
by sparse identification of nonlinear dynamical systems. Proc Natl Acad Sci
Escherichia coli. Nature. 2000;403(6767):339–42.
U S A. 2016;113(15):3932–7.
35. Liu C, et al. Sequential establishment of stripe patterns in an expanding cell
68. Wolpert L. Positional information and the spatial pattern of cellular
population. Science. 2011;334(6053):238–41.
differentiation. J Theor Biol. 1969;25(1):1–47.
36. Adler M, et al. Optimal regulatory circuit topologies for fold-change 69. Clyde DE, et al. A self-organizing system of repressor gradients establishes
detection. Cell Syst. 2017;4(2):171–181 e8. segmental complexity in Drosophila. Nature. 2003;426(6968):849–53.
37. Li Z, Liu S, Yang Q. Incoherent inputs enhance the robustness of biological 70. Hopfield JJ. Kinetic proofreading: a new mechanism for reducing errors in
oscillators. Cell Syst. 2017;5(1):72–81 e4. biosynthetic processes requiring high specificity. Proc Natl Acad Sci. 1974;
38. Cotterell J, Sharpe J. An atlas of gene regulatory networks reveals multiple 71(10):4135–9.
three-gene mechanisms for interpreting morphogen gradients. Mol Syst 71. Mangan S, Zaslaver A, Alon U. The coherent feedforward loop serves as a
Biol. 2010;6:425. sign-sensitive delay element in transcription networks. J Mol Biol. 2003;
39. Chau AH, et al. Designing synthetic regulatory networks capable of self- 334(2):197–204.
organizing cell polarization. Cell. 2012;151(2):320–32. 72. Novak B, Tyson JJ. Design principles of biochemical oscillators. Nat Rev Mol
40. Eldar A, et al. Robustness of the BMP morphogen gradient in Drosophila Cell Biol. 2008;9(12):981–91.
embryonic patterning. Nature. 2002;419(6904):304–8. 73. Stricker J, et al. A fast, robust and tunable synthetic gene oscillator. Nature.
41. Ma W, et al. Defining network topologies that can achieve biochemical 2008;456(7221):516–9.
adaptation. Cell. 2009;138(4):760–73. 74. Marcand S, Gilson E, Shore D. A protein-counting mechanism for telomere
42. Ben-Zvi D, et al. Scaling of the BMP activation gradient in Xenopus length regulation in yeast. Science. 1997;275(5302):986–90.
embryos. Nature. 2008;453(7199):1205–11. 75. Friedland AE, et al. Synthetic gene networks that count. Science. 2009;
43. Gerardin, J. and W.A. Lim, The design principles of biochemical timers: 324(5931):1199–202.
circuits that discriminate between transient and sustained stimulation. 76. Slomovic S, Pardee K, Collins JJ. Synthetic biology devices for in vitro and in
biorxiv preprint https://doi.org/10.1101/100651, 2017. vivo diagnostics. Proc Natl Acad Sci U S A. 2015;112(47):14429–35.
44. Perkins TJ, et al. Reverse engineering the gap gene network of Drosophila 77. Perli SD, Cui CH, Lu TK. Continuous genetic recording with self-targeting
melanogaster. PLoS Comput Biol. 2006;2(5):e51. CRISPR-Cas in human cells. Science. 2016;353(6304):aag0511.
45. Crombach A, et al. Efficient reverse-engineering of a developmental gene 78. Liepe J, et al. A framework for parameter estimation and model selection
regulatory network. PLoS Comput Biol. 2012;8(7):e1002589. from experimental data in systems biology using approximate Bayesian
46. Francois P. Evolving phenotypic networks in silico. Semin Cell Dev Biol. computation. Nat Protoc. 2014;9(2):439–56.
2014;35:90–7. 79. Calderhead B, Girolami M, Lawrence ND. Accelerating Bayesian inference
47. Francois P, Siggia ED. A case study of evolutionary computation of over nonlinear differential equations with Gaussian processes. Adv Neural
biochemical adaptation. Phys Biol. 2008;5(2):026009. Inf Proces Syst. 2009;21:217–24.
48. Francois P, Hakim V. Design of genetic networks with specified functions by
evolution in silico. Proc Natl Acad Sci U S A. 2004;101(2):580–5.
49. Francois P, Hakim V, Siggia ED. Deriving structure from evolution: metazoan
segmentation. Mol Syst Biol. 2007;3:154.
50. Noman N, et al. Evolving robust gene regulatory networks. PLoS One. 2015;
10(1):e0116258.

You might also like