Walk-through example
- 1) Introduction
- 2) Pre-processor include directives
- 3) Read command line parameters
- 4) Instanciate the pseudo-random numbers generator (PRNG)
- 5) Initialize the population
- 6) Create a lineage and a coalescence tree, and add the roots
- 7) Run the evolutionary algorithm
- 8) Final step: extracting information from the trees
- 9) Results
The objective of puutools is to dynamically update the structure of the lineage tree during a simulation in an efficient way, such that the amount of data in live memory is minimized. puutools algorithms have been optimized to this aim.
First consult the main documentation to install puutools on your device.
puutools is distributed as an external static library. Once installed, the header must be included with the standard #include directive:
#include <puutools.h>
The main structure that will be manipulated is the class puu_tree, which instanciates a dynamical representation of a lineage or coalescence tree. puu_tree is a template class:
template <typename selection_unit>
class puu_treewith selection_unit being any class of your own, with the only constraint that the copy constructor must be fully implemented.
For example, if your individual class is Cell, the tree will be instanciated as:
puu_tree<Cell> my_tree;
In this example, we will implement a simple evolutionary simulation algorithm with a constant population of size
We will now walk through puutools classes step by step.
We first include the necessary standard library (std) utilitaries and the puutools library:
#include <iostream>
#include <vector>
#include <tuple>
#include <assert.h>
#include <puutools.h>
We then include three classes that have been pre-implemented on purpose for this tutorial (see the example folder of this repository):
#include "Prng.h"
#include "Individual.h"
#include "Simulation.h"
The Prng class contains several random numbers generation functions based on the GNU Scientific Library. The class Individual contains the structure of an individual (one phenotypic trait and one fitness value, plus a few methods)—this class will be used by puutools to instanciate trees (the class Individual must have a properly defined copy constructor). The class Simulation contains all the code to run an evolutionary simulation.
We need to define five parameters:
- The initial trait value
$x_0$ ; - The simulation time
$T$ (in generations); - The population size
$N$ ; - The mutation rate
$m$ ; - The mutation size
$s$ ;
Let's implement a piece of code to read our parameters from the command line:
int main( int argc, char const** argv )
{
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
/* 1) Read simulation parameters */
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
assert(argc==6);
(void)argc;
double initial_trait_value = atof(argv[1]);
int simulation_time = atoi(argv[2]);
int population_size = atoi(argv[3]);
double mutation_rate = atof(argv[4]);
double mutation_size = atof(argv[5]);
std::cout << "> Running a simulation with the following parameters:" << std::endl;
std::cout << " • Initial trait value: " << initial_trait_value << std::endl;
std::cout << " • Simulation time : " << simulation_time << std::endl;
std::cout << " • Population size : " << population_size << std::endl;
std::cout << " • Mutation rate : " << mutation_rate << std::endl;
std::cout << " • Mutation size : " << mutation_size << std::endl;We also instanciate a PRNG:
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
/* 2) Create the prng */
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
Prng prng(time(0));This step creates the simulation and initializes the population:
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
/* 3) Create the simulation */
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
Simulation simulation(&prng, initial_trait_value, population_size, mutation_rate, mutation_size);
simulation.initialize_population();We will create two trees:
- A lineage tree, containing parent-children relationships at every generations,
- A coalescence tree, which will only contain common ancestors.
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
/* 4) Create trees and add roots */
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
puu_tree<Individual> lineage_tree;
puu_tree<Individual> coalescence_tree;
for (int i = 0; i < population_size; i++)
{
lineage_tree.add_root(simulation.get_individual(i));
coalescence_tree.add_root(simulation.get_individual(i));
}
We first instanciate two trees with the class Individual. It is not mandatory to name your individual class "Individual".
We then create a root in both trees for each of the add_root(*individual). It is essential to root a tree at the beginning of a simulation.
This is the core of our algorithm. The different tasks have been written as separate code loops for clarity, however it is possible to optimize the code by merging several loops together. At each generation:
- The next generation of individuals is created;
- All reproduction events are added to the trees;
- The previous generation is "inactivated" in the trees (i.e. parents die);
- The population is updated with next generation's individuals;
- Trees structures are updated;
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
/* 5) Evolve the population */
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
for (int generation = 1; generation <= simulation_time; generation++)
{
if (generation%1000==0)
{
std::cout << ">> Generation " << generation << "\n";
}
/* STEP 1 : Create the next generation
------------------------------------ */
simulation.create_next_generation();
/* STEP 2 : Add reproduction events
--------------------------------- */
Individual* parent;
Individual* descendant;
std::tie(parent, descendant) = simulation.get_first_parent_descendant_pair();
while (parent != NULL)
{
lineage_tree.add_reproduction_event(parent, descendant, (double)generation);
coalescence_tree.add_reproduction_event(parent, descendant, (double)generation);
std::tie(parent, descendant) = simulation.get_next_parent_descendant_pair();
}
/* STEP 3 : Inactivate parents
---------------------------- */
for (int i = 0; i < population_size; i++)
{
lineage_tree.inactivate(simulation.get_individual(i), true);
coalescence_tree.inactivate(simulation.get_individual(i), false);
}
/* STEP 4 : Replace the current population with the new one
--------------------------------------------------------- */
simulation.update_population();
/* STEP 5: Update the lineage and coalescence trees
------------------------------------------------- */
lineage_tree.update_as_lineage_tree();
coalescence_tree.update_as_coalescence_tree();
}
At STEP 2, we register in the trees every reproduction events to add the new node relationships.
This is done with the method add_reproduction_event(*parent, *child, time).
At STEP 3, we must indicate for each tree which individuals from the previous generation are now dead, thanks to the method inactivate(*individual, copy). The parameter copy is a boolean (true/false). If true, the tree creates a copy of the individual, and saves it independently from the main population algorithm (this is why it is mandatory to implement a copy constructor with puutools). Importantly, calling the method inactivate(*individual, copy) depends on your algorithm. Indeed, it can happen that both the parent and its children remain alive at the next generation (e.g. for a bacterial population). However using this function is mandatory, as tree's structure manipulations can only be done with inactivated nodes.
Note also that at STEP 3, we copy the dead individuals in the lineage tree, but not in the coalescence tree. Indeed, we will recover later the evolution of the phenotypic trait and the fitness from the lineage tree, while we will only extract the structure of the coalescence tree.
💡 TIP: It is not mandatory to call the STEP 5 at each generation: if a tree is updated more often, this will increase the computational load. If the tree is updated less often, this will increase the memory load (trees grow at each generation before being pruned and shortened). The user must decide of the period depending on the performance of its own code.
💡 TIP: The size of a coalescence tree is approximately constant over time (
Now that the simulation reached an end, we will extract some information from the trees. We call a last time update functions to ensure a correct final structure:
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
/* 6) Save lineage and coalescence data */
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
lineage_tree.update_as_lineage_tree();
coalescence_tree.update_as_coalescence_tree();
We first retrieve the lineage of the last best individual. To do so, we first get the best individual's node using the method get_node_by_selection_unit(*individual). We then trace back the lineage of this particular node using the method get_parent(), until the root of the tree is reached. Doing so, we write statistics in a file:
/* Save the lineage of the last best individual
--------------------------------------------- */
std::ofstream file("./output/lineage_best.txt", std::ios::out | std::ios::trunc);
file << "generation mutation_size trait fitness" << std::endl;
puu_node<Individual>* best_node = lineage_tree.get_node_by_selection_unit(simulation.get_best_individual());
while (best_node != NULL)
{
file << best_node->get_insertion_time() << " ";
file << best_node->get_selection_unit()->get_mutation_size() << " ";
file << best_node->get_selection_unit()->get_trait() << " ";
file << best_node->get_selection_unit()->get_fitness() << std::endl;
best_node = best_node->get_parent();
file.flush();
}
file.close();
We then save the data over the whole lineage tree. To do so, we use the methods get_first() and get_next(). When the last node is reached, the function returns NULL.
/* Save the lineage of all alive individuals
------------------------------------------ */
file.open("./output/lineage_all.txt", std::ios::out | std::ios::trunc);
file << "generation mutation_size trait fitness" << std::endl;
puu_node<Individual>* node = lineage_tree.get_first();
while (node != NULL)
{
file << node->get_insertion_time() << " ";
file << node->get_selection_unit()->get_mutation_size() << " ";
file << node->get_selection_unit()->get_trait() << " ";
file << node->get_selection_unit()->get_fitness() << std::endl;
file.flush();
node = lineage_tree.get_next();
}
file.close();
Finally, we save the structure of the coalescence tree in Newick format (.phb extension):
/* Save the coalescence tree
-------------------------- */
coalescence_tree.write_newick_tree("./output/coalescence_tree.phb");
This simulation example is available in the folder example of this repository, and can be compiled with CMake (navigate to the folder example/cmake with a terminal and run the following command:
sh make_release.sh
The binary executable puutools_example is located in the folder example/build/bin.
As an example, a simulation have been run by shifting an initial population of size
../build/bin/puutools_example 2.0 10000 200 0.02 0.02
Output files are written in the folder example/output, which also contains a Rscript to generate a figure. Here, we can see that the population evolved towards the optimum. As we recover the lineage of the last best individual, we have also access to the size of fixed mutations.


