Code for Simulated metabolic profiles unveil biases in pathway analysis methods
All instructions, code, and data used to run the simulations, analyses, and plots can be found in this repository.
- Clone the repository
git clone https://github.com/juliette-cooke/simulatedPA.git
- Install Python packages
Python 3.7 was used for this project.
cd simulatedPA/
pip3 install -r requirements.txt
- Install R libraries
R 4.2.2 was used in this project. Other versions should work if the packages are available for it.
In this repository you will find the renv.lock, .Rprofile, and renv/activate.R files. Launch the R project, and renv should automatically install itself. If not, run install.packages('renv'). Then, use renv::restore() to set up the project library on your machine.
SAMBA is described here and can be used locally (with a powerful computer) or more commonly on a computing cluster (e.g. with SLURM architecture). You can download it on gitlab here.
SAMBA takes a metabolic network as input. You can also provide a list of reactions or genes to knockout in the model, or set a scenario type that SAMBA will then generate (e.g. knockout all reactions in each pathway individually).
This results in a simulated metabolic profile for each modelled condition.
We have run SAMBA on Human1 and Recon2.2, simulating each individual pathway knockout. As the resulting files can be large, they have been made available here to avoid overloading this github repository. They can be used to run the rest of this analysis without running SAMBA, if you wish to work on a human network.
- Metabolic network (SBML)
- z-scores file (generated by SAMBA)
- Pathway supercategories (
pathway_db.tsv) for the chosen network - List of side compounds for the chosen network
- Pathway category colours (optional)
- List of blocked reactions for the model (can be generated through Python)
The python notebook can be run using the SBML network and the z-scores. It generates several files used in the R analysis, and most importantly runs ORA and GSEA on the results from SAMBA.
- Pathway analysis raw results and metrics files
- Metabolite dictionary
- Pathway dictionary
- Pathway states file
- Metabolite to pathways file
Met4J 1.5.1 or higher. Download from: https://forgemia.inra.fr/metexplore/met4j
Use the PathwayNet package:
java -cp path/to/met4j/met4j-toolbox/target/met4j-toolbox-1.5.1-SNAPSHOT.jar fr.inrae.toulouse.metexplore.met4j_toolbox.networkAnalysis.PathwayNet -s path/to/model_no_blocked_reactions.xml -o path/to/output/model_distance_network.gml -sc path/to/side_compounds.txt -ncw
Import the previously generated network into Cytoscape. Import the pathway_db as a column in the node table. Filter out the Miscellaneous pathways. Select the layout by going to Layout > Edge-weighted spring layout > Weight.
The R notebook takes the outputs from the Python notebook and runs the analyses shown in the paper.
- Figures 1-5
- Supplementary figures
- Various analyses [add list]