Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
22e9570
Increased font size in plots
itrujnara Apr 29, 2024
3187ac4
Fixed a bug if report is generated without MSA
itrujnara Apr 29, 2024
59bccc4
Linting fix - single quotes in resource label
itrujnara Apr 29, 2024
0e6098f
Linting fix again - single quotes
itrujnara Apr 29, 2024
8e39174
Merge branch 'dev' of https://github.com/itrujnara/reportho into dev
itrujnara Apr 29, 2024
9e113c1
Linting fix - wrong resource label
itrujnara Apr 29, 2024
40b5100
Linting fix - added missing containers
itrujnara Apr 29, 2024
8da78a2
Linting fix - nonexistent container version
itrujnara Apr 29, 2024
6338f0f
Removed fulfilled TODOs from readme
itrujnara Apr 29, 2024
cdaf4f6
Tweak to contributor names in readme
itrujnara Apr 29, 2024
ab103aa
Changed modules run script name
itrujnara Apr 29, 2024
b492331
Tweaked usage.md to match the pipeline
itrujnara Apr 29, 2024
c8db09a
Updated output.md to match pipeline.
itrujnara Apr 29, 2024
9f1f1bf
Set test.config to correct values
itrujnara Apr 29, 2024
5798f5c
Tweaked report to work with skipped downstream
itrujnara Apr 29, 2024
c30a037
Added options to skip plotting
itrujnara Apr 29, 2024
5fe68ed
Minor fixes to plotting
itrujnara Apr 29, 2024
2c27b50
Added stats aggregation
itrujnara Apr 30, 2024
c290759
Added file name for CSV merge
itrujnara Apr 30, 2024
fd1957e
Added config for full test
itrujnara Apr 30, 2024
e591d9d
Tweaks to full test
itrujnara Apr 30, 2024
9796e30
Added citations
itrujnara Apr 30, 2024
5acc848
Linting fix - default params
itrujnara Apr 30, 2024
2bac397
Switched parameters to be false by default
itrujnara May 2, 2024
a70c77b
Added copyright info to scripts
itrujnara May 2, 2024
1abec30
Missing copyright note
itrujnara May 2, 2024
f960b85
Removed a fulfilled TODO
itrujnara May 2, 2024
fbfbde6
Update bin/yml2csv.py
itrujnara May 6, 2024
cf3d0dd
Added FASTA example to the usage document
itrujnara May 6, 2024
f848a1f
Merge branch 'dev' of https://github.com/itrujnara/reportho into dev
itrujnara May 6, 2024
8fc2254
Refactored Orthoinspector parameter names
itrujnara May 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 42 additions & 4 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,51 @@

## Pipeline tools

- [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
- [OMA](htpps://omabrowser.org)

> Andrews, S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data [Online].
> Adrian M Altenhoff, Clément-Marie Train, Kimberly J Gilbert, Ishita Mediratta, Tarcisio Mendes de Farias, David Moi, Yannis Nevers, Hale-Seda Radoykova, Victor Rossier, Alex Warwick Vesztrocy, Natasha M Glover, Christophe Dessimoz, OMA orthology in 2021: website overhaul, conserved isoforms, ancestral gene order and more, Nucleic Acids Research, Volume 49, Issue D1, 8 January 2021, Pages D373–D379, https://doi.org/10.1093/nar/gkaa1007

- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)
- [PANTHER](https://pantherdb.org)

> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
> Thomas PD, Ebert D, Muruganujan A, Mushayahama T, Albou L-P, Mi H. PANTHER: Making genome-scale phylogenetics accessible to all. Protein Science. 2022; 31: 8–22. https://doi.org/10.1002/pro.4218

- [OrthoInspector](https://lbgi.fr/orthoinspector)

> Yannis Nevers, Arnaud Kress, Audrey Defosset, Raymond Ripp, Benjamin Linard, Julie D Thompson, Olivier Poch, Odile Lecompte, OrthoInspector 3.0: open portal for comparative genomics, Nucleic Acids Research, Volume 47, Issue D1, 08 January 2019, Pages D411–D418, https://doi.org/10.1093/nar/gky1068

- [EggNOG](https://eggnog5.embl.de)

> Jaime Huerta-Cepas, Damian Szklarczyk, Davide Heller, Ana Hernández-Plaza, Sofia K Forslund, Helen Cook, Daniel R Mende, Ivica Letunic, Thomas Rattei, Lars J Jensen, Christian von Mering, Peer Bork, eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses, Nucleic Acids Research, Volume 47, Issue D1, 08 January 2019, Pages D309–D314, https://doi.org/10.1093/nar/gky1085

- [UniProt](https://uniprot.org)

> The UniProt Consortium , UniProt: the Universal Protein Knowledgebase in 2023, Nucleic Acids Research, Volume 51, Issue D1, 6 January 2023, Pages D523–D531, https://doi.org/10.1093/nar/gkac1052

- [UniProt ID Mapping](https://uniprot.org/id-mapping)

> Huang H, McGarvey PB, Suzek BE, Mazumder R, Zhang J, Chen Y, Wu CH. A comprehensive protein-centric ID mapping service for molecular data integration. Bioinformatics. 2011 Apr 15;27(8):1190-1. doi: 10.1093/bioinformatics/btr101. PMID: 21478197; PMCID: PMC3072559.

- [AlphaFold](https://deepmind.google/technologies/alphafold)

> Jumper, J., Evans, R., Pritzel, A. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). https://doi.org/10.1038/s41586-021-03819-2

- [AlphaFold Database](https://alphafold.ebi.ac.uk)

> Mihaly Varadi, Stephen Anyango, Mandar Deshpande, Sreenath Nair, Cindy Natassia, Galabina Yordanova, David Yuan, Oana Stroe, Gemma Wood, Agata Laydon, Augustin Žídek, Tim Green, Kathryn Tunyasuvunakool, Stig Petersen, John Jumper, Ellen Clancy, Richard Green, Ankur Vora, Mira Lutfi, Michael Figurnov, Andrew Cowie, Nicole Hobbs, Pushmeet Kohli, Gerard Kleywegt, Ewan Birney, Demis Hassabis, Sameer Velankar, AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models, Nucleic Acids Research, Volume 50, Issue D1, 7 January 2022, Pages D439–D444, https://doi.org/10.1093/nar/gkab1061

- [T-COFFEE](https://tcoffee.org)

> Notredame C, Higgins DG, Heringa J. T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol. 2000 Sep 8;302(1):205-17. doi: 10.1006/jmbi.2000.4042. PMID: 10964570.

- [IQTREE](https://iqtree.org)

> B.Q. Minh, H.A. Schmidt, O. Chernomor, D. Schrempf, M.D. Woodhams, A. von Haeseler, R. Lanfear (2020) IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol., 37:1530-1534. https://doi.org/10.1093/molbev/msaa015

> D.T. Hoang, O. Chernomor, A. von Haeseler, B.Q. Minh, L.S. Vinh (2018) UFBoot2: Improving the ultrafast bootstrap approximation. Mol. Biol. Evol., 35:518–522. https://doi.org/10.1093/molbev/msx281

- [FastME](https://atgc-montpellier.fr/fastme/)

> Vincent Lefort, Richard Desper, Olivier Gascuel, FastME 2.0: A Comprehensive, Accurate, and Fast Distance-Based Phylogeny Inference Program, Molecular Biology and Evolution, Volume 32, Issue 10, October 2015, Pages 2798–2800, https://doi.org/10.1093/molbev/msv150

## Software packaging/containerisation tools

Expand Down
16 changes: 4 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,6 @@

![nf-core-reportho tube map](docs/images/reportho_tube_map.svg?raw=true "nf-core-reportho tube map")

<!-- TODO nf-core: Fill in short bullet-pointed list of the default steps in the pipeline -->

1. **Obtain Query Information**: (depends on provided input) identification of Uniprot ID and taxon ID for the query or its closest homolog.
2. **Fetch Orthologs**: fetching of ortholog predictions from public databases, either through API or from local snapshot.
3. **Compare and Assemble**: calculation of agreement statistics, creation of ortholog lists, selection of the consensus list.
Expand Down Expand Up @@ -66,8 +64,6 @@ If using the latter format, you must set `--uniprot_query` to true.

Now, you can run the pipeline using:

<!-- TODO nf-core: update the following command to include all required parameters for a minimal example -->

```bash
nextflow run nf-core/reportho \
-profile <docker/singularity/.../institute> \
Expand All @@ -89,15 +85,13 @@ For more details about the output files and reports, please refer to the

## Credits

nf-core/reportho was originally written by itrujnara.
nf-core/reportho was originally written by Igor Trujnara (@itrujnara).

We thank the following people for their extensive assistance in the development of this pipeline:

@lsantus

@avignoli

@JoseEspinosa
- Luisa Santus (@lsantus)
- Alessio Vignoli (@avignoli)
- Jose Espinosa-Carrasco (@JoseEspinosa)

## Contributions and Support

Expand All @@ -110,8 +104,6 @@ For further information or help, don't hesitate to get in touch on the [Slack `#
<!-- TODO nf-core: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file. -->
<!-- If you use nf-core/reportho for your analysis, please cite it using the following doi: [10.5281/zenodo.XXXXXX](https://doi.org/10.5281/zenodo.XXXXXX) -->

<!-- TODO nf-core: Add bibliography of tools and data used in your pipeline -->

An extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.

You can cite the `nf-core` publication as follows:
Expand Down
5 changes: 2 additions & 3 deletions assets/samplesheet.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,2 @@
sample,fastq_1,fastq_2
SAMPLE_PAIRED_END,/path/to/fastq/files/AEG588A1_S1_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A1_S1_L002_R2_001.fastq.gz
SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz,
id,query
BicD2,Q8TD16
3 changes: 3 additions & 0 deletions bin/clustal2fasta.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is needed in each single script since the repository already has the MIT license and your attribution could be found in the README

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checklist mentions this, and I saw some pipelines (e.g. rnaseq) do this, so I put those notes there just in case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

# See https://opensource.org/license/mit for details

import sys

from Bio import SeqIO
Expand Down
3 changes: 3 additions & 0 deletions bin/clustal2phylip.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

from Bio import SeqIO
Expand Down
3 changes: 3 additions & 0 deletions bin/csv_adorn.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys


Expand Down
3 changes: 3 additions & 0 deletions bin/ensembl2uniprot.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

import requests
Expand Down
3 changes: 3 additions & 0 deletions bin/fetch_afdb_structures.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

import requests
Expand Down
3 changes: 3 additions & 0 deletions bin/fetch_inspector_group.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

import requests
Expand Down
3 changes: 3 additions & 0 deletions bin/fetch_oma_by_sequence.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys
from warnings import warn

Expand Down
3 changes: 3 additions & 0 deletions bin/fetch_oma_group.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

import requests
Expand Down
3 changes: 3 additions & 0 deletions bin/fetch_oma_groupid.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

from utils import fetch_seq
Expand Down
3 changes: 3 additions & 0 deletions bin/fetch_oma_taxid_by_id.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

from utils import fetch_seq
Expand Down
3 changes: 3 additions & 0 deletions bin/fetch_panther_group.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

import requests
Expand Down
3 changes: 3 additions & 0 deletions bin/fetch_sequences.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

import requests
Expand Down
3 changes: 3 additions & 0 deletions bin/filter_fasta.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

from Bio import SeqIO
Expand Down
3 changes: 3 additions & 0 deletions bin/get_oma_version.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import requests


Expand Down
3 changes: 3 additions & 0 deletions bin/make_score_table.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import csv
import re
import sys
Expand Down
3 changes: 3 additions & 0 deletions bin/make_stats.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import csv
import sys

Expand Down
3 changes: 3 additions & 0 deletions bin/map_uniprot.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

from ensembl2uniprot import ensembl2uniprot
Expand Down
3 changes: 3 additions & 0 deletions bin/oma2uniprot_local.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import gzip
import sys

Expand Down
18 changes: 11 additions & 7 deletions bin/plot_orthologs.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env Rscript

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

suppressMessages(library(ggplot2))
suppressMessages(library(reshape2))
suppressMessages(library(dplyr))
Expand All @@ -15,6 +18,7 @@ if (length(args) < 2) {
# Styles
text_color <- "#DDDDDD"
bg_color <- "transparent"
font_size <- 16

# Load the data
data <- read.csv(args[1], header = TRUE, stringsAsFactors = FALSE)
Expand All @@ -38,9 +42,9 @@ p <- ggplot(melted_crosstable, aes(x = method, y = count, fill = score)) +
labs(title = "Support for predictions", x = "Database", y = "Number of orthologs", fill = "Support") +
scale_fill_manual(values = c("#59B4C3", "#74E291", "#8F7AC2", "#EFF396", "#FF9A8D")) +
theme(legend.position = "right",
text = element_text(size = 12, color = text_color),
axis.text.x = element_text(color = text_color),
axis.text.y = element_text(color = text_color),
text = element_text(size = font_size, color = text_color),
axis.text.x = element_text(size = font_size, color = text_color),
axis.text.y = element_text(size = font_size, color = text_color),
plot.background = element_rect(color = bg_color, fill = bg_color),
panel.background = element_rect(color = bg_color, fill = bg_color))

Expand All @@ -54,7 +58,7 @@ for (i in colnames(data)[4:ncol(data)-1]) {
}
venn.plot <- ggVennDiagram(venn.data, set_color = text_color) +
theme(legend.position = "none",
text = element_text(size = 12, color = text_color),
text = element_text(size = font_size, color = text_color),
plot.background = element_rect(color = bg_color, fill = bg_color),
panel.background = element_rect(color = bg_color, fill = bg_color))
ggsave(paste0(args[2], "_venn.png"), plot = venn.plot, width = 6, height = 6, dpi = 300)
Expand All @@ -81,9 +85,9 @@ p <- ggplot(jaccard, aes(x = method1, y = method2, fill = jaccard)) +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = "Jaccard Index", x = "", y = "", fill = "Jaccard Index") +
theme(legend.position = "right",
text = element_text(size = 12, color = text_color),
axis.text.x = element_text(color = text_color),
axis.text.y = element_text(color = text_color),
text = element_text(size = font_size, color = text_color),
axis.text.x = element_text(size = font_size, color = text_color),
axis.text.y = element_text(size = font_size, color = text_color),
plot.background = element_rect(color = bg_color, fill = bg_color),
panel.background = element_rect(color = bg_color, fill = bg_color))

Expand Down
3 changes: 3 additions & 0 deletions bin/plot_tree.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env Rscript

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

library(treeio)
library(ggtree)
library(ggplot2)
Expand Down
3 changes: 3 additions & 0 deletions bin/refseq2uniprot.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

import requests
Expand Down
3 changes: 3 additions & 0 deletions bin/score_hits.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import csv
import sys

Expand Down
3 changes: 3 additions & 0 deletions bin/uniprot2oma_local.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import gzip
import sys

Expand Down
3 changes: 3 additions & 0 deletions bin/uniprot2uniprot.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

import requests
Expand Down
3 changes: 3 additions & 0 deletions bin/uniprotize_oma_local.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import gzip
import sys

Expand Down
3 changes: 3 additions & 0 deletions bin/uniprotize_oma_online.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details

import sys

from map_uniprot import map_uniprot
Expand Down
4 changes: 4 additions & 0 deletions bin/utils.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# Written by Igor Trujnara, released under the MIT license
# See https://opensource.org/license/mit for details
# Includes code written by UniProt contributors published under CC-BY 4.0 license

import time
from typing import Any

Expand Down
Loading