Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Bactopia v2 Overview #233

@rpetit3

Description

@rpetit3

Bactopia v2 Overview

With tremendous effort by @Mxrcon and @abhi18av, the foundation for migrating Bactopia to DSL2 has been laid out. This transition represents the key milestone to push Bactopia to v2! (Super excited about this Davi and Abhinav!)

By switching to DSL2, the door for creating custom Bactopia workflows has been opened. For example, let's say you have some Staphylococcus aureus samples, and you want to run Bactopia and then the Bactopia Tool staph-typer. Instead with DSL2, we can create a sub-workflow (e.g. Staphopia) that will automatically run Bactopia and staph-typer. In other words, we can start creating organism-specific sub-workflows, as well as sub-workflows that only include certain steps such as assembly.

I think this also a good time to start cleaning up some things and adding features that will make long-term maintenance more sustainable.

House Cleaning

These are to help reduce the burden required to maintain Bactopia long-term. These are really about standardizing things in such a way that we can automate things. For example, printing usage across each of the workflows can be configured through config files (e.g. nf-core json schema. There are also a lot of shared functions for checking inputs, creating channels, etc. These duplications are no longer necessary in DSL2.

Additional Features

  • Support for Nanopore reads
  • Drop support for some uncompressed outputs (e.g. assemblies)
    • Defaults to compressed outputs, --skip_compression disables this feature
  • GenBank compatible assembly
    • Currently does not like gnl|
  • Outputs from the tutorial https://doi.org/10.6084/m9.figshare.17097156.v1

Implement pytest for testing

I'd like to create a suite of tests that are operated by pytest and pytest-workflows. The nf-core/modules team has a framework that can be extended to Bactopia.

  • Setup Test-Data repo - Done! https://github.com/bactopia/bactopia-tests
  • Setup walk through for testing - Done! https://github.com/bactopia/bactopia/tree/dsl2/tests
  • Add tests to Github Actions -Done! https://github.com/bactopia/bactopia/actions/runs/1256507737
  • Create Tests for Bactopia Modules
    • annotate_genome
    • antimicrobial_resistance
    • ariba_analysis
    • assemble_genome
    • assembly_qc
    • blast
    • call_variants
    • count_31mers (merged into minmer_sketch)
    • download_references (merged into call_variants)
    • estimate_genome_size (merged into gather_samples)
    • fastq_status (merged into gather_samples)
    • gather_samples
    • mapping_query
    • minmer_query
    • minmer_sketch
    • qc_reads
    • sequence_type
  • Create Tests for Bactopia Tool Modules
    • agrvate
    • bakta
    • ectyper
    • emmtyper
    • eggnog
    • fastani
    • hicap
    • ismapper
    • kleborate
    • lissero
    • mashtree
    • meningotype
    • ngmaster
    • pangenome
    • seqsero2
    • spatyper
    • staph-typer
    • staphopiasccmec
    • tbprofiler

Convert some processes to nf-core/modules

There are a few tools used by Bactopia that are the only tool in the process. Most of these tools are in the Bactopia Tools. I think its best that these tools be transferred to nf-core/modules. Many of these will need to be added to nf-core but they are in need of some bacterial genomic tool love, so its ok!

Curated Datasets

I think one of the best features of Bactopia is the ability to include public datasets. This works great for general datasets, but organism-specific datasets are kind of lost. I think it would be great to start a set of curated datasets that users can add data to.

Here's an example of a curated Staphylococcus aureus Bactopia Dataset. This dataset can easily be imported and allow users to rapidly analyze their samples with a curated dataset specific to their organism.

I think it would also be nice if these curated datasets, included SRA accessions linked to publications. But this exceeds my capabilities and would require extensive community support.

Species specific Workflows

With DSL2, we can create Species Specific workflows by combining the main Bactopia workflow with some Bactopia Tools. The main example, and thus shall act as a proof-of-concept will be Staphopia. Staphopia is essentially Bactopia + the Bactopia Tool staph-typer.

  • Create a Staphopia workflow

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is neededv2.0.0Bactopia v2

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions