Codestin Search App

Added

Python 3.12 support

Fixed

Issue with checking for the presence of dwell information in fastq files.

Changed

Behaviour of medaka_consensus with --bacteria option: if the basecaller model cannot
be parsed or is not compatible with the bacterial polishing model, exit with an error instead of
falling back to default model.
Replaced pkg_resources with importlib

Removed

Python 3.8 support (EOL).

Fixed

Updated documentation with inference and sequence command renaming.
Changed default model resolved from bam file from variant to consensus.
Fixed issue with initializing inference in Medaka tandem model.
Fixed a memory leak in the Medaka C library and removed redundant memory objects to reduce the footprint.

Changed

Fully refactored and redesigned medaka tandem code and optimised CPU-based execution.
Read-level models cannot be used with medaka tandem.
get_trimmed_reads now also returns the phase-set, hap and read ids.

Added

Consensus models for v5.2.0 basecaller models.
Added support for read-level consensus models for v5.0.0 and v5.2.0 basecaller models.
Models dna_r10.4.1_e8.2_5khz_400bps_sup and dna_r10.4.1_e8.2_5khz_400bps_hac added
as aliases to those without _5kz_ in their names.
Added -B option to medaka_consensus to allow passing a bed file or region to polish
via medaka inference --regions.
Added --cpu option to medaka inference to force CPU and avoid searching for GPUs.
New output format for medaka tandem tailored for population studies.
New fields to medaka tandem output: depth, read lengths, read names, phase sets, and MAD of read lengths.
Read length–based outlier detection in medaka tandem.

Fixed

medaka smolecule was broken by change from medaka consensus to medaka inference.

Changed

Improved error message when model is not found.

Switched from tensorflow to pytorch.

Existing models for recent basecallers have been converted to the new format.
Pytorch format models contain a _pt suffix in the filename.

Changed

Inference is now performed using PyTorch instead of TensorFlow.
The medaka consensus command has been renamed to medaka inference to reflect
its function in running an arbitrary model and avoid confusion with medaka_consensus.
The medaka stitch command has been renamed to medaka sequence to reflect its
function in creating a consensus sequence.
The medaka variant command has been renamed to medaka vcf to reflect its function
in consolidating variants and avoid confusion with medaka_variant.
Order of arguments to medaka vcf has been changed to be more consistent
with medaka sequence.
The helper script medaka_haploid_variant has been renamed medaka_variant to
save typing.
Make --ignore_read_groups option available to more medaka subcommands including inference.

Removed

The medaka snp command has been removed. This was long defunct as diploid SNP calling
had been deprecated, and medaka variant is used to create VCFs for current models.
Loading models in hdf format has been deprecated.
Deleted minimap2 and racon wrappers in medaka/wrapper.py.

Added

Release conda packages for Linux (x86 and aarch64) and macOS (arm64).
Option --lr_schedule allows using cosine learning rate schedule in training.
Option --max_valid_samples to set number of samples in a training validation batch.

Fixed

Training models with DiploidLabelScheme uses categorical cross-entropy loss
instead of binary cross-entropy.

(Probably) final version of medaka using tensorflow. Future versions will use
pytorch instead.

Fixed

medaka_consensus: only keep bam tags if input file matches joint polishing pipeline.
Pin numpy to <2.0.0.

Added

Consensus and variant models lookup for v3.5.1 Dorado models.

Fixed

tandem: Use haplotag 0 in unphased mode.
tandem: Don't run consensus if regions set is empty.

Added

Models for version 5 basecaller models.
Expose sym_indels option for training.
Expose --min_mapq minimum mapping quality alignment fitering option for medaka consensus.
tandem: Option --ignore_read_groups to ignore read groups present in input file.
Wrapper script medaka_consensus_joint and convenience tools (prepare_tagged_bam,
get_model_dtypes) to facilitate joint polishing with multiple datatypes.

Added

Consensus and variant models for v4.3.0 dorado models.

Added

Parsing model information from fastq headers output by Guppy and MinKNOW.

Changed

Additional explanatory information in VCF INFO fields concerning depth calculations.

Fixed

Do not exit if model cannot be interpreted, use the default instead.
An issue with co-ordinate handling in computing variants from alignments.

Added

Ability to use basecaller model name as --model argument.
Better handling or errors when running abpoa.

Fixed

Correct suffix of consensus file when medaka_consensus outputs a fastq.

Added

Choice of model file can be introspected from input files. For BAM files the
read group (RG) headers are searched according to the dorado
specification,
whilst for .fastq files the comment section of a number of reads are checked
for corresponding read group information. In the latter case see README for
information on correctly converting basecaller output to .fastq whilst
maintaining the relevant meta information.
medaka tools resolve_model can display the model that would automatically
be used for a given input file.

Changed

If no model is provided on command-line interface (medaka consensus,
medaka_consensus, and medaka_haploid_variant) automatic attempts will be made
to choose the appropriate model.

Releases: nanoporetech/medaka

v2.1.1

Added

Fixed

Changed

Removed

Uh oh!

v2.1.0

Fixed

Changed

Added

Uh oh!

v2.0.1

Fixed

Changed

Uh oh!

v2.0.0

Changed

Removed

Added

Fixed

Uh oh!

v1.12.1

Fixed

Added

Uh oh!

v1.12.0

Fixed

Added

Uh oh!

v1.11.3

Added

Uh oh!

v1.11.2

Added

Changed

Uh oh!

v1.11.1

Fixed

Added

Uh oh!

v1.11.0

Fixed

Added

Changed

Uh oh!