Releases: nanoporetech/medaka
Releases · nanoporetech/medaka
v2.1.1
Added
- Python 3.12 support
Fixed
- Issue with checking for the presence of dwell information in fastq files.
Changed
- Behaviour of
medaka_consensuswith--bacteriaoption: if the basecaller model cannot
be parsed or is not compatible with the bacterial polishing model, exit with an error instead of
falling back to default model. - Replaced
pkg_resourceswithimportlib
Removed
- Python 3.8 support (EOL).
v2.1.0
Fixed
- Updated documentation with
inferenceandsequencecommand renaming. - Changed default model resolved from bam file from
varianttoconsensus. - Fixed issue with initializing
inferencein Medaka tandem model. - Fixed a memory leak in the Medaka C library and removed redundant memory objects to reduce the footprint.
Changed
- Fully refactored and redesigned
medaka tandemcode and optimised CPU-based execution. - Read-level models cannot be used with
medaka tandem. - get_trimmed_reads now also returns the phase-set, hap and read ids.
Added
- Consensus models for v5.2.0 basecaller models.
- Added support for read-level consensus models for v5.0.0 and v5.2.0 basecaller models.
- Models
dna_r10.4.1_e8.2_5khz_400bps_supanddna_r10.4.1_e8.2_5khz_400bps_hacadded
as aliases to those without_5kz_in their names. - Added
-Boption tomedaka_consensusto allow passing a bed file or region to polish
viamedaka inference --regions. - Added
--cpuoption tomedaka inferenceto force CPU and avoid searching for GPUs. - New output format for
medaka tandemtailored for population studies. - New fields to
medaka tandemoutput: depth, read lengths, read names, phase sets, and MAD of read lengths. - Read length–based outlier detection in
medaka tandem.
v2.0.1
Fixed
medaka smoleculewas broken by change frommedaka consensustomedaka inference.
Changed
- Improved error message when model is not found.
v2.0.0
Switched from tensorflow to pytorch.
Existing models for recent basecallers have been converted to the new format.
Pytorch format models contain a _pt suffix in the filename.
Changed
- Inference is now performed using PyTorch instead of TensorFlow.
- The
medaka consensuscommand has been renamed tomedaka inferenceto reflect
its function in running an arbitrary model and avoid confusion withmedaka_consensus. - The
medaka stitchcommand has been renamed tomedaka sequenceto reflect its
function in creating a consensus sequence. - The
medaka variantcommand has been renamed tomedaka vcfto reflect its function
in consolidating variants and avoid confusion withmedaka_variant. - Order of arguments to
medaka vcfhas been changed to be more consistent
withmedaka sequence. - The helper script
medaka_haploid_varianthas been renamedmedaka_variantto
save typing. - Make
--ignore_read_groupsoption available to more medaka subcommands includinginference.
Removed
- The
medaka snpcommand has been removed. This was long defunct as diploid SNP calling
had been deprecated, andmedaka variantis used to create VCFs for current models. - Loading models in hdf format has been deprecated.
- Deleted minimap2 and racon wrappers in
medaka/wrapper.py.
Added
- Release conda packages for Linux (x86 and aarch64) and macOS (arm64).
- Option
--lr_scheduleallows using cosine learning rate schedule in training. - Option
--max_valid_samplesto set number of samples in a training validation batch.
Fixed
- Training models with DiploidLabelScheme uses categorical cross-entropy loss
instead of binary cross-entropy.
v1.12.1
(Probably) final version of medaka using tensorflow. Future versions will use
pytorch instead.
Fixed
- medaka_consensus: only keep bam tags if input file matches joint polishing pipeline.
- Pin numpy to <2.0.0.
Added
- Consensus and variant models lookup for v3.5.1 Dorado models.
v1.12.0
Fixed
- tandem: Use haplotag 0 in unphased mode.
- tandem: Don't run consensus if regions set is empty.
Added
- Models for version 5 basecaller models.
- Expose
sym_indelsoption for training. - Expose
--min_mapqminimum mapping quality alignment fitering option for medaka consensus. - tandem: Option
--ignore_read_groupsto ignore read groups present in input file. - Wrapper script
medaka_consensus_jointand convenience tools (prepare_tagged_bam,
get_model_dtypes) to facilitate joint polishing with multiple datatypes.
v1.11.3
Added
- Consensus and variant models for v4.3.0 dorado models.
v1.11.2
Added
- Parsing model information from fastq headers output by Guppy and MinKNOW.
Changed
- Additional explanatory information in VCF INFO fields concerning depth calculations.
v1.11.1
Fixed
- Do not exit if model cannot be interpreted, use the default instead.
- An issue with co-ordinate handling in computing variants from alignments.
Added
- Ability to use basecaller model name as --model argument.
- Better handling or errors when running abpoa.
v1.11.0
Fixed
- Correct suffix of consensus file when
medaka_consensusoutputs a fastq.
Added
- Choice of model file can be introspected from input files. For BAM files the
read group (RG) headers are searched according to the dorado
specification,
whilst for .fastq files the comment section of a number of reads are checked
for corresponding read group information. In the latter case see README for
information on correctly converting basecaller output to .fastq whilst
maintaining the relevant meta information. medaka tools resolve_modelcan display the model that would automatically
be used for a given input file.
Changed
- If no model is provided on command-line interface (medaka consensus,
medaka_consensus, and medaka_haploid_variant) automatic attempts will be made
to choose the appropriate model.