Mod-Minimizer-Minimap2

This is a fork of minimap2 that replaces the minimizer algorithm with the mod-minimizer scheme, based on the paper:

The mod-minimizer: a simple and efficient sampling algorithm for long k-mers
Ragnar Groot Koerkamp, Giulio Ermanno Pibiri
bioRxiv 2024.05.25.595898; doi: 10.1101/2024.05.25.595898

Changes

This fork modifies the minimap2 code to implement the mod-minimizer algorithm for finding (w,k)-minimizers on DNA sequences. The mod-minimizer scheme provides a simple and efficient sampling algorithm for long k-mers, improving performance in certain applications.

Note: This implementation reduces the minimizer density. If you wish to achieve approximately the same minimizer density as Minimap2's default settings, it is recommended to use the flag -w 8.

Mod-Minimizer Algorithm Overview

The mod-minimizer algorithm finds (w,k)-minimizers on a DNA sequence using the following procedure:

Notation:

tmer: A newly constructed t-mer by removing the first base and appending the new base.
W: Set of t-mers in the current window.
tmer_i: The i-th t-mer of the window.
kmer_i: The i-th k-mer of the window.
h(a): Hash of sequence a.
rc(a): Reverse complement of sequence a.
pos_W(a): Position of a within window W (0-indexed).
M: List of minimizers.

Procedure:

Using a sliding window, construct the entering t-mer:

Update Window: Remove the oldest t-mer from W:
W = W \ {W_0}
Compute t-mer Info:
info = min( h(tmer), h(rc(tmer)) )
Add t-mer to Window:
W = W ∪ {info}
Find Minimal t-mer:
min = min(W)
Compute Position:
p = pos_W(min) mod w
Select k-mer:
kmer = min( h(kmer_p), h(rc(kmer_p)) )
Update Minimizers:
M = M ∪ {kmer}

For more details on the algorithm and its implementation, please refer to the original paper.

Additional Information

For all other details, usage instructions, and documentation, please refer to the original minimap2 repository. The version of the base Minimap2 program used for this modification and the corresponding benchmarks is Release 2.28-r1209 (27 March 2024).

Name		Name	Last commit message	Last commit date
Latest commit History 1,154 Commits
.github/workflows		.github/workflows
lib		lib
misc		misc
python		python
sse2neon		sse2neon
test		test
tex		tex
.gitignore		.gitignore
.gitmodules		.gitmodules
FAQ.md		FAQ.md
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
Makefile		Makefile
Makefile.simde		Makefile.simde
NEWS.md		NEWS.md
README.md		README.md
align.c		align.c
bseq.c		bseq.c
bseq.h		bseq.h
code_of_conduct.md		code_of_conduct.md
cookbook.md		cookbook.md
esterr.c		esterr.c
example.c		example.c
format.c		format.c
hit.c		hit.c
index.c		index.c
kalloc.c		kalloc.c
kalloc.h		kalloc.h
kdq.h		kdq.h
ketopt.h		ketopt.h
khash.h		khash.h
krmq.h		krmq.h
kseq.h		kseq.h
ksort.h		ksort.h
ksw2.h		ksw2.h
ksw2_dispatch.c		ksw2_dispatch.c
ksw2_extd2_sse.c		ksw2_extd2_sse.c
ksw2_exts2_sse.c		ksw2_exts2_sse.c
ksw2_extz2_sse.c		ksw2_extz2_sse.c
ksw2_ll_sse.c		ksw2_ll_sse.c
kthread.c		kthread.c
kthread.h		kthread.h
kvec.h		kvec.h
lchain.c		lchain.c
main.c		main.c
map.c		map.c
minimap.h		minimap.h
minimap2.1		minimap2.1
misc.c		misc.c
mmpriv.h		mmpriv.h
options.c		options.c
pe.c		pe.c
pyproject.toml		pyproject.toml
sdust.c		sdust.c
sdust.h		sdust.h
seed.c		seed.c
setup.py		setup.py
sketch.c		sketch.c
splitidx.c		splitidx.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mod-Minimizer-Minimap2

Changes

Mod-Minimizer Algorithm Overview

Additional Information

About

Uh oh!

Releases

Packages

Languages

License

marcikque/minimap2_mod-mini

Folders and files

Latest commit

History

Repository files navigation

Mod-Minimizer-Minimap2

Changes

Mod-Minimizer Algorithm Overview

Additional Information

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages