Releases: TimoLassmann/kalign
Kalign 3.5.1
Bugfix release following v3.5.0. All new features from 3.5.0 are included — this release fixes issues found during the initial release.
What's new in 3.5 (since 3.4)
- Three alignment modes:
default(balanced),fast(same speed as v3.4),precise(ensemble, highest accuracy) - Ensemble alignment (
--preciseor--ensemble N): multiple runs with varied parameters, POAR consensus - Per-column confidence scores from ensemble mode
- Alignment refinement (
--refine): iterative realignment for improved accuracy - PFASUM substitution matrices (
--type pfasum,pfasum43,pfasum60) - Variable Scoring Matrix (VSM): adaptive scoring based on sequence divergence
- Python package on PyPI:
pip install kalign-python
Fixes in 3.5.1
- Fix memory leak in
build_tree_from_pairwise(affected realign and ensemble modes) - Move
seabornfrom core dependency to optional (pip install kalign-python[analysis]) - Benchmark workflow handles unavailable datasets gracefully
- Code formatting fixes
Install
Python
pip install kalign-python
CMake (from source)
mkdir build && cd build
cmake ..
make
make test
sudo make install
Zig (cross-compilation)
zig build
Homebrew (macOS)
brew install kalign
Quick start
Command line
kalign input.fasta -o aligned.fasta
# Fast mode (same as v3.4)
kalign --fast input.fasta -o aligned.fasta
# Precise mode (ensemble, highest accuracy)
kalign --precise input.fasta -o aligned.fasta
Python
import kalign
# Default mode
aligned = kalign.align(["MKTAYIAK...", "MKTAYIAKQ..."])
# Precise mode (ensemble)
aligned = kalign.align(sequences, mode="precise")Kalign v3.4.6-rc1 (Pre-release)
Added Python API
Version 3.4.0
- Added a simple sequence simulator for testing
- Fixed an issue where aligning the same set of sequences with different number of threads would result in slightly different alignments.
Minor feature update
Kalign now detects and ignores empty entries in fasta inputs.
Version 3.3.4: switched to cmake
Switched to cmake
Additional changes:
- added a Kalign library to make it easier to use Kalign from another projects
- added a block version of Gene Myers bit parallel string matching code (described here: Myers, Gene. "A fast bit-vector algorithm for approximate string matching based on dynamic programming." Journal of the ACM (JACM) 46.3 (1999): 395-415). This means Kalign will now run equivalently on processors with and without AVX2 instructions (e.g. apple M1 / M2 and ARM chips).
- alignment types giving users more control over alignment parameters.
- multi-threading
Kalign v3.3.2
version 3.3.2 - Bug Fix
There was a bug in building a guide tree from highly similar sequences. The fix
involved distributing identical sequences equally among branches. The problem only happened
in cases when there were thousands of identical sequences.
In addition Kalign now compiles on Apple's M1 chip and possibly on other ARM architectures
as well (although I did not test the latter).
v3.3.1 - Minor improvements
The previous version kalign checked the top 50 sequences in inputs to determine
whether the sequences are aligned or not. If the first 50 sequences are not aligned
but following sequences contain gaps (or other characters!) kalign can crash. In this
version (3.3.1) kalign checks all sequences, thereby avoiding this issue.
To alert users to the situation described above and to warn users about the presence of
odd characters, kalign now produces a warning message like this:
[Date Time] : LOG : Start io tests.
[Date Time] : LOG : reading: dev/data/a2m.good.1
[Date Time] : LOG : Detected protein sequences.
[Date Time] : WARNING : -------------------------------------------- (rwalign.c line 505)
[Date Time] : WARNING : The input sequences contain gap characters: (rwalign.c line 506)
[Date Time] : WARNING : "-" : 36 found (rwalign.c line 510)
[Date Time] : WARNING : BUT the sequences do not seem to be aligned! (rwalign.c line 514)
[Date Time] : WARNING : (rwalign.c line 515)
[Date Time] : WARNING : Kalign will remove the gap characters and (rwalign.c line 516)
[Date Time] : WARNING : align the sequences. (rwalign.c line 517)
[Date Time] : WARNING : -------------------------------------------- (rwalign.c line 518)
v3.3 - Multi-threading
Kalign now runs pairwise distance estimation, guide tree building and alignments in parallel.
Memory optimisations.
Optimised bi-sectional K-means algorithm.
added -clean option to check for sequences with identical names but different sequences.
fixed minor bug in alignment I/O module
v3.2.3
Replaced timing code.
v3.2.2
Minor bugs fixed. Including:
- bug in alignment write test