A Genomic Language Model for Detecting WGA Chimeric Artifacts
Installation • Quick Start • Web Demo • Documentation • Citation
A deep learning-powered tool to identify chimeric artifacts introduced by whole genome amplification (WGA).
No installation required! Try ChimeraLM instantly in your browser:
🤗 Launch Web Demo on Hugging Face Spaces
Perfect for:
- 🧪 Testing with individual sequences
- 📊 Visualizing prediction confidence scores
- 🎓 Learning about chimeric artifact detection
- 🔬 Quick validation before batch processing
For production use with BAM files and batch processing, install the CLI tool below.
pip install chimeralmRequirements: Python 3.10, 3.11 and 3.12
For GPU support, installation instructions, and troubleshooting, see the Installation Guide.
# Predict chimeric reads (CPU)
chimeralm predict your_data.bam
# Predict with GPU acceleration
chimeralm predict your_data.bam --gpus 1 --batch-size 24
# Filter BAM to remove chimeric reads
chimeralm filter your_data.bam your_data.predictionsOutput:
- Predictions: Tab-separated file with read names and labels (0=biological, 1=chimeric)
- Filtered BAM:
{input}.filtered.sorted.bamwith chimeric reads removed
Need more help? See the Quick Start Tutorial for a complete walkthrough.
Full documentation is available at ylab-hi.github.io/ChimeraLM
Key Resources:
- Installation Guide - Setup with pip, conda, uv, or from source
- Quick Start Tutorial - Your first prediction in 15 minutes
- CLI Reference - Complete command documentation
- BAM Filtering Tutorial - Comprehensive filtering guide
- Performance Optimization - Speed up your analysis
- Troubleshooting - Common issues and solutions
- 🌐 Interactive Web Demo: Try it online at HuggingFace Spaces - no installation needed!
- 🎯 High Accuracy: Deep learning model trained on real WGA data
- ⚡ GPU Accelerated: Optimized for CUDA, MPS (Apple Silicon), and CPU
- 🚀 Easy to Use: Simple CLI with sensible defaults
- 📦 Fast Processing: Batch inference with configurable parallelism
- 🖥️ Local Web Interface: Run the web UI locally with
chimeralm ui - 🏭 Production Ready: Includes filtering, sorting, and indexing of BAM files
Contributions are welcome! See our Contributing Guide for development setup and guidelines.
If you use ChimeraLM in your research, please cite:
@software{chimeralm2025,
title={ChimeraLM: A genomic language model to identify chimera artifacts},
author={Li, Yangyang and Guo, Qingxiang and Yang, Rendong},
year={2025},
url={https://github.com/ylab-hi/ChimeraLM}
}Apache License 2.0 - see LICENSE for details.