Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Open-Athena/marin-dna

Repository files navigation

MarinDNA

Open development of genomic language models — data, modeling, and evaluation.

Inspired by Marin.

Experiments

Tracked as GitHub issues. See the experiment-labeled issues.

Leaderboard

Variant effect prediction leaderboards (under construction): openathena.ai/marin-dna.

Installation

uv sync
Optional installs (all opt-in)
Selector Purpose
--group dev Pre-commit, ruff, pytest, snakefmt.
--extra marin marin / marin-levanter / marin-iris / marin-zephyr / marin-rigging — for marin-launched DNA experiments under experiments/. Lives as an extra (not a group) so iris workers can install it via uv sync --extra marin.
--group enhancer-classification AlphaGenome-Pytorch, Lightning, py2bit — for the enhancer-classification training path.
--group alphagenome-eval AlphaGenome — for AlphaGenome eval pipelines.
--group aws-cli awscli for snakemake rules that shell out to aws s3 cp (e.g. evals/ldscore_download).

The marin extra and aws-cli group are mutually exclusive (awscli pins fsspec/s3fs older than marin's requirements). For TPU training under marin, also pass --extra tpu:

uv sync --extra marin --extra tpu

Development

# Install dev dependencies and pre-commit hooks
uv sync --group dev
uv run pre-commit install

# Run quality checks
uv run pre-commit run

# Run tests
uv run pytest

Project Structure

See AGENTS.md.

Community

Join the Marin Discord; MarinDNA discussion happens in the #dna channel.

Citation

If you find datasets, models, or experiments from this repo useful, please cite:

MarinDNA: open development of genomic language models. Open Athena, 2026. https://github.com/Open-Athena/marin-dna

BibTeX:

@misc{marin-dna,
  title  = {MarinDNA: open development of genomic language models},
  author = {{Open Athena}},
  year   = {2026},
  url    = {https://github.com/Open-Athena/marin-dna},
}

Releases

No releases published

Packages

 
 
 

Contributors