Biomni: A General-Purpose Biomedical AI Agent

Overview

This is the biomni repository for the NUS Red Cell Engineering Lab.

Project Structure

app/: Core application to run Biomni as a dockerized agent receiving messages via WebSocket.
The rest of the codebase is inherited from the original Biomni repository.
This repo is dockerized into 2 seperate images:
- biomni_env: The base conda environment with all dependencies installed. The Dockerfile is in biomni_env/Dockerfile.
- biomni_app: The app code that runs the websocket server and the agent. The Dockerfile is in app/Dockerfile.

Project Pipeline

Each time a user wants to start a new session, the frontend sends a request to the backend, which sends a request to PBS job scheduler in HPC cluster to start a new docker container with the biomni_app image.
The container starts an agent instance, all communication between the container and backend is done via standard input/output.
The frontend sends user messages to the backend via websocket, the backend forwards the messages to the container via standard input.
All files generated during the session are stored a shared volume mounted to the container. When the session ends, the container is destroyed but the files are kept in the shared volume.

Running the biomni_app image

First, make sure you have created a seperate conda environment in the HPC cluster, you can do so by running:

conda env create -f biomni_env/environment.yml
conda activate biomni_e1
pip install biomni --upgradepip install

This so that you can run the create_data_lake.py script to download the data lake (11GB) to the shared volume. You only need to do this once.

python create_data_lake.py

Next, pull the biomni_app and biomni_env images from Dockerhub:

docker pull uylulu/biomni_app:latest
docker pull uylulu/biomni_env:latest

Then, create a new directory to store the files generated during the session, this directory will be mounted to the container as a shared volume:

mkdir /path/to/generated_files

Finally, you can run the biomni_app image with the following command:

docker run -it -v /path/to/data_lake:/app/data -v /path/to/generated_files:/app/generated_files uylulu/biomni_app:latest

You can check if the container is running by using:

curl http://localhost:8000/health

[IMPORTANT] The other half of this README file is from the original Biomni repository:

Installation

Our software environment is massive and we provide a single setup.sh script to setup. Follow this file to setup the env first.

Then activate the environment E1:

conda activate biomni_e1

then install the biomni official pip package:

pip install biomni --upgrade

For the latest update, install from the github source version, or do:

pip install git+https://github.com/snap-stanford/Biomni.git@main

Lastly, configure your API keys using one of the following methods:

Click to expand

Option 1: Using .env file (Recommended)

Create a .env file in your project directory:

# Copy the example file
cp .env.example .env

# Edit the .env file with your actual API keys

Your .env file should look like:

# Required: Anthropic API Key for Claude models
ANTHROPIC_API_KEY=your_anthropic_api_key_here

# Optional: OpenAI API Key (if using OpenAI models)
OPENAI_API_KEY=your_openai_api_key_here

# Optional: Azure OpenAI API Key (if using Azure OpenAI models)
OPENAI_API_KEY=your_azure_openai_api_key
OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com/

# Optional: AI Studio Gemini API Key (if using Gemini models)
GEMINI_API_KEY=your_gemini_api_key_here

# Optional: groq API Key (if using groq as model provider)
GROQ_API_KEY=your_groq_api_key_here

# Optional: Set the source of your LLM for example:
#"OpenAI", "AzureOpenAI", "Anthropic", "Ollama", "Gemini", "Bedrock", "Groq", "Custom"
LLM_SOURCE=your_LLM_source_here

# Optional: AWS Bedrock Configuration (if using AWS Bedrock models)
AWS_BEARER_TOKEN_BEDROCK=your_bedrock_api_key_here
AWS_REGION=us-east-1

# Optional: Custom model serving configuration
# CUSTOM_MODEL_BASE_URL=http://localhost:8000/v1
# CUSTOM_MODEL_API_KEY=your_custom_api_key_here

# Optional: Biomni data path (defaults to ./data)
# BIOMNI_DATA_PATH=/path/to/your/data

# Optional: Timeout settings (defaults to 600 seconds)
# BIOMNI_TIMEOUT_SECONDS=600

Option 2: Using shell environment variables

Alternatively, configure your API keys in bash profile ~/.bashrc:

export ANTHROPIC_API_KEY="YOUR_API_KEY"
export OPENAI_API_KEY="YOUR_API_KEY" # optional if you just use Claude
export OPENAI_ENDPOINT="https://your-resource-name.openai.azure.com/" # optional unless you are using Azure
export AWS_BEARER_TOKEN_BEDROCK="YOUR_BEDROCK_API_KEY" # optional for AWS Bedrock models
export AWS_REGION="us-east-1" # optional, defaults to us-east-1 for Bedrock
export GEMINI_API_KEY="YOUR_GEMINI_API_KEY" #optional if you want to use a gemini model
export GROQ_API_KEY="YOUR_GROQ_API_KEY" # Optional: set this to use models served by Groq
export LLM_SOURCE="Groq" # Optional: set this to use models served by Groq

⚠️ Known Package Conflicts

Some Python packages are not installed by default in the Biomni environment due to dependency conflicts. If you need these features, you must install the packages manually and may need to uncomment relevant code in the codebase. See the up-to-date list and details in docs/known_conflicts.md.

Basic Usage

Once inside the environment, you can start using Biomni:

from biomni.agent import A1

# Initialize the agent with data path, Data lake will be automatically downloaded on first run (~11GB)
agent = A1(path='./data', llm='claude-sonnet-4-20250514')

# Execute biomedical tasks using natural language
agent.go("Plan a CRISPR screen to identify genes that regulate T cell exhaustion, generate 32 genes that maximize the perturbation effect.")
agent.go("Perform scRNA-seq annotation at [PATH] and generate meaningful hypothesis")
agent.go("Predict ADMET properties for this compound: CC(C)CC1=CC=C(C=C1)C(C)C(=O)O")

If you plan on using Azure for your model, always prefix the model name with azure- (e.g. llm='azure-gpt-4o').

Configuration Management

Biomni includes a centralized configuration system that provides flexible ways to manage settings. You can configure Biomni through environment variables, runtime modifications, or direct parameters.

from biomni.config import default_config
from biomni.agent import A1

# RECOMMENDED: Modify global defaults for consistency
default_config.llm = "gpt-4"
default_config.timeout_seconds = 1200

# All agents AND database queries use these defaults
agent = A1()  # Everything uses gpt-4, 1200s timeout

Note: Direct parameters to A1() only affect that agent's reasoning, not database queries. For consistent configuration across all operations, use default_config or environment variables.

For detailed configuration options, see the Configuration Guide.

PDF Generation

Generate PDF reports of execution traces:

from biomni.agent import A1

# Initialize agent
agent = A1(path='./data', llm='claude-sonnet-4-20250514')

# Run your task
agent.go("Your biomedical task here")

# Save conversation history as PDF
agent.save_conversation_history("my_analysis_results.pdf")

PDF Generation Dependencies:

Click to expand

For optimal PDF generation, install one of these packages:

# Option 1: WeasyPrint (recommended for best layout control)
# Conda environment (recommended)
conda install weasyprint

# System installation
brew install weasyprint  # macOS
apt install weasyprint   # Linux

# See [WeasyPrint Installation Guide](https://doc.courtbouillon.org/weasyprint/stable/first_steps.html) for detailed instructions.

# Option 2: markdown2pdf (Rust-based, fast and reliable)
# macOS:
brew install theiskaa/tap/markdown2pdf

# Windows/Linux (using Cargo):
cargo install markdown2pdf

# Or download prebuilt binaries from:
# https://github.com/theiskaa/markdown2pdf/releases/latest

# Option 3: Pandoc (pip installation)
pip install pandoc

MCP (Model Context Protocol) Support

Biomni supports MCP servers for external tool integration:

from biomni.agent import A1

agent = A1()
agent.add_mcp(config_path="./mcp_config.yaml")
agent.go("Find FDA active ingredient information for ibuprofen")

Built-in MCP Servers: For usage and implementation details, see the MCP Integration Documentation and examples in tutorials/examples/add_mcp_server/ and tutorials/examples/expose_biomni_server/.

Biomni-R0

Biomni-R0 is our first reasoning model for biology, built on Qwen-32B with reinforcement learning from agent interaction data. It's designed to excel at tool use, multi-step reasoning, and complex biological problem-solving through iterative self-correction.

🤗 Model: biomni/Biomni-R0-32B-Preview
📝 Technical Report: biomni.stanford.edu/blog/biomni-r0-technical-report

To use Biomni-R0 for agent reasoning while keeping database queries on your usual provider (recommended), run a local SGLang server and pass the model to A1() directly.

Launch SGLang with Biomni-R0:

python -m sglang.launch_server --model-path RyanLi0802/Biomni-R0-Preview --port 30000 --host 0.0.0.0 --mem-fraction-static 0.8 --tp 2 --trust-remote-code --json-model-override-args '{"rope_scaling":{"rope_type":"yarn","factor":1.0,"original_max_position_embeddings":32768}, "max_position_embeddings": 131072}'

Point the agent to your SGLang endpoint for reasoning:

from biomni.config import default_config
from biomni.agent import A1

# Database queries (indexes, retrieval, etc.) use default_config
default_config.llm = "claude-3-5-sonnet-20241022"
default_config.source = "Anthropic"

# Agent reasoning uses Biomni-R0 served via SGLang (OpenAI-compatible API)
agent = A1(
    llm="biomni/Biomni-R0-32B-Preview",
    source="Custom",
    base_url="http://localhost:30000/v1",
    api_key="EMPTY",
)

agent.go("Plan a CRISPR screen to identify genes regulating T cell exhaustion")

Biomni-Eval1

Biomni-Eval1 is a comprehensive evaluation benchmark for assessing biological reasoning capabilities across diverse tasks. It contains 433 instances spanning 10 biological reasoning tasks, from gene identification to disease diagnosis.

Tasks Included:

GWAS causal gene identification (3 variants)
Lab bench Q&A (2 variants)
Patient gene detection
Screen gene retrieval
GWAS variant prioritization
Rare disease diagnosis
CRISPR delivery method selection

Resources:

🤗 Dataset: biomni/Eval1
💻 Quick Start:

from biomni.eval import BiomniEval1

evaluator = BiomniEval1()
score = evaluator.evaluate('gwas_causal_gene_opentargets', 0, 'BRCA1')

🤝 Contributing to Biomni

Biomni is an open-science initiative that thrives on community contributions. We welcome:

🔧 New Tools: Specialized analysis functions and algorithms
📊 Datasets: Curated biomedical data and knowledge bases
💻 Software: Integration of existing biomedical software packages
📋 Benchmarks: Evaluation datasets and performance metrics
📚 Misc: Tutorials, examples, and use cases
🔧 Update existing tools: many current tools are not optimized - fix and replacements are welcome!

Check out this Contributing Guide on how to contribute to the Biomni ecosystem.

If you have particular tool/database/software in mind that you want to add, you can also submit to this form and the biomni team will implement them.

🔬 Call for Contributors: Help Build Biomni-E2

Biomni-E1 only scratches the surface of what’s possible in the biomedical action space.

Now, we’re building Biomni-E2 — a next-generation environment developed with and for the community.

We believe that by collaboratively defining and curating a shared library of standard biomedical actions, we can accelerate science for everyone.

Join us in shaping the future of biomedical AI agent.

Contributors with significant impact (e.g., 10+ significant & integrated tool contributions or equivalent) will be invited as co-authors on our upcoming paper in a top-tier journal or conference.
All contributors will be acknowledged in our publications.
More contributor perks...

Let’s build it together.

Tutorials and Examples

Biomni 101 - Basic concepts and first steps

More to come!

🌐 Web Interface

Experience Biomni through our no-code web interface at biomni.stanford.edu.

Important Note

Security warning: Currently, Biomni executes LLM-generated code with full system privileges. If you want to use it in production, please use in isolated/sandboxed environments. The agent can access files, network, and system commands. Be careful with sensitive data or credentials.
This release was frozen as of April 15 2025, so it differs from the current web platform.
Biomni itself is Apache 2.0-licensed, but certain integrated tools, databases, or software may carry more restrictive commercial licenses. Review each component carefully before any commercial use.

Cite Us

@article{huang2025biomni,
  title={Biomni: A General-Purpose Biomedical AI Agent},
  author={Huang, Kexin and Zhang, Serena and Wang, Hanchen and Qu, Yuanhao and Lu, Yingzhou and Roohani, Yusuf and Li, Ryan and Qiu, Lin and Zhang, Junze and Di, Yin and others},
  journal={bioRxiv},
  pages={2025--05},
  year={2025},
  publisher={Cold Spring Harbor Laboratory}
}

Name		Name	Last commit message	Last commit date
Latest commit History 415 Commits
app		app
biomni		biomni
biomni_env		biomni_env
docs		docs
figs		figs
tutorials		tutorials
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTION.md		CONTRIBUTION.md
DETAILS.md		DETAILS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
create_data_lake.py		create_data_lake.py
license_info.md		license_info.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Biomni: A General-Purpose Biomedical AI Agent

Overview

Project Structure

Project Pipeline

Running the biomni_app image

[IMPORTANT] The other half of this README file is from the original Biomni repository:

Installation

Option 1: Using .env file (Recommended)

Option 2: Using shell environment variables

⚠️ Known Package Conflicts

Basic Usage

Configuration Management

PDF Generation

MCP (Model Context Protocol) Support

Biomni-R0

Biomni-Eval1

🤝 Contributing to Biomni

🔬 Call for Contributors: Help Build Biomni-E2

Tutorials and Examples

🌐 Web Interface

Important Note

Cite Us

About

Uh oh!

Releases

Packages

Languages

License

redcellengineeringlab/Biomni

Folders and files

Latest commit

History

Repository files navigation

Biomni: A General-Purpose Biomedical AI Agent

Overview

Project Structure

Project Pipeline

Running the biomni_app image

[IMPORTANT] The other half of this README file is from the original Biomni repository:

Installation

Option 1: Using .env file (Recommended)

Option 2: Using shell environment variables

⚠️ Known Package Conflicts

Basic Usage

Configuration Management

PDF Generation

MCP (Model Context Protocol) Support

Biomni-R0

Biomni-Eval1

🤝 Contributing to Biomni

🔬 Call for Contributors: Help Build Biomni-E2

Tutorials and Examples

🌐 Web Interface

Important Note

Cite Us

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages