Important: This is the OriGene, an open‑source, self-evolving multi-agent system that acts as a virtual disease biologist. We also introduce the TRQA Benchmark — a benchmark of 1,921 expert-level questions for evaluating biomedical AI agents. The product was launched at the 2025 WAIC!
| Try OriGene | Paper | Code | Hugging Face Benchmark |
1.Public online Launch – OriGene is now live and available to try at https://origene.lglab.ac.cn/.
2.Open‑Source Release – The entire OriGene codebase and benchmark is now available, Fork away!
3.Officially presented at the 2025 World Artificial Intelligence Conference (WAIC).
Therapeutic target discovery remains one of the most critical yet intuition-driven stages in drug development. We present OriGene, a self-evolving multi-agent system that functions as a virtual disease biologist to
identify and prioritize therapeutic targets at scale.
-
Deploy the MCP Server
OriGene relies on the MCP Server, which aggregates more than 600 bioinformatics tools. Follow the guidelines in Origene MCP to deploy the MCP service and record the server endpoint (for example, http://127.0.0.1:8788). -
Configure OriGene
Editsrc/local_deep_research/_settings/.secrets.tomland fill in the MCP server URL together with your LLM API keys. Because OriGene is model-agnostic, you can freely switch between different base models or customize additional settings.
[mcp]
server_url = "Enter your mcp url"
[embedding]
api_key = "Enter your api key (match url : https://api.siliconflow.cn/v1/embeddings)"
cache = "embedding_cache.pkl"
[template]
api_base = "https://ark.cn-beijing.volces.com/api/v3"
api_key = "Enter your api key"
[openai]
api_base = "https://api.openai-proxy.org/v1"
api_key = "Enter your api key"
[deepseek]
api_base = "https://api.deepseek.com"
api_key = "Enter your api key"- Install dependencies
cd src
uv sync- Activate the virtual environment
source ./.venv/bin/activate- (Optional) Add the project root to
PYTHONPATH
export PYTHONPATH=$(pwd):$PYTHONPATHLaunch the interactive assistant:
uv run -m local_deep_research.mainYou will see a prompt similar to the following:
Welcome to the Advanced Research System
Type 'quit' to exit
Select output type:
1) Analysis (few minutes, answers questions, summarizes findings)
2) Detailed Report (more time, generates a comprehensive report with deep analysis)
Enter number (1 or 2):
After selecting an output type, enter your research query and OriGene will return the results.
Run the benchmark to generate agent answers (you can use either command):
# From the project root (this directory), after activating the venv
uv run -m local_deep_research.evaluate_local
# Or using python
python -m local_deep_research.evaluate_localThen score the generated results (replace paths if you changed dataset/output names):
# Example: score TRQA-lit-choice core set results
python local_deep_research/score_evaluation_results.py \
--agent_results benchmark/TRQA_lit_choice/agent_answers_test.txt \
--original_data benchmark/TRQA_lit_choice/TRQA-lit-choice-172-coreset.csv \
--model_name "OriAgent"
# Or using uv
uv run -m local_deep_research.score_evaluation_results \
--agent_results benchmark/TRQA_lit_choice/agent_answers_test.txt \
--original_data benchmark/TRQA_lit_choice/TRQA-lit-choice-172-coreset.csv \
--model_name "OriAgent"To evaluate performance, we constructed TRQA, a benchmark of 1,921 questions specific to therapeutic target identification tasks across multiple disease areas.
Target Research-related Question Answering (TRQA) is a comprehensive evaluation benchmark designed to assess the capabilities of OriGene and similar systems in biomedical reasoning and target discovery.
TRQA evaluates core competencies including:
- Scientific planning
- Information retrieval
- Tool selection
- Reasoning toward biological conclusions
- Critical self-evolution
It spans domains such as fundamental biology, disease biology, pharmacology, and clinical medicine, integrating both scientific literature and real-world data from drug development pipelines and clinical trials.
TRQA includes two subsets:
- TRQA-lit: Focuses on recent research findings. Includes 172 multiple-choice questions (for rapid model/human comparison) and 1,108 short-answer questions covering key biomedical areas.
- TRQA-db: Centers on competitive landscape analysis. Includes 641 short-answer questions that evaluate the ability to retrieve, integrate, and reason over data related to drug R&D and clinical trials.
Target Research-related Question Answering (TRQA) benchmark leader board
| Method | TRQA-lit Choice (Core Set) | TRQA-lit Short-Answer | TRQA-db |
|---|---|---|---|
| OriGene | 0.601 | 0.826 | 0.721 |
| o3-mini | 0.578 | 0.720 | 0.487 |
| Claude-3.7-Sonnet | 0.558 | 0.695 | 0.504 |
| DeepSeek-R1 | 0.548 | 0.714 | 0.446 |
| DeepSeek-V3 | 0.541 | 0.768 | 0.466 |
| GPT-4o-search | 0.531 | 0.651 | 0.493 |
| Gemini-2.5-pro | 0.529 | 0.678 | 0.359 |
| GPT-4o | 0.512 | 0.696 | 0.392 |
| TxAgent | 0.190 | 0.472 | 0.426 |
| Human Group 3 (PhD + 3-5 year exp.) | 0.523 | ✗ | ✗ |
| Human Group 2 (PhD + 1-3 year exp.) | 0.378 | ✗ | ✗ |
| Human Group 1 (senior PhD candidates) | 0.215 | ✗ | ✗ |
OriGeneTools integrates over 600 tools to support target discovery and biomedical reasoning.
-
On the left, tools are grouped by multi-omics domians (e.g., genomics, transcriptomics, proteomics, phenomics, clinical evidence), highlighting OriGene’s ability to process biological data across scales.
-
On the right, the same tools are reorganized by biomedical knowledge domains: fundamental biology, disease biology, pharmacology, and competitive landscape, reflecting how OriGene supports expert-level reasoning across diverse therapeutic tasks.
Any publication that discloses findings arising from using this source code, the model parameters or outputs produced by those should cite:
@article{origene,
title={{OriGene}: A Self-Evolving Virtual Disease Biologist Automating Therapeutic Target Discovery},
author={Zhang, Zhongyue and Qiu, Zijie and Wu, Yingcheng and Li, Shuya and Wang, Dingyan and Zhou, Zhuomin and An, Duo and Chen, Yuhan and Li, Yu and Wang, Yongbo and Ou, Chubin and Wang, Zichen and Chen, Jack Xiaoyu and Zhang, Bo and Hu, Yusong and Zhang, Wenxin and Wei, Zhijian and Ma, Runze and Liu, Qingwu and Dong, Bo and He, Yuexi and Feng, Qiantai and Bai, Lei and Gao, Qiang and Sun, Siqi and Zheng, Shuangjia},
journal={bioRxiv},
year={2025},
publisher={Cold Spring Harbor Laboratory}
}
This code repository is licensed under the Creative Commons Attribution-Non-Commercial ShareAlike International License, Version 4.0 (CC-BY-NC-SA 4.0) (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://github.com/GENTEL-lab/OriGene/blob/main/LICENSE.
If you have any questions, please raise an issue or contact us at [email protected] or [email protected].
Thanks to DeepSeek, ChatGPT, Claude, and Gemini for providing powerful language models that made this project possible.
Special thanks to the human experts who assisted us in benchmarking and evaluating the agent's performance!