Official source code of KDD 2025 paper: AlphaAgent: LLM-Driven Alpha Mining with Regularized Exploration to Counteract Alpha Decay
AlphaAgent is an autonomous framework that effectively integrates LLM agents for mining interpretable and decay-resistant alpha factors through three specialized agents.
- Idea Agent: Proposes market hypotheses to guide factor creation based on financial theories or emerging trends.
- Factor Agent: Constructs factors based on hypotheses while incorporating regularization mechanisms to avoid duplication and overfitting.
- Eval Agent: Validates practicality, performs backtesting, and iteratively refines factors via feedback loops.
This repository follows the implementation of RD-Agent. You can find its repository at: https://github.com/microsoft/RD-Agent. We would like to extend our sincere gratitude to the RD-Agent team for their pioneering work and contributions to the community.
- Create a new conda environment with Python (3.10 and 3.11 are well-tested in our CI):
conda create -n alphaagent python=3.10
- Activate the environment:
conda activate alphaagent
-
# Install AlphaAgent pip install -e .
-
First, clone Qlib source code for runing backtest locally.
# Clone Qlib source code git clone https://github.com/microsoft/qlib.git cd qlib pip install . cd .. -
Then, mannully download Chinese stock data via baostock and dump into the Qlib format.
# Download or update stock data from 2015-01-01 until NOW from baostock python prepare_cn_data.py cd qlib # Convert csv to Qlib format. Check correct paths before runing. python scripts/dump_bin.py dump_all ... \ --include_fields open,high,low,close,preclose,volume,amount,turn,factor \ --csv_path ~/.qlib/qlib_data/cn_data/raw_data_now \ --qlib_dir ~/.qlib/qlib_data/cn_data \ --date_field_name date \ --symbol_field_name code # Collect calendar data python scripts/data_collector/future_calendar_collector.py --qlib_dir ~/.qlib/qlib_data/cn_data/ --region cn # Download the CSI500/CSI300/CSI100 stock universe python scripts/data_collector/cn_index/collector.py --index_name CSI500 --qlib_dir ~/.qlib/qlib_data/cn_data/ --method parse_instruments
-
Alternatively, stock data (out-dated) will be automatically downloaded to
~/.qlib/qlib_data/cn_data. -
You can modify backtest configuration files which are located at:
- Baseline:
alphaagent/scenarios/qlib/experiment/factor_template/conf.yaml - For Newly proposed factors:
alphaagent/scenarios/qlib/experiment/factor_template/conf_cn_combined.yaml - For changing train/val/test periods, first remove all cache files in
./git_ignore_folderand./pickle_cache. - For changing the market, remove cache files in
./git_ignore_folder,./pickle_cache. Then, deletedaily_pv_all.h5anddaily_pv_debug.h5in directoryalphaagent/scenarios/qlib/experiment/factor_data_template/.
- Baseline:
- For OpenAI compatible API, ensure both
OPENAI_BASE_URLandOPENAI_API_KEYare configured in the.envfile. REASONING_MODELis used in the idea agent and factor agent, whileCHAT_MODELis for debugging factors and generating feedbacks.- Slow-thinking models, such as o3-mini are preferred for the
REASONING_MODEL. - To run the project in a local environment (instead of Docker), add
USE_LOCAL=Trueto the.envfile.
-
Run AlphaAgent based on Qlib Backtesting Framework.
alphaagent mine --potential_direction "<YOUR_MARKET_HYPOTHESIS>" -
Alternatively, run the following command
dotenv run -- python alphaagent/app/qlib_rd_loop/factor_alphaagent.py --direction "<YOUR_MARKET_HYPOTHESIS>"After running the command, log out and log back in for the changes to take effect.
-
Multi-factor backtesting
alphaagent backtest --factor_path "<PATH_TO_YOUR_CSV_FILE>"Your factors need to be stored in a
.csvfile. Here is an example:factor_name,factor_expression MACD_Factor,"MACD($close)" RSI_Factor,"RSI($close)" -
If you need to rerun the baseline results or update backtest configs, remove the cache folders:
rm -r ./pickle_cache/* rm -r ./git_ignore_folder/*
- You can run the following command for our demo program to see the run logs. Note than the entrance is deprecated.
alphaagent ui --port 19899 --log_dir log/
If you find this work helpful, please cite our paper:
@misc{tang2025alphaagentllmdrivenalphamining,
title={AlphaAgent: LLM-Driven Alpha Mining with Regularized Exploration to Counteract Alpha Decay},
author={Ziyi Tang and Zechuan Chen and Jiarui Yang and Jiayao Mai and Yongsen Zheng and Keze Wang and Jinrui Chen and Liang Lin},
year={2025},
eprint={2502.16789},
archivePrefix={arXiv},
primaryClass={cs.CE},
url={https://arxiv.org/abs/2502.16789},
}