This repository implements an improved Monte Carlo Tree Search (MCTS) framework tailored for symbolic regression. The primary enhancements to the standard MCTS procedure are:
- Extreme Bandit Allocation Strategy – an adaptive sampling scheme (UCB-extreme) to allocate more simulations to regions with high potential for optimal expressions.
- Evolution‑Inspired State‑Jumping Actions – incorporating genetic‑programming‑style mutations and crossovers within the MCTS to diversify exploration and accelerate convergence.
The main implementation of the algorithm resides in the iMCTS directory.
Key features:
-
Custom Reward Functions
Users may supply arbitrary reward functions, provided that each returns a scalar in the interval [0, 1] and accepts three arguments:x(input variable)y(target variable)f(candidate symbolic expression)
-
Support for Complex Constants
Expressions may include complex‑valued constants. In the symbol set,Rdenotes the real subspace andCthe complex subspace.
- Python ≥ 3.11.5
To install all required dependencies, execute:
pip install -r requirements.txt| Script | Description |
|---|---|
benchmark_runner.py |
Runs the MCTS algorithm on a suite of symbolic‑regression benchmarks. Outputs are saved under results/. |
demo.py |
Demonstrates a run on the Nguyen-7 benchmark with default settings. You may modify the equation being tested. |
Refer to demo.py; you can run the algorithm as follows:
var_num = 1
X = np.random.uniform(0, 2, (var_num, 20))
def f(x):
return np.log(x[0] + 1) + np.log(x[0]**2 + 1)
Y = f(X)
model = Regressor(
x_train=X,
y_train=Y,
ops=['+', '-', '*', '/', 'sin', 'cos', 'exp', 'log'], # If constants are needed, add 'R'
verbose=True, # Prints detailed runtime logs
)
sym_exp, vec_exp, evaluations, path = model.fit()The four outputs above correspond to, respectively, the simplified expression, the unsimplified expression, the number of valid equations generated by the algorithm, and the corresponding symbol sequence. Note that in the implementation of the fit method in iMCTS/regressor.py, the maximum runtime is limited to 48 hours; you may modify this as needed.
To evaluate performance on the Nguyen benchmark suite, for example, run:
python benchmark_runner.py --benchmark Nguyen --config ./configs/basic.yamlBy default, this command uses 10 parallel processes. You can modify this setting in ./benchmarks/run.py.
The --benchmark option may be set to any of the following:
NguyenNguyenCJinLivermoreBlackbox
Important: Prior to using the
Blackboxbenchmark, download thedatasets/directory from the PMLB repository and place it in the project root.
Configuration files reside in the configs/ directory. You may adjust algorithm parameters (e.g. exploration constant, mutation rate, UCB-extreme parameters) by editing the corresponding YAML file.
Zhengyao Huang, Daniel Zhengyu Huang, Tiannan Xiao, Dina Ma, Zhenyu Ming, Hao Shi, Yuanhui Wen. "Improving Monte Carlo Tree Search for Symbolic Regression."