We present the implementation for reproducing the results of our work. The pipeline consists of two main stages:
1. Data Preparation (via DataFlow)
We use the DataFlow framework for dataset preparation. Please follow the steps below:
# Step 1: pip install OpenDataFlow enviroment
pip install open-dataflow
# Step 2: Prepare the data using the provided scripts
python mathfusion.pyMake sure the processed dataset follows the required format expected by our training pipeline.
We will release the training scripts and configuration files shortly.