A lightweight Python package for hydrological model calibration and evaluation, featuring the XinAnJiang (XAJ) model.
- Free software: GNU General Public License v3
- Documentation: https://OuyangWenyu.github.io/hydromodel
hydromodel is a Python implementation of conceptual hydrological models, with a focus on the XinAnJiang (XAJ) model - one of the most widely-used rainfall-runoff models, especially in China and Asian regions.
Key Features:
- XAJ Model Variants: Standard XAJ and optimized versions (xaj_mz with MizuRoute)
- Multiple Calibration Algorithms:
- SCE-UA: Shuffled Complex Evolution with spotpy
- GA: Genetic Algorithm with DEAP
- scipy: L-BFGS-B, SLSQP, and other gradient-based methods
- Multi-Basin Support: Efficient calibration and evaluation for multiple basins simultaneously
- Unified Results Format: All algorithms save results in standardized JSON + CSV format
- Comprehensive Evaluation Metrics: NSE, KGE, RMSE, PBIAS, and more
- Unified API: Consistent interfaces for calibration, evaluation, and simulation
- Flexible Data Integration: Seamless support for CAMELS datasets via hydrodataset and custom data via hydrodatasource
- Configuration-Based Workflow: YAML configuration for reproducibility
- Progress Tracking: Real-time progress display and intermediate results saving
For Researchers:
- Battle-tested XAJ implementations used in published research
- Configuration-based workflow ensures reproducibility
- Easy to extend with new models or calibration algorithms
For Practitioners:
- Simple YAML configuration, minimal coding required
- Handles multi-basin calibration efficiently
- Integration with global CAMELS series datasets (20+ variants)
- Clear documentation and examples
pip install hydromodel hydrodataset hydrodatasourceOr using uv (faster):
uv pip install hydromodel hydrodataset hydrodatasourceFor developers, it is recommended to use uv to manage the environment, as this project has local dependencies (e.g., hydroutils, hydrodataset, hydrodatasource).
-
Clone the repository:
git clone https://github.com/OuyangWenyu/hydromodel.git cd hydromodel -
Sync the environment with
uv: This command will install all dependencies, including the local editable packages.uv sync --all-extras
No configuration needed! hydromodel automatically uses default paths:
Default data directory:
- Windows:
C:\Users\YourUsername\hydromodel_data\ - macOS/Linux:
~/hydromodel_data/
The default structure (aqua_fetch automatically creates uppercase dataset directories):
~/hydromodel_data/
├── datasets-origin/
│ ├── CAMELS_US/ # CAMELS US dataset (created by aqua_fetch)
│ ├── CAMELS_AUS/ # CAMELS Australia dataset (if used)
│ └── ... # Other datasets
├── datasets-interim/ # Your custom basin data
└── ...
Create ~/hydro_setting.yml to specify custom paths:
local_data_path:
root: 'D:/data'
datasets-origin: 'D:/data' # For CAMELS datasets (aqua_fetch adds CAMELS_US automatically)
datasets-interim: 'D:/data/my_basins' # For custom dataImportant: For CAMELS datasets, provide only the datasets-origin directory. The system automatically appends the uppercase dataset directory name (e.g., CAMELS_US, CAMELS_AUS). If your data is in D:/data/CAMELS_US/, set datasets-origin: 'D:/data'.
Using CAMELS Datasets (hydrodataset):
Getting public datasets using hydrodataset
pip install hydrodatasetRun the following code to download data to your directory
from hydrodataset.camels_us import CamelsUs
# Auto-downloads if not found. Provide datasets-origin directory (e.g., "D:/data")
# aqua_fetch automatically appends dataset name, creating "D:/data/CAMELS_US/"
ds = CamelsUs(data_path)
basin_ids = ds.read_object_ids() # Get basin IDsNote: First-time download may take some time. The complete CAMELS dataset is approximately 70GB (including zipped and unzipped files).
Available datasets: please see README.md in hydrodataset
Using Custom Data (hydrodatasource):
For your own data to be read using hydrodatasource, it needs to be prepared in the format of selfmadehydrodataset :
pip install hydrodatasourceData structure:
/path/to/your_data_root/
└── my_custom_dataset/ # your dataset name
├── attributes/
│ └── attributes.csv
├── shapes/
│ └── basins.shp
└── timeseries/
├── 1D/ # One sub folder per time resolution (e.g. 1D/3h/1h)
│ ├── basin_01.csv
│ ├── basin_02.csv
│ └── ...
└── 1D_units_info.json # JSON file containing unit information
Required files and formats:
-
attributes/attributes.csv: Basin metadata with required columns
basin_id: Unique basin identifier (e.g., "basin_001")area: Basin area in km² (mapped tobasin_areainternally)- Additional columns: Any basin attributes (e.g., elevation, slope)
-
shapes/basins.shp: Basin boundary shapefiles (all 4 files required: .shp, .shx, .dbf, .prj)
- Must contain
BASIN_IDcolumn (uppercase) matching basin IDs in attributes.csv - Geometries: Polygon features defining basin boundaries
- Coordinate system: Any valid CRS (e.g., EPSG:4326 for WGS84)
- Must contain
-
timeseries/{time_scale}/{basin_id}.csv: Time series data for each basin
time: Datetime column (e.g., "2010-01-01")- Variable columns:
prcp,PET,streamflow(or your chosen variable names) - Format: CSV with header row
-
timeseries/{time_scale}_units_info.json: Variable units metadata
- JSON format:
{"variable_name": "unit"}(e.g.,{"prcp": "mm/day"}) - Must match variable names in time series files
- JSON format:
For detailed format specifications and examples, see:
- Data Guide - Complete guide for both CAMELS and custom data
- hydrodatasource documentation - Source package
configs/example_config_selfmade.yaml- Complete configuration example for custom datasets
Option 1: Use Command-Line Scripts (Recommended for Beginners)
We provide ready-to-use scripts for model calibration, evaluation, simulation, and visualization:
# 1. Calibration (saves config files by default)
python scripts/run_xaj_calibration.py --config configs/example_config.yaml
# 2. Evaluation on test period
python scripts/run_xaj_evaluate.py --calibration-dir results/xaj_mz_SCE_UA
# 3. Simulation with custom parameters (no calibration required!)
python scripts/run_xaj_simulate.py --config configs/example_simulate_config.yaml --param-file configs/example_xaj_params.yaml --plot
# 4. Visualization (time series plots with precipitation and streamflow)
python scripts/visualize.py --eval-dir results/xaj_mz_SCE_UA/evaluation_test
# Visualize specific basins
python scripts/visualize.py --eval-dir results/xaj_mz_SCE_UA/evaluation_test --basins 01013500
Configuration Files:
Edit the appropriate configuration file for your data type:
configs/example_config.yaml- For continuous time series data (e.g., CAMELS datasets)configs/example_config_selfmade.yaml- For custom data and flood event datasets
All configuration options work with the same unified API. For detailed flood event data usage, see Usage Guide - Flood Event Data.
Option 2: Use Python API (For Advanced Users)
from hydromodel.trainers.unified_calibrate import calibrate
from hydromodel.trainers.unified_evaluate import evaluate
config = {
"data_cfgs": {
"data_source_type": "camels_us",
"basin_ids": ["01013500"],
"train_period": ["1985-10-01", "1995-09-30"],
"test_period": ["2005-10-01", "2014-09-30"],
"warmup_length": 365,
"variables": ["precipitation", "potential_evapotranspiration", "streamflow"]
},
"model_cfgs": {
"model_name": "xaj_mz",
},
"training_cfgs": {
"algorithm_name": "SCE_UA",
"algorithm_params": {"rep": 5000, "ngs": 1000},
"loss_config": {"type": "time_series", "obj_func": "RMSE"},
"output_dir": "results",
"experiment_name": "my_experiment",
},
"evaluation_cfgs": {
"metrics": ["NSE", "KGE", "RMSE"],
},
}
results = calibrate(config) # Calibrate
evaluate(config, param_dir="results/my_experiment", eval_period="test") # EvaluateResults are saved in the results/ directory.
The unified API uses a configuration dictionary with four main sections:
config = {
"data_cfgs": {
"data_source_type": "camels_us", # Dataset type
"basin_ids": ["01013500"], # Basin IDs to calibrate
"train_period": ["1990-10-01", "2000-09-30"],
"test_period": ["2000-10-01", "2010-09-30"],
"warmup_length": 365, # Warmup days
"variables": ["precipitation", "potential_evapotranspiration", "streamflow"],
},
"model_cfgs": {
"model_name": "xaj_mz", # Model variant
"model_params": {
"source_type": "sources",
"source_book": "HF",
"kernel_size": 15, # Muskingum routing kernel
},
},
"training_cfgs": {
"algorithm_name": "GA", # Algorithm: SCE_UA, GA, or scipy
# Algorithm-specific parameters (choose one based on algorithm_name)
# For SCE-UA (Shuffled Complex Evolution):
"SCE_UA": {
"rep": 1000, # Iterations (5000+ recommended)
"ngs": 1000, # Number of complexes
"kstop": 500, # Stop if no improvement
"peps": 0.1, # Parameter convergence
"pcento": 0.1, # Percentage change allowed
"random_seed": 1234,
},
# For GA (Genetic Algorithm):
"GA": {
"pop_size": 80, # Population size
"n_generations": 50, # Generations (100+ recommended)
"cx_prob": 0.7, # Crossover probability
"mut_prob": 0.2, # Mutation probability
"random_seed": 1234,
},
# For scipy (gradient-based optimization):
"scipy": {
"method": "SLSQP", # L-BFGS-B, SLSQP, TNC, etc.
"max_iterations": 500, # Maximum iterations
},
"loss_config": {
"type": "time_series",
"obj_func": "RMSE", # RMSE, NSE, or KGE
},
"output_dir": "results",
"experiment_name": "my_exp",
"save_config": True, # Save config files to output directory (default: True)
},
"evaluation_cfgs": {
"metrics": ["NSE", "KGE", "RMSE", "PBIAS"],
},
}Configuration for custom datasets:
See configs/example_config_selfmade.yaml for a complete example. Custom datasets require additional parameters:
"data_cfgs": {
"dataset": "selfmadehydrodataset" # or "floodevent" for flood event data
"dataset_name": "my_basin_data" # Your dataset folder name (REQUIRED)
"time_unit": ["1D"] # Time resolution (e.g., ["1h"], ["3h"], ["1D"])
"datasource_kwargs":{ # Optional additional parameters
"offset_to_utc": False # Whether to convert local time to UTC
}
"is_event_data": True # Whether floodevent data
# ... other standard parameters (basin_ids, variables, periods, etc.)Key differences from CAMELS datasets:
dataset_name: Specifies your custom dataset folder name (required)time_unit: Must match the subdirectory names intimeseries/folderdatasource_kwargs: Optional parameters for data preprocessing
from hydromodel.trainers.unified_calibrate import calibrate
results = calibrate(config)Output: Calibration results saved to {output_dir}/{experiment_name}/
Saved files:
results/my_exp/
├── calibration_results.json # Best parameters for all basins (unified format)
├── {basin_id}_sceua.csv # SCE-UA detailed iteration history
├── {basin_id}_ga.csv # GA generation history with parameters
├── {basin_id}_scipy.csv # scipy iteration history with parameters
├── calibration_config.yaml # Configuration used (saved if save_config=True)
└── param_range.yaml # Parameter ranges for current model only (saved if save_config=True)
Notes:
calibration_results.json: Always saved, contains best parameterscalibration_config.yamlandparam_range.yaml: Only saved ifsave_config=True(default)param_range.yaml: Contains parameter ranges for the current model only (e.g., onlyxaj_mz, not all models)- In
calibration_config.yaml,param_range_fileis set to the actual saved path
Available algorithms:
SCE_UA/sceua: Shuffled Complex Evolution (recommended for global optimization)GA/genetic_algorithm: Genetic Algorithm with DEAP (flexible, handles complex landscapes)scipy/scipy_minimize: scipy.optimize methods (fast for smooth objectives)
from hydromodel.trainers.unified_evaluate import evaluate
# Evaluate on test period
test_results = evaluate(config, param_dir="results/my_exp", eval_period="test")
# Evaluate on training period
train_results = evaluate(config, param_dir="results/my_exp", eval_period="train")
# Evaluate on custom period
custom_results = evaluate(
config,
param_dir="results/my_exp",
eval_period="custom",
custom_period=["2010-10-01", "2015-09-30"]
)Output: Evaluation results in {param_dir}/evaluation_{period}/
basins_metrics.csv- Performance metricsbasins_norm_params.csv- Calibrated parameters (normalized [0,1])basins_denorm_params.csv- Denormalized parameters (physical values)xaj_mz_evaluation_results.nc- Full simulation results (NetCDF)
Parameter Loading Priority:
calibration_results.json(⭐ Recommended, works for all algorithms){basin_id}_ga.csv(GA algorithm CSV){basin_id}_scipy.csv(scipy algorithm CSV){basin_id}_sceua.csv(SCE-UA algorithm CSV){basin_id}_calibrate_params.txt(Legacy format)
Available metrics: NSE, KGE, RMSE, PBIAS, FHV, FLV, FMS
calibration_results.json structure:
{
"01013500": {
"convergence": "success",
"objective_value": 1.234567,
"best_params": {
"xaj": {
"K": 0.567890,
"B": 0.234567,
"IM": 0.045678,
...
}
},
"algorithm_info": {
"generations": 50,
"population_size": 80,
...
}
}
}CSV files (GA/scipy) structure:
generation,objective_value,param_K,param_B,param_IM,...
0,3.456,0.567,0.234,0.045,...
1,2.345,0.589,0.256,0.047,...Why two formats?
- JSON: Best parameters only, works with all algorithms, used by evaluation
- CSV: Full iteration/generation history, useful for convergence analysis
Important: Simulation does NOT require prior calibration!
UnifiedSimulator provides a flexible interface for running model simulations with any parameter values:
from hydromodel.trainers.unified_simulate import UnifiedSimulator
from hydromodel.datasets.unified_data_loader import UnifiedDataLoader
# Load data
data_loader = UnifiedDataLoader(config["data_cfgs"])
p_and_e, qobs = data_loader.load_data()
# Define parameters (from calibration, literature, or custom values)
parameters = {
"K": 0.75, "B": 0.25, "IM": 0.06,
"UM": 18.0, "LM": 80.0, "DM": 95.0,
# ... other parameters
}
# Create simulator
model_config = {
"model_name": "xaj_mz",
"parameters": parameters
}
simulator = UnifiedSimulator(model_config, basin_config)
# Run simulation
results = simulator.simulate(
inputs=p_and_e,
qobs=qobs,
warmup_length=365
)
# Extract results
qsim = results["qsim"] # Simulated streamflowCommand-line usage:
# Using custom parameters (works with any parameter values)
python scripts/run_xaj_simulate.py \
--config configs/example_simulate_config.yaml \
--param-file configs/example_xaj_params.yaml \
--output simulation_results.csv \
--plot
# Using calibrated parameters from SCE-UA (CSV format)
python scripts/run_xaj_simulate.py \
--param-file results/xaj_mz_SCE_UA/01013500_sceua.csv \
--plotUse cases:
- Parameter sensitivity analysis
- Model comparison
- Scenario testing with custom parameters
- Literature parameter validation
For detailed API documentation and advanced usage, see Usage Guide - Model Simulation.
hydromodel/
├── hydromodel/
│ ├── models/ # Model implementations
│ │ ├── xaj.py # Standard XAJ model
│ │ ├── gr4j.py # GR4J model
│ │ └── ...
│ ├── trainers/ # Calibration, evaluation, and simulation
│ │ ├── unified_calibrate.py # Calibration API
│ │ ├── unified_evaluate.py # Evaluation API
│ │ └── unified_simulate.py # Simulation API
│ └── datasets/ # Data preprocessing and visualization
│ ├── unified_data_loader.py # Data loader
│ ├── data_visualize.py # Visualization functions
│ └── ...
├── scripts/ # Command-line interface scripts
│ ├── run_xaj_calibration.py # Calibration script
│ ├── run_xaj_evaluate.py # Evaluation script
│ ├── run_xaj_simulate.py # Simulation script
│ └── visualize.py # Visualization CLI
├── configs/ # Configuration files
└── docs/ # Documentation
- Quick Start: docs/quickstart.md
- Usage Guide: docs/usage.md
- API Reference: https://OuyangWenyu.github.io/hydromodel
- Allen, R.G., L. Pereira, D. Raes, and M. Smith, 1998. Crop Evapotranspiration, Food and Agriculture Organization of the United Nations, Rome, Italy. FAO publication 56. ISBN 92-5-104219-5. 290p.
- Duan, Q., Sorooshian, S., and Gupta, V. (1992), Effective and efficient global optimization for conceptual rainfall-runoff models, Water Resour. Res., 28( 4), 1015– 1031, doi:10.1029/91WR02985.
- François-Michel De Rainville, Félix-Antoine Fortin, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. DEAP: a python framework for evolutionary algorithms. In Proceedings of the 14th annual conference companion on Genetic and evolutionary computation (GECCO '12). Association for Computing Machinery, New York, NY, USA, 85–92. DOI:https://doi.org/10.1145/2330784.2330799
- Houska T, Kraft P, Chamorro-Chavez A, Breuer L (2015) SPOTting Model Parameters Using a Ready-Made Python Package. PLoS ONE 10(12): e0145180. https://doi.org/10.1371/journal.pone.0145180
- Mizukami, N., Clark, M. P., Sampson, K., Nijssen, B., Mao, Y., McMillan, H., Viger, R. J., Markstrom, S. L., Hay, L. E., Woods, R., Arnold, J. R., and Brekke, L. D.: mizuRoute version 1: a river network routing tool for a continental domain water resources applications, Geosci. Model Dev., 9, 2223–2238, https://doi.org/10.5194/gmd-9-2223-2016, 2016.
- Zhao, R.J., Zhuang, Y. L., Fang, L. R., Liu, X. R., Zhang, Q. S. (ed) (1980) The Xinanjiang model, Hydrological Forecasting Proc., Oxford Symp., IAHS Publication, Wallingford, U.K.
- Zhao, R.J., 1992. The xinanjiang model applied in China. J Hydrol 135 (1–4), 371–381.
Related Projects:
- hydrodataset - CAMELS and other datasets
- hydrodatasource - Data preparation utilities
- torchhydro - PyTorch-based hydrological models
If you use hydromodel in your research, please cite:
@software{hydromodel,
author = {Ouyang, Wenyu},
title = {hydromodel: A Python Package for Hydrological Model Calibration},
year = {2025},
url = {https://github.com/OuyangWenyu/hydromodel}
}Contributions are welcome! For major changes, please open an issue first.
git clone https://github.com/OuyangWenyu/hydromodel.git
cd hydromodel
uv sync --all-extras
pytest tests/GNU General Public License v3.0 - see LICENSE file.
- Author: Wenyu Ouyang
- Email: [email protected]
- GitHub: https://github.com/OuyangWenyu/hydromodel
- Issues: https://github.com/OuyangWenyu/hydromodel/issues