Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add MultiHorizonTimeSeriesSplit for Multi-Horizon Time Series Cross-Validation #31344

Open
@andrelrodriguess

Description

@andrelrodriguess

Describe the workflow you want to enable

The current TimeSeriesSplit in scikit-learn supports cross-validation for time series data with a single prediction horizon per split, which limits its use for scenarios requiring forecasts over multiple future steps (e.g., predicting 1, 3, and 5 days ahead). I propose adding a new class, MultiHorizonTimeSeriesSplit, to enable cross-validation with multiple prediction horizons in a single split.

This would allow users to:

  • Specify a list of horizons (e.g., [1, 3, 5]) to generate train-test splits where the test set includes indices for multiple future steps.
  • Evaluate time series models for short, medium, and long-term forecasts simultaneously.
  • Simplify workflows for applications like demand forecasting, financial modeling, or weather prediction, avoiding manual splitting.

Example usage with daily temperatures:

from sklearn.model_selection import MultiHorizonTimeSeriesSplit
import numpy as np

# Daily temperatures for 10 days (in °C)
X = np.array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29])
cv = MultiHorizonTimeSeriesSplit(n_splits=2, horizons=[1, 2])
for train_idx, test_idx in cv.split(X):
    print(f"Train indices: {train_idx}, Test indices: {test_idx}")

Expected output:

Train indices: [0 1 2 3 4], Test indices: [5 6]
Train indices: [0 1 2 3 4 5 6], Test indices: [7 8]

Describe your proposed solution

I propose implementing a new class, MultiHorizonTimeSeriesSplit, inheriting from TimeSeriesSplit. The class will:

  • Add a horizons parameter (list of integers) to specify prediction steps.
  • Modify the split method to generate test indices for each horizon while preserving temporal order.
  • Include input validation to ensure valid horizons and splits.

To ensure the correctness of MultiHorizonTimeSeriesSplit, we will develop unit tests covering various configurations and edge cases. For benchmarking, we will assess the computational efficiency and correctness of the new class compared to manual splitting. We will use synthetic time series to evaluate scalability and measure split generation time and memory usage, running tests on a personal laptop.

Describe alternatives you've considered, if relevant

No response

Additional context

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions