Add TimeSeriesCV and HomogeneousTimeSeriesCV

I get this asked about once a day, so I think we should just add it.
Many people work with time series, and adding cross-validation for them would be really easy.
The standard strategy is described for example [here](http://stats.stackexchange.com/questions/14099/using-k-fold-cross-validation-for-time-series-model-selection)

There are basically two cases: homogeneous time series (one sample every X seconds / days), or heterogeneous time series, where each sample has a time stamp.

For the homogeneous case, we can just put the first `n_samples // n_folds` in the first fold etc, so it's a very simple variation of KFold. Fixed in #6586.

For heterogeneous case, we need to get a `labels` array and split accordingly. If we cast that to integers, people could actually provide pandas time series, and they would be handled correctly (they will be converted to nanoseconds).

I remember arguing against this addition, but I changed my mind ;)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add TimeSeriesCV and HomogeneousTimeSeriesCV #6322

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Add TimeSeriesCV and HomogeneousTimeSeriesCV #6322

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions