[ENH] major speedup for _predict_out_of_sample for RecursiveReductionForeca… #7380

ericjb · 2024-11-11T17:33:51Z

Speedup _predict_out_of_sample in Issue #3224 - see @fkiraly comment there from Jul 2, 2023

I rewrote the method _predict_out_of_sample in class RecursiveReductionForecaster. The rewrite involved copying a similar method from a different class and modifying it to work with this class.

This PR is a bit hacky:

I left in much of the original code for reference, surrounded by `if False:'
I did not address the pooling = "global" case (I don't really know what that is)
I did a careful comparison that the numbers are identical pre/post - but that was only done on one example.

…ster

fkiraly · 2024-11-12T07:50:07Z

For information: the pooling="global" case covers the case of fitting the regression model on multiple time series, i.e., on Panel typed containers.

The implementation however seems buggy at the moment, this is a different known issue.

fkiraly

Nice!

I think we need to enable tests for this estimator to ensure that nothing is broken.

What I wlil do, I will enable the tests and add a skip for the known failure in the "global" case. Then you can merge from main and we see what happens.

Further, do you have any profiling results that you can share? Pre/post?

ericjb · 2024-11-12T12:40:32Z

Attached: code I used to do the profiling and the profiling outputs. "Pre" = before my changes, "Post" = after my changes. The profiling was done only for the generation of the out-of-sample forecasts. Let me know if you have any questions.
example_PR_7380.py.gz
profile_7380_post.txt
profile_7380_pre.txt

fkiraly · 2024-11-12T15:23:26Z

That's ... a factor of 1000???

Marvelous! Did you also understand where the time was lost? It looks like unnecessary copies of objects were made, and too many function calls?

fkiraly · 2024-11-12T15:24:27Z

Re activating the tests, I got stuck there with a failed test run to identify the failing tests - I have now restarted.

fkiraly · 2024-11-12T18:17:17Z

Btw, I would invite you to check the discord - another contributor (Julian) also wants to fix the issues with the global setting!

We could try to meet in one of the Friday 13 UTC meetups.

fkiraly · 2024-11-12T19:38:32Z

Ok, I switched on the tests!

ericjb · 2024-11-14T13:35:48Z

Btw, I would invite you to check the discord - another contributor (Julian) also wants to fix the issues with the global setting!

We could try to meet in one of the Friday 13 UTC meetups.

Questions:

what does it mean to "check the discord" ? (esp. with reference to Julian)
How does one join the Friday meetups? I think I tried in the past and failed. Also, Friday 13 UTC is generally a very awkward time for me (with some exceptions) due to prior commitments. Frustrating, as I'd definitely like to be a regular attendee.

ericjb · 2024-11-14T13:49:36Z

That's ... a factor of 1000???

Marvelous! Did you also understand where the time was lost? It looks like unnecessary copies of objects were made, and too many function calls?

Too many function calls is an understatement - 250 million function calls! The routine is supposed to take a vector of 12 numbers, do essentially a few matrix multiplies against this vector, yielding a scalar. This gets repeated 36 times X 20 forecasters, i.e. less than 1,000 calculations. It should be essentially instantaneous. Almost everywhere one looks there are unnecessary actions being done. Why the copies in the first place? All the info from the fitted model is only being read, not written to. And is there really a need for checking the need for imputation? The model has already been fitted at this point. Also why grab the entire original time series, when only the last window is needed. And on it goes. [Also, even if a copy or check for imputation is needed, it should be needed only once, before the first time through the recursive loop.]

fkiraly

Excellent!

Could you kindly add the test cases we discussed in a file forecasting.compose.tests.test_reduce_2nd? With asserts optimally where differences were before the fix.

ericjb · 2024-11-26T07:42:43Z

Note that these tests do not include the additional tests that I did which confirmed that the actual numerical results - forecasts and fitted values - were unchanged by the faster implementation. (It is not possible to automate this test as it requires different versions of the same file, or something equivalent.)

fkiraly · 2024-11-26T08:46:58Z

Understood, makes sense - thanks!

ericjb · 2024-11-30T09:53:18Z

Understood, makes sense - thanks!

What remains to be done here? Can it be completed?

…al repo

… (does not seem to do that, though)

…RegressionForecaster; fix some index issues for _predict_multiple

…tiindex - all related to DirectTabularRegressionForecaster

… (exog)

…regular time series (hence no freq)

…lobal pooling, etc

major speedup for _predict_out_of_sample for RecursiveReductionForeca…

dd07dd8

…ster

ericjb requested review from XinyuWuu, achieveordie, benHeid, fkiraly, fnhirwa, geetu040, pranavvp16 and yarnabrina as code owners November 11, 2024 17:33

ericjb mentioned this pull request Nov 11, 2024

[ENH] reducer design for rework #3224

Open

fkiraly requested changes Nov 12, 2024

View reviewed changes

fkiraly added module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting enhancement Adding new functionality labels Nov 12, 2024

Merge branch 'main' into pr/7380

1a584f4

fix handling of missing values (now consistent with pre-changes)

7133299

fkiraly requested changes Nov 25, 2024

View reviewed changes

ericjb added 2 commits November 26, 2024 09:08

added test robust to missing index values

b51ea7d

added test handles hierarchical data with unequal indices

09d024f

Merge branch 'main' into pr/7380

dd69dc3

ericjb added 25 commits September 30, 2025 10:09

intermediate version

5ef3b3d

passes more unit tests than before

723dc33

fixed issues with skpro (soft dependency)

278c236

fixed typo in skpro params

fcc4839

added helper method _cutoff_scalar

4b7a80f

move helper _cutoff_scalar to _ReducerMixin

b2de7b4

fix setting of rtol and atol in a test (they were both zero)

80c4f07

use _cutoff_scalar() in an additional statement

798071e

broader replacement of self.cutoff by self._cutoff_scalar()

9742401

modify _cutoff_scalar() to provide usable freq in most (all?) cases

981d8d4

modify _get_expected_pred_idx to handle fh is None

6a29efb

handle various corner cases for setting freq

ef702f0

get DirectTabularRegressionForecaster to be included when testing loc…

f6a8ea8

…al repo

fixed bug in DirectTabularRegressionForecaster with Multiindex data

80adbf5

added some tags to try to get local CI to run same tests as GitHub CI…

b4789f8

… (does not seem to do that, though)

Merge branch 'main' into speedupRecursiveReductionForecaster

cb4c9ea

address CI errors: set capability:pred_int to False for DirectTabular…

5c18bba

…RegressionForecaster; fix some index issues for _predict_multiple

made changes to pass more CI tests - usually indexing related for mul…

7b299e6

…tiindex - all related to DirectTabularRegressionForecaster

fixed a CI error related to slice_at_ix and absolute ForecastingHorizon

f5f5399

fix a CI discovered bug related to Multiindex y and time index only X…

ff1c44d

… (exog)

fix a CI discovered bug related to inferring freq from a Multiindex

5fc490c

fixed a CI error in DirectReductionForecaster related to y not being …

1da996f

…regular time series (hence no freq)

completes unittest5 but with debugging trace

7034e7e

commented out debugging trace

ebc665e

I fixed a bug I introduced for ForecastingHorizon.to_relative

399c5cb

felipeangelimvieira mentioned this pull request Oct 16, 2025

[ENH] Unify Direct Reduction and Recursive Reduction #8975

Open

ericjb added 4 commits October 16, 2025 22:24

modifications to pass CI tests that begin with a WIDE y

467d918

comment out debugging trace

4b776a6

additional bug fixes related to passing CI tests: panel data, local/g…

df2c952

…lobal pooling, etc

remove debugging trace

8cfaac9

Uh oh!

[ENH] major speedup for _predict_out_of_sample for RecursiveReductionForeca… #7380

Are you sure you want to change the base?

[ENH] major speedup for _predict_out_of_sample for RecursiveReductionForeca… #7380

Uh oh!

Conversation

ericjb commented Nov 11, 2024

Uh oh!

fkiraly commented Nov 12, 2024

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

ericjb commented Nov 12, 2024

Uh oh!

fkiraly commented Nov 12, 2024

Uh oh!

fkiraly commented Nov 12, 2024

Uh oh!

fkiraly commented Nov 12, 2024

Uh oh!

fkiraly commented Nov 12, 2024

Uh oh!

ericjb commented Nov 14, 2024

Uh oh!

ericjb commented Nov 14, 2024

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

ericjb commented Nov 26, 2024

Uh oh!

fkiraly commented Nov 26, 2024

Uh oh!

ericjb commented Nov 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants