Thanks to visit codestin.com
Credit goes to github.com

[ENH] Out-of-Sample-Residual Wrapper for Forecasters #8936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

geetu040 wants to merge 8 commits into sktime:main from geetu040:oos

Member

geetu040 commented Oct 8, 2025 •

edited

Loading

Reference Issues/PRs

Fixes #8762

Does your contribution introduce a new dependency? If yes, which one?

No

What should a reviewer concentrate their feedback on?

predict methods

Did you add any tests for the change?

Not yet

PR checklist

For all contributions

Optionally, for added estimators: I've added myself and possibly to the maintainers tag - do this if you want to become the owner or maintainer of an estimator you added.
See here for further details on the algorithm maintainer role.
The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.

For new estimators

I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.

geetu040 requested review from benHeid, felipeangelimvieira, fkiraly and yarnabrina as code owners

October 8, 2025 13:13

geetu040 commented

View reviewed changes

Member Author

geetu040 left a comment

FYI: @marrov @ankurankan @fkiraly
This is the draft implementation. Please take a look at _predict to see if it's going in the right direction. I am implementing other methods meanwhile. Also I've left a few quesition in the review please take a look at them as well.

sktime/forecasting/compose/_oos_residual.py Outdated Show resolved Hide resolved

sktime/forecasting/compose/_oos_residual.py Outdated Show resolved Hide resolved

sktime/forecasting/compose/_oos_residual.py Outdated Show resolved Hide resolved

fkiraly changed the title ~~[ENH] Implements Out-of-Sample-Residual Wrapper for Sktime Forecasters~~ [ENH] Out-of-Sample-Residual Wrapper for Forecasters

fkiraly assigned geetu040

fkiraly added this to May - Sep 2025 mentee projects

fkiraly moved this to PR in progress in May - Sep 2025 mentee projects

geetu040 force-pushed the oos branch from b6d82d4 to 335c8b7 Compare

October 17, 2025 03:51

geetu040 added 8 commits

October 20, 2025 07:58


          upload initial code

7e5affe


          new implementation

455869c


          fix for multi-index

7bc3da0


          fix estimator checks

95f4e21


          enable capability:pred_int

59fe39e


          shorten var names

7335c1b


          rename: OosResidualsWrapper->OosForecaster

39a09e9


          update docstrings

a27bc4d

geetu040 force-pushed the oos branch from 538d12f to a27bc4d Compare

October 20, 2025 03:10

Member Author

geetu040 commented Oct 20, 2025

This PR is ready for review. Please take a look at your convenience.
FYI: @fkiraly @ankurankan @marrov

marrov reviewed

View reviewed changes

sktime/forecasting/compose/_oos.py


		return in_fh, oos_fh

		def _fit(self, y, X, fh):

Collaborator

marrov Oct 23, 2025

If the user only asks for in-sample points and runs fit, oos_fh will be None. However, we still run self._oos_forecaster.fit(..., fh=None). Any wrapped forecaster with requires-fh-in-fit=True would crashes here, right? Our own tag currently claims fh is optional. Should we mirror the wrapped tag and skip the out-of-sample fit when there is no out-of-sample horizon?

marrov reviewed

View reviewed changes

sktime/forecasting/compose/_oos.py

+                          index = pd.MultiIndex.from_product(
+                              _y.index.levels[:-1] + [index], names=_y.index.names
+                          )
+                      preds = pd.DataFrame(0.0, index=index, columns=columns)

Collaborator

marrov Oct 23, 2025

Initialising the result frame with zeros means every in-sample timestamp that the splitter can’t cover inherits a 0 forecast/residual. That could distort downstream residual analysis. I think these slots need to stay NaN (and ideally raise a warning) rather than fabricate zeros.

marrov reviewed

View reviewed changes

sktime/forecasting/compose/_oos.py

+                      columns = self._get_columns(method=method_name, **method_kwargs)
+                      index = fh.to_absolute_index(self.cutoff)
+                      if isinstance(_y.index, pd.MultiIndex):
+                          index = pd.MultiIndex.from_product(

Collaborator

marrov Oct 23, 2025

Are you sure this works properly for all multiindex dfs? It would create the full Cartesian product of hierarchy levels with fh so I am concerned about what happens if there are missing combinations. Wouldn't it be safer to construct the multiindex using self._y.index.droplevel(-1).unique()?

marrov reviewed

View reviewed changes

sktime/forecasting/compose/_oos.py

+                  In standard forecasters, in-sample predictions are typically obtained directly
+                  from fitted values, which reuse information from the target observation itself.
+                  In contrast, this wrapper enforces a strictly causal prediction regime by

Collaborator

marrov Oct 23, 2025

Not sure "strictly causal prediction regime" is the right wording here. Just say something like "provides in-sample prediction as if they were out-of-sample (i.e. unseen by the forecaster)"

marrov reviewed

View reviewed changes

sktime/forecasting/compose/_oos.py

+                     - Aggregate predictions across all splits to form the complete
+                       in-sample forecast.
+. Any in-sample points not covered by a ``cv`` split are filled with zeros

Collaborator

marrov Oct 23, 2025

As mentioned before, I'd do NaNs here.

marrov reviewed

View reviewed changes

sktime/forecasting/compose/_oos.py

+                      self._oos_forecaster = clone(self.forecaster)
+                      self._oos_forecaster.fit(y=y, X=X, fh=oos_fh)
+                  def _custom_predict(self, fh, X, method_name, **method_kwargs):

Collaborator

marrov Oct 23, 2025

I'm ok with sparse use of docstrings elsewhere, but this method is quite long and also, is the key one, so I'd add some docstrings to explain the logic.

marrov reviewed

View reviewed changes

sktime/forecasting/compose/_oos.py

+                          self._in_forecaster.fit(y=new_y, X=new_X, fh=cv.get_fh())
+                          # update on all training windows
+                          for window, horizon in cv.split(_y):

Collaborator

marrov Oct 23, 2025 •

edited

Loading

I may have not followed the more recent discussions but it does feel like a shame that we cannot reuse this logic from elsewhere (evaluate, etc). Can you refresh my memory here: why is re-implementing rolling update logic manually the best option?

marrov reviewed

View reviewed changes

sktime/forecasting/compose/_oos.py

+                              new_X = _X.iloc[window] if _X is not None else None
+                              new__X = _X.iloc[horizon] if _X is not None else None
+                              self._in_forecaster.update(y=new_y, X=new_X, update_params=True)

Collaborator

marrov Oct 23, 2025

I think we've already discussed this but if it used refit instead of update this could be parallelised, no? Isn't that a worthwhile benefit @fkiraly? I recall we also said maybe have update as the base an potentially expand with other PRs. Is that the goal here?

Collaborator

marrov commented Oct 23, 2025

I think it would be good to have a dedicated test set for this wrapper and not only rely of the default ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

benHeid Awaiting requested review from benHeid benHeid is a code owner

felipeangelimvieira Awaiting requested review from felipeangelimvieira felipeangelimvieira is a code owner

fkiraly Awaiting requested review from fkiraly fkiraly is a code owner

yarnabrina Awaiting requested review from yarnabrina yarnabrina is a code owner

1 more reviewer

marrov marrov left review comments

At least 1 approving review is required to merge this pull request.

Labels

None yet