Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@geetu040
Copy link
Member

@geetu040 geetu040 commented Oct 8, 2025

Reference Issues/PRs

Fixes #8762

Does your contribution introduce a new dependency? If yes, which one?

No

What should a reviewer concentrate their feedback on?

predict methods

Did you add any tests for the change?

Not yet

PR checklist

For all contributions
  • Optionally, for added estimators: I've added myself and possibly to the maintainers tag - do this if you want to become the owner or maintainer of an estimator you added.
    See here for further details on the algorithm maintainer role.
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.

Copy link
Member Author

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: @marrov @ankurankan @fkiraly
This is the draft implementation. Please take a look at _predict to see if it's going in the right direction. I am implementing other methods meanwhile. Also I've left a few quesition in the review please take a look at them as well.

@fkiraly fkiraly changed the title [ENH] Implements Out-of-Sample-Residual Wrapper for Sktime Forecasters [ENH] Out-of-Sample-Residual Wrapper for Forecasters Oct 10, 2025
@fkiraly fkiraly moved this to PR in progress in May - Sep 2025 mentee projects Oct 13, 2025
@geetu040
Copy link
Member Author

This PR is ready for review. Please take a look at your convenience.
FYI: @fkiraly @ankurankan @marrov


return in_fh, oos_fh

def _fit(self, y, X, fh):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user only asks for in-sample points and runs fit, oos_fh will be None. However, we still run self._oos_forecaster.fit(..., fh=None). Any wrapped forecaster with requires-fh-in-fit=True would crashes here, right? Our own tag currently claims fh is optional. Should we mirror the wrapped tag and skip the out-of-sample fit when there is no out-of-sample horizon?

index = pd.MultiIndex.from_product(
_y.index.levels[:-1] + [index], names=_y.index.names
)
preds = pd.DataFrame(0.0, index=index, columns=columns)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialising the result frame with zeros means every in-sample timestamp that the splitter can’t cover inherits a 0 forecast/residual. That could distort downstream residual analysis. I think these slots need to stay NaN (and ideally raise a warning) rather than fabricate zeros.

columns = self._get_columns(method=method_name, **method_kwargs)
index = fh.to_absolute_index(self.cutoff)
if isinstance(_y.index, pd.MultiIndex):
index = pd.MultiIndex.from_product(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this works properly for all multiindex dfs? It would create the full Cartesian product of hierarchy levels with fh so I am concerned about what happens if there are missing combinations. Wouldn't it be safer to construct the multiindex using self._y.index.droplevel(-1).unique()?


In standard forecasters, in-sample predictions are typically obtained directly
from fitted values, which reuse information from the target observation itself.
In contrast, this wrapper enforces a strictly causal prediction regime by
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure "strictly causal prediction regime" is the right wording here. Just say something like "provides in-sample prediction as if they were out-of-sample (i.e. unseen by the forecaster)"

- Aggregate predictions across all splits to form the complete
in-sample forecast.

4. Any in-sample points not covered by a ``cv`` split are filled with zeros
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned before, I'd do NaNs here.

self._oos_forecaster = clone(self.forecaster)
self._oos_forecaster.fit(y=y, X=X, fh=oos_fh)

def _custom_predict(self, fh, X, method_name, **method_kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with sparse use of docstrings elsewhere, but this method is quite long and also, is the key one, so I'd add some docstrings to explain the logic.

self._in_forecaster.fit(y=new_y, X=new_X, fh=cv.get_fh())

# update on all training windows
for window, horizon in cv.split(_y):
Copy link
Collaborator

@marrov marrov Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have not followed the more recent discussions but it does feel like a shame that we cannot reuse this logic from elsewhere (evaluate, etc). Can you refresh my memory here: why is re-implementing rolling update logic manually the best option?

new_X = _X.iloc[window] if _X is not None else None
new__X = _X.iloc[horizon] if _X is not None else None

self._in_forecaster.update(y=new_y, X=new_X, update_params=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've already discussed this but if it used refit instead of update this could be parallelised, no? Isn't that a worthwhile benefit @fkiraly? I recall we also said maybe have update as the base an potentially expand with other PRs. Is that the goal here?

@marrov
Copy link
Collaborator

marrov commented Oct 23, 2025

I think it would be good to have a dedicated test set for this wrapper and not only rely of the default ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: PR in progress

Development

Successfully merging this pull request may close these issues.

[ENH] forecaster wrapper that ensures in-sample forecasts are out-of-sample rolling forecasts of a given horizon

3 participants