Thanks to visit codestin.com
Credit goes to github.com

Skip to content

SLEP006: introduction of metadata routing through a feature flag #26045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
adrinjalali opened this issue Apr 1, 2023 · 2 comments
Open

SLEP006: introduction of metadata routing through a feature flag #26045

adrinjalali opened this issue Apr 1, 2023 · 2 comments
Labels

Comments

@adrinjalali
Copy link
Member

In #25776 we started talking about introducing SLEP6 through a feature flag. This would mean, the user can control whether SLEP6 is enabled or not. This will be done via our global configs:

import sklearn
sklearn.set_config(slep6="enabled")
# or
sklearn.set_config(enable_slep6=True)
# or ...

This also allows users to use this in a context manager, via config_context, if they wish to.

Default value

We can decide on the default value and the path forward. Probably the easiest option here would be for the default value of the setting to be off/False, which means at first the introduction of SLEP6 wouldn't break anything and users' code runs as it would before.

Transition

The following steps are required to completely switch to SLEP6:

  1. Land the implementation on main, and have it released in version a
  2. In version b, start warning users who pass any metadata, that they should be switching to SLEP6 and that the default will change in version c to be enabled by default, and that old routing would be removed in version d
    • This can be done by raising a warning everywhere we do old routing and we observe a metadata/fit param not to be None (and set the warning to be raised once)
  3. Remove old routing in version d
  4. Deprecate sklearn.set_config(enable_slep6=True) in version e, and remove in version f, since once the old routing is removed from the code base, doesn't make sense to keep the flag in the configs.

Constraints and open questions:

  • a <= b: but they can also be the same, i.e. we start raising the warning as soon as the feature is shipped.
  • b < c: and our usual deprecation cycle is two releases, i.e. c = b + 2, but we can allow a longer deprecation cycle here if we want.
  • c <= d: we could have c and d to happen at the same time, but we probably would want to give some buffer in between.
  • d <= e < f: since setting the flag's value doesn't do anything and setting it to False is not allowed once we remove the old routing, we should probably have e == d, but we can remove it with a gracious deprecation period.

My proposal here:

  • b == a: warn immediately to start the move cycle)
  • c == b + 3: give 1.5 years for users to move to the new syntax
  • d == c + 1: 6 months period here
  • d == e: deprecate the flag immediate since users can't really use it at this point anyway
  • f == e + 4: let the config not raise for 2 years

Implementation

The implementation follows the same pattern as mentioned in #25776 (comment):

class MyMetaEstimator(...):

    def fit(self, X, y, **fit_params):
        if SLEP006_ON:
            params = process_routing(self, "fit", fit_params)
        else:
            # define params using backward compatible semantics
            params = ...

        self.estimator_ = clone(self.estimator).fit(X, y, **params.estimator.fit)

with the added complexity that:

  • Introducing SLEP6 is also adding metadata to other methods such as transform, etc, which means these would be accepted and added to the signature of metaestimators such as Pipeline. If the user passes anything to those newly introduce parameters w/o enabling the feature flag, we raise.
  • Introducing SLEP6 also deprecates certain parameters of functions such as cross_val_score, e.g. fit_prams -> metadata/params. This deprecation message and the switch will depend on how we decide on the transition as explained in previous section.

We shall also implement this for the PRs for metaestimators that are already open, and apply this to the ones we've already merged into the feature branch (sample-props).

Third party developers

They can decide if they want to follow our way of introducing the new syntax, or to introduce it with a breaking change. For them this introduces an added complexity. They can ideally tie things to how we change settings in our config.


This came as a result of the meeting we held on 31.03.2022: #25776 (comment)

cc @scikit-learn/core-devs

@github-actions github-actions bot added the Needs Triage Issue requires triage label Apr 1, 2023
@adrinjalali adrinjalali added API and removed Needs Triage Issue requires triage labels Apr 1, 2023
@haiatn
Copy link
Contributor

haiatn commented Jul 29, 2023

Can this be closed due to #26103 being merged?

@adrinjalali
Copy link
Member Author

We have introduced the feature, but we still haven't enabled it by default, and the rest of the deprecation is still pending. This will take a couple of years to finish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants