-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
RFC SLEP006: allow users to enable a "strict" mode in metadata routing #23920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
For me, I prefer the final API to raise an error. In your example, pipeline = make_pipeline(LogisticRegression().set_fit_request(sample_weight=True)).fit(X, y)
pipeline = make_pipeline(LogisticRegression()).fit(X, y) That feels weird to me. Do you think the final API should be strict by default? |
I don't mind having it strict by default, but I might be missing some edge cases. Would you have something in mind to object the default strict mode @jnothman ? |
Basically by not being strict we're allowing for someone to provide us with a canned composite estimator with By saying |
Some options out there:
I'm personally happy with either of the above solutions. |
|
I agree with this example being weird, and to make the example even more poignant, the below two are identical. pipeline = make_pipeline(LogisticRegression().set_fit_request(sample_weight=True)).fit(X, y)
pipeline = make_pipeline(LogisticRegression().set_fit_request(sample_weight=False)).fit(X, y) I personally would prefer some explicit parameter that will raise. To illustrate an example, suppose someone investigating the hypothesis of: "does Research code is often hasty and you might expect that setting Of course this is an example of user error, but sklearn is very good at validating correctness and the expectations for this new feature would be no different. From experience, experimentation code in machine learning research is riddled with these subtle bugs and sklearn is great at informing users about them. As for the exact semantics of specifying One API recommendation to limit new methods or new parameters to simply introduce a new value type that determines behaviour, namely if param["sample_weight"]: # Relies on implicit "truthyness"
# do... The above proposal would not work in this case as EDIT: The above API with "strict" proposal would technically conflict with semantics defined in 1.1.4. Advanced: Different scoring and fitting weights. It would be odd for someone to use it but it means that simply checking for a |
ref: https://github.com/scikit-learn/scikit-learn/pull/22986/files#r862344847
Along the way we've had discussions on whether we should raise if a metadata is requested but not provided.
Mirroring the existing behavior, SLEP006 and the implementation won't raise in the following case:
Should we allow a strict mode, either globally or on the estimator level, which would make the above code raise?
The text was updated successfully, but these errors were encountered: