-
-
Notifications
You must be signed in to change notification settings - Fork 26k
MNT Param validation: Add a common test for param validation of public functions #23514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNT Param validation: Add a common test for param validation of public functions #23514
Conversation
This is something that we should probably address when refactoring the common tests. For the moment, IMO it is fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments but overall I think this is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ad of having it separated from the func
@glemaitre the code has changed significantly after irl discussions. You might want to take another look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after resolving the coverage.
# The dict of parameter constraints is set as an attribute of the function | ||
# to make it possible to dynamically introspect the constraints for | ||
# automatic testing. | ||
setattr(func, "_skl_parameter_constraints", parameter_constraints) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we are storing the constraints in the function itself, I'll prefer the function become a "class that is callable". i.e. a class that defines __call__
.
To me, using setattr
on a function feels like a hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's an issue with that approach. The decorator works with functions and methods. If the decorator returns an object I think it can't work as expected to replace a method. But maybe I just don't know how to do it. Here's what I came up with:
def validate_params(parameter_constraints):
def decorator(func):
@functools.wraps(func, updated=())
class wrapper:
def __init__(self):
self._skl_parameter_constraints = parameter_constraints
def __call__(self, *args, **kwargs):
func_sig = signature(func)
# Map *args/**kwargs to the function signature
params = func_sig.bind(*args, **kwargs)
params.apply_defaults()
# ignore self/cls and positional/keyword markers
to_ignore = [
p.name
for p in func_sig.parameters.values()
if p.kind in (p.VAR_POSITIONAL, p.VAR_KEYWORD)
]
to_ignore += ["self", "cls"]
params = {k: v for k, v in params.arguments.items() if k not in to_ignore}
validate_parameter_constraints(
self._skl_parameter_constraints, params, caller_name=func.__qualname__
)
return func(*args, **kwargs)
return wrapper()
return decorator
maybe someone has an idea ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, not sure it's that less hackish that setting an attribute on a function 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that replacing a method by an object that is callable doesn't make it a method but makes it an attribute of the class that is callable. Thus, when calling it, self
is not passed as first argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussing IRL with @jeremiedbb, it seems that making it callable makes things more complex. I would be inclined to keep setting the attribute on the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is also possible to use the decorator on methods. I can think that we could validate kernels (of gaussian process) or splitters that are not proper estimators. So it could be handy to still make it possible to validate parameters for such classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with leaving this as is. I think to actually use a "class callable" with a decorator, it would end up something like available_if
and using class descriptor's:
scikit-learn/sklearn/utils/metaestimators.py
Line 143 in 2f787f4
def available_if(check): |
On that note, can you check to make sure that the current implementation does not run into issues like #21344 which was fixed in #23077?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint, I forgot about available_if !
Let me try to implement this and we'll chose the best solution EDIT: this would only work on methods but no longer on functions. I guess there's no simple way to make it work on both and would require to implement 2 versions of the decorator, making the whole thing a lot more complex. So I'm also keen on leaving the PR as is 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think the attribute solution is the easiest even if it populates the callables' namespaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @jeremiedbb.
# The dict of parameter constraints is set as an attribute of the function | ||
# to make it possible to dynamically introspect the constraints for | ||
# automatic testing. | ||
setattr(func, "_skl_parameter_constraints", parameter_constraints) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think the attribute solution is the easiest even if it populates the callables' namespaces.
lgtm |
…c functions (scikit-learn#23514) Co-authored-by: Julien Jerphanion <[email protected]>
This PR adds a test for checking param validation of public functions, similar to the for testing estimators
scikit-learn/sklearn/utils/estimator_checks.py
Line 4042 in 8ea2997
Usually we define a list of all the functions/estimators we want to be tested and comment them all but here I did not find very clear which functions are really considered public so I chose to define an empty list that we can fill incrementally. We can still replace by the other option later.
Did not find an obvious existing test file for this test so added a new one in
sklearn/tests
. Maybe I missed an obvious location ?