MNT Param validation: Add a common test for param validation of public functions #23514

jeremiedbb · 2022-06-01T14:54:19Z

This PR adds a test for checking param validation of public functions, similar to the for testing estimators

scikit-learn/sklearn/utils/estimator_checks.py

Line 4042 in 8ea2997

def check_param_validation(name, estimator_orig):

Usually we define a list of all the functions/estimators we want to be tested and comment them all but here I did not find very clear which functions are really considered public so I chose to define an empty list that we can fill incrementally. We can still replace by the other option later.

Did not find an obvious existing test file for this test so added a new one in sklearn/tests. Maybe I missed an obvious location ?

…ns-validate-params

glemaitre · 2022-06-03T06:17:17Z

Did not find an obvious existing test file for this test so added a new one in sklearn/tests. Maybe I missed an obvious location ?

This is something that we should probably address when refactoring the common tests. For the moment, IMO it is fine.

glemaitre

A couple of comments but overall I think this is fine.

sklearn/tests/test_public_functions.py

sklearn/utils/_param_validation.py

sklearn/utils/tests/test_param_validation.py

sklearn/utils/_param_validation.py

sklearn/utils/tests/test_param_validation.py

sklearn/tests/test_public_functions.py

sklearn/cluster/_kmeans.py

sklearn/tests/test_public_functions.py

glemaitre

LGTM

…ad of having it separated from the func

jeremiedbb · 2022-06-08T12:30:27Z

@glemaitre the code has changed significantly after irl discussions. You might want to take another look

jjerphan

LGTM after resolving the coverage.

thomasjpfan · 2022-06-08T13:44:06Z

sklearn/utils/_param_validation.py

+        # The dict of parameter constraints is set as an attribute of the function
+        # to make it possible to dynamically introspect the constraints for
+        # automatic testing.
+        setattr(func, "_skl_parameter_constraints", parameter_constraints)


Now that we are storing the constraints in the function itself, I'll prefer the function become a "class that is callable". i.e. a class that defines __call__.

To me, using setattr on a function feels like a hack.

I think there's an issue with that approach. The decorator works with functions and methods. If the decorator returns an object I think it can't work as expected to replace a method. But maybe I just don't know how to do it. Here's what I came up with:

def validate_params(parameter_constraints): def decorator(func): @functools.wraps(func, updated=()) class wrapper: def __init__(self): self._skl_parameter_constraints = parameter_constraints def __call__(self, *args, **kwargs): func_sig = signature(func) # Map *args/**kwargs to the function signature params = func_sig.bind(*args, **kwargs) params.apply_defaults() # ignore self/cls and positional/keyword markers to_ignore = [ p.name for p in func_sig.parameters.values() if p.kind in (p.VAR_POSITIONAL, p.VAR_KEYWORD) ] to_ignore += ["self", "cls"] params = {k: v for k, v in params.arguments.items() if k not in to_ignore} validate_parameter_constraints( self._skl_parameter_constraints, params, caller_name=func.__qualname__ ) return func(*args, **kwargs) return wrapper() return decorator

maybe someone has an idea ?

Also, not sure it's that less hackish that setting an attribute on a function 😄

My understanding is that replacing a method by an object that is callable doesn't make it a method but makes it an attribute of the class that is callable. Thus, when calling it, self is not passed as first argument.

Discussing IRL with @jeremiedbb, it seems that making it callable makes things more complex. I would be inclined to keep setting the attribute on the function.

It is also possible to use the decorator on methods. I can think that we could validate kernels (of gaussian process) or splitters that are not proper estimators. So it could be handy to still make it possible to validate parameters for such classes.

I'm okay with leaving this as is. I think to actually use a "class callable" with a decorator, it would end up something like available_if and using class descriptor's:

scikit-learn/sklearn/utils/metaestimators.py

Line 143 in 2f787f4

def available_if(check):

On that note, can you check to make sure that the current implementation does not run into issues like #21344 which was fixed in #23077?

Thanks for the hint, I forgot about available_if !
~~Let me try to implement this and we'll chose the best solution~~ EDIT: this would only work on methods but no longer on functions. I guess there's no simple way to make it work on both and would require to implement 2 versions of the decorator, making the whole thing a lot more complex. So I'm also keen on leaving the PR as is 😄

I also think the attribute solution is the easiest even if it populates the callables' namespaces.

sklearn/utils/_param_validation.py

…ns-validate-params

jjerphan

Thank you, @jeremiedbb.

jjerphan · 2022-06-14T08:46:24Z

sklearn/utils/_param_validation.py

+        # The dict of parameter constraints is set as an attribute of the function
+        # to make it possible to dynamically introspect the constraints for
+        # automatic testing.
+        setattr(func, "_skl_parameter_constraints", parameter_constraints)


I also think the attribute solution is the easiest even if it populates the callables' namespaces.

glemaitre · 2022-06-20T14:16:33Z

lgtm

…c functions (scikit-learn#23514) Co-authored-by: Julien Jerphanion <[email protected]>

jeremiedbb added 4 commits May 30, 2022 17:02

wip

91580c3

add common test for testing param validation of functions

9c2ae21

Merge remote-tracking branch 'upstream/main' into test-public-functio…

6d8fe96

…ns-validate-params

improve docstring

7051a1f

github-actions bot added module:cluster module:utils labels Jun 1, 2022

jeremiedbb added the No Changelog Needed label Jun 1, 2022

jeremiedbb mentioned this pull request Jun 1, 2022

MNT Use _validate_params in NMF and MiniBatchNMF #23463

Merged

glemaitre reviewed Jun 3, 2022

View reviewed changes

jjerphan reviewed Jun 3, 2022

View reviewed changes

sklearn/cluster/_kmeans.py Outdated Show resolved Hide resolved

address review comments

4e5b80a

glemaitre reviewed Jun 8, 2022

View reviewed changes

sklearn/tests/test_public_functions.py Outdated Show resolved Hide resolved

glemaitre approved these changes Jun 8, 2022

View reviewed changes

jeremiedbb added 4 commits June 8, 2022 11:55

apply suggestion

f61f492

fix

021747e

make the decorator set the constraints as attribute of the func inste…

a485e59

…ad of having it separated from the func

lint

6e3852b

jjerphan reviewed Jun 8, 2022

View reviewed changes

thomasjpfan reviewed Jun 8, 2022

View reviewed changes

glemaitre reviewed Jun 8, 2022

View reviewed changes

sklearn/utils/_param_validation.py Outdated Show resolved Hide resolved

jeremiedbb added 3 commits June 9, 2022 09:57

improve coverage and fix

4822dc3

Merge remote-tracking branch 'upstream/main' into test-public-functio…

01ff905

…ns-validate-params

Merge remote-tracking branch 'upstream/main' into test-public-functio…

60ce2eb

…ns-validate-params

jeremiedbb added the Validation related to input validation label Jun 13, 2022

jjerphan approved these changes Jun 14, 2022

View reviewed changes

Merge branch 'main' into test-public-functions-validate-params

7a04da8

glemaitre merged commit cc6806b into scikit-learn:main Jun 20, 2022

ogrisel pushed a commit to ogrisel/scikit-learn that referenced this pull request Jul 11, 2022

MNT Param validation: Add a common test for param validation of publi…

ea92c5e

…c functions (scikit-learn#23514) Co-authored-by: Julien Jerphanion <[email protected]>

Uh oh!

MNT Param validation: Add a common test for param validation of public functions #23514

MNT Param validation: Add a common test for param validation of public functions #23514

Uh oh!

Conversation

jeremiedbb commented Jun 1, 2022

Uh oh!

glemaitre commented Jun 3, 2022

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

jeremiedbb commented Jun 8, 2022

Uh oh!

jjerphan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeremiedbb Jun 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeremiedbb Jun 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jjerphan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Jun 20, 2022

Uh oh!

Uh oh!

jjerphan left a comment •

edited

Loading

jeremiedbb Jun 8, 2022 •

edited

Loading

jeremiedbb Jun 13, 2022 •

edited

Loading