You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is possible to specify that some method will error if it is provided, but no converse option to error if it is not provided.
I've also read SLEP006 and the milestone issue #22893 and could not find a mention of this.
I am okay with this if that's the case but I would rather it be explicitly stated somewhere in the linked docs that this is not a feature that's supported.
Reason for clarification:
Testing some custom evaluator that routes will fail if using a GroupKFold with a ValueError, i.e. groups= was never specified. I would have expected this to raise some specific XXXMetaDataError, similar to UnsetMetadataPassedError for when some metadata is passed which is not required.
deftest_custom_evaluator_forwards_splitter_params_correctly():
custom_evaluator=CustomEvalutor(..., splitter=GroupKFold(...), params={})
# Failing due to required metadata (an exceptional case of metadata for Group* splitters)# However it's a generic `ValueError` and not a MetaData kind of error.withpytest.raises(ValueError, match="The 'groups' parameter should not be None."):
custom_evaluator.blub(...)
# All good, parameters specifiedcustom_evaluator=CustomEvalutor(..., splitter=GroupKFold(...), params={"groups": groups})
custom_evaluator.blub()
Naturally, I wanted to test if this works for an estimator too.
deftest_custom_evaluator_forwards_estimator_params_correctly():
estimator=DummyClassifier()
# True, False, None, "sample_weight" can't indicate that this **needs** sample weightsestimator.set_fit_request(sample_weight=...)
custom_evaluator=CustomEvalutor(estimator, params={})
# No error will be raised in any case, can't use this to test that sample_weight actually# got passedwithpytest.raises(SomeMetaDataError):
custom_evaluator.blub(...)
# Will pass regardless, I just don't know if my CustomEvaluator actually did what it# was meant tocustom_evaluator=CustomEvalutor(estimator, params={"sample_weight": sample_weight})
custom_evaluator.blub(...)
Suggest a potential alternative/fix
If there is no way to specify that some metadata is required, then to explicitly document this.
I would propose the following last bullet point to the documentation:
Here value can be:
* True: method requests a sample_weight. This means if the metadata is provided, it will be used, otherwise no error is raised.* False: method does not request a sample_weight.* None: router will raise an error if sample_weight is passed. This is in almost all cases the default value when an object is instantiated and ensures the user sets the metadata requests explicitly when a metadata is passed. The only exception are Group*Fold splitters.* "param_name": if this estimator is used in a meta-estimator, the meta-estimator should forward "param_name" as sample_weight to this estimator. This means the mapping between the metadata required by the object, e.g. sample_weight and what is provided by the user, e.g. my_weights is done at the router level, and not by the object, e.g. estimator, itself.# This line# ------------------------* It is not possible to indicate that a method **requires** metadata to be provided.# ------------------------
The text was updated successfully, but these errors were encountered:
eddiebergman
changed the title
[Question, Documentation] Metadata Routing, required metadata
[Question, Documentation] Metadata Routing, indicate metadata is **required** by a method
Jan 31, 2024
eddiebergman
changed the title
[Question, Documentation] Metadata Routing, indicate metadata is **required** by a method
[Question, Documentation] Metadata Routing, indicate metadata is required by a method
Jan 31, 2024
Describe the issue linked to the documentation
From my understanding, there is no way to specify that some metadata is required with
set_*_request(...)
.Doc: https://scikit-learn.org/stable/metadata_routing.html#api-interface
It is possible to specify that some method will error if it is provided, but no converse option to error if it is not provided.
I've also read SLEP006 and the milestone issue #22893 and could not find a mention of this.
I am okay with this if that's the case but I would rather it be explicitly stated somewhere in the linked docs that this is not a feature that's supported.
Reason for clarification:
GroupKFold
with aValueError
, i.e.groups=
was never specified. I would have expected this to raise some specificXXXMetaDataError
, similar toUnsetMetadataPassedError
for when some metadata is passed which is not required.Naturally, I wanted to test if this works for an estimator too.
Suggest a potential alternative/fix
If there is no way to specify that some metadata is required, then to explicitly document this.
I would propose the following last bullet point to the documentation:
The text was updated successfully, but these errors were encountered: