-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
ENH Deprecates _pairwise attribute and adds pairwise to estimator tags #18143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Deprecates _pairwise attribute and adds pairwise to estimator tags #18143
Conversation
Thank you, I think this is an important cleanup. |
I agree this should be deprecated. |
Updated PR to deprecate the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
sklearn/base.py
Outdated
@@ -37,6 +37,7 @@ | |||
'binary_only': False, | |||
'requires_fit': True, | |||
'requires_y': False, | |||
'pairwise': False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'pairwise': False | |
'pairwise': False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To have on diff less for the next tag :)
sklearn/kernel_ridge.py
Outdated
@property | ||
def _pairwise(self): | ||
return self.kernel == "precomputed" | ||
return self.kernel == 'precomputed' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double quotes were fine :)
similar methods consists of pairwise measures over samples rather than a | ||
feature representation for each sample. It is usually `True` where an | ||
estimator has a `metric` or `affinity` or `kernel` parameter with value | ||
'precomputed'. Its primary purpose is that when a :term:`meta-estimator` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when using cross-validation or when a meta-estimator?
I don't think GridSearchCV is the first meta-estimator people think of and I think cross_val_score etc are an important use-case.
I don't think the deprecation is correct. What we want for a third party is that if they implement |
It doesn't hurt to deprecate the attribute too, but I agree that deprecating the behaviour is the main thing |
I updated the estimator_checks to actually use the tag now. This means some of the estimator checks will fail if the tag is not set correctly.
If a third party estimator inherits from if hasattr(estimator, '_pairwise'):
pairwise = estimator._pairwise:
else:
pairwise = estimator._get_tags().get("pairwise", False) which would be backward compatible with third party estimators. The downside of this approach is that third party estimators that uses |
Isn't it something that we want to avoid as part of the deprecation? I mean that the code of third-party will fail if they use |
Uhm actually, we have the default flag so the issue would only be if one does not inherit from |
Should we make a warning every time a call to I think the issue is that |
I had discussed this with @thomasjpfan two days ago, not sure if that was before or after he commented. My suggestion is to check in
The last one is maybe a bit surprising but I think should be fine? We need to catch any deprecation warnings raised by accessing This requires that all our estimators implement the tag, and backwards-compatibility for the estimators (in case someone else implemented cross-validation, maybe a bit far-fetched but whatever) dictates that we still have the |
Sounds good.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good apart from nitpick for test.
doc/developers/develop.rst
Outdated
@@ -226,6 +226,11 @@ the dataset, e.g. when ``X`` is a precomputed kernel matrix. Specifically, | |||
the :term:`_pairwise` property is used by ``utils.metaestimators._safe_split`` | |||
to slice rows and columns. | |||
|
|||
.. deprecated:: 0.24 | |||
|
|||
The _pairwise attribute is deprecated in 0.24. From 0.26 and onward, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _pairwise attribute is deprecated in 0.24. From 0.26 and onward, | |
The _pairwise attribute is deprecated in 0.24. From 0.26 onward, |
doc/glossary.rst
Outdated
.. deprecated:: 0.24 | ||
|
||
The _pairwise attribute is deprecated in 0.24. From 0.26 | ||
and onward, the `pairwise` estimator tag should be used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and onward, the `pairwise` estimator tag should be used | |
onward, the `pairwise` estimator tag should be used |
sklearn/manifold/_mds.py
Outdated
@property | ||
def _pairwise(self): | ||
return self.kernel == "precomputed" | ||
return self.metric == "precomputed" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this a bug?!
@@ -189,7 +195,7 @@ def _safe_split(estimator, X, y, indices, train_indices=None): | |||
Indexed targets. | |||
|
|||
""" | |||
if getattr(estimator, "_pairwise", False): | |||
if _is_pairwise(estimator): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This here is the main usage right? I think it might be nice to have a direct test of this? Or of cross_validate
or anything like that? Right now you're only testing the helper (extensively) but you're not testing anything that actually uses the helper, right?
It might be a bit overkill to test all of these places but maybe one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was done in cfeaa09
do you wanna fix conflicts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise lgtm
|
||
# TODO: Remove in 0.26 when the _pairwise attribute is removed | ||
def test_validation_pairwise(): | ||
# Correctly warns with pairwise tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update this comment
Synced this PR up with master. |
Reference Issues/PRs
Related to #17806
What does this implement/fix? Explain your changes.
Currently, this PR deprecates the
_pairwise
attribute and places it into estimator tags.