-
-
Notifications
You must be signed in to change notification settings - Fork 26.4k
ENH Adds warning & docs for splitters that do not support goups #28210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Adds warning & docs for splitters that do not support goups #28210
Conversation
| ) | ||
| digits = load_digits() | ||
|
|
||
| pytestmark = pytest.mark.filterwarnings( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit weird for me to have this filter globally set for the file.
We should have our test catching the warning to make sure that we raise it when we expect or otherwise, we should raise an error.
Is there a particular reason to go fine grain here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I converted this pytestmark to error instead and updated the tests.
sklearn/model_selection/_split.py
Outdated
|
|
||
| class BaseShuffleSplit(_MetadataRequester, metaclass=ABCMeta): | ||
| """Base class for ShuffleSplit and StratifiedShuffleSplit.""" | ||
| """Base class for ShuffleSplit and StratifiedShuffleSplit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the GroupShuffleSplit as well. Maybe "*ShuffleSplit is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense. I was actually not aware of the predefined splitter. I think that we don't document it in our API documentation.
I'll open a PR.
We do have it documented.
adrinjalali
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a secret plan to make all splitters support group and strata (#26821) to all splitters and to simply the interface a bit.
But in the meantime this is a nice fix.
glemaitre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @thomasjpfan
…it-learn#28210) Co-authored-by: Loïc Estève <[email protected]>
Reference Issues/PRs
Closes #22848
What does this implement/fix? Explain your changes.
This PR:
_UnsupportedGroupCVMixinmixin that updates the docstrings of CV splitters that do not support groups. It also includes a warning if groups are passed in.splitmethod, this PR adds a warning there. ForPredefinedSplitterandTimeSeriesSplitter, it raises a warning duringsplitwhen the generator is created.Any other comments?
Implementation wise, I moved
ShuffleSplit._iter_indicestoBaseShuffleSplit._iter_indicies, so thatGroupShuffleSplitcan directly inherit fromBaseShuffleSplit. This wayGroupShuffleSplitis not inheriting from a splitter that does not support groups.