-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FEA Add RepeatedStratifiedGroupKFold as a new splitter #24227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
FEA Add RepeatedStratifiedGroupKFold as a new splitter #24227
Conversation
No worries, I can address the review @glemaitre |
I think for this it makes sense to have a |
@adrinjalali I think that we can add the class and the deprecation can come as a whole where we move into parameters instead of separate class. |
In general, it looks good. Just a couple of nitpicks. |
It seems silly to me to introduce a class which we already know we'd like to deprecate, while we can simply add an arg to |
@@ -1242,6 +1253,39 @@ def test_repeated_stratified_kfold_determinstic_split(): | |||
next(splits) | |||
|
|||
|
|||
def test_repeated_stratified_group_kfold_determinstic_split(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should have an additional test to be sure that we don't get the same split if we have different random state.
There is a typo for deterministic
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this what you had in mind @glemaitre?
Fixed the typo (it was also misspelled in test_repeated_stratified_kfold_deterministic_split,
so I fixed it there, too.
X = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]] | ||
y = [1, 1, 1, 0, 0] | ||
groups = [0, 0, 1, 1, 1] | ||
random_state = 1944695409 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could use 0
and not a fancy number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use 0 and 1 now.
* Bump versionadded * Fixed two typos of deterministic in tests * Added multiple random_state parameters to tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the functionality, but as I said before, I rather not introduce it via new class.
Thanks for the feedback. Is there a PR or issue discussing deprecating the |
As for repeating, I agree that But generally, we also want to have a And generally, we can have a sinlge |
closes #24247
Reference Issues/PRs
This functionality was discussed in #13621.
What does this implement/fix? Explain your changes.
This adds a splitter class to
model_selection
that repeatsStratifiedGroupKFold
n times.Any other comments?