-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
StratifiedShuffleSplit still buggy #6471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I can add a few more test cases to this. With the following data
All manner of crazy things happen
I'm not that bothered about getting 1 extra instance in the test or training set, but getting 10x the amount of data that was requested is not optimal. |
Right, so the problem is this line
https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/cross_validation.py#L1005 If This seems to have already been fixed on master though. |
Sorry, I may have missed something here. Have you found a bug that has not already been fixed in master or that was not the one for which the issue was created? |
@lesteve no, after adding the comments I noticed that on the upstream master the issue has been fixed, so this should probably be closed. The link above is to the 0.17 sources (linked to from the API docs). Sorry for the confusion. |
this is a follow-up on #6379 and a sign that the logic is too complex. I'm doing a rewrite to fix this bug but it is not as clean as I'd like.
The text was updated successfully, but these errors were encountered: