-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
repeated k-fold #7948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
Do you want to be repeating the *same* splits? In all of 0.17, 0.18.0,
0.18.1 and master this will produce different splits in each call to
cross_val_score, but not if random_state is an int. The difference in
0.18.0 is that in GridSearchCV each parameter would use different splits. I
think #7935 would change this to perform repeated KFold with the *same*
splits. These differences are clearly very subtle but important
semantically.
…On 29 November 2016 at 10:26, Andreas Mueller ***@***.***> wrote:
We should add repeated k-fold cross-validation. It's something a lot of
people use, and it's tricky to implement in scikit-learn. In particular
with the recent changes to cross-validation,
cv = KFold(shuffle=True)
results = []for i in range(10):
results.append(cross_val_score(est, X, y, cv=cv))
yields repeated cross-validation in 0.17 and 0.18, but not in 0.18.1 or
dev (I think?)
We could do this with a wrapper that sets the random_state or we could
explicitly write RepeatedKFold and RepeatedStratifiedKFold.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#7948>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AAEz63mM7vzvCPzHC-VwnaNcYS9BKtzMks5rC2M7gaJpZM4K-YtH>
.
|
You are right, this is still the same as it has been in 0.17, but would change with #7935. |
We could do this with a wrapper that sets the random_state or we could
explicitly write RepeatedKFold and RepeatedStratifiedKFold.
+1 for explicit RepeatedKFold and RepeatedStratifiedKFold.
|
Yeah that's where I'm leaning to, too. |
I would like to work on this one. |
@neerajgangwar go ahead! |
@amueller I pushed initial commit. Can you please check and see if I am going in the right direction? :) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We should add repeated k-fold cross-validation. It's something a lot of people use, and it's tricky to implement in scikit-learn. In particular with the recent changes to cross-validation,
yields repeated cross-validation in 0.17 and 0.18, but not in 0.18.1 or dev (I think?)
We could do this with a wrapper that sets the random_state or we could explicitly write
RepeatedKFold
andRepeatedStratifiedKFold
.The text was updated successfully, but these errors were encountered: