-
-
Notifications
You must be signed in to change notification settings - Fork 26.6k
[MRG+1] Fix shuffle not passed in MLP #12582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Please make each change a separate PR. Ideally add a test for the shuffle issue. |
|
|
|
Alright the test was added and the shuffle argument is working properly now |
| allow_unlabeled=False, | ||
| random_state=0) | ||
|
|
||
| # Ensure shuffling happens or doesn't happen when passed as an argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ensure shuffling parameter has some effect" is all you are testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fair, the test could be more direct. The most direct effect of shuffle is the resulting coefficients so I rewrote the test to look at them instead.
I can’t really think of a better way to test it than that, but I made sure this new test consistently works across different sample sizes and random states so it is a genuine test of the shuffle behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't really have a problem with the test... I had a problem with the comment.
| mlp1.fit(X, y) | ||
| mlp2.fit(X, y) | ||
|
|
||
| assert not np.array_equal(mlp1.coefs_[0], mlp2.coefs_[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we be testing that they are not even allclose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure of a good tolerance level for closeness here, they will be pretty close because the difference is essentially just random noise that comes from shuffling.
|
I think this PR is complete unless there are additional concerns. The current test seems sufficient to me for testing whether the data is being shuffled. |
|
@samwaterbury I was running into the shuffle issue, too, and just wanted to say thanks for addressing it. |
jnothman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add an entry to the change log at doc/whats_new/v0.21.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:
|
Sorry I got sort of sidetracked, I should've circled back around to this PR sooner. I just updated whats_new. I'm assuming the circleci failure for Python 2 can be ignored. Once this gets merged I will fix the merge conflicts in my other open PR #12605. |
|
Thanks. Merging with master will make the Circle failure disappear. Another reviewer here for a simple bug fix? |
|
I think it's too far down the open PR list for any other reviewers to see, maybe someone can be tagged? |
|
@rth, a quick one? |
rth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @samwaterbury! LGTM.
The CircleCI fails due to outdated setup, but I don't think there anything in this PR that could make it fail on master. Merging.
This reverts commit aa33ccd.
This reverts commit aa33ccd.
Reference Issues/PRs
Addresses #12505 (comment)
Alright, so I listed a few issues with
MLPClassifier(which also affectMLPRegressor) in #12505. The most basic issue was that theshuffleargument was being ignored, which is fixed in this first commit.The rest of the issues I want some input on before I continue. I posted a long message in the original issue thread but here's the tl;dr. They basically boil down to documenting the odd behavior vs. fixing it:
The method
partial_fitis not documented as being incompatible with LBFGS, however if you attempt to use it it raises an error saying that it is incompatible. If I remove the code that raises this error, it works fine with LBFGS as far as I can tell. Unless anyone knows why we shouldn't allow LBFGS withpartial_fit, I'd say we could remove that restriction. Otherwise, it should be documented.When
warm_start=True, thefitmethod breaks after a single iteration of training, just likepartial_fitdoes. Thus, they perform identically. I don't understand whyfitshould be breaking after 1 iteration forwarm_start=True. In my opinion it should be changed, but this would change its behavior. Again, if it isn't changed, it should be documented.Not as serious as the other points, but the documentation for
partial_fitappears oddly due to it being implemented as a@property. It could probably be fixed, but I'm not sure if its important or not. I'm not sure of the logic behind why it was implemented this way.Each of these are easy fixes whichever way we decide to go (3 possibly excluded?). Input would be appreciated!
Also pinging @jnothman since you raised some points in the issue.