[MRG+1] Fix shuffle not passed in MLP #12582

samwaterbury · 2018-11-14T03:50:35Z

Reference Issues/PRs

Alright, so I listed a few issues with MLPClassifier (which also affect MLPRegressor) in #12505. The most basic issue was that the shuffle argument was being ignored, which is fixed in this first commit.

The rest of the issues I want some input on before I continue. I posted a long message in the original issue thread but here's the tl;dr. They basically boil down to documenting the odd behavior vs. fixing it:

The method partial_fit is not documented as being incompatible with LBFGS, however if you attempt to use it it raises an error saying that it is incompatible. If I remove the code that raises this error, it works fine with LBFGS as far as I can tell. Unless anyone knows why we shouldn't allow LBFGS with partial_fit, I'd say we could remove that restriction. Otherwise, it should be documented.
When warm_start=True, the fit method breaks after a single iteration of training, just like partial_fit does. Thus, they perform identically. I don't understand why fit should be breaking after 1 iteration for warm_start=True. In my opinion it should be changed, but this would change its behavior. Again, if it isn't changed, it should be documented.
Not as serious as the other points, but the documentation for partial_fit appears oddly due to it being implemented as a @property. It could probably be fixed, but I'm not sure if its important or not. I'm not sure of the logic behind why it was implemented this way.

Each of these are easy fixes whichever way we decide to go (3 possibly excluded?). Input would be appreciated!

Also pinging @jnothman since you raised some points in the issue.

jnothman · 2018-11-14T10:27:57Z

Please make each change a separate PR. Ideally add a test for the shuffle issue.

jnothman · 2018-11-14T10:29:58Z

partial_fit is a property so that it disappears (hasattr(mlp, 'partial_fit') is False) if not a stochastic optimiser. This allows duck typing to be used... Yes, it's convoluted. No, we do not currently have a nice solution for the docstrings. You could include the true method signature at the top of the docstring as specified in numpydoc.

samwaterbury · 2018-11-15T05:26:16Z

Alright the test was added and the shuffle argument is working properly now

sklearn/neural_network/tests/test_mlp.py

jnothman · 2018-11-15T07:45:09Z

sklearn/neural_network/tests/test_mlp.py

+                                          allow_unlabeled=False,
+                                          random_state=0)
+
+    # Ensure shuffling happens or doesn't happen when passed as an argument


"ensure shuffling parameter has some effect" is all you are testing

This is fair, the test could be more direct. The most direct effect of shuffle is the resulting coefficients so I rewrote the test to look at them instead.

I can’t really think of a better way to test it than that, but I made sure this new test consistently works across different sample sizes and random states so it is a genuine test of the shuffle behavior.

I didn't really have a problem with the test... I had a problem with the comment.

jnothman · 2018-11-18T05:11:13Z

sklearn/neural_network/tests/test_mlp.py

+    mlp1.fit(X, y)
+    mlp2.fit(X, y)
+
+    assert not np.array_equal(mlp1.coefs_[0], mlp2.coefs_[0])


should we be testing that they are not even allclose?

I'm not sure of a good tolerance level for closeness here, they will be pretty close because the difference is essentially just random noise that comes from shuffling.

samwaterbury · 2018-11-21T18:45:57Z

I think this PR is complete unless there are additional concerns. The current test seems sufficient to me for testing whether the data is being shuffled.

nils-wisiol · 2019-01-18T10:22:41Z

@samwaterbury I was running into the shuffle issue, too, and just wanted to say thanks for addressing it.

jnothman

Please add an entry to the change log at doc/whats_new/v0.21.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

samwaterbury · 2019-01-20T03:48:13Z

Sorry I got sort of sidetracked, I should've circled back around to this PR sooner.

I just updated whats_new. I'm assuming the circleci failure for Python 2 can be ignored. Once this gets merged I will fix the merge conflicts in my other open PR #12605.

jnothman · 2019-01-20T20:44:33Z

Thanks. Merging with master will make the Circle failure disappear.

Another reviewer here for a simple bug fix?

samwaterbury · 2019-01-24T18:57:10Z

I think it's too far down the open PR list for any other reviewers to see, maybe someone can be tagged?

jnothman · 2019-01-25T06:35:34Z

@rth, a quick one?

rth

Thanks @samwaterbury! LGTM.

The CircleCI fails due to outdated setup, but I don't think there anything in this PR that could make it fail on master. Merging.

This reverts commit aa33ccd.

Fix for shuffle argument in BaseMultilayerPerceptron

3ef1c92

samwaterbury added 2 commits November 14, 2018 21:44

Added test for shuffle MLP shuffle behavior

a38fb5e

Fixed accidental 80 character line

b0e3da5

jnothman reviewed Nov 15, 2018

View reviewed changes

Rewrote shuffle test

9c14747

samwaterbury changed the title ~~[WIP] Fixes for multilayer perceptron issues~~ [MRG] Fixes for multilayer perceptron issues Nov 15, 2018

samwaterbury changed the title ~~[MRG] Fixes for multilayer perceptron issues~~ [MRG] Fix shuffle option for multilayer perceptron Nov 17, 2018

jnothman reviewed Nov 18, 2018

View reviewed changes

jnothman approved these changes Jan 19, 2019

View reviewed changes

Updated whats_new

c15864e

jnothman added the Bug label Jan 20, 2019

samwaterbury changed the title ~~[MRG] Fix shuffle option for multilayer perceptron~~ [MRG+1] Fix shuffle option for multilayer perceptron Jan 24, 2019

jnothman changed the title ~~[MRG+1] Fix shuffle option for multilayer perceptron~~ [MRG+1] Fix shuffle not passed in MLP Jan 25, 2019

rth approved these changes Jan 25, 2019

View reviewed changes

rth merged commit dca9156 into scikit-learn:master Jan 25, 2019

thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Feb 7, 2019

FIX Fix shuffle not passed in MLP (scikit-learn#12582)

2626cb7

samwaterbury deleted the mlp branch February 14, 2019 21:26

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

FIX Fix shuffle not passed in MLP (scikit-learn#12582)

aa33ccd

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "FIX Fix shuffle not passed in MLP (scikit-learn#12582)"

d5ccdf4

This reverts commit aa33ccd.

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "FIX Fix shuffle not passed in MLP (scikit-learn#12582)"

6b46871

This reverts commit aa33ccd.

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019

FIX Fix shuffle not passed in MLP (scikit-learn#12582)

e4598a7

Uh oh!

[MRG+1] Fix shuffle not passed in MLP #12582

[MRG+1] Fix shuffle not passed in MLP #12582

Uh oh!

Conversation

samwaterbury commented Nov 14, 2018 • edited by rth Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

Uh oh!

jnothman commented Nov 14, 2018

Uh oh!

jnothman commented Nov 14, 2018

Uh oh!

samwaterbury commented Nov 15, 2018

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jnothman Nov 15, 2018

Choose a reason for hiding this comment

Uh oh!

samwaterbury Nov 15, 2018

Choose a reason for hiding this comment

Uh oh!

jnothman Nov 18, 2018

Choose a reason for hiding this comment

Uh oh!

jnothman Nov 18, 2018

Choose a reason for hiding this comment

Uh oh!

samwaterbury Nov 18, 2018

Choose a reason for hiding this comment

Uh oh!

samwaterbury commented Nov 21, 2018

Uh oh!

nils-wisiol commented Jan 18, 2019

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

samwaterbury commented Jan 20, 2019

Uh oh!

jnothman commented Jan 20, 2019

Uh oh!

samwaterbury commented Jan 24, 2019

Uh oh!

jnothman commented Jan 25, 2019

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

samwaterbury commented Nov 14, 2018 •

edited by rth

Loading