Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@samwaterbury
Copy link
Contributor

@samwaterbury samwaterbury commented Nov 14, 2018

Reference Issues/PRs

Addresses #12505 (comment)

Alright, so I listed a few issues with MLPClassifier (which also affect MLPRegressor) in #12505. The most basic issue was that the shuffle argument was being ignored, which is fixed in this first commit.

The rest of the issues I want some input on before I continue. I posted a long message in the original issue thread but here's the tl;dr. They basically boil down to documenting the odd behavior vs. fixing it:

  1. The method partial_fit is not documented as being incompatible with LBFGS, however if you attempt to use it it raises an error saying that it is incompatible. If I remove the code that raises this error, it works fine with LBFGS as far as I can tell. Unless anyone knows why we shouldn't allow LBFGS with partial_fit, I'd say we could remove that restriction. Otherwise, it should be documented.

  2. When warm_start=True, the fit method breaks after a single iteration of training, just like partial_fit does. Thus, they perform identically. I don't understand why fit should be breaking after 1 iteration for warm_start=True. In my opinion it should be changed, but this would change its behavior. Again, if it isn't changed, it should be documented.

  3. Not as serious as the other points, but the documentation for partial_fit appears oddly due to it being implemented as a @property. It could probably be fixed, but I'm not sure if its important or not. I'm not sure of the logic behind why it was implemented this way.

Each of these are easy fixes whichever way we decide to go (3 possibly excluded?). Input would be appreciated!

Also pinging @jnothman since you raised some points in the issue.

@jnothman
Copy link
Member

Please make each change a separate PR. Ideally add a test for the shuffle issue.

@jnothman
Copy link
Member

partial_fit is a property so that it disappears (hasattr(mlp, 'partial_fit') is False) if not a stochastic optimiser. This allows duck typing to be used... Yes, it's convoluted. No, we do not currently have a nice solution for the docstrings. You could include the true method signature at the top of the docstring as specified in numpydoc.

@samwaterbury
Copy link
Contributor Author

Alright the test was added and the shuffle argument is working properly now

allow_unlabeled=False,
random_state=0)

# Ensure shuffling happens or doesn't happen when passed as an argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ensure shuffling parameter has some effect" is all you are testing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fair, the test could be more direct. The most direct effect of shuffle is the resulting coefficients so I rewrote the test to look at them instead.

I can’t really think of a better way to test it than that, but I made sure this new test consistently works across different sample sizes and random states so it is a genuine test of the shuffle behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't really have a problem with the test... I had a problem with the comment.

@samwaterbury samwaterbury changed the title [WIP] Fixes for multilayer perceptron issues [MRG] Fixes for multilayer perceptron issues Nov 15, 2018
@samwaterbury samwaterbury changed the title [MRG] Fixes for multilayer perceptron issues [MRG] Fix shuffle option for multilayer perceptron Nov 17, 2018
mlp1.fit(X, y)
mlp2.fit(X, y)

assert not np.array_equal(mlp1.coefs_[0], mlp2.coefs_[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be testing that they are not even allclose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure of a good tolerance level for closeness here, they will be pretty close because the difference is essentially just random noise that comes from shuffling.

@samwaterbury
Copy link
Contributor Author

I think this PR is complete unless there are additional concerns. The current test seems sufficient to me for testing whether the data is being shuffled.

@nils-wisiol
Copy link

@samwaterbury I was running into the shuffle issue, too, and just wanted to say thanks for addressing it.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an entry to the change log at doc/whats_new/v0.21.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

@samwaterbury
Copy link
Contributor Author

Sorry I got sort of sidetracked, I should've circled back around to this PR sooner.

I just updated whats_new. I'm assuming the circleci failure for Python 2 can be ignored. Once this gets merged I will fix the merge conflicts in my other open PR #12605.

@jnothman jnothman added the Bug label Jan 20, 2019
@jnothman
Copy link
Member

Thanks. Merging with master will make the Circle failure disappear.

Another reviewer here for a simple bug fix?

@samwaterbury
Copy link
Contributor Author

I think it's too far down the open PR list for any other reviewers to see, maybe someone can be tagged?

@samwaterbury samwaterbury changed the title [MRG] Fix shuffle option for multilayer perceptron [MRG+1] Fix shuffle option for multilayer perceptron Jan 24, 2019
@jnothman jnothman changed the title [MRG+1] Fix shuffle option for multilayer perceptron [MRG+1] Fix shuffle not passed in MLP Jan 25, 2019
@jnothman
Copy link
Member

@rth, a quick one?

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @samwaterbury! LGTM.

The CircleCI fails due to outdated setup, but I don't think there anything in this PR that could make it fail on master. Merging.

@rth rth merged commit dca9156 into scikit-learn:master Jan 25, 2019
thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Feb 7, 2019
@samwaterbury samwaterbury deleted the mlp branch February 14, 2019 21:26
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants