Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Pipeline should use fit_transform in fit_predict #7558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gszpak opened this issue Oct 3, 2016 · 12 comments
Closed

Pipeline should use fit_transform in fit_predict #7558

gszpak opened this issue Oct 3, 2016 · 12 comments
Labels

Comments

@gszpak
Copy link
Contributor

gszpak commented Oct 3, 2016

Description

Pipeline's fit_predict implementation is inconsistent with docstring. According to docstring, each
step should call fit_transform, but it calls only transform instead.

@amueller
Copy link
Member

amueller commented Oct 3, 2016

wow this is a pretty bad bug. Is fit_predict not tested?

@amueller amueller added the Bug label Oct 3, 2016
@gszpak
Copy link
Contributor Author

gszpak commented Oct 3, 2016

It is here, but not properly in my opinion.
Pipeline object in this test uses the same scaler object that was used earlier to calculate expected result. Therefore this call of check_is_fitted doesn't fail and the test passes.

@amueller
Copy link
Member

amueller commented Oct 3, 2016

Do you want to fix this, and the test?

Hm @jnothman I am somewhat confused. I thought we'd be cloning estimators on pipeline construction. Was it a deliberate choice not to clone when we set the steps?

@gszpak
Copy link
Contributor Author

gszpak commented Oct 3, 2016

Sure, I'll fix both.

@jnothman
Copy link
Member

jnothman commented Oct 4, 2016

I don't think you should fix the lack of cloning. @amueller, I consider this to be among Pipeline's "legacy" behaviour. It's been around for a long time and we'll break lots of code by changing its behaviour. Pipeline also modifies a constructor param in fit rather than setting a new _-trailed attr.

Yes, fix fit_predict.

gszpak added a commit to gszpak/scikit-learn that referenced this issue Oct 4, 2016
@amueller
Copy link
Member

amueller commented Oct 5, 2016

Ugh I should remember that. Should we add that to the docs somewhere?

@jnothman
Copy link
Member

jnothman commented Oct 5, 2016

Should we add that to the docs somewhere?

Perhaps we should document it; perhaps we should ensure it's tested; and it should at least be commented.

@gszpak
Copy link
Contributor Author

gszpak commented Oct 6, 2016

I can add it to test_pipeline_init (or add a separate test) if you don't mind.

@jnothman
Copy link
Member

jnothman commented Oct 6, 2016

Perhaps as a separate PR. I'd rather merge the fix first.

@jnothman
Copy link
Member

jnothman commented Oct 6, 2016

Thanks for catching and fixing this, @gszpak

@jnothman jnothman closed this as completed Oct 6, 2016
jnothman pushed a commit that referenced this issue Oct 6, 2016
…t_predict (#7585)

* BUGFIX Calling fit_transform instead of transform in Pipeline's fit_predict (#7558)

* PEP8 fixes in test_fit_predict_on_pipeline

* Added comment explaining separate estimators for pipeline in test_fit_predict_on_pipeline
@amueller
Copy link
Member

amueller commented Oct 7, 2016

@gszpak if you like you can open an issue and/or PR for the tests and doc. That would be much appreciated.

@gszpak
Copy link
Contributor Author

gszpak commented Oct 9, 2016

Sure, will do:)

amueller pushed a commit to amueller/scikit-learn that referenced this issue Oct 14, 2016
…t_predict (scikit-learn#7585)

* BUGFIX Calling fit_transform instead of transform in Pipeline's fit_predict (scikit-learn#7558)

* PEP8 fixes in test_fit_predict_on_pipeline

* Added comment explaining separate estimators for pipeline in test_fit_predict_on_pipeline
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this issue Jun 14, 2017
…t_predict (scikit-learn#7585)

* BUGFIX Calling fit_transform instead of transform in Pipeline's fit_predict (scikit-learn#7558)

* PEP8 fixes in test_fit_predict_on_pipeline

* Added comment explaining separate estimators for pipeline in test_fit_predict_on_pipeline
paulha pushed a commit to paulha/scikit-learn that referenced this issue Aug 19, 2017
…t_predict (scikit-learn#7585)

* BUGFIX Calling fit_transform instead of transform in Pipeline's fit_predict (scikit-learn#7558)

* PEP8 fixes in test_fit_predict_on_pipeline

* Added comment explaining separate estimators for pipeline in test_fit_predict_on_pipeline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants