-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG + 1] FIX Calling fit_transform instead of transform in Pipeline's fit_predict #7585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
thanks, lgtm :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pep8 ;)
|
||
# first compute the transform and clustering step separately | ||
scaled = scaler.fit_transform(iris.data) | ||
separate_pred = km.fit_predict(scaled) | ||
|
||
# use a pipeline to do the transform and clustering in one step | ||
pipe = Pipeline([('scaler', scaler), ('Kmeans', km)]) | ||
pipe = Pipeline([('scaler', scaler_for_pipeline), ('Kmeans', km_for_pipeline)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line's too long now :)
thanks! |
This is actually fixing a regression, and an error on my part :( Tagging with 0.18.1 for backport. Almost LGTM. |
@@ -277,14 +277,19 @@ def test_fit_predict_on_pipeline(): | |||
# transform and clustering steps separately | |||
iris = load_iris() | |||
scaler = StandardScaler() | |||
scaler_for_pipeline = StandardScaler() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment that this is necessary since Pipeline
does not clone the estimators.
…_predict_on_pipeline
LGTM, thanks |
…t_predict (scikit-learn#7585) * BUGFIX Calling fit_transform instead of transform in Pipeline's fit_predict (scikit-learn#7558) * PEP8 fixes in test_fit_predict_on_pipeline * Added comment explaining separate estimators for pipeline in test_fit_predict_on_pipeline
…t_predict (scikit-learn#7585) * BUGFIX Calling fit_transform instead of transform in Pipeline's fit_predict (scikit-learn#7558) * PEP8 fixes in test_fit_predict_on_pipeline * Added comment explaining separate estimators for pipeline in test_fit_predict_on_pipeline
…t_predict (scikit-learn#7585) * BUGFIX Calling fit_transform instead of transform in Pipeline's fit_predict (scikit-learn#7558) * PEP8 fixes in test_fit_predict_on_pipeline * Added comment explaining separate estimators for pipeline in test_fit_predict_on_pipeline
Reference Issue
This PR fixes issue #7558.
What does this implement/fix? Explain your changes.
As discussed in #7558, each transformer in pipeline should call
fit_transform
instead oftransform
infit_predict
.test_fix_predict_on_pipeline
is also fixed.