-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] Add verbose option for Pipeline and Feature Union [rebased 8568] #9668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add verbose option for Pipeline and Feature Union [rebased 8568] #9668
Conversation
- Each line printed by Pipeline and FeatureUnion, when their verbosity mode is on, will be 70 characters long.
The |
sklearn/pipeline.py
Outdated
print('%s%s%s' % (name, '.' * (70 - len(name + elapsed)), elapsed)) | ||
|
||
|
||
class _BasePipeline(six.with_metaclass(ABCMeta, _BaseComposition)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that it has been badly merged. This class has been removed in favor of _BaseComposition
.
sklearn/pipeline.py
Outdated
@@ -10,23 +10,98 @@ | |||
# License: BSD | |||
|
|||
from collections import defaultdict | |||
from abc import ABCMeta, abstractmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed any more with _BaseComposition
sklearn/pipeline.py
Outdated
|
||
|
||
class FeatureUnion(_BaseComposition, TransformerMixin): | ||
class FeatureUnion(_BasePipeline, TransformerMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should also be subclassed from BaseComposition
sklearn/pipeline.py
Outdated
|
||
def __init__(self, transformer_list, n_jobs=1, transformer_weights=None, | ||
verbose=False): | ||
self.transformer_list = tosequence(transformer_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be list as previously stated
sklearn/pipeline.py
Outdated
@@ -773,10 +935,10 @@ def transform(self, X): | |||
|
|||
def _update_transformer_list(self, transformers): | |||
transformers = iter(transformers) | |||
self.transformer_list[:] = [ | |||
self.transformer_list = tosequence([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check master
Fix error message bug
916663e
to
3938adc
Compare
@petrushev Could you solve the PEP8 issue: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jnothman this PR has been resurrect and it seems to have been extensively reviewed previously.
Having a quick look, I was thinking that a decorator could be used in multiple places instead of repeating the same line around the processing. WDYT?
doc/whats_new.rst
Outdated
@@ -5754,3 +5766,5 @@ David Huard, Dave Morrill, Ed Schofield, Travis Oliphant, Pearu Peterson. | |||
|
|||
.. _Neeraj Gangwar: http://neerajgangwar.in | |||
.. _Arthur Mensch: https://amensch.fr | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this blank line.
doc/whats_new.rst
Outdated
- Added optional parameter ``verbose`` in functions `pipeline.make_pipeline` | ||
and `pipeline.make_union` to extend the same functionality as the | ||
corresponding classes. :issue:`9668` by | ||
:user:`Baze Petrushev <petrushev>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you could also add @karandesai-96 in the authorship since that it did a good part of the job as well :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @glemaitre and @petrushev
Current What's New file on this branch has these two lines:
Miscellaneous
- Added optional parameter ``verbose`` in :class:`pipeline.Pipeline` and
:class:`pipeline.FeatureUnion` for showing progress and timing of each
step. :issue:`8568` by :user:`Karan Desai <karandesai-96>`.
- Added optional parameter ``verbose`` in functions `pipeline.make_pipeline`
and `pipeline.make_union` to extend the same functionality as the
corresponding classes. :issue:`9668` by
:user:`Baze Petrushev <petrushev>`.
I think they should be unified outlining the feature, issue, PR and contributors.
sklearn/pipeline.py
Outdated
|
||
import numpy as np | ||
from scipy import sparse | ||
|
||
from .base import clone, TransformerMixin | ||
from .externals.joblib import Parallel, delayed | ||
from .externals import six | ||
from .utils.metaestimators import if_delegate_has_method | ||
from .utils.metaestimators import if_delegate_has_method, _BaseComposition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert to the version of master with line 20 and line 24. It is equivalent to master in bring 2 diffs.
sklearn/pipeline.py
Outdated
|
||
|
||
def _fit_one_transformer(transformer, X, y): | ||
return transformer.fit(X, y) | ||
def _fit_one_transformer(trans, X, y, verbose=False, idx=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace trans
by transformer
in all this function
sklearn/pipeline.py
Outdated
total_steps=None, name=None): | ||
# idx, total_steps and name are not required when verbosity is disabled | ||
step_start_time = time.time() | ||
trans = trans.fit(X, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trans
-> transformer
sklearn/pipeline.py
Outdated
return res, transformer | ||
return res * weight, transformer | ||
return res, trans | ||
return res * weight, trans |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trans
-> transformer
sklearn/pipeline.py
Outdated
@@ -705,9 +792,19 @@ def fit(self, X, y=None): | |||
This estimator | |||
""" | |||
self._validate_transformers() | |||
all_transformers = [(name, trans, weight) for name, trans, weight in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This diff seems weird to me. It is not in master. I would be checking that this is not from the previous version before _BaseComposition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this was done, along with the change in _fit_one_transform
, for the benefit of the verbosity, so not in master.
Here 's the commit in the original PR.
sklearn/pipeline.py
Outdated
@@ -729,10 +826,19 @@ def fit_transform(self, X, y=None, **fit_params): | |||
sum of n_components (output dimension) over transformers. | |||
""" | |||
self._validate_transformers() | |||
all_transformers = [(name, trans, weight) for name, trans, weight in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as the previous statement
sklearn/tests/test_pipeline.py
Outdated
@@ -67,12 +67,12 @@ def set_params(self, **params): | |||
|
|||
|
|||
class NoInvTransf(NoTrans): | |||
def transform(self, X): | |||
def transform(self, X, y=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove y=None
. This is from the previous transformer version
sklearn/tests/test_pipeline.py
Outdated
return X | ||
|
||
|
||
class Transf(NoInvTransf): | ||
def transform(self, X): | ||
def transform(self, X, y=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove y=None
Hi @petrushev, thanks for taking forward my PR. |
28740b6
to
efb4aac
Compare
Thanks for taking this on, @petrushev. As stated in the previous PR, I had hoped we could share the logging logic with |
@jnothman Sure, that seems like the way to go. I'll see what I can do about it. |
39a0736
to
2e5b6cc
Compare
@jnothman I wonder if this is something like you had in mind? The |
it is, at a glance, thanks! I'll give it a more thorough review soon.
|
could you please add a unit test for the helper, making sure it doesn't
explode, for instance, if the text is too long. The standard library
textwrap may prove useful for such things.
…On 6 Sep 2017 9:27 am, "Joel Nothman" ***@***.***> wrote:
it is, at a glance, thanks! I'll give it a more thorough review soon.
|
of course, i just wanted to make sure i'm on the right track |
are you still working on this? |
@amueller unfortunately i can't really promise i'll commit myself to it any time soon :/ |
thanks for your work so far
|
@petrushev now worries and thanks for an honest assessment :) |
Reference Issue
Rebases #8568
Fixes #5298, #5321
What does this implement/fix? Explain your changes.
Rebases #8568 which has the following description:
Also, it extends the
make_pipeline
helper to support theverbose
kwarg.The snippets in the original PR produce the expected output.
Also checked with the example
feature_selection/plot_feature_selection_pipeline.py
which produces a nice output: