Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[WIP] Add verbose option for Pipeline and Feature Union [rebased 8568] #9668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

petrushev
Copy link

Reference Issue

Rebases #8568
Fixes #5298, #5321

What does this implement/fix? Explain your changes.

Rebases #8568 which has the following description:

This PR adds a new argument verbose to Pipeline and FeatureUnion, which is False by default. Setting this True will print useful information during execution.

Also, it extends the make_pipeline helper to support the verbose kwarg.

The snippets in the original PR produce the expected output.

Also checked with the example feature_selection/plot_feature_selection_pipeline.py which produces a nice output:

[Pipeline] (step 1 of 2) selectkbest ........................ 0.00057s
[Pipeline] (step 2 of 2) svc ................................ 0.00067s
[Pipeline] Total time elapsed: .............................. 0.00123s

@petrushev
Copy link
Author

The whats new was not rebased correctly, will fix it.

print('%s%s%s' % (name, '.' * (70 - len(name + elapsed)), elapsed))


class _BasePipeline(six.with_metaclass(ABCMeta, _BaseComposition)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that it has been badly merged. This class has been removed in favor of _BaseComposition.

@@ -10,23 +10,98 @@
# License: BSD

from collections import defaultdict
from abc import ABCMeta, abstractmethod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed any more with _BaseComposition



class FeatureUnion(_BaseComposition, TransformerMixin):
class FeatureUnion(_BasePipeline, TransformerMixin):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should also be subclassed from BaseComposition


def __init__(self, transformer_list, n_jobs=1, transformer_weights=None,
verbose=False):
self.transformer_list = tosequence(transformer_list)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be list as previously stated

@@ -773,10 +935,10 @@ def transform(self, X):

def _update_transformer_list(self, transformers):
transformers = iter(transformers)
self.transformer_list[:] = [
self.transformer_list = tosequence([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check master

@petrushev petrushev force-pushed the pipeline-verbose-rebased branch from 916663e to 3938adc Compare September 1, 2017 12:56
@glemaitre
Copy link
Member

@petrushev Could you solve the PEP8 issue:
https://travis-ci.org/scikit-learn/scikit-learn/jobs/270816940

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnothman this PR has been resurrect and it seems to have been extensively reviewed previously.

Having a quick look, I was thinking that a decorator could be used in multiple places instead of repeating the same line around the processing. WDYT?

@@ -5754,3 +5766,5 @@ David Huard, Dave Morrill, Ed Schofield, Travis Oliphant, Pearu Peterson.

.. _Neeraj Gangwar: http://neerajgangwar.in
.. _Arthur Mensch: https://amensch.fr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this blank line.

- Added optional parameter ``verbose`` in functions `pipeline.make_pipeline`
and `pipeline.make_union` to extend the same functionality as the
corresponding classes. :issue:`9668` by
:user:`Baze Petrushev <petrushev>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you could also add @karandesai-96 in the authorship since that it did a good part of the job as well :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @glemaitre and @petrushev
Current What's New file on this branch has these two lines:

Miscellaneous

- Added optional parameter ``verbose`` in :class:`pipeline.Pipeline` and
  :class:`pipeline.FeatureUnion` for showing progress and timing of each
     step. :issue:`8568` by :user:`Karan Desai <karandesai-96>`.

- Added optional parameter ``verbose`` in functions `pipeline.make_pipeline`
  and `pipeline.make_union` to extend the same functionality as the
  corresponding classes. :issue:`9668` by
  :user:`Baze Petrushev <petrushev>`.

I think they should be unified outlining the feature, issue, PR and contributors.


import numpy as np
from scipy import sparse

from .base import clone, TransformerMixin
from .externals.joblib import Parallel, delayed
from .externals import six
from .utils.metaestimators import if_delegate_has_method
from .utils.metaestimators import if_delegate_has_method, _BaseComposition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert to the version of master with line 20 and line 24. It is equivalent to master in bring 2 diffs.



def _fit_one_transformer(transformer, X, y):
return transformer.fit(X, y)
def _fit_one_transformer(trans, X, y, verbose=False, idx=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace trans by transformer in all this function

total_steps=None, name=None):
# idx, total_steps and name are not required when verbosity is disabled
step_start_time = time.time()
trans = trans.fit(X, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trans -> transformer

return res, transformer
return res * weight, transformer
return res, trans
return res * weight, trans
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trans -> transformer

@@ -705,9 +792,19 @@ def fit(self, X, y=None):
This estimator
"""
self._validate_transformers()
all_transformers = [(name, trans, weight) for name, trans, weight in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff seems weird to me. It is not in master. I would be checking that this is not from the previous version before _BaseComposition

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this was done, along with the change in _fit_one_transform, for the benefit of the verbosity, so not in master.
Here 's the commit in the original PR.

@@ -729,10 +826,19 @@ def fit_transform(self, X, y=None, **fit_params):
sum of n_components (output dimension) over transformers.
"""
self._validate_transformers()
all_transformers = [(name, trans, weight) for name, trans, weight in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as the previous statement

@@ -67,12 +67,12 @@ def set_params(self, **params):


class NoInvTransf(NoTrans):
def transform(self, X):
def transform(self, X, y=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove y=None. This is from the previous transformer version

return X


class Transf(NoInvTransf):
def transform(self, X):
def transform(self, X, y=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove y=None

@kdexd
Copy link

kdexd commented Sep 3, 2017

Hi @petrushev, thanks for taking forward my PR.
@jnothman: I was amidst a lot of segfaults and failing tests of my codebase since past few weeks due to GSoC. I planned to work the remaining part of my PR after final evaluation, but it was hard to pull myself out. Nevertheless, I'm glad that this unmerged feature is approaching completion!

@petrushev petrushev force-pushed the pipeline-verbose-rebased branch from 28740b6 to efb4aac Compare September 3, 2017 21:06
@petrushev petrushev changed the title Add verbose option for Pipeline and Feature Union [rebased 8568] [WIP] Add verbose option for Pipeline and Feature Union [rebased 8568] Sep 3, 2017
@jnothman
Copy link
Member

jnothman commented Sep 4, 2017

Thanks for taking this on, @petrushev.

As stated in the previous PR, I had hoped we could share the logging logic with sklearn/model_selection/_validation.py's _fit_and_score by putting them in a sklearn/utils/ helper function. In particular, I thought we could share the formatting features, including the use of logger.short_format_time. Formatting times as %0.5f seems particularly weird, because I don't think 10 microseconds is a meaningful precision to report. It's this kind of discrepancy that makes me uncomfortable about reinventing the wheel, preferring to share the code.

@petrushev
Copy link
Author

@jnothman Sure, that seems like the way to go. I'll see what I can do about it.

@petrushev petrushev force-pushed the pipeline-verbose-rebased branch from 39a0736 to 2e5b6cc Compare September 5, 2017 22:30
@petrushev
Copy link
Author

@jnothman I wonder if this is something like you had in mind? The _fit_and_score is refactored in both _validation and cross_validation.

@jnothman
Copy link
Member

jnothman commented Sep 5, 2017 via email

@jnothman
Copy link
Member

jnothman commented Sep 5, 2017 via email

@petrushev
Copy link
Author

of course, i just wanted to make sure i'm on the right track

@amueller
Copy link
Member

are you still working on this?

@petrushev
Copy link
Author

@amueller unfortunately i can't really promise i'll commit myself to it any time soon :/

@jnothman
Copy link
Member

jnothman commented Oct 25, 2017 via email

@jnothman jnothman added Easy Well-defined and straightforward way to resolve help wanted Stalled labels Oct 25, 2017
@amueller
Copy link
Member

@petrushev now worries and thanks for an honest assessment :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Easy Well-defined and straightforward way to resolve help wanted Stalled
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Verbose option on Pipeline
5 participants