Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Error raised during grid search on pipeline with None for transformer step #18815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jrbourbeau opened this issue Nov 11, 2020 · 3 comments
Closed
Milestone

Comments

@jrbourbeau
Copy link
Contributor

Describe the bug

When performing a grid search on a pipeline that has None for a transformer step, an AttributeError is raised. This snippet below previously ran successfully with scikit-learn==0.23.2 but no longer works the 0.24.dev0.

Steps/Code to Reproduce

from sklearn.datasets import load_iris
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC

iris = load_iris()
X, y = iris.data, iris.target

pipe = Pipeline([("setup", None), ("svc", SVC(kernel="linear", random_state=0))])

param_grid = [
    {"svc__C": [0.1, 0.1]},
    {"setup": [StandardScaler()]},
]

gs = GridSearchCV(pipe, param_grid=param_grid, return_train_score=True, cv=3)
gs.fit(X, y)

Expected Results

The GridSearchCV.fit call is able to successfully complete

Actual Results

The following error is raised (I've included the full traceback further down):

  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/base.py", line 863, in _is_pairwise
    pairwise_tag = estimator._get_tags().get('pairwise', False)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/base.py", line 348, in _get_tags
    more_tags = base_class._more_tags(self)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/pipeline.py", line 626, in _more_tags
    estimator_tags = self.steps[0][1]._get_tags()
AttributeError: 'NoneType' object has no attribute '_get_tags'

It appears that the _is_pairwise check doesn't work as expected when applied to a pipeline with None for a step transformer.

Full traceback:
Traceback (most recent call last):
  File "test-pipeline.py", line 18, in <module>
    gs.fit(X, y)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/utils/validation.py", line 60, in inner_f
    return f(*args, **kwargs)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 841, in fit
    self._run_search(evaluate_candidates)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 1288, in _run_search
    evaluate_candidates(ParameterGrid(self.param_grid))
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 795, in evaluate_candidates
    out = parallel(delayed(_fit_and_score)(clone(base_estimator),
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/parallel.py", line 1048, in __call__
    if self.dispatch_one_batch(iterator):
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/parallel.py", line 866, in dispatch_one_batch
    self._dispatch(tasks)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/parallel.py", line 784, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
    return [func(*args, **kwargs)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
    return [func(*args, **kwargs)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/utils/fixes.py", line 222, in __call__
    return self.function(*args, **kwargs)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/model_selection/_validation.py", line 585, in _fit_and_score
    X_train, y_train = _safe_split(estimator, X, y, train)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/utils/metaestimators.py", line 198, in _safe_split
    if _is_pairwise(estimator):
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/base.py", line 863, in _is_pairwise
    pairwise_tag = estimator._get_tags().get('pairwise', False)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/base.py", line 348, in _get_tags
    more_tags = base_class._more_tags(self)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/pipeline.py", line 626, in _more_tags
    estimator_tags = self.steps[0][1]._get_tags()
AttributeError: 'NoneType' object has no attribute '_get_tags'

Versions

System:
    python: 3.8.6 | packaged by conda-forge | (default, Oct  7 2020, 18:42:56)  [Clang 10.0.1 ]
executable: /Users/james/miniforge3/envs/dask-ml/bin/python3.8
   machine: macOS-10.15.5-x86_64-i386-64bit

Python dependencies:
          pip: 20.2.4
   setuptools: 49.6.0.post20201009
      sklearn: 0.24.dev0
        numpy: 1.19.4
        scipy: 1.5.3
       Cython: None
       pandas: 1.1.4
   matplotlib: None
       joblib: 0.17.0
threadpoolctl: 2.1.0

Built with OpenMP: True
@NicolasHug
Copy link
Member

Thanks for the report @jrbourbeau , we can reproduce. We're investigating the best solution in the different issues linked above if you're interested

@NicolasHug NicolasHug added this to the 0.24 milestone Nov 14, 2020
@NicolasHug
Copy link
Member

I'll mark it as a blocker because the error will not just appear when using None, but when using any step that doesn't have _get_tags attribute (likely because it doesn't inherit from BaseEstimator)

@ogrisel
Copy link
Member

ogrisel commented Dec 2, 2020

Fixed by #18797. Thanks for the timely bug report @jrbourbeau .

@ogrisel ogrisel closed this as completed Dec 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants