Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

amueller
Copy link
Member

@amueller amueller commented Aug 16, 2017

I haven't removed the stuff that's model_selection now, in case we want to give that another version.
Also some other stuff still needs to be done.

Remove code tagged to be removed in v.0.20. Stole todo from #10094

Things to remove:

  • Classes (reported here)

    • cross_validation.KFold
    • cross_validation.LabelKFold
    • cross_validation.LeaveOneLabelOut
    • cross_validation.LeaveOneOut
    • cross_validation.LeavePOut
    • cross_validation.LeavePLabelOut
    • cross_validation.LabelShuffleSplit
    • cross_validation.ShuffleSplit
    • cross_validation.StratifiedKFold
    • cross_validation.StratifiedShuffleSplit
    • cross_validation.PredefinedSplit
    • decomposition.RandomizedPCA
    • gaussian_process.GaussianProcess
    • grid_search.ParameterGrid
    • grid_search.ParameterSampler
    • grid_search.GridSearchCV
    • grid_search.RandomizedSearchCV
    • mixture.DPGMM
    • mixture.GMM
    • mixture.VBGMM
  • from whats new reported here

    • Linear, kernelized and related models

      • residual_metric has been deprecated in :class:linear_model.RANSACRegressor. Use loss instead. By Manoj Kumar_.
      • Access to public attributes .X_ and .y_ has been deprecated in :class:isotonic.IsotonicRegression. By :user:Jonathan Arfa <jarfa>.
    • Decomposition, manifold learning and clustering

      • The old :class:mixture.DPGMM is deprecated in favor of the new :class:mixture.BayesianGaussianMixture (with the parameter weight_concentration_prior_type='dirichlet_process'). The new class solves the computational problems of the old class and computes the Gaussian mixture with a Dirichlet process prior faster than before. :issue:7295 by :user:Wei Xue <xuewei4d> and :user:Thierry Guillemot <tguillemot>.
      • The old :class:mixture.VBGMM is deprecated in favor of the new :class:mixture.BayesianGaussianMixture (with the parameter weight_concentration_prior_type='dirichlet_distribution'). The new class solves the computational problems of the old class and computes the Variational Bayesian Gaussian mixture faster than before. :issue:6651 by :user:Wei Xue <xuewei4d> and :user:Thierry Guillemot <tguillemot>.
      • The old :class:mixture.GMM is deprecated in favor of the new :class:mixture.GaussianMixture. The new class computes the Gaussian mixture faster than before and some of computational problems have been solved. :issue:6666 by :user:Wei Xue <xuewei4d> and :user:Thierry Guillemot <tguillemot>.
    • Model evaluation and meta-estimators

      • The :mod:sklearn.cross_validation, :mod:sklearn.grid_search and :mod:sklearn.learning_curve have been deprecated and the classes and functions have been reorganized into the :mod:sklearn.model_selection module. Ref :ref:model_selection_changes for more information. :issue:4294 by Raghav RV_.

      • The grid_scores_ attribute of :class:model_selection.GridSearchCV and :class:model_selection.RandomizedSearchCV is deprecated in favor of the attribute cv_results_. Ref :ref:model_selection_changes for more information. :issue:6697 by Raghav RV_.

      • The parameters n_iter or n_folds in old CV splitters are replaced by the new parameter n_splits since it can provide a consistent and unambiguous interface to represent the number of train-test splits. :issue:7187 by :user:YenChen Lin <yenchenlin>.

      • classes parameter was renamed to labels in :func:metrics.hamming_loss. :issue:7260 by :user:Sebastián Vanrell <srvanrell>.

      • The splitter classes LabelKFold, LabelShuffleSplit, LeaveOneLabelOut and LeavePLabelsOut are renamed to :class:model_selection.GroupKFold, :class:model_selection.GroupShuffleSplit, :class:model_selection.LeaveOneGroupOut and :class:model_selection.LeavePGroupsOut respectively. Also the parameter labels in the :func:split method of the newly renamed splitters :class:model_selection.LeaveOneGroupOut and :class:model_selection.LeavePGroupsOut is renamed to groups. Additionally in :class:model_selection.LeavePGroupsOut, the parameter n_labels is renamed to n_groups. :issue:6660 by Raghav RV_.

      • Error and loss names for scoring parameters are now prefixed by 'neg_', such as neg_mean_squared_error. The unprefixed versions are deprecated and will be removed in version 0.20. :issue:7261 by :user:Tim Head <betatim>.

files with remaining deprecated lines

  • sklearn/isotonic.py
  • sklearn/metrics/ranking.py
  • sklearn/metrics/scorer.py
  • sklearn/tree/export.py
  • sklearn/decomposition/online_lda.py
  • sklearn/cluster/hierarchical.py
  • sklearn/utils/testing.py
  • sklearn/svm/classes.py
  • sklearn/linear_model/least_angle.py
  • sklearn/linear_model/base.py
  • sklearn/tests/test_base.py
  • sklearn/model_selection/_search.py
  • sklearn/model_selection/tests/test_search.py
  • sklearn/base.py

@amueller
Copy link
Member Author

(and yes, I'm just trying to hack my lines added / lines deleted ratio on github, you got me ;)

@amueller amueller force-pushed the 0_20_deprecations branch from c9776c7 to 4b7aa69 Compare May 22, 2018 17:49
@sklearn-lgtm
Copy link

This pull request fixes 2 alerts when merging f114920 into 20cb37e - view on lgtm.com

fixed alerts:

  • 1 for Non-callable called
  • 1 for Non-iterable used in for loop

Comment posted by lgtm.com

@sklearn-lgtm
Copy link

This pull request fixes 2 alerts when merging fab56a3 into f049ec7 - view on LGTM.com

fixed alerts:

  • 1 for Non-callable called
  • 1 for Non-iterable used in for loop

Comment posted by LGTM.com

@jnothman
Copy link
Member

jnothman commented Jun 5, 2018

CI failures

@glemaitre
Copy link
Member

@amueller Do you mind if I am solving the conflicts and make the CI happy?

results_dict = {k: benchmark(est, data) for k, est in [('pca', pca),
('rpca', rpca)]}
rpca = PCA(n_components=n_components, svd_solver='randomized', random_state=1999)
results_dict = {k: benchmark(est, data) for k, est in [('pca', pca)]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example does not work.
We should either keep rcpa here or remove it everywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomDLT Is the suggestion of @jnothman a few lines above to use PCA(svd_solver='randomized') what you mean with keeping rpca ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, or we can just drop it.
My comment was meant to mention that the example is currently broken.

@rth rth added the Blocker label Jun 14, 2018
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick skip through the diff as well, and given the earlier positive reviews, I would suggest to update this with master and merge it.
Then we can further investigate/clean-up remaining deprecation warnings on master.

return '"tree.dot"'


SENTINEL = Sentinel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentinel class can be removed now

results_dict = {k: benchmark(est, data) for k, est in [('pca', pca),
('rpca', rpca)]}
rpca = PCA(n_components=n_components, svd_solver='randomized', random_state=1999)
results_dict = {k: benchmark(est, data) for k, est in [('pca', pca)]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomDLT Is the suggestion of @jnothman a few lines above to use PCA(svd_solver='randomized') what you mean with keeping rpca ?

@jnothman
Copy link
Member

There is currently a test failure (as well as flake8 failure and merge conflicts):

=================================== FAILURES ===================================
 test_non_meta_estimators[GaussianProcessRegressor-GaussianProcessRegressor-check_estimators_unfitted] 
name = 'GaussianProcessRegressor'
Estimator = <class 'sklearn.gaussian_process.gpr.GaussianProcessRegressor'>
check = <function check_estimators_unfitted at 0x7f000a3d11b8>
    @pytest.mark.parametrize(
            "name, Estimator, check",
            _generate_checks_per_estimator(_yield_all_checks,
                                           _tested_non_meta_estimators()),
            ids=_rename_partial
    )
    def test_non_meta_estimators(name, Estimator, check):
        # Common tests for non-meta estimators
        estimator = Estimator()
        set_checking_parameters(estimator)
>       check(name, estimator)
Estimator  = <class 'sklearn.gaussian_process.gpr.GaussianProcessRegressor'>
check      = <function check_estimators_unfitted at 0x7f000a3d11b8>
estimator  = GaussianProcessRegressor(alpha=1e-10, copy_X_train=True, kernel=None,
        ..., normalize_y=False,
             optimizer='fmin_l_bfgs_b', random_state=None)
name       = 'GaussianProcessRegressor'
/home/travis/build/scikit-learn/scikit-learn/sklearn/tests/test_common.py:96: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/home/travis/build/scikit-learn/scikit-learn/sklearn/utils/testing.py:328: in wrapper
    return fn(*args, **kwargs)
/home/travis/build/scikit-learn/scikit-learn/sklearn/utils/estimator_checks.py:1510: in check_estimators_unfitted
    est.predict, X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
exceptions = (<type 'exceptions.AttributeError'>, <type 'exceptions.ValueError'>)
message = 'fit'
function = <bound method GaussianProcessRegressor.predict of GaussianProcessRegressor(alp... normalize_y=False,
             optimizer='fmin_l_bfgs_b', random_state=None)>
args = (array([[-0.45364538, -0.47282444, -1.20608008, ..., -0.75500806,
         0.25... 0.45314754, -1.1924583 , ..., -1.72110924,
         0.27199858, -1.38161571]]),)
kwargs = {}, names = 'AttributeError or ValueError'
    def assert_raise_message(exceptions, message, function, *args, **kwargs):
        """Helper function to test the message raised in an exception.
    
        Given an exception, a callable to raise the exception, and
        a message string, tests that the correct exception is raised and
        that the message is a substring of the error thrown. Used to test
        that the specific message thrown during an exception is correct.
    
        Parameters
        ----------
        exceptions : exception or tuple of exception
            An Exception object.
    
        message : str
            The error message or a substring of the error message.
    
        function : callable
            Callable object to raise error.
    
        *args : the positional arguments to `function`.
    
        **kwargs : the keyword arguments to `function`.
        """
        try:
            function(*args, **kwargs)
        except exceptions as e:
            error_message = str(e)
            if message not in error_message:
                raise AssertionError("Error message does not include the expected"
                                     " string: %r. Observed error message: %r" %
                                     (message, error_message))
        else:
            # concatenate exception names
            if isinstance(exceptions, tuple):
                names = " or ".join(e.__name__ for e in exceptions)
            else:
                names = exceptions.__name__
    
            raise AssertionError("%s not raised by %s" %
>                                (names, function.__name__))
E           AssertionError: AttributeError or ValueError not raised by predict
args       = (array([[-0.45364538, -0.47282444, -1.20608008, ..., -0.75500806,
         0.25... 0.45314754, -1.1924583 , ..., -1.72110924,
         0.27199858, -1.38161571]]),)
exceptions = (<type 'exceptions.AttributeError'>, <type 'exceptions.ValueError'>)
function   = <bound method GaussianProcessRegressor.predict of GaussianProcessRegressor(alp... normalize_y=False,
             optimizer='fmin_l_bfgs_b', random_state=None)>
kwargs     = {}
message    = 'fit'
names      = 'AttributeError or ValueError'
/home/travis/build/scikit-learn/scikit-learn/sklearn/utils/testing.py:404: AssertionError

Any idea why this failure is occurring?

@jnothman
Copy link
Member

I say we merge on green.

@sklearn-lgtm
Copy link

This pull request fixes 3 alerts when merging ee5710d into 62301aa - view on LGTM.com

fixed alerts:

  • 1 for Non-callable called
  • 1 for Unused import
  • 1 for Non-iterable used in for loop

Comment posted by LGTM.com

@jnothman jnothman merged commit eec7649 into scikit-learn:master Jun 24, 2018
@jnothman
Copy link
Member

Thanks @amueller

@jnothman
Copy link
Member

And thanks @massich!!

@jnothman
Copy link
Member

Cue lots of complaints about cross_validation and grid_search disappearing... :)

@jnothman
Copy link
Member

We just lost 10,200 lines of code :D

@amueller
Copy link
Member Author

thanks for the fixes @jnothman! Sorry I was absent, I'm so glad this is in!

@amueller amueller deleted the 0_20_deprecations branch June 27, 2018 16:14
@jnothman
Copy link
Member

jnothman commented Jun 27, 2018 via email

@amueller
Copy link
Member Author

What's your preferred timeline now?

@jnothman
Copy link
Member

My preferred timeline? I'm feeling very full up of work things at the moment and can't see myself being able to do anything focused towards release... but mostly there are a handful of things that we should still be trying to squeeze into release (deprecations, bug fixes, maybe a MissingIndicator, etc.), several of which are awaiting second review. The key features are in.

@jnothman
Copy link
Member

I would also personally like to see some of #9599 merged to help libraries extend the search approach in BaseSearchCV...

@amueller
Copy link
Member Author

I really would like the tags but not sure it's worth delaying the release...
Can you / have you flagged stuff for release?

@glemaitre
Copy link
Member

We did flag the issues/PRs for the release with the 0.20 milestone.
Of course, we might miss some and some others could be to challenging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants