Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG+1] Deprecate axis parameter in imputer #10558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Feb 8, 2018
Merged

[MRG+1] Deprecate axis parameter in imputer #10558

merged 10 commits into from
Feb 8, 2018

Conversation

qinhanmin2014
Copy link
Member

@qinhanmin2014 qinhanmin2014 commented Jan 31, 2018

Reference Issues/PRs

Fixes #9463
Closes #9672

What does this implement/fix? Explain your changes.

We are unable to contact with the author of the original PR, so I try to complete it.
Improvements (according to the reviews in #9672):
(1) Improve the deprecation message
(2) Add a test
(3) Correct what's new
(4) Ignore deprecation warnings in the tests (since there are too many warnings)

Any other comments?

@glemaitre
Copy link
Member

LGTM

@glemaitre glemaitre changed the title [MRG] Deprecate axis parameter in imputer [MRG+1] Deprecate axis parameter in imputer Feb 7, 2018
@glemaitre
Copy link
Member

@jnothman Since that you reviewed the original PR, I think that this is ready to be merged.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc/modules/preprocessing.rst sets axis explicitly and will now raise a warning. examples/plot_missing_values.py does too. They should be changed.

warnings.warn("Parameter 'axis' has been deprecated in 0.20 and "
"will be removed in 0.22. Future (and default) "
"behavior is equivalent to 'axis=0' (impute along "
"columns).", DeprecationWarning)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably say "Row-wise imputation can be performed with FunctionTransformer." Could even specify FunctionTransformer(lambda X: Imputer(...).fit_transform(X.T).T)

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@jnothman jnothman merged commit 52aaf82 into scikit-learn:master Feb 8, 2018
@qinhanmin2014 qinhanmin2014 deleted the deprecate_imputer_axis branch February 8, 2018 06:57
@qinhanmin2014
Copy link
Member Author

Thanks @glemaitre @jnothman :)

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Feb 8, 2018
@sergeyf
Copy link
Contributor

sergeyf commented Feb 12, 2018

FYI, this seems to have broken a test: test_permutation_test_score_allow_nans, which is here:

def test_permutation_test_score_allow_nans():

The issue is that the axis check was not altered to be _axis, so _statistics is not set:

        if self.axis is None:
            self._axis = 0
        else:
            warnings.warn("Parameter 'axis' has been deprecated in 0.20 and "
                          "will be removed in 0.22. Future (and default) "
                          "behavior is equivalent to 'axis=0' (impute along "
                          "columns). Row-wise imputation can be performed "
                          "with FunctionTransformer.", DeprecationWarning)
            self._axis = self.axis

        if self._axis not in [0, 1]:
            raise ValueError("Can only impute missing values on axis 0 and 1, "
                             " got axis={0}".format(self._axis))

        # Since two different arrays can be provided in fit(X) and
        # transform(X), the imputation data will be computed in transform()
        # when the imputation is done per sample (i.e., when axis=1).
        if self.axis == 0:
            faf = 'allow-nan' if self.missing_values == 'NaN' else True
            X = check_array(X, accept_sparse='csc', dtype=np.float64,
                            force_all_finite=faf)

            if sparse.issparse(X):
                self.statistics_ = self._sparse_fit(X,
                                                    self.strategy,
                                                    self.missing_values,
                                                    self._axis)
            else:
                self.statistics_ = self._dense_fit(X,
                                                   self.strategy,
                                                   self.missing_values,
                                                   self._axis)

I can fix this in the MICE PR, as this change also effects how MICE will make use of Imputer.

@sergeyf
Copy link
Contributor

sergeyf commented Feb 12, 2018

The fix is here: 8350981

@jnothman
Copy link
Member

Please submit a separate PR... why is this not failing in master?

@sergeyf
Copy link
Contributor

sergeyf commented Feb 12, 2018

OK will do. No idea why it's not failing in master.

@sergeyf
Copy link
Contributor

sergeyf commented Feb 12, 2018

Hmm, it looks correct in master. It's probably a mistake I made while merging the MICE PR to master, but I'm not sure how that would have happened. My apologies for the bother!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deprecate Imputer with axis=1
5 participants