-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG+1] Deprecate Imputer.axis
argument
#9672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dee866d
to
c900dd5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes should make the CI happy.
Also edit the title with [WIP] when working on the PR and change it to [MRG] when this is ready for revision.
doc/whats_new.rst
Outdated
@@ -47,6 +47,11 @@ Model evaluation and meta-estimators | |||
|
|||
- A scorer based on :func:`metrics.brier_score_loss` is also available. | |||
:issue:`9521` by :user:`Hanmin Qin <qinhanmin2014>`. | |||
- The ``axis`` parameter in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank line
sklearn/preprocessing/imputation.py
Outdated
self.verbose = verbose | ||
self.copy = copy | ||
|
||
self.axis = axis | ||
if axis is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning should be raised in fit
method. The validation is postponed for the SearhCV: object: http://scikit-learn.org/stable/developers/contributing.html#instantiation
sklearn/preprocessing/imputation.py
Outdated
@@ -169,8 +178,12 @@ def fit(self, X, y=None): | |||
|
|||
def _sparse_fit(self, X, strategy, missing_values, axis): | |||
"""Fit the transformer on sparse data.""" | |||
if axis is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need if using since self.axis_
will be 0
or 1
already
sklearn/preprocessing/imputation.py
Outdated
# Imputation is done "by column", so if we want to do it | ||
# by row we only need to convert the matrix to csr format. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the blank line. we try to minimize the diff which bring nothing
sklearn/preprocessing/imputation.py
Outdated
@@ -249,6 +262,9 @@ def _sparse_fit(self, X, strategy, missing_values, axis): | |||
|
|||
def _dense_fit(self, X, strategy, missing_values, axis): | |||
"""Fit the transformer on dense data.""" | |||
if axis is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need if using since self.axis_
will be 0
or 1
already
sklearn/preprocessing/imputation.py
Outdated
@@ -306,7 +322,7 @@ def transform(self, X): | |||
X : {array-like, sparse matrix}, shape = [n_samples, n_features] | |||
The input data to complete. | |||
""" | |||
if self.axis == 0: | |||
if self.axis is None or self.axis == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need if using since self.axis_
will be 0
or 1
already
sklearn/preprocessing/imputation.py
Outdated
@@ -341,7 +357,7 @@ def transform(self, X): | |||
valid_statistics_indexes = np.where(valid_mask)[0] | |||
missing = np.arange(X.shape[not self.axis])[invalid_mask] | |||
|
|||
if self.axis == 0 and invalid_mask.any(): | |||
if (self.axis is None or self.axis == 0) and invalid_mask.any(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need if using since self.axis_
will be 0
or 1
already
sklearn/preprocessing/imputation.py
Outdated
@@ -366,7 +382,7 @@ def transform(self, X): | |||
n_missing = np.sum(mask, axis=self.axis) | |||
values = np.repeat(valid_statistics, n_missing) | |||
|
|||
if self.axis == 0: | |||
if self.axis is None or self.axis == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need if using since self.axis_
will be 0
or 1
already
sklearn/preprocessing/imputation.py
Outdated
raise ValueError("Can only impute missing values on axis 0 and 1, " | ||
" got axis={0}".format(self.axis)) | ||
|
||
# Since two different arrays can be provided in fit(X) and | ||
# transform(X), the imputation data will be computed in transform() | ||
# when the imputation is done per sample (i.e., when axis=1). | ||
if self.axis == 0: | ||
if self.axis == 0 or self.axis is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need if using since self.axis_
will be 0
or 1
already
sklearn/preprocessing/imputation.py
Outdated
The axis along which to impute. | ||
|
||
- If `axis=0`, then impute along columns. | ||
- If `axis=1`, then impute along rows. | ||
|
||
.. deprecated:: 0.20 | ||
``axis`` will be removed from ``Imputer``, and it will only impute | ||
along columns (axis=0) in 0.22. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(axis=0)
-> (i.e., ``axis=0``)
…and equals to 0 when axis is None.
LGTM @jnothman could yo have a look. |
Imputer.axis
argumentImputer.axis
argument
sklearn/preprocessing/imputation.py
Outdated
@@ -143,27 +147,35 @@ def fit(self, X, y=None): | |||
" got strategy={1}".format(allowed_strategies, | |||
self.strategy)) | |||
|
|||
if self.axis not in [0, 1]: | |||
if self.axis is None: | |||
self.axis_ = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This attribute implicitly becomes part of the public API. Either use a private attribute (beginning _) or use a local variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ups my bad ... I should have advise to make it private
LGTM |
Imputer.axis
argumentImputer.axis
argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment in the deprecation message.
It would be great to add a test to make sure we get the DeprecationWarning when axis is set explicitly (test either to axis=0 or axis=1 or both)
if self.axis is None: | ||
self._axis = 0 | ||
else: | ||
warnings.warn("'axis' will be removed from Imputer, and it will " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convention is to add the version the deprecation happened as well as the on it will be removed.
From http://scikit-learn.org/stable/developers/contributing.html#deprecation:
As in these examples, the warning message should always give both the version in which the deprecation happened and the version in which the old behavior will be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @petrushev This looks good overall. Could you please (please also refer to the comments above)
(1) Solve the conflict
(2) Improve the deprecation message
(3) Add a test to ensure the warning is raised
(4) Move what's new entry to API changes summary
Thanks :)
@petrushev Would you be able to address minor comments above, this is almost good to merge )
@qinhanmin2014 Do we really want to add a test at each deprecation warning? |
@rth I don't know too much about the history but we have tests for most API changes in 0.20. I think at least it's not harmful to test the deprecation warning. WDYT? Also see lesteve's comment above (#9672 (review)): |
Good to know, thanks, I missed that. |
@petrushev I've continued the work in #10558 since we've not heard from you for several months, hope you won't mind. |
Reference Issue
Fixes: #9463
What does this implement/fix? Explain your changes.
Deprecated the argument
axis
on theImputer
class.