Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Deprecate Imputer with axis=1 #9463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jnothman opened this issue Jul 30, 2017 · 7 comments · Fixed by #10558
Closed

Deprecate Imputer with axis=1 #9463

jnothman opened this issue Jul 30, 2017 · 7 comments · Fixed by #10558

Comments

@jnothman
Copy link
Member

jnothman commented Jul 30, 2017

After having tried to deal with a few issues related to extending Imputer behaviour, I believe we should be removing the axis parameter from Imputer.

  • It seems a strange feature to support in a machine learning context, except perhaps where the features represent something like a time series.
  • It is not stateful and can be performed with a FunctionTransformer. (We could even provide a row_impute function, if we felt it necessary, which would roughly be defined as def row_impute(X, **kwargs): return Imputer(**kwargs).fit_transform(X.T).T.)
  • It complicates the implementation, which already has a bunch of weird edge-cases (to handle sparse data with missing indicated by 0 which is an inefficient use of a sparse data structure; and to handle non-NaN missingness indicators), unnecessarily.
  • It is often nonsensical to extend further features to the axis=1 case.

Do others agree?

@amueller
Copy link
Member

It could be stateful for KNN, right? That might not be totally useless. But not sure if that's something that people are doing.
But yeah, it's a strange feature, and I wouldn't be opposed to removing it.

@jnothman
Copy link
Member Author

jnothman commented Aug 1, 2017 via email

@amueller
Copy link
Member

amueller commented Aug 1, 2017

Well you could learn which feature is most common to which feature is most common to which other feature, and then impute using a distance weighted average of these features.
You could learn something like "this feature is always the average of these other two features" or "these features are perfectly correlated".

@jnothman
Copy link
Member Author

jnothman commented Aug 1, 2017 via email

@amueller
Copy link
Member

amueller commented Aug 2, 2017

yeah I agree.

@jnothman jnothman changed the title Deprecate Imputer with axis=1? Deprecate Imputer with axis=1 Aug 2, 2017
@petrushev
Copy link

Since the only other sensible value would be axis=0, then this means we should probably deprecate the parameter completely?

@jnothman
Copy link
Member Author

jnothman commented Sep 2, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants