sklearn.preprocessing.MinMaxScaler not preserving symmetry / Add axis=None #4892

alessiob · 2015-06-24T11:10:38Z

MinMaxScaler does not preserve symmetry.

scikit-learn (0.15.2) and scikit-learn (0.16.1)
Windows 7 SP 1 64 bit
Python 2.7.9 32 bit

An affected numpy matrix and the script to reproduce the problem are available at: https://www.dropbox.com/s/vkcuq71wa69jrw7/sklearn-bug.tar?dl=0

TomDLT · 2015-06-24T12:19:14Z

# A more simple example:
array([[  1.,   2.],
       [  2.,  10.]])
# will be transformed in:
array([[ 0.,  0.],
       [ 1.,  1.]])

This is not a bug.
MinMaxScaler standardizes each feature (column) individually.
The docstring says:

This estimator scales and translates each feature individually such
that it is in the given range on the training set, i.e. between zero and one.

jnothman · 2015-06-25T06:52:51Z

Perhaps we should consider supporting axis=None? Ping @untom.

untom · 2015-06-25T15:36:04Z

Is there a common-enough use-case to add axis=None (I can't think of one)?

In a pinch, the same result can be had by using ravel() on the input and reshape() on the result of the scaler.

alessiob · 2015-06-25T16:10:20Z

Thanks for the answers and my apologies.
axis=None would be very useful in my case.

jnothman · 2015-06-25T18:01:58Z

ravel and reshape is not a pretty operation to achieve in a pipeline!

On 26 June 2015 at 02:10, Alessio Bazzica [email protected] wrote:

Thanks for the answers and my apologies.
axis=None would be very useful in my case.

—
Reply to this email directly or view it on GitHub
#4892 (comment)
.

amueller · 2015-07-01T20:56:46Z

Is this for a pairwise distance? Preprocessors apart from the KernelCenterer are not really supposed to be used on that.

jnothman · 2015-07-02T00:03:28Z

Is there a reason not to support axis=None, @amueller (except in sparse
where it requires real additional work)?

On 2 July 2015 at 06:57, Andreas Mueller [email protected] wrote:

Is this for a pairwise distance? Preprocessors apart from the
KernelCenterer are not really supposed to be used on that.

—
Reply to this email directly or view it on GitHub
#4892 (comment)
.

amueller · 2015-07-11T13:52:44Z

No, I think it would actually be cool.

stephen-hoover · 2015-07-11T17:05:16Z

I can work on this, but it appears that none of the Scalers accept an "axis" argument. All of them operate only on single features independent of the other features. Should I add an "axis" argument to all of them, accepting inputs of [0, ..., ndim-1] or None (defaulting to 0)?

amueller · 2015-07-11T17:15:47Z

ndim is always 2. I thought we had an axis in the scalers but it seems that is only for the function interface. Which I feel is slightly odd.
Maybe just add 0 or None, defaulting to 0 for now. That would make sense for both MinMaxScaler and StandardScaler.

untom · 2015-07-11T17:16:53Z

I once tried introducing an axis argument to all scalers back in #2514, but IIRC the problem was that having axis=1 does not make sense for scalers and is likely overengineering (see here: #3639 (comment) ). Since then, I have started agreeing with the viewpoint: most of sklearn assumes a Samples by Features format, and normalizing along anything else than features doesn't make much sense. For the rare cases where it is needed, there are the scaling functions.

amueller · 2015-07-11T17:18:59Z

we could add the additional axis=None to the function interface? not sure though.

untom · 2015-07-11T17:20:21Z

Not sure how it affects the original issue in this thread (e.g. if the application scenario involves fitting a scaler on a training set and applying it on the test data or not)

stephen-hoover · 2015-07-11T17:20:57Z

I think that it's useful to allow the "axis=None" option, but that might not be the best option name. What if the Scalers took an option "grouped=False"?

thomasjpfan · 2022-07-21T16:55:40Z

@amueller I do not see a use case for axis=None. Without a use case, I am overall -1 on adding this feature.

TomDLT · 2022-07-21T17:30:31Z

Apart from numerical reasons, I don't see any usecase either to scaling all the features in the same way. I would be surprised if some estimators behaved differently depending on the global scale of features.

-1 as well

thomasjpfan · 2022-07-22T17:55:42Z

With the comments: #4963 (comment), #4892 (comment), #4892 (comment), I do not think we will include this feature.

amueller added Easy Well-defined and straightforward way to resolve Documentation Need Contributor labels Jul 11, 2015

amueller changed the title ~~sklearn.preprocessing.MinMaxScaler not preserving symmetry~~ sklearn.preprocessing.MinMaxScaler not preserving symmetry / Add axis=None Jul 11, 2015

stephen-hoover mentioned this issue Jul 11, 2015

[MRG] Add "grouped" option to Scaler classes #4963

Closed

4 tasks

TomDLT removed the Need Contributor label Nov 4, 2015

cmarmo added the module:preprocessing label Dec 8, 2021

thomasjpfan added Needs Decision - Include Feature Requires decision regarding including feature New Feature and removed Easy Well-defined and straightforward way to resolve Documentation labels Jul 21, 2022

thomasjpfan added this to Quansight's scikit-learn Project Board Jul 21, 2022

thomasjpfan moved this to Todo📬 in Quansight's scikit-learn Project Board Jul 21, 2022

thomasjpfan closed this as not planned Won't fix, can't repro, duplicate, stale Jul 22, 2022

Repository owner moved this from Todo📬 to Done🚀 in Quansight's scikit-learn Project Board Jul 22, 2022

Uh oh!

sklearn.preprocessing.MinMaxScaler not preserving symmetry / Add axis=None #4892

sklearn.preprocessing.MinMaxScaler not preserving symmetry / Add axis=None #4892

Comments

alessiob commented Jun 24, 2015

TomDLT commented Jun 24, 2015

Uh oh!

jnothman commented Jun 25, 2015

Uh oh!

untom commented Jun 25, 2015

Uh oh!

alessiob commented Jun 25, 2015

Uh oh!

jnothman commented Jun 25, 2015

Uh oh!

amueller commented Jul 1, 2015

Uh oh!

jnothman commented Jul 2, 2015

Uh oh!

amueller commented Jul 11, 2015

Uh oh!

stephen-hoover commented Jul 11, 2015

Uh oh!

amueller commented Jul 11, 2015

Uh oh!

untom commented Jul 11, 2015

Uh oh!

amueller commented Jul 11, 2015

Uh oh!

untom commented Jul 11, 2015

Uh oh!

stephen-hoover commented Jul 11, 2015

Uh oh!

thomasjpfan commented Jul 21, 2022

Uh oh!

TomDLT commented Jul 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomasjpfan commented Jul 22, 2022

Uh oh!

TomDLT commented Jul 21, 2022 •

edited

Loading