Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG] DOC Clarify RobustScaler behavior with sparse input #8858

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/modules/preprocessing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ matrices as input, as long as ``with_mean=False`` is explicitly passed
to the constructor. Otherwise a ``ValueError`` will be raised as
silently centering would break the sparsity and would often crash the
execution by allocating excessive amounts of memory unintentionally.
:class:`RobustScaler` cannot be fited to sparse inputs, but you can use
:class:`RobustScaler` cannot be fitted to sparse inputs, but you can use
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that was the typo, good catch

the ``transform`` method on sparse inputs.

Note that the scalers accept both Compressed Sparse Rows and Compressed
Expand Down
13 changes: 8 additions & 5 deletions sklearn/preprocessing/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -909,9 +909,9 @@ class RobustScaler(BaseEstimator, TransformerMixin):
and the 3rd quartile (75th quantile).

Centering and scaling happen independently on each feature (or each
sample, depending on the `axis` argument) by computing the relevant
sample, depending on the ``axis`` argument) by computing the relevant
statistics on the samples in the training set. Median and interquartile
range are then stored to be used on later data using the `transform`
range are then stored to be used on later data using the ``transform``
method.

Standardization of a dataset is a common requirement for many
Expand All @@ -928,7 +928,7 @@ class RobustScaler(BaseEstimator, TransformerMixin):
----------
with_centering : boolean, True by default
If True, center the data before scaling.
This does not work (and will raise an exception) when attempted on
This will cause ``transform`` to raise an exception when attempted on
Copy link
Member

@lesteve lesteve May 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think .fit raises an exception if X is sparse Edit: scratch this as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, fiton sparse raises regardless of any option. You can transform on sparse after fitting on dense, apparently?

sparse matrices, because centering them entails building a dense
matrix which in common use cases is likely to be too large to fit in
memory.
Expand Down Expand Up @@ -1023,11 +1023,14 @@ def fit(self, X, y=None):
return self

def transform(self, X):
"""Center and scale the data
"""Center and scale the data.

Can be called on sparse input, provided that ``RobustScaler`` has been
fitted to dense input and ``with_centering=False``.

Parameters
----------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type description for X here site mention sparse

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

X : array-like
X : {array-like, sparse matrix}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnothman if I read your feedback correctly, does this address the comment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Sorry for the autocorrect typo.

The data used to scale along the specified axis.
"""
if self.with_centering:
Expand Down