Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG] ENH validate sample_weight with _check_sample_weight in IsotonicRegression #16203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 5 additions & 11 deletions sklearn/isotonic.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from scipy.stats import spearmanr
from .base import BaseEstimator, TransformerMixin, RegressorMixin
from .utils import check_array, check_consistent_length
from .utils.validation import _check_sample_weight
from ._isotonic import _inplace_contiguous_isotonic_regression, _make_unique
import warnings
import math
Expand Down Expand Up @@ -121,10 +122,7 @@ def isotonic_regression(y, sample_weight=None, y_min=None, y_max=None,
order = np.s_[:] if increasing else np.s_[::-1]
y = check_array(y, ensure_2d=False, dtype=[np.float64, np.float32])
y = np.array(y[order], dtype=y.dtype)
if sample_weight is None:
sample_weight = np.ones(len(y), dtype=y.dtype)
else:
sample_weight = np.array(sample_weight[order], dtype=y.dtype)
sample_weight = _check_sample_weight(sample_weight, y, dtype=y.dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

You need to apply the order parameter, maybe as,

    if sample_weight is None:
        sample_weight = np.ones(len(y), dtype=y.dtype)
    else:
        sample_weight = _check_sample_weight(sample_weight, y, dtype=y.dtype)
        sample_weight = sample_weight[order]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you think

sample_weight = _check_sample_weight(sample_weight, y, dtype=y.dtype)
sample_weight = sample_weight[order]

is enough ? It makes the code simpler and reordering ones has no effect and efficiency is not a problem there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes the code simpler and reordering ones has no effect and efficiency is not a problem there.

If you prefer. Slicing an array with int indexing does have some cost, but it should indeed be minimal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the same :)


_inplace_contiguous_isotonic_regression(y, sample_weight)
if y_min is not None or y_max is not None:
Expand Down Expand Up @@ -261,13 +259,9 @@ def _build_y(self, X, y, sample_weight, trim_duplicates=True):

# If sample_weights is passed, removed zero-weight values and clean
# order
if sample_weight is not None:
sample_weight = check_array(sample_weight, ensure_2d=False,
dtype=X.dtype)
mask = sample_weight > 0
X, y, sample_weight = X[mask], y[mask], sample_weight[mask]
else:
sample_weight = np.ones(len(y), dtype=X.dtype)
sample_weight = _check_sample_weight(sample_weight, X, dtype=X.dtype)
mask = sample_weight > 0
X, y, sample_weight = X[mask], y[mask], sample_weight[mask]

order = np.lexsort((y, X))
X, y, sample_weight = [array[order] for array in [X, y, sample_weight]]
Expand Down