Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG] ENH validate sample_weight with _check_sample_weight in IsotonicRegression #16203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Conversation

marijavlajic
Copy link
Contributor

Reference Issues/PRs

Fixes #15358 for IsotonicRegression.

What does this implement/fix? Explain your changes.

Replaces custom validation logic with standardized method utils.validation._check_sample_weight.

Any other comments?

Worked with @gelavizh1 @lschwetlick at scikit-learn sprint.

Worked also on the same issue in naive_bayes.py / for BaseDiscreteNB but it looks like the appropriate changes have already been made.

CC @adrinjalali @noatamir #WiMLDS

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @marijavlajic

@glemaitre glemaitre changed the title [MRG] Fix "Use _check_sample_weight to consistently validate sample_weight" in IsotonicRegression [MRG] ENH validate sample_weight with _check_sample_weight in IsotonicRegression Jan 26, 2020
sample_weight = np.ones(len(y), dtype=y.dtype)
else:
sample_weight = np.array(sample_weight[order], dtype=y.dtype)
sample_weight = _check_sample_weight(sample_weight, y, dtype=y.dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

You need to apply the order parameter, maybe as,

    if sample_weight is None:
        sample_weight = np.ones(len(y), dtype=y.dtype)
    else:
        sample_weight = _check_sample_weight(sample_weight, y, dtype=y.dtype)
        sample_weight = sample_weight[order]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you think

sample_weight = _check_sample_weight(sample_weight, y, dtype=y.dtype)
sample_weight = sample_weight[order]

is enough ? It makes the code simpler and reordering ones has no effect and efficiency is not a problem there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes the code simpler and reordering ones has no effect and efficiency is not a problem there.

If you prefer. Slicing an array with int indexing does have some cost, but it should indeed be minimal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the same :)

@jeremiedbb
Copy link
Member

It has been done in #16322. Sorry @marijavlajic for the confusion but someone did the same thing and his PR got merged first. We still need to be better in coordination :/

@jeremiedbb jeremiedbb closed this Jan 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use _check_sample_weight to consistently validate sample_weight
4 participants