Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@yuanx749
Copy link
Contributor

@yuanx749 yuanx749 commented May 1, 2024

Reference Issues/PRs

Fixes #28898

What does this implement/fix? Explain your changes.

HistGradientBoosting uses ColumnTransformer to preprocess the input X, which places the categorical features at the beginning of X. This PR adjusts the feature order in monotonic_cst correspondingly, to avoid the error when calling TreeGrower.

Any other comments?

@github-actions
Copy link

github-actions bot commented May 1, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: ec90944. Link to the linter CI: here

@ogrisel
Copy link
Member

ogrisel commented May 2, 2024

Thanks for the PR @yuanx749, the fix and the test look good. Can you please add an entry to the change log (doc/whats_new/1.5.rst)?

@yuanx749
Copy link
Contributor Author

yuanx749 commented May 2, 2024

Thank you for reviewing @ogrisel. I added the entry to whats_new.

@ogrisel
Copy link
Member

ogrisel commented May 2, 2024

Actually, I think it would be worth making the test stronger by expanding test_predictions to insert some extra columns f_cat_a = rng.randint(low=0, high=9, size=n_samples) (and similarly f_b and f_c, before, between and after f_0 and f_1 and no monotonicity constraint on those columns (and no impact on y either).

This way we could better check the correctness of the monotonicity constraint remapping.

@ogrisel
Copy link
Member

ogrisel commented May 2, 2024

/cc @thomasjpfan.

@ogrisel ogrisel added this to the 1.5 milestone May 2, 2024
@yuanx749
Copy link
Contributor Author

yuanx749 commented May 3, 2024

Actually, I think it would be worth making the test stronger by expanding test_predictions to insert some extra columns f_cat_a = rng.randint(low=0, high=9, size=n_samples) (and similarly f_b and f_c, before, between and after f_0 and f_1 and no monotonicity constraint on those columns (and no impact on y either).

This way we could better check the correctness of the monotonicity constraint remapping.

Good point. I updated test_predictions, and all tests pass locally.

Then I think the additional non-regression test I added before can be removed.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @yuanx749 ! The fix and additional tests looks good to me.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's delete the redundant test but otherwise LGTM! Thanks for the fix.

@ogrisel ogrisel enabled auto-merge (squash) May 3, 2024 14:40
@ogrisel ogrisel merged commit 7bc1dbb into scikit-learn:main May 3, 2024
@yuanx749 yuanx749 deleted the hist-mono branch May 5, 2024 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HistGradientBoostingClassifier raise error with monotonic constraints and categorical features

3 participants