Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@rcap107
Copy link
Member

@rcap107 rcap107 commented Dec 10, 2025

Bug Fix Pull Request

This PR addresses a bug that may occur when using plot_parallel_coord if the results of the crossvalidation are not correct.

Description

The PR addresses the problem by adding a small eps and removing nan values from what is used in _add_jitter.

How Has This Been Tested?

A new test has been added to test_parallel_coord.py to check the values that are passed to _add_jitter

Steps to Reproduce (original bug)

The bug first appeared in the CI testing the optuna example:

https://app.circleci.com/pipelines/github/skrub-data/skrub/7618/workflows/eef20e21-776e-42d2-b24a-3031c0c05614/jobs/14823

Copy link
Member Author

@rcap107 rcap107 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure the solution I am proposing is correct.

I was not able to reproduce the same problem that triggered the CI in the first place. np.uniform should handle the case in which min_value == max_value, but it fails if either is np.nan or np.inf.

However, the presence of np.nan should have been handled in _prepare_numeric_column, which replaces nans with a placeholder value.

I don't know if this bug is related to the fact that it was the optuna example.

@rcap107 rcap107 added the bug Something isn't working label Dec 10, 2025
@rcap107
Copy link
Member Author

rcap107 commented Dec 10, 2025

It seems like the problem occurs if the column is integer and all values are np.nan

@rcap107 rcap107 marked this pull request as ready for review December 10, 2025 14:40
@rcap107
Copy link
Member Author

rcap107 commented Dec 10, 2025

In the end, the problem was in the function _prepare_numeric_column. In some edge cases NaNs were not handled correctly, and now they are.

Copy link
Member

@jeromedockes jeromedockes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! thanks for the fix @rcap107

@rcap107 rcap107 changed the title WIP - fixing a bug in _add_jitter BUG - Fixing a few edge cases in the generation of the parallel coordinate plot Dec 11, 2025
@rcap107 rcap107 merged commit c0c926a into skrub-data:main Dec 11, 2025
30 checks passed
@rcap107 rcap107 deleted the fix-jitter-parallel-coord-plot branch December 11, 2025 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants