-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+1] split data using _safe_split in _permutaion_test_scorer to fix error… #5697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
thanks for fixing this! Could you also add this fix to the |
Also could you edit the PR title to - And the PR description to "Fixes #5696" |
yeah a test would be great, using the mock data frame or an acutal dataframe if pandas is installed. |
First of all, sorry for the late reply. I added the fix also to model_selection/_validation.py. Moreover, I added a |
Is there additional work that needs to be done before this can be merged? I'm still encountering the issue that this addresses (#5696), which I'm working around by calling .values on my series. Is it just the merge conflict that's hairy? If so, I'm happy to take a crack. |
Wow sorry this one has been lying around for a bit. It needs a rebase and an entry to |
This actually fixes another issue, namely using precomputed kernels / distance matrices with |
check_df = lambda x: isinstance(x, InputFeatureType) | ||
check_series = lambda x: isinstance(x, TargetType) | ||
clf = CheckingClassifier(check_X=check_df, check_y=check_series) | ||
permutation_test_score(clf, X_df, y_ser) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we should check that, for fixed random_state
, results are identical regardless of input type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, such input tests are also not present for the other validation methods, there is only: test_cross_val_score_pandas()
, test_cross_val_predict_pandas()
.
I wonder is that really a necessity?
@equialgo Sorry for the extreme delay. Let us know if you have time soon to complete the changes. |
Hi guys, I rebased the fix it should be relatively easy to merge now (if it passes the checks). @amueller I do not really get why this also fixes the issue using precomputed kernels / distance matrices with |
You've done something strange with your commit history. Perhaps a rebase is in order. |
210d663
to
f0bc2cd
Compare
@jnothman oops, kinda messed-up there! Re-did the rebase; commit history should be okay now. |
…hed commits) Squashed commits: [94fd9f4] split data using _safe_split in _permutaion_test_scorer [522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings. [15a48bf] adding safe_indexing to _shuffle function [9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [3cf5e8f] split data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
f0bc2cd
to
de14dfa
Compare
+1 to merge this guy (there is a flake8 failure, but it's because of lambda expressions, and here they seem legit). Let's try to merge this fast, it's been lying around for too long. |
@raghavrv : are you +1 for merge? If so, let's merge |
I think this needs a whatsnew entry? I can send a quick PR after this gets merged if you want. Otherwise LGTM... That pep8 error needs to be added to ignore list... (#8131) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pending whatsnew... (Feel free to merge and later add a whatsnew...)
Okay... I'm merging this... Il make a whatsnew entry for this and another PR tomorrow... |
Thanks @equialgo! |
You guys thanks for merging! |
…arn#5697) Squashed commits: [94fd9f4] split data using _safe_split in _permutaion_test_scorer [522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings. [15a48bf] adding safe_indexing to _shuffle function [9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [3cf5e8f] split data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
…arn#5697) Squashed commits: [94fd9f4] split data using _safe_split in _permutaion_test_scorer [522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings. [15a48bf] adding safe_indexing to _shuffle function [9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [3cf5e8f] split data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
…arn#5697) Squashed commits: [94fd9f4] split data using _safe_split in _permutaion_test_scorer [522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings. [15a48bf] adding safe_indexing to _shuffle function [9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [3cf5e8f] split data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
…arn#5697) Squashed commits: [94fd9f4] split data using _safe_split in _permutaion_test_scorer [522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings. [15a48bf] adding safe_indexing to _shuffle function [9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [3cf5e8f] split data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
…arn#5697) Squashed commits: [94fd9f4] split data using _safe_split in _permutaion_test_scorer [522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings. [15a48bf] adding safe_indexing to _shuffle function [9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [3cf5e8f] split data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
…arn#5697) Squashed commits: [94fd9f4] split data using _safe_split in _permutaion_test_scorer [522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings. [15a48bf] adding safe_indexing to _shuffle function [9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series [3cf5e8f] split data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
… when using Pandas DataFrame/Series
Related to issue #5696