Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG+1] split data using _safe_split in _permutaion_test_scorer to fix error… #5697

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 29, 2016

Conversation

equialgo
Copy link
Contributor

@equialgo equialgo commented Nov 2, 2015

… when using Pandas DataFrame/Series

Related to issue #5696

@raghavrv
Copy link
Member

thanks for fixing this!

Could you also add this fix to the model_selection module and also add a NRT to model_selection/tests/test_validation.py? :)

@raghavrv
Copy link
Member

Also could you edit the

PR title to -
[MRG] split data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series

And the PR description to "Fixes #5696"

@amueller amueller changed the title split data using _safe_split in _permutaion_test_scorer to fix error… [MRG] split data using _safe_split in _permutaion_test_scorer to fix error… Dec 10, 2015
@amueller
Copy link
Member

yeah a test would be great, using the mock data frame or an acutal dataframe if pandas is installed.

@equialgo
Copy link
Contributor Author

equialgo commented Jan 3, 2016

First of all, sorry for the late reply.

I added the fix also to model_selection/_validation.py.

Moreover, I added a test_permutation_test_score_pandas() test case to model_selection/tests/test_validation.py (based on the test_cross_val_score_pandas() test case)

@dankessler
Copy link

Is there additional work that needs to be done before this can be merged? I'm still encountering the issue that this addresses (#5696), which I'm working around by calling .values on my series. Is it just the merge conflict that's hairy? If so, I'm happy to take a crack.

@amueller amueller added the Bug label Dec 14, 2016
@amueller
Copy link
Member

Wow sorry this one has been lying around for a bit. It needs a rebase and an entry to whatsnew.rst - and reviews, though that should be reasonably quick.

@amueller
Copy link
Member

This actually fixes another issue, namely using precomputed kernels / distance matrices with permutation_score. It would be great if there was also a regression test for that.
Otherwise looks good to me. Sorry for the slow turn-around.

check_df = lambda x: isinstance(x, InputFeatureType)
check_series = lambda x: isinstance(x, TargetType)
clf = CheckingClassifier(check_X=check_df, check_y=check_series)
permutation_test_score(clf, X_df, y_ser)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we should check that, for fixed random_state, results are identical regardless of input type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, such input tests are also not present for the other validation methods, there is only: test_cross_val_score_pandas(), test_cross_val_predict_pandas().

I wonder is that really a necessity?

@jnothman
Copy link
Member

@equialgo Sorry for the extreme delay. Let us know if you have time soon to complete the changes.

@equialgo
Copy link
Contributor Author

equialgo commented Dec 24, 2016

Hi guys, I rebased the fix it should be relatively easy to merge now (if it passes the checks).

@amueller I do not really get why this also fixes the issue using precomputed kernels / distance matrices with permutation_score. If it is trivial please elaborate, otherwise it might be better to add a new ticket for that one.

@jnothman
Copy link
Member

You've done something strange with your commit history. Perhaps a rebase is in order.

@equialgo
Copy link
Contributor Author

@jnothman oops, kinda messed-up there! Re-did the rebase; commit history should be okay now.

…hed commits)

Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
@GaelVaroquaux GaelVaroquaux changed the title [MRG] split data using _safe_split in _permutaion_test_scorer to fix error… [MRG+1] split data using _safe_split in _permutaion_test_scorer to fix error… Dec 28, 2016
@GaelVaroquaux
Copy link
Member

+1 to merge this guy (there is a flake8 failure, but it's because of lambda expressions, and here they seem legit).

Let's try to merge this fast, it's been lying around for too long.

@GaelVaroquaux
Copy link
Member

@raghavrv : are you +1 for merge? If so, let's merge

@raghavrv
Copy link
Member

raghavrv commented Dec 28, 2016

I think this needs a whatsnew entry? I can send a quick PR after this gets merged if you want. Otherwise LGTM... That pep8 error needs to be added to ignore list... (#8131)

Copy link
Member

@raghavrv raghavrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending whatsnew... (Feel free to merge and later add a whatsnew...)

@raghavrv raghavrv merged commit 986a49b into scikit-learn:master Dec 29, 2016
@raghavrv
Copy link
Member

raghavrv commented Dec 29, 2016

Okay... I'm merging this... Il make a whatsnew entry for this and another PR tomorrow...

@raghavrv
Copy link
Member

Thanks @equialgo!

@equialgo
Copy link
Contributor Author

You guys thanks for merging!

raghavrv pushed a commit to raghavrv/scikit-learn that referenced this pull request Jan 5, 2017
…arn#5697)

Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
sergeyf pushed a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017
…arn#5697)

Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
@Przemo10 Przemo10 mentioned this pull request Mar 17, 2017
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
…arn#5697)

Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
…arn#5697)

Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
…arn#5697)

Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
…arn#5697)

Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants