[MRG] Refactor CV and grid search#2736
Conversation
|
Thank you! I've been trying to get to another attempt at this. I can't review right away, but I'll get there soon. |
|
Are there other related pull requests like #2079? |
Its predecessor, #1787, also did this refactoring. They also hoped to make a consistent extensible output from I also consider this related to some of the CV API consistency stuff @eshilts is propsing in #2733. |
|
Now, |
There was a problem hiding this comment.
You appear to have dropped the verbose output here. Why not report time and score in _cross_val_score?
|
I think this is looking very good, so I'll just do some nitpicking :P |
There was a problem hiding this comment.
I know this is how I named it, but I think others will prefer camel-case: _PassthroughScorer
There was a problem hiding this comment.
I thought that would be a convention. :) I will change that.
|
Apart from the previous comments on error messages checks in the tests and private helpers usage and naming, this PR LGTM. |
There was a problem hiding this comment.
Maybe this function needs renaming. I called it originally "_cross_val_score" because it was the inner loop of the cross_val_score function, but now that it is used in a different setting, I find that the name is confusing, as it doesn't really do a cross-validation. Maybe something like "_fit_and_score"?
PS: Sorry for nitpicking about the names, but they are important as they control the readability of a codebase.
There was a problem hiding this comment.
As I said before: I like the nitpicking. It is especially helpful to find meaningful names and naming is the most important thing to create readable software. I think you do a great job. :)
I renamed it to fit_and_score.
There was a problem hiding this comment.
@GaelVaroquaux, do you think fit_and_score should be public? Personally, I don't think either it or fit_grid_point should be public (and while not _-prefixed, fit_grid_point isn't in classes.rst, nor is it any longer the right name for that function). It makes it much harder to add functionality to grid search etc. since every modification to its output is an API change.
There was a problem hiding this comment.
Also, I don't understand what the objection is to calling this cross-validation... I think that's what fitting on one chunk of data and scoring on another is.
|
I will work on addressing the final comments and merging. |
|
I merged by rebase. I haven't made fit_grid_point private yet. It has been there for a long time so I would prefer to wait for other opinions before changing anything. If we decide to make it private, we can keep the public one for two releases. |
|
Hurrah! Thank you @AlexanderFabisch for some long-needed maintenance, and @mblondel for pushing it through. I look forward to your and other extensions to this to get more returned from |
|
Indeed, thanks for your recent contributions, @AlexanderFabisch. |
|
I removed the I think this makes the code better. |
There was a problem hiding this comment.
I messed this up. The order of test_score and train_score is wrong in the docstring.
|
Oh, I did not expect the PR to get merged so soon after all the difficulties. Unfortunately I found the bug in the documentation too late. @mblondel could you fix this? Anyway, thanks for the compliments. :) |
|
I just realized that |
|
+1 to deprecate in case it's in private use On 16 January 2014 19:20, Mathieu Blondel [email protected] wrote:
|
|
Please deprecate it as this method had a public name (no leading underscore). |
While implementing #2701 I have seen some duplicate code in
grid_searchandcross_validation. Before I will implement the rest of issue #2584 I would like to clean that up a little bit.Todo:
cross_validation.cross_val_scoreandgrid_search.fit_grid_point