-
Notifications
You must be signed in to change notification settings - Fork 2
[MRG] Better format for storing the search results, to support multiple metric. (grid_scores and _CVScoreTuple) #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8f7bea9
to
a2c3922
Compare
The reason we've not done this before was at least in part because there was disagreement on the form of |
I agree with @jnothman, the format might have to change to accommodate multiple metrics. As a sidenote, I think the current |
This less so than training scores or timing information that have been previously proposed. |
Both. I think it should be a dict in a form that calling |
(Sorry if this is lame) we chose a named tuple instead of dict to make it memory efficient correct? |
That's my understanding, based on the very good comment in the source. I wonder how relevant the memory concern is. Was there an issue prompting it? On April 14, 2016 1:14:01 PM EDT, Raghav R V [email protected] wrote:
Sent from my Android device with K-9 Mail. Please excuse my brevity. |
I think I might have introduced it to give it more of a fixed structure. I don't think it was a good idea and I don't think there was a particular concern. |
Ok so now we are naming the |
Maybe the |
Where was I don't think we want |
That should work as well making the proposed |
+1 for having a row per fold with a column for every parameter. And a column for every metric whenever multiple metric support is added. |
@amueller wrote:
A dict of arrays, or a list of dicts? I supposed the namedtuple was introduced because the incumbent tuple (wasn't it) was not self-documenting. But namedtuples persist in some of the inflexibility of tuples, particularly with unpacking iteration ( |
Two problems here:
I also think it's very strange that we're not having this discussion in a scikit-learn space, but in @rvraghav93's. But please also see scikit-learn#1787 where This Discussion Was Had Before. (Struct arrays and dataframes both have their advantages, but I agree with @amueller that we are best off giving users a more familiar and universal structure.) |
I am sorry, I started this as a trivial PR which renames |
@jnothman one benefit of starting the discussion here is that I saw it because it wasn't caught by my scikit-learn filter ;) I think this is a very important discussion. |
Hahaha so now we know how to get your attention! On 16 April 2016 at 01:20, Andreas Mueller [email protected] wrote:
|
I've raised an issue referring all the relevant issues/PRs noting the important conclusions and my proposed solution here at scikit-learn#6686. Kindly take a look! |
@MechCoder @amueller @vene @jnothman
I'll do this (multiple metric support) in incremental steps. Will merge the trivial PRs as soon as I get a +1.
This is a very trivial PR. Pl take a look