-
-
Notifications
You must be signed in to change notification settings - Fork 26k
FEA return final cross-validation score in SequentialFeatureSelector #31483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @cboseak
doc/whats_new/upcoming_changes/sklearn.feature_selection/31483.feature.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @cboseak
See latest changes to address your comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd need more opinions on this to see if we'd like to include it.
cc @scikit-learn/core-devs
doc/whats_new/upcoming_changes/sklearn.feature_selection/31483.feature.rst
Outdated
Show resolved
Hide resolved
….feature.rst update based on PR suggestions Co-authored-by: Adrin Jalali <[email protected]>
…_cv_score` to return raw values instead of mean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind the implementation as is, but I do wonder its usecases and whether it's useful to enough users.
Tagging for a second opinion: @OmarManzoor @adam2392
scores : ndarray of shape (n_splits,) | ||
Array of cross-validation scores for each split. | ||
""" | ||
_raise_for_params(params, self, "get_final_cv_score") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please have a test for this.
When I approved this I considered it being added as an attribute but since that increases the fit time I am not so sure about having a separate function that will still need to be called separately. Wouldn't that kind of be similar to just calling the code within the function? I guess if it adds some convenience to users we can add it. |
Not sure which function you mean. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, this is purely a convenience function right?
The computation time to get the answer that you'd want is the same with or without the function.
In that case, my main criterion would be looking at whether this makes the API more usable. Is this function name also present in other feature selectors? If so, let's add it imo. If not, shouldn't we consolidate?
@@ -193,6 +193,21 @@ def __init__( | |||
self.cv = cv | |||
self.n_jobs = n_jobs | |||
|
|||
def _get_cv(self, y): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why this function is needed. Perhaps I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a suggestion in one of the comments but basically we had duplicate code in 2 places (cv = check_cv(self.cv, y, classifier=is_classifier(self.estimator))
) so it was moved into a function to clean it up
Reference Issues/PRs
What does this implement/fix? Explain your changes.