Thanks to visit codestin.com
Credit goes to github.com

Skip to content

FEA return final cross-validation score in SequentialFeatureSelector #31483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

cboseak
Copy link

@cboseak cboseak commented Jun 4, 2025

Reference Issues/PRs

What does this implement/fix? Explain your changes.

  • Added an attribute (e.g., final_cv_score_) that stores the mean cross-validation score of the final model with the selected features. This would avoid having to run another cross-validation externally to get the final performance score.
    • Currently, when using SequentialFeatureSelector, it internally performs cross-validation to decide which features to select, based on the scoring function. However, the final cross-validation score (e.g., recall) is not returned by the SFS object.

Copy link

github-actions bot commented Jun 4, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 1944c07. Link to the linter CI: here

@betatim betatim changed the title per issue 31473, return final cross-validation score in SequentialFea… FEA return final cross-validation score in SequentialFea… Jun 6, 2025
Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @cboseak

Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @cboseak

@OmarManzoor OmarManzoor added the Waiting for Second Reviewer First reviewer is done, need a second one! label Jun 11, 2025
@adrinjalali adrinjalali removed the Waiting for Second Reviewer First reviewer is done, need a second one! label Jun 12, 2025
@adrinjalali adrinjalali changed the title FEA return final cross-validation score in SequentialFea… FEA return final cross-validation score in SequentialFeatureSelector Jun 12, 2025
@cboseak
Copy link
Author

cboseak commented Jun 12, 2025

See latest changes to address your comments

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd need more opinions on this to see if we'd like to include it.

cc @scikit-learn/core-devs

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind the implementation as is, but I do wonder its usecases and whether it's useful to enough users.

Tagging for a second opinion: @OmarManzoor @adam2392

scores : ndarray of shape (n_splits,)
Array of cross-validation scores for each split.
"""
_raise_for_params(params, self, "get_final_cv_score")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please have a test for this.

@OmarManzoor
Copy link
Contributor

When I approved this I considered it being added as an attribute but since that increases the fit time I am not so sure about having a separate function that will still need to be called separately. Wouldn't that kind of be similar to just calling the code within the function? I guess if it adds some convenience to users we can add it.

@adrinjalali
Copy link
Member

Wouldn't that kind of be similar to just calling the code within the function?

Not sure which function you mean.

@OmarManzoor
Copy link
Contributor

Not sure which function you mean.

get_final_cv_score the one that is added in this PR

Copy link
Member

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this is purely a convenience function right?

The computation time to get the answer that you'd want is the same with or without the function.

In that case, my main criterion would be looking at whether this makes the API more usable. Is this function name also present in other feature selectors? If so, let's add it imo. If not, shouldn't we consolidate?

@@ -193,6 +193,21 @@ def __init__(
self.cv = cv
self.n_jobs = n_jobs

def _get_cv(self, y):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this function is needed. Perhaps I'm missing something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a suggestion in one of the comments but basically we had duplicate code in 2 places (cv = check_cv(self.cv, y, classifier=is_classifier(self.estimator))) so it was moved into a function to clean it up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add option to return final cross-validation score in SequentialFeatureSelector
4 participants