Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

GaetandeCast
Copy link
Contributor

@GaetandeCast GaetandeCast commented Sep 22, 2025

Reference Issues/PRs

Fixes #32201

What does this implement/fix? Explain your changes.

To be used in RFE and RFECV, permutation_importance needs to be aware of which features were already eliminated by the procedure to reduce its test dataset.
This PR adds a feature_indices parameter to sklearn.feature_selection._base._get_feature_importances that is given to the importance_getter so that it is aware of which features to compute the importance of.

Any other comments?

The new feature is added to the test suite and illustrated in the RFECV doc example.

@glemaitre @ogrisel

Copy link

github-actions bot commented Sep 22, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 3854954. Link to the linter CI: here

@GaetandeCast GaetandeCast changed the title base logic and test FEAT allow RFE(CV) be used with pemutation_importance Sep 22, 2025
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. This is very useful. Please indeed update the example and add a changelog entry.

EDIT: here are the instructions for the changelog entry: https://github.com/scikit-learn/scikit-learn/blob/main/doc/whats_new/upcoming_changes/README.md

Comment on lines 7 to 8
This allows methods like :func:`permutation_importance` to extract the relevant features
from its test set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This allows methods like :func:`permutation_importance` to extract the relevant features
from its test set.
This allows methods like :func:`permutation_importance` and similar tools to
iteratively extract the previously selected features from a test set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I realized it was not very clear so I planned to change it to:
" This allows methods that need a test set, like :func:permutation_importance, to know which
features of to use in their predictions."

@GaetandeCast GaetandeCast marked this pull request as ready for review September 25, 2025 14:13
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides the following:

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a small problem in the example + small suggestions for further improvement in the docstrings.

Also @ArturoAmorQ might be interested in reviewing this PR (the updated example in particular).

If `callable`, overrides the default feature importance getter.
The callable is passed with the fitted estimator and it should
return importance for each feature.
return importance for each feature. When it accepts it, the callable is passed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return importance for each feature. When it accepts it, the callable is passed
return importance for each feature. When it accepts it, the callable is passed

return importance for each feature.
return importance for each feature. When it accepts it, the callable is passed
`feature_indices` which stores the index of the features in the full dataset
that have not been eliminated yet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
that have not been eliminated yet.
that have not yet been eliminated in previous iterations.

return importance for each feature.
return importance for each feature. When it accepts it, the callable is passed
`feature_indices` which stores the index of the features in the full dataset
that have not been eliminated yet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
that have not been eliminated yet.
that have not yet been eliminated in previous iterations.

shown at the end of
:ref:`sphx_glr_auto_examples_feature_selection_plot_rfe_with_cross_validation.py`.
.. versionadded:: 0.24
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a .. versionchanged:: 1.8 and mention that support for passing feature_indices to the callable when part of its signature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And similarly for the docstring of the other class.

n_classes=8,
n_clusters_per_class=1,
class_sep=0.8,
random_state=0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make sure that we do not sample from the training set:

Suggested change
random_state=0,
random_state=1, # Use a different seed to sample different points.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The centroids positions of make_classification depend on the random_state generator:

centroids = _generate_hypercube(n_clusters, n_informative, generator).astype(

i.e. changing to random_state=1 will not sample the same distribution. Let's rather make a train_test_split right after the first make_classification (with n_samples=1_000).

On a side note, X_test/y_test feel awkward for computing something (the permutation importances) that is later used during fit. Should we call them X_val/y_val instead (similar to the notation of early stopping in HGBT)?

Copy link
Member

@ArturoAmorQ ArturoAmorQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @GaetandeCast, this is certainly a very nice addition! A few comments regarding documentation only.

n_classes=8,
n_clusters_per_class=1,
class_sep=0.8,
random_state=0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The centroids positions of make_classification depend on the random_state generator:

centroids = _generate_hypercube(n_clusters, n_informative, generator).astype(

i.e. changing to random_state=1 will not sample the same distribution. Let's rather make a train_test_split right after the first make_classification (with n_samples=1_000).

On a side note, X_test/y_test feel awkward for computing something (the permutation importances) that is later used during fit. Should we call them X_val/y_val instead (similar to the notation of early stopping in HGBT)?

Comment on lines +120 to +125
# Under the hood, `RFECV` uses importance scores derived from the coefficients of the
# linear model we used, to choose which feature to eliminate. We show here how to use
# `permutation_importance` as an alternative way to measure the importance of features.
# For that, we use a callable in the `importance_getter` parameter of RFECV.
# This callable accepts a fitted model and an array containing the indices of the
# features that have not been eliminated yet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rephrase to introduce the importance_getter earlier in the sentence. By doing so it's easier to justify the "under the hood" statement. We can use something similar to:

The importance_getter parameter in RFE and RFECV uses by default the coef_ (e.g. in linear models) or the feature_importances_ attributes of an estimator to derive the feature importance. We show here how to use a callable instead to compute the permutation_importance. This callable accepts a fitted model and an array containing the indices of the features that remain after elimination.

Comment on lines +127 to +129
return importance for each feature. When it accepts it, the callable is passed
`feature_indices` which stores the index of the features in the full dataset
that have not yet been eliminated in previous iterations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"When it accepts it" is a bit vague. How about being more explicit? Something in the line of

[...] it should an importance value for each feature. If the callable accepts an additional argument feature_indices, it should contain the indices of the features in the full dataset that that remain after elimination in previous iterations.

@@ -0,0 +1,9 @@
- :class:`feature_selection.RFE` and :class:`feature_selection.RFECV`
now support the use of :func:`permutation_importance` as an :attr:`importance_getter`.
When a callable, and when it can accept it, the :attr:`importance_getter` is passed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about "it can accept it", who is who in the pronouns "it"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEAT] Let importance_getter from RFE accept test data for permutation importance
3 participants