-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Allow sample weights and other fit_params for RFE #7333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rfe.fit(X, y, sample_weight=w) | ||
ranking_1 = rfe.ranking_.copy() | ||
|
||
# Case 2 - duplicate the features of one class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate the samples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I'll change this.
Looks good apart from comment about the test. |
I do consider this a work in progress though, due to the remaining additions mentioned at the end of the first post. |
|
- Test that the weighted feature ranking is different from the original feature ranking - Clearer comments and variable names
I have updated the tests with respect to @amueller's comments, but won't work on |
I think you also need to add the
|
any updates ? |
This needs a merge with master |
any updates ? |
@g-rutter – any updates on this? |
Everyone looking for RFE with Gradient Boosting, like LGBM or XGB, I suggest shap-hypetune... A python package for simultaneous Hyperparameters Tuning and Features Selection for Gradient Boosting Models. It supports RFE (also with shap feature ranking) with every fitting parameters like in the standard algorithm API |
Me and @ijpulidos were taking over this PR to finish it up against the current main branch |
Reference Issue
Fixes #7308
What does this implement/fix? Explain your changes.
Adds support for passing sample weights into
RFE.fit()
and having them used by the estimator'sfit
method.Adds a test for this with the iris dataset, where sample weights are used to give one class double its normal weight and compare that to doubling the samples in that class. The test passes if both these approaches produce the same feature ranking. This feature ranking is different from the one which arises when all samples have the same weight.
Any other comments?
I will look in to adding the same support for
RFECV
, and toRFE.score()
.