Thanks to visit codestin.com
Credit goes to github.com

Skip to content

RFE/RFECV doesn't work with sample weights #7308

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 of 2 tasks
g-rutter opened this issue Aug 31, 2016 · 12 comments · Fixed by #20380 or #29312
Closed
1 of 2 tasks

RFE/RFECV doesn't work with sample weights #7308

g-rutter opened this issue Aug 31, 2016 · 12 comments · Fixed by #20380 or #29312

Comments

@g-rutter
Copy link

g-rutter commented Aug 31, 2016

As far as I can tell, sklearn.feature_selection.RFE has no way to pass sample weights to the estimator alongside the data.

I have fixed this in my code with:

index bbe0cda..f5072b2 100644
--- a/sklearn/feature_selection/rfe.py
+++ b/sklearn/feature_selection/rfe.py
@@ -120,7 +120,7 @@ class RFE(BaseEstimator, MetaEstimatorMixin, SelectorMixin):
     def _estimator_type(self):
         return self.estimator._estimator_type

-    def fit(self, X, y):
+    def fit(self, X, y, **fit_params):
         """Fit the RFE model and then the underlying estimator on the selected
            features.

@@ -132,9 +132,9 @@ class RFE(BaseEstimator, MetaEstimatorMixin, SelectorMixin):
         y : array-like, shape = [n_samples]
             The target values.
         """
-        return self._fit(X, y)
+        return self._fit(X, y, **fit_params)

-    def _fit(self, X, y, step_score=None):
+    def _fit(self, X, y, step_score=None, **fit_params):
         X, y = check_X_y(X, y, "csc")
         # Initialization
         n_features = X.shape[1]
@@ -166,7 +166,7 @@ class RFE(BaseEstimator, MetaEstimatorMixin, SelectorMixin):
             if self.verbose > 0:
                 print("Fitting estimator with %d features." % np.sum(support_))

-            estimator.fit(X[:, features], y)
+            estimator.fit(X[:, features], y, **fit_params)

             # Get coefs
             if hasattr(estimator, 'coef_'):

Would this be a worthwhile contribution to scikit-learn?

Versions

In [1]: import platform; print(platform.platform())
Linux-3.13.0-63-generic-x86_64-with-Ubuntu-14.04-trusty

In [2]: import sys; print("Python", sys.version)
('Python', '2.7.6 (default, Jun 22 2015, 17:58:13) \n[GCC 4.8.2]')

In [3]: import numpy; print("NumPy", numpy.__version__)
('NumPy', '1.11.0')

In [4]: import scipy; print("SciPy", scipy.__version__)
('SciPy', '0.17.1')

In [5]: import sklearn; print("Scikit-Learn", sklearn.__version__)
('Scikit-Learn', '0.18.dev0')

TODO:

  • Add support for sample_weight in RFE
  • Add support for sample_weight in RFECV
@amueller
Copy link
Member

yeah I think you can open a PR for that. You need to add tests, though.

@jnothman
Copy link
Member

Note that testing for sample weights usually involves checking that weights
correspond to repeating samples.

On 1 September 2016 at 03:33, Andreas Mueller [email protected]
wrote:

yeah I think you can open a PR for that. You need to add tests, though.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#7308 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz64VJO8sJbmnWKX_vPF61jQEWBmwwks5qlbrtgaJpZM4Jx1Lf
.

@g-rutter
Copy link
Author

g-rutter commented Sep 1, 2016

Great, want to assign this to me @amueller? I'll sort out a PR with tests.

@jnothman
Copy link
Member

jnothman commented Sep 1, 2016

We don't use the "assignee" feature much, and for some reason it's not letting me assign. Go ahead and open a PR, reference this issue, then ping one of us after we've released 0.18: it's not likely to be looked at before then.

@amueller
Copy link
Member

you can only "assign" contributors

@fbidu
Copy link
Contributor

fbidu commented Jun 27, 2021

Hey all, just an FIY, @ijpulidos and I worked on this over #20380

@nathanwalker-sp
Copy link

@fbidu thank you so much for this! for what it's worth I think it would be fairly simple to add this also to RFECV (which just calls RFE), but I understand if that's outside of the scope of what you're working on (since you're already building/trying to merge)

@fbidu
Copy link
Contributor

fbidu commented Jul 24, 2021

@nathanwalker-sp you're welcome! Well, I agree with you and sure enough should be a simple change, but I'm not familiar with the practices adopted by this project to track these.

@glemaitre as the reviewer for my PR, what do you think? May I just go ahead and implement this there or is it better if we first merge #20380 and then create a new issue/PR pair to track RFECV?

@fbidu
Copy link
Contributor

fbidu commented Jul 28, 2021

@nathanwalker-sp yeah, @glemaitre already merged it. Can you please create a new issue referring back to this in order to track RFECV?

@glemaitre
Copy link
Member

I am reopening this issue and will rename the title. Feel free to open a new PR.

@glemaitre glemaitre reopened this Jul 28, 2021
@glemaitre glemaitre changed the title RFE doesn't work with sample weights RFE/RFECV doesn't work with sample weights Jul 28, 2021
@max-franceschi
Copy link

Hello,
Is this still open for RFECV?

@glemaitre
Copy link
Member

@max-franceschi metadata routing need to be implemented for the CV. Check #22893

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment