-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
KBinsDiscretizer.transform mutates the _encoder attribute #12490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yes, this is a problem. |
Yes, although I am not sure it's necessary because we pass the categories to the constructor so the |
I think our I'm wondering why this is not detected by the common test but I don't have time to investigate now. So contributors here please try to investigate our common test. |
Hey, I'd like to work on this if no one else has taken it up yet! |
An issue is never "taken" :) If you don't see any linked PR, feel free to give it a try. I am not sure that this is a good first issue though. In particular you need to be very familiar with the details of the contract of the estimator API: http://scikit-learn.org/stable/developers/contributing.html#apis-of-scikit-learn-objects |
If we do fit_transform in transform, is there any need to store the encoder?
|
We also need it in |
So there's actually a bug in our common test (i.e., we use |
@ogrisel I eventually recall that we choose to fit the OneHotEncoder in transform because we only determine the bins in fit. We need to put data into different bins before feeding it to OneHotEncoder. |
FYI I proposed an awkward solution in #12514 without refactoring the code. |
I don't understand what you mean here. |
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/preprocessing/_discretization.py#L234-L270
I think we should call
self._encoder.transform
instead ofself._encoder.fit_transform
.The text was updated successfully, but these errors were encountered: