-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[RFC] Stateless transformers requiring fit? #12616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Also consider HashingVectorizer, and CountVectorizer with given vocabulary,
and OneHotEncoder and OrdinalEncoder with given categories...
|
+1. This can avoid awkward things like #12514 and maybe more user-friendly. |
This issue is still open. Any workaround when we have "known" categories in OneHotEncoder? |
We are dealing with this issue now with the parameter validation: some stateless estimators would like to validate the validity of some input parameters, usually done at I would be in favour of always having to call At least, having this behaviour will not make these estimators different from others but would still have the tag to ensure some mathematical consistency regarding their stateless aspect. |
I see that I forgot to mention that in one of the meetings we propose to always make parameter validation in |
Uh oh!
There was an error while loading. Please reload this page.
Right now there's some estimators that don't require calling "fit", two that I'm aware of:
Normalizer
andFunctionTransformer
. They do input validation iffit
is called.There's one estimator that is stateless but requires calling
fit
for no real reason I can see,AdditiveChi2Sampler
.My questions are:
Should we remove the requirement to calling
fit
if it can be avoided?If
fit
is called, should we ensure that the number of features is the same infit
andtransform
, even though that's not required by the algorithm to avoid user errors?The text was updated successfully, but these errors were encountered: