Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[RFC] Stateless transformers requiring fit? #12616

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
amueller opened this issue Nov 19, 2018 · 6 comments
Closed

[RFC] Stateless transformers requiring fit? #12616

amueller opened this issue Nov 19, 2018 · 6 comments
Labels

Comments

@amueller
Copy link
Member

amueller commented Nov 19, 2018

Right now there's some estimators that don't require calling "fit", two that I'm aware of: Normalizer and FunctionTransformer. They do input validation if fit is called.
There's one estimator that is stateless but requires calling fit for no real reason I can see, AdditiveChi2Sampler.

My questions are:

  • Should we remove the requirement to calling fit if it can be avoided?

  • If fit is called, should we ensure that the number of features is the same in fit and transform, even though that's not required by the algorithm to avoid user errors?

@amueller amueller added the API label Nov 19, 2018
@amueller amueller mentioned this issue Nov 19, 2018
4 tasks
@jnothman
Copy link
Member

jnothman commented Nov 19, 2018 via email

@qinhanmin2014
Copy link
Member

Should we remove the requirement to calling fit if it can be avoided?

+1. This can avoid awkward things like #12514 and maybe more user-friendly.

@itanvir
Copy link

itanvir commented Apr 25, 2022

This issue is still open. Any workaround when we have "known" categories in OneHotEncoder?

@glemaitre
Copy link
Member

We are dealing with this issue now with the parameter validation: some stateless estimators would like to validate the validity of some input parameters, usually done at fit even though we will not learn anything on X.

I would be in favour of always having to call fit to validate those parameters and keep the stateless meaning for the estimator not extracting information from the training X useful to transform any X.

At least, having this behaviour will not make these estimators different from others but would still have the tag to ensure some mathematical consistency regarding their stateless aspect.

@glemaitre
Copy link
Member

This PR can be closed since #25190 solve the issue and define the behaviour of the stateless estimator. #24230 also defines a new stateless transformer.

@glemaitre
Copy link
Member

I see that I forgot to mention that in one of the meetings we propose to always make parameter validation in fit but we don't want to request calling it. We added a common test to ensure that this behavior is consistent across scikit-learn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants