-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
scikit-learn custom transformer is raising NotFitted Error #19953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Our I assume that we should fix our |
@glemaitre what i got from your point is that basically pipeline and randomizedsearchcv is making shallow copy and that leads to not able to clone the parameters in the constructor. But question is why it was working with 0.21.1 (did we make this change after this release) Is there any workaround for the current situation? |
I would need to investigate more. I would not expect the behaviour to have changed. |
@glemaitre is there any workaround for the above issue. Also, did you check for version 0.21.1. Please let me know if, is there any specific file that I can check to look into it, why I am facing this behavior. |
I assume that the thing that could have changed is the def __sklearn_is_fitted__(self):
return True adding this method allows to bypass the error and check_is_fitted will see your estimator as always fitted if it is stateless. It is equivalent to our |
Even after adding |
@glemaitre can you please suggest the workaround since even after adding I'm getting the same error. |
In simple, how can we pass the fitted transformer (in the above case it is |
@amitmeel can you please try to make your reproducer as minimal as possible. I am under the impression that there are many unnecessary steps in your code. This makes it hard for us to understand what's causing you trouble. https://scikit-learn.org/dev/developers/minimal_reproducer.html#minimal-reproducer |
Something that I find weird in your code is that the |
Same comment for |
@ogrisel if you closely look at the code we are passing a fitted transformer/vectorizer to a custom vectorizer.
In the above snippet, this tf_content and tf_keyphrase are fitted instances of TfidfVectorizer, that's why i'm not doing anything in When I'm running the same code in version 0.21.1 , and checking whether tf_content is fitted or not using Reproducible code: code In simple, how can we pass the fitted transformer (in the above case it is tf_content, tf_keyphrase that needs to be passed to CustomVectorizer) in custom transformers so that when we use it with RandomizedSearchCV or VotingClassifier it does not throw NotFittedError and do the transformations per expectation? |
@ogrisel @glemaitre any update on the above issue ?
|
We still don't have a minimal reproducible here. You can use |
Describe the bug
I was experimenting with scikit-learn after updating scikit-learn from 0.21.1 to 1.0.2, and found that the custom transformer had stopped working. I wonder what might have changed in version 1.0.2 which caused this issue. Is there a workaround to resolve this issue?
Below are code snippets of the same to reproduce the issue:
This was working in scikit-learn 0.21.1 as expected and giving the below output:
but in scikit-learn 1.0.2, I'm getting the below error:
Also, when I defined a new classifier and used the voting classifier as shown below, I'm getting NotFitted Error in scikit-learn 1.0.2 but the same code was working with scikit-learn 0.21.1 :
Just wondering what got changed when we call it using the voting classifier. does it clone the estimator and it is not able to pass the fitted instance of tfidf to custom vectorizer.
Versions
scikit-learn=0.24.1
numpy=1.19.2
scipy=1.6.0
pandas=1.2.1
platform: Windows_x64
Python=3.6.10
Note: Code was working fine in scikit-learn version: 0.21.1 , numpy: 1.18.1, scipy: 1.3.1 , pandas:0.25.1.
Reproducible code: code
The text was updated successfully, but these errors were encountered: