-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Version 1.6.X: ClassifierMixIn failing with new __sklearn_tags__ function #30479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@DaMuBo According to the official documentation, I think your import numpy as np
from sklearn.base import BaseEstimator, ClassifierMixin
class MyEstimator(ClassifierMixin, BaseEstimator):
def __init__(self, *, param=1):
self.param = param
def fit(self, X, y=None):
self.is_fitted_ = True
return self
def predict(self, X):
return np.full(shape=X.shape[0], fill_value=self.param) The code snippet above runs fine with your |
@gunsodo Thanks for the hint to the check_estimator we'll look into it. But i also think the ClassifierMixIn shouldn't use a method which it can't execute. |
I would still consider this breakage as a regression. We made effort to raise a proper deprecation warning such that user that defined compatible estimators have one version to change their code. While we strongly advocate for using We should probably have a try/except around the |
@adrinjalali do you think that it is a reasonable way forward? |
I'd rather make We certainly shouldn't "fix" the issue as proposed in #30480 since it breaks code if inheritance order is wrong. The code written here in the OP is simply "not supported", however, it's the kinda thing that was supported before, albeit dysfunctional and wrong. I'd suggest two things:
|
I still have this issue with TransformerMixin in 1.6.1. What is the intended fix? My code is something like class TransformCoerce(TransformerMixin, BaseEstimator):
"""Coerce input data to all numeric for xgboost."""
def fit(self, X, y):
return self
def transform(self, X):
cols = X.columns.tolist()
value_cols = list(filter(lambda x: x.endswith("value"), cols))
X[value_cols] = X[value_cols].apply(lambda x: pd.to_numeric(x, errors="coerce"))
return X
pipeline = Pipeline([("transformer", TransformCoerce()),
("xgb", xgb.XGBClassifier(enable_categorical=True))])
X, y = df.drop("GT", axis=1), df["GT"]
clf = pipeline.fit(X, y)
# in-sample predictions for simplicity
y_pred_prob = clf.predict_proba(X)
print(y_pred_prob) All the online tutorials say to inherit from TransformerMixin and BaseEstimator. In particular https://scikit-learn.org/stable/modules/generated/sklearn.base.TransformerMixin.html Is it an issue to do with xgboost? |
@jxu this is not a reproducer, so I can't really answer. But xgboost doesn't have any issues if you use latest xgboost and scikit-learn versions. |
Yes, it is all fixed with latest versions. From https://stackoverflow.com/questions/79290968/super-object-has-no-attribute-sklearn-tags either I had to downgrade to sklearn 1.5.2 or upgrade xgboost to 3.0. |
Describe the bug
Hi,
we are using Sklearn in our projects for different classification training methods on production level. In the dev stage we upgraded to the latest release and our Training failed due to changes in the ClassifierMixIn Class. We use it in combination with a sklearn Pipeline.
in 1.6.X the following function was introduced:
It is calling the sklearn_tags methods from it's parent class. But the ClassifierMixIn doesn't have a parent class. So it says function super().sklearn_tags() is not existing.
Steps/Code to Reproduce
Expected Results
A Prediction is returned.
Actual Results
Versions
The text was updated successfully, but these errors were encountered: