Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Enable mixed ensembles with estimators that do & don't accept the sample_weight fit_param #20167

Closed
@ajcallegari

Description

@ajcallegari

I need to make a VotingRegressor ensemble with some estimators that accept sample weights during fitting and some that don't. Currently, mixed ensembles raise an exception:

from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor, VotingRegressor
from sklearn.neighbors import KNeighborsRegressor

X, y = make_regression()
weights = abs(y)

rgr = VotingRegressor(estimators=[('LR', LinearRegression()), 
                                  ('KNN', KNeighborsRegressor()),
                                  ('XGBR', RandomForestRegressor())])

rgr.fit(X, y, sample_weight=weights)

result:
TypeError: Underlying estimator KNeighborsRegressor does not support sample weights.

A possible solution would be to have the ensemble class (e.g. VotingRegressor, VotingClassifier, StackingRegressor, StackingClassifier) read the fit() signatures of the estimators in the ensemble and not pass sample_weight to estimators that don't accept sample_weight. Or more realistically, catch exceptions caused by calls to fit() with the sample_weight parameter and then default to calling fit() without this parameter. This behavior could be default, or activated by flag like "enable_mixed_sample_weight" in the ensemble class's __init__ method. If it's important to notify the user when an estimator doesn't accept the sample_weight parameter, notification and the exception currently in place could be enabled with a flag like "enforce_sample_weight".

As a workaround I'm using the Ensemble class from the pipecaster library (https://github.com/ajcallegari/pipecaster) which allows mixed ensembles by catching exceptions caused by fit() and then defaulting to a fit() call without the sample_weight parameter. This Ensemble class has the scikit-learn interface and supports classification, regression, voting, and model stacking.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions