Description
Describe the bug
I'm not sure if it is a proper bug, or my lack of understanding of the metadata routing API ;)
When enable_metadata_routing=True
, the score
method of a LogisticRegressionCV
estimator will ignore sample_weight
.
set_config(enable_metadata_routing=True)
logreg_cv = LogisticRegressionCV().fit(X, y)
logreg_cv.score(X, y, sample_weight=sw)==logreg_cv.score(X, y) #unweighted accuracy
I found it surprising, because the score
method works fine when enable_metadata_routing=False
, so the same piece of code behaves differently depending on the metadata routing config.
set_config(enable_metadata_routing=False)
logreg_cv = LogisticRegressionCV().fit(X, y)
logreg_cv.score(X, y, sample_weight=sw) #weighted accuracy
If I understood the metadata routing API correctly, to make the score
method sample_weight
aware we need to explicitly pass a scorer that request it:
set_config(enable_metadata_routing=True)
weighted_accuracy = make_scorer(accuracy_score).set_score_request(sample_weight=True)
logreg_cv = LogisticRegressionCV(scoring=weighted_accuracy).fit(X, y)
logreg_cv.score(X, y, sample_weight=sw) #weighted accuracy
If it's the intended behavior of the metadata routing API, maybe we should warn the user or raise an error in the first case, instead of silently ignoring sample_weight
?
Steps/Code to Reproduce
from sklearn import set_config
from sklearn.metrics import make_scorer, accuracy_score
from sklearn.linear_model import LogisticRegressionCV
import numpy as np
rng = np.random.RandomState(22)
n_samples, n_features = 10, 4
X = rng.rand(n_samples, n_features)
y = rng.randint(0, 2, size=n_samples)
sw = rng.randint(0, 5, size=n_samples)
set_config(enable_metadata_routing=True)
logreg_cv = LogisticRegressionCV()
logreg_cv.fit(X, y)
# sample_weight is silently ignored in logreg_cv.score
assert logreg_cv.score(X, y) == logreg_cv.score(X, y, sample_weight=sw)
assert not logreg_cv.score(X, y, sample_weight=sw)==accuracy_score(logreg_cv.predict(X),y, sample_weight=sw)
Expected Results
Either logreg_cv.score(X, y, sample_weight=sw)
raises an error/warning or the assertions are false.
Actual Results
The assertions are true.
Versions
sklearn: 1.7.dev0
Metadata
Metadata
Assignees
Type
Projects
Status