-
-
Notifications
You must be signed in to change notification settings - Fork 26k
Description
Not sure if it's relevant to the motivation behind the implementation as discussed in #4324 , but a two-class y
array with two classes present in the classes
param proceeds with the sum of the weights being equal to the number of classes:
compute_class_weight('auto', [0, 1], iris.target[0:100])
array([ 1., 1.])
While a three-class y array with only two of the classes present in the classes
param does something different altogether:
compute_class_weight('auto', [0, 1], iris.target[0:120])
array([ 0.66666667, 0.66666667])
I had sidestepped this in compute_sample_weight
in #4190 by determining the present classes from y
itself. I'm happy to open a PR to remove the param, and was going to, but while the function is somewhat private, it is exposed in partial_fit
in BaseSGDClassifier
:
"In order to use 'auto' weights, use compute_class_weight('auto', classes, y)."
So does this need a deprecation warning? Some more discussion?