Thanks to visit codestin.com
Credit goes to github.com

Skip to content

compute_class_weight() class param behaviour #4327

@trevorstephens

Description

@trevorstephens

Not sure if it's relevant to the motivation behind the implementation as discussed in #4324 , but a two-class y array with two classes present in the classes param proceeds with the sum of the weights being equal to the number of classes:

compute_class_weight('auto', [0, 1], iris.target[0:100])
array([ 1.,  1.])

While a three-class y array with only two of the classes present in the classes param does something different altogether:

compute_class_weight('auto', [0, 1], iris.target[0:120])
array([ 0.66666667,  0.66666667])

I had sidestepped this in compute_sample_weight in #4190 by determining the present classes from y itself. I'm happy to open a PR to remove the param, and was going to, but while the function is somewhat private, it is exposed in partial_fit in BaseSGDClassifier:

"In order to use 'auto' weights, use compute_class_weight('auto', classes, y)."

So does this need a deprecation warning? Some more discussion?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions