Closed
Description
Easy one first, there is an unused class_weight
parameter in the fit
method signature, class_weight
flows in through the constructor:
Just to prove it:
from sklearn.linear_model import SGDClassifier
from sklearn.datasets import make_classification
from sklearn.utils import compute_class_weight
import numpy as np
X, y = make_classification(n_features=5, weights=[0.7, 0.3],
n_clusters_per_class=1, random_state=415)
# Baseline
clf = SGDClassifier()
clf.fit(X, y)
print clf.coef_
# [[ 2.13434174 1.8734288 2.12685039 5.08116123 2.89369872]]
# With unused fit class_weight attribute
clf = SGDClassifier()
clf.fit(X, y, class_weight='auto')
print clf.coef_
# [[ 2.13434174 1.8734288 2.12685039 5.08116123 2.89369872]]
Now weighting the samples in different (equivalent) ways:
# With auto-weights
clf = SGDClassifier(class_weight='auto')
clf.fit(X, y)
print clf.coef_
# [[ 10.10838607 -4.29529238 13.14026606 -5.99728163 7.65887541]]
weights = compute_class_weight('auto', clf.classes_, y)
weights = dict(zip(clf.classes_, weights))
mapper = np.vectorize(lambda c: weights[c])
weights = mapper(y)
# With manual auto-weights
clf = SGDClassifier()
clf.fit(X, y, sample_weight=weights)
print clf.coef_
# [[ 10.10838607 -4.29529238 13.14026606 -5.99728163 7.65887541]]
# With manual auto-weights & unused fit class_weight attribute
clf = SGDClassifier()
clf.fit(X, y, sample_weight=weights, class_weight='auto')
print clf.coef_
# [[ 10.10838607 -4.29529238 13.14026606 -5.99728163 7.65887541]]
All fine so far, but if you do both class_weight
in the constructor and sample_weights
in the fitting, the resulting weights appear to be multiplicative.
# With manual auto-weights, squared
clf = SGDClassifier()
clf.fit(X, y, sample_weight=weights**2)
print clf.coef_
# [[ 3.22495438 14.11510502 0.58504094 6.38631993 9.55338404]]
# With auto-weights manual auto-weights -- multiplicative
clf = SGDClassifier(class_weight='auto')
clf.fit(X, y, sample_weight=weights)
print clf.coef_
# [[ 3.22495438 14.11510502 0.58504094 6.38631993 9.55338404]]
Whether this is desirable or not is one thing, but it does not appear to be documented anywhere, ie neither class_weight
nor sample_weight
refer to one another in their docstrings. I feel like perhaps a warning or error should be raised, or at least a mention of the interaction in the docstring.