Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Warm start bug when fitting a LogisticRegression model on binary outcomes with multi_class='multinomial'. #10836

@rwolst

Description

@rwolst

Description

Bug when fitting a LogisticRegression model on binary outcomes with multi_class='multinomial' when using warm start. Note that it is similar to the issue here #9889 i.e. only using a 1D coef object on binary outcomes even when using multi_class='multinomial' as opposed to a 2D coef object.

Steps/Code to Reproduce

from sklearn.linear_model import LogisticRegression
import sklearn.metrics
import numpy as np

# Set up a logistic regression object
lr = LogisticRegression(C=1000000, multi_class='multinomial',
                    solver='sag', tol=0.0001, warm_start=True,
                    verbose=0)

# Set independent variable values
Z = np.array([
[ 0.        ,  0.        ],
[ 1.33448632,  0.        ],
[ 1.48790105, -0.33289528],
[-0.47953866, -0.61499779],
[ 1.55548163,  1.14414766],
[-0.31476657, -1.29024053],
[-1.40220786, -0.26316645],
[ 2.227822  , -0.75403668],
[-0.78170885, -1.66963585],
[ 2.24057471, -0.74555021],
[-1.74809665,  2.25340192],
[-1.74958841,  2.2566389 ],
[ 2.25984734, -1.75106702],
[ 0.50598996, -0.77338402],
[ 1.21968303,  0.57530831],
[ 1.65370219, -0.36647173],
[ 0.66569897,  1.77740068],
[-0.37088553, -0.92379819],
[-1.17757946, -0.25393047],
[-1.624227  ,  0.71525192]])

# Set dependant variable values
Y = np.array([1, 0, 0, 1, 0, 0, 0, 0, 
          0, 0, 1, 1, 1, 0, 0, 1, 
          0, 0, 1, 1], dtype=np.int32)

# First fit model normally
lr.fit(Z, Y)

p = lr.predict_proba(Z)
print(sklearn.metrics.log_loss(Y, p)) # ...

print(lr.intercept_)
print(lr.coef_)

# Now fit model after a warm start
lr.fit(Z, Y)

p = lr.predict_proba(Z)
print(sklearn.metrics.log_loss(Y, p)) # ...

print(lr.intercept_)
print(lr.coef_)

Expected Results

The predictions should be the same as the model converged the first time it was run.

Actual Results

The predictions are different. In fact the more times you re-run the fit the worse it gets. This is actually the only reason I was able to catch the bug. It is caused by the line here https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/logistic.py#L678.

 w0[:, :coef.shape[1]] = coef

As coef is (1, n_features), but w0 is (2, n_features), this causes the coef value to be broadcast into the w0. This some sort of singularity issue when training resulting in worse performance. Note that had it not done exactly this i.e. w0 was simply initialised by some random values, this bug would be very hard to catch because of course each time the model would converge just not as fast as one would hope when warm starting.

Further Information

The fix I believe is very easy, just need to swap the previous line to

 if n_classes == 1:
     w0[0, :coef.shape[1]] = -coef  # Be careful to get these the right way around
     w0[1, :coef.shape[1]] = coef
 else:
     w0[:, :coef.shape[1]] = coef

Versions

Linux-4.13.0-37-generic-x86_64-with-Ubuntu-16.04-xenial
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
NumPy 1.14.2
SciPy 1.0.0
Scikit-Learn 0.20.dev0 (built from latest master)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions