Division by zero using SGDClassifier with alpha=0 #5329

afshinrahimi · 2015-09-30T01:31:34Z

When using SGDClassifier, setting alpha=0 throws division by zero warnings and underflow/overflow exceptions. This happens ignoring what the penalty term is set to.

If penalty is 'none', I think SGDClassifier should accept alpha=0.

The problem is that, when I set the penalty to 'none' and set different numbers for alpha, I get different results (shuffle is False) which I don't expect because I don't have any regularization any more. So I think something other than regularization is using the alpha parameter.

Reproduction:
The problem is reproducible by setting alpha=0 in one of 20newsgroup SGDClassifiers.

SGDClassifier(alpha=0, average=False, class_weight=None, epsilon=0.1,
       eta0=0.0, fit_intercept=True, l1_ratio=0.15,
       learning_rate='optimal', loss='hinge', n_iter=50, n_jobs=1,
       penalty='none', power_t=0.5, random_state=None, shuffle=False,
       verbose=0, warm_start=False)

Errors:

/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/stochastic_gradient.py:292: RuntimeWarning: divide by zero encountered in double_scalars
  est.power_t, est.t_, intercept_decay)
/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/stochastic_gradient.py:292: RuntimeWarning: invalid value encountered in double_scalars
  est.power_t, est.t_, intercept_decay)

File "sklearn/linear_model/sgd_fast.pyx", line 404, in sklearn.linear_model.sgd_fast.plain_sgd (sklearn/linear_model/sgd_fast.c:4873)
  File "sklearn/linear_model/sgd_fast.pyx", line 697, in sklearn.linear_model.sgd_fast._plain_sgd (sklearn/linear_model/sgd_fast.c:7523)
ValueError: Floating-point under-/overflow occurred at epoch #1. Scaling input data with StandardScaler or MinMaxScaler might help.

The text was updated successfully, but these errors were encountered:

agramfort · 2015-09-30T05:30:40Z

+1 for either raising an error if alpha=0 or default to none automatically. I guess the later option can make grid search param grid easier to handle...

afshinrahimi · 2015-09-30T05:44:05Z

@agramfort even when penalty='none' if alpha=0 the exception is thrown. So I think alpha is being used somewhere other than the regularisation term.

TomDLT · 2015-09-30T08:56:33Z

alpha is currently used for computing the step size eta if learning_rate="optimal":
eta is chosen proportional to 1/alpha, as described in the doc.

However, the docstring in SGDClassifier does not mention alpha if learning_rate="optimal".

We should probably:

add it to the docstring (in alpha and learning_rate)
raise an error if alpha=0, since the case with no penalty is already possible with penalty='none'.
(or maybe just when learning_rate='optimal')

agramfort · 2015-09-30T16:22:03Z

so we throw an error ?

rrohan · 2015-09-30T22:46:54Z

@agramfort @TomDLT I was looking for an easy enhancement to start contributing to scikit-learn. Is this something I can pick up ?

TomDLT · 2015-10-01T07:34:17Z

@rrohan Excelent, go ahead !

rrohan · 2015-10-01T19:20:55Z

@TomDLT Thank you. I have submitted a PR #5335 which fixes this.

[MRG+1] fixes #5329. Division by zero using SGDClassifier

agramfort closed this as completed in f8486a5 Oct 4, 2015

agramfort added a commit that referenced this issue Oct 4, 2015

Merge pull request #5335 from rrohan/current

c4687fd

[MRG+1] fixes #5329. Division by zero using SGDClassifier

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Division by zero using SGDClassifier with alpha=0 #5329

Division by zero using SGDClassifier with alpha=0 #5329

afshinrahimi commented Sep 30, 2015

agramfort commented Sep 30, 2015 via email

afshinrahimi commented Sep 30, 2015

TomDLT commented Sep 30, 2015

agramfort commented Sep 30, 2015 via email

rrohan commented Sep 30, 2015

TomDLT commented Oct 1, 2015

rrohan commented Oct 1, 2015

Division by zero using SGDClassifier with alpha=0 #5329

Division by zero using SGDClassifier with alpha=0 #5329

Comments

afshinrahimi commented Sep 30, 2015

agramfort commented Sep 30, 2015 via email

afshinrahimi commented Sep 30, 2015

TomDLT commented Sep 30, 2015

agramfort commented Sep 30, 2015 via email

rrohan commented Sep 30, 2015

TomDLT commented Oct 1, 2015

rrohan commented Oct 1, 2015