-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
DEP parameter penalty in LogisticRegression and LogisticRegressionCV #32659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEP parameter penalty in LogisticRegression and LogisticRegressionCV #32659
Conversation
ogrisel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for getting this is in. I just have a concern with the warning strategy for l1_ratios in LogisticRegressionCV (see below).
I am rather indifferent to the possibilities, just say which one:
In any case, |
doc/whats_new/upcoming_changes/sklearn.linear_model/32659.api.rst
Outdated
Show resolved
Hide resolved
glemaitre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Only 3 tiny changes.
|
A bunch of conflicts now here @lorentzenchr |
Let's first merge #32073 and then resolve merge conflicts here. In case I don't respond in time for your release schedule, please take over. |
b19fa52 to
b3b50fa
Compare
|
@glemaitre @ogrisel It would be nice to get actual review approvals. |
|
@lesteve Thanks for the finish. |
lesteve
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a few tweaks and I am posting this comment before I ran out of steam.
Once thing I noticed, sure it's a bit edge casy, it's not super clear in this PR who wins between penalty and l1 (I think it's penalty even if it is deprecated which is kind the same behaviour in 1.7 but there is not something about this in the output)
In [1]: from sklearn.datasets import make_classification
...: from sklearn.linear_model import LogisticRegression
...:
...: X, y = make_classification()
...: lr = LogisticRegression(penalty='l1', l1_ratio=0, solver='saga')
...: lr.fit(X, y)
...:
/home/lesteve/dev/scikit-learn/sklearn/linear_model/_logistic.py:1132: FutureWarning: 'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.
warnings.warn(
On scikit-learn 1.7 there is a warning that l1_ratio gets ignored ...
In [1]: from sklearn.datasets import make_classification
...: from sklearn.linear_model import LogisticRegression
...:
...: X, y = make_classification()
...: lr = LogisticRegression(penalty='l1', l1_ratio=0, solver='saga')
...: lr.fit(X, y)
...:
/home/lesteve/micromamba/lib/python3.12/site-packages/sklearn/linear_model/_logistic.py:1221: UserWarning: l1_ratio parameter is only used when penalty is 'elasticnet'. Got (penalty=l1)
warnings.warn(
/home/lesteve/micromamba/lib/python3.12/site-packages/sklearn/linear_model/_sag.py:348: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
warnings.warn(
Out[1]: LogisticRegression(l1_ratio=0, penalty='l1', solver='saga')
| # Check that an informative error message is raised when penalty="elasticnet" | ||
| # but l1_ratio is not specified. | ||
| model = LR(penalty="elasticnet", solver="saga") | ||
| model = LR(penalty="elasticnet", **{arg: None}, solver="saga") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a small change of behaviour but I guess fine since it is a bit more permissive than before and this is a bit of a edge case ...
With this PR, LogisticRegression(penalty='elasticnet').fit(X, y) is fine (equivalent to penalty='l2'), because l1_ratio=0. by default. Before it was raising an error because l1_ratio=None.
Similar thing for LogisticRegressionCV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this PR, LogisticRegression(penalty='elasticnet').fit(X, y) is fine (equivalent to penalty='l2'), because l1_ratio=0. by default. Before it was raising an error because l1_ratio=None.
I think it's acceptable as well. It will be there only until 1.10.
| with an L1 penalty, see: | ||
| :ref:`sphx_glr_auto_examples_linear_model_plot_logistic_path.py`. | ||
| l1_ratio : float, default=0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lorentzenchr just curious, can you explain the reordering, is it to be in the same order as the parameters of the constructor?
This makes the PR harder to review and I feel this is not really necessary but whatever at this point, I will just push on and try to ignore the pain 😅 ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameters for the penalty should be at the start. If penalty is gone, it is C and l1_ratio. Similiar to class ElasticNet.
| l1_ratios_ : ndarray of shape (n_l1_ratios) | ||
| Array of l1_ratios used for cross-validation. If no l1_ratio is used | ||
| Array of l1_ratios used for cross-validation. If l1_ratios=None is used | ||
| (i.e. penalty is not 'elasticnet'), this is set to ``[None]`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure "penalty is not 'elasticnet'" is fully accurate here, but I don't have a good suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will have to change in 1.10 and in the mean time is correct so I wouldn't bother too much
jeremiedbb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small comments
| if penalty != "elasticnet" and ( | ||
| self.l1_ratio is not None and 0 < self.l1_ratio < 1 | ||
| ): | ||
| warnings.warn( | ||
| "l1_ratio parameter is only used when penalty is " | ||
| "'elasticnet'. Got " | ||
| "(penalty={})".format(self.penalty) | ||
| "(penalty={})".format(penalty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add an error for the 2 special non compatible cases penalty=l1 + l1_ratio=0 and penalty=l2 + l1_ratio=1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lesteve for the case you mention in #32659 (review), I think we should detect it and raise an error here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lesteve for the case you mention in #32659 (review), I think we should detect it and raise an error here
To be sure we are talking about the same thing you would like the following to raise a snippet in 1.8 even if it's only a UserWarning in 1.7 1?
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
X, y = make_classification()
lr = LogisticRegression(penalty='l1', l1_ratio=0, solver='saga')
lr.fit(X, y)Footnotes
-
the warning is saying that
l1_ratiogets ignored:UserWarning: l1_ratio parameter is only used when penalty is 'elasticnet'. Got (penalty=l1)↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah actually an error is not good because it would break backward compat. I added a warning instead in 49ed659
jeremiedbb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
OK let's merge this one so we can start the rc release process, thanks @lorentzenchr and everyone else for the reviews! |
Reference Issues/PRs
Partially solves #28711.
This is the the no-brainer of #32042 (comment).
What does this implement/fix? Explain your changes.
This PR
deprecates the parameters penalty
changes the default of l1_ratio from None to 0
l1_ratio=None is deprecated and forbidden as of 1.10
deprecates the parameters penalty
changes the default of l1_ratios from None to "warn" (=deprecation of None)
default of l1_ratios will change to (0,) in 1.10
l1_ratios=None is deprecated and forbidden as of 1.10
Any other comments?