-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
DEP deprecate LogisticRegression parameters penalty and C, introduce new regularization parameter alpha #32042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
DEP deprecate LogisticRegression parameters penalty and C, introduce new regularization parameter alpha #32042
Conversation
…new regularization parameter alpha
This PR is intentionally not yet 100% implemented (class weights, LogisticRegressionCV, fixing a lot of tests elsewhere, etc.). Let's first see how the discussion goes. |
I'm +1 on using That said, I haven’t fully considered the engineering cost of completing this 😅. At first glance, we would also need to update the A few examples would also need updating (e.g., |
…ce new regularization parameter alphas * default of l1_ratio from None to 0.0
82ee1ca
to
e0cdfdb
Compare
I have several worries about this change:
I seriously doubt that the benefits in clarity and API consistency outweighs those costs, even if integrated over the expected lifetime of the scikit-learn project. The third point could be addressed by defining an |
🤔 I opened #28711 in March 2024 over a year ago, @scikit-learn/core-devs were pinged. Yet, it did not receive any critical or worried voices. So thanks @ogrisel for raising them now.
Good point that I did not consider so far. Just my conclusion is different: It gives the authors the opportunity to publish a new edition. Considering all the big API changes that we had (metadata routing, pandas in/out, array api), it seems like this change of penalty param of LogisticRegression would be very minor. This is also a point that could be raised with any deprecation. Fortunately, we don't. Then as analogy, old material of C++ before C++11 is also very outdated. You should not learn from it. Better learn modern C++, so at least C++17.
Isn't it the case with any deprecation? Just that LogisticRegression might be among the most used classes we have. We could spend an extended deprecation period, e.g. 4 instead of 2 releases.
Have a look at the table in #28711 (comment). LogisticRegression is really outstanding in being the only one not having alpha as its penalty parameter. As a maintainer, this makes my brain itch every time I see it. As current main developer for linear models, I see it very often. Also literature prefers penalty instead of anti-penalty C, e.g. the legendary Elements of Statistical Learning and the original publication for penalized (ridge) logreg Cessie (1992) https://doi.org/10.2307/2347628. I guess the C-variant mainly stems from SVM-literature. SVMs nowadays are very much outdated. |
The deprecation of Regarding
There are 2 changes in the PR that should be discussed separately imo:
|
Reference Issues/PRs
Closes #28711.
Related to #11865.
What does this implement/fix? Explain your changes.
This PR
LogisticRegression
penalty
andC
alpha
l1_ratio
fromNone
to0
LogisticRegressionCV
penalty
andCs
alphas
l1_ratios
fromNone
to(0,)
C_
andCs_
alpha_
andalphas_
The way to specify penalties is then 100% aligned with
ElasticNet(alpha=.., l1_ratio=..)
and with other GLM likePoissonRegressor(alpha=)
(without L1, all use1/(2n) * sum(loss) + alpha/2 * ||w||_2
).Any other comments?
This will be a highly controversial issue, therefore a lot of fun ahead 🎉
The main reason for this change is that the current API is objectively bad design. Currently, I need to specify 3 parameters (
penalty
,C
,l1_ratio
) for just 2 effective parameters (the L1 and L2 penalization). On top, it warns a lot when mixing those, e.g.penalty="l2"
andl1_ratio=0
, but why on earth...