-
-
Notifications
You must be signed in to change notification settings - Fork 26.6k
DOC Release highlights for 1.8 #32809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| print(f"Fitting ElasticNetCV took {toc - tic:.3} seconds.") | ||
|
|
||
| # %% | ||
| # API changes in logistic regression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this! I am a bit undecided about mentioning the Logisticregression{,CV} deprecations in the release highlights.
Part of me thinks that deprecations are not so "crucial" and don't really belong in the release highlights. Part of me thinks that the LogisticRegression{,CV} changes may catch a few users and it would be nice to have a good place where they are summarized. Right now, my feeling is that they are a bit scattered in the changelog and in the docstrings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was even thinking about linking the issue where deprecation or not of C is discussed 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd keep it in the highlights. If 1.8 was the end of the story it might not be worth it, but more changes are happening and the way to handle them is a bit more unusual than a run of the mill deprecation (temporary argument).
|
OK I am going to start pinging people which I feel are way better placed than me to draft some of the entries left as TODO, please feel free to push into this PR branch directly 🙏! It doesn't have to be perfect, once there is an initial draft for the section, we can always improve the release highlights together.
|
ogrisel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass of contributions.
virchan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have the permission to push into the PR branch directly, so I will leave it as a suggestion here.
Co-authored-by: Olivier Grisel <[email protected]> Co-authored-by: Virgil Chan <[email protected]>
Co-authored-by: Christian Lorentzen <[email protected]>
| # in particular `how to install a free-threaded CPython <https://py-free-threading.github.io/installing_cpython/>`_ | ||
| # and `Ecosystem compatibility tracking <https://py-free-threading.github.io/tracking/>`_. | ||
| # | ||
| # The long term goal of free-threaded Python is to more efficiently leverage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the goal of this paragraph? My impression is that it is to tell people that right now they need to reconfigure joblib and to make them aware that this is still in an early phase of development (in scikit-learn). If yes, should we remove the explanation of how free-threaded python offers the promise to remove process based workers and directly say something like: "You need to call joblib.parallel_config(backend="threading") to change the default backend to "threading" and use n_jobs>1 to take advantage of free-threading for parallel computation. Note: free-threaded Python and support for it is brand new, this means that there are open issues to fix before making this the default."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Olivier wrote this, and I guess the goal is to manage expectations. One tricky thing in this section is to hit the right balance between "this is cool, you should definitely try this!" because we would love user feed-back and "actually unless you turn a few knobs here and there, you will likely not notice any difference".
Basically, if you expect to switch to free-threading and your code to be faster, in most cases you will be disappointed, it may actually go a bit slower (e.g. JIT and free-threaded don't play nice with each other or something like this).
For completeness: here is a user report from July 2024 where "just switching to free-threaded" helped #29587, but I guess it's a bit special.
Switching the default joblib backend may go faster by using threads instead of processes. So if a user wants to do this and report issues, it would be more than welcome!
In the future joblib may use threading as the default backend (only for free-threaded I guess) but this is longer-term would need to test this a bit more. I did test it a bit the past but there were issues. Some of them were likely due to Python 3.13 and have gone away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want people to try this (with the "this is experimental!" caveat) I think we should start the paragraph with what they have to do.
From what I understand:
- use Python 3.14
n_jobs>1- switch the joblib backend to threading
Is that right?
Maybe you don't have to switch the joblib backend and still get some improvements in some areas - but is it worth trying to explain this finer point to users vs telling them explicitly "do X, Y and Z". I think I am in the camp of "give clear instructions"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comments, they make a lot of sense. I'll try to have a closer look at improving the free-threaded section!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use Python 3.14
This is not enough. You need to use a free-threaded build of CPython 3.14, not the usual one as explained in the linked doc.
Co-authored-by: Olivier Grisel <[email protected]>
Co-authored-by: Olivier Grisel <[email protected]>
Co-authored-by: Olivier Grisel <[email protected]>
|
For the record, I granted permissions to edit the colab notebook to a few reviewers who asked for it over on discord. |
If instead I get our vote on deprecating |
Co-authored-by: Christian Lorentzen <[email protected]>
|
Rendered doc for last chance review before merge 😉! (or even after merge, you can always backport it to release branch). |
lorentzenchr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be able to merge.
betatim
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve of this message
|
Thanks everyone for your contributions, this feels like the end result is very nice! |
Co-authored-by: Christian Lorentzen <[email protected]> Co-authored-by: Olivier Grisel <[email protected]> Co-authored-by: Virgil Chan <[email protected]> Co-authored-by: Omar Salman <[email protected]> Co-authored-by: Tim Head <[email protected]> Co-authored-by: Jérémie du Boisberranger <[email protected]>
Co-authored-by: Christian Lorentzen <[email protected]> Co-authored-by: Olivier Grisel <[email protected]> Co-authored-by: Virgil Chan <[email protected]> Co-authored-by: Omar Salman <[email protected]> Co-authored-by: Tim Head <[email protected]> Co-authored-by: Jérémie du Boisberranger <[email protected]>
Topics I have thought of by looking at the changelog, in no particular order
CalibratedClassifierCV#31068Other things I am not sure whether they are release highlights noteworthy, TSNE with PCA initialization, MDS with different metrics, QuadraticDiscriminantAnalysis improvements.