-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
RFC should the scikit-learn metrics return a Python scalar or a NumPy scalar? #27339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
As hinted by @glemaitre, the accepted NEP51 proposes to change the representation https://numpy.org/neps/nep-0051-scalar-representation.html It is being implemented for numpy 2.0 scheduled for next year and is causing many of our doctests to fail. To address NEP51, we could go either way. [solution-a] Do nothing to our scoring metrics (continue returning scalar numpy arrays)
[solution-b] Change our scoring metrics to always return a Python scalarSo we could decide to make our scalar metric function always call This means that the precision level of the computation would be lost. This also means that eager/blocking execution semantics are forced when calling a metric function with lazy Array API compatible inputs. It would also always move back the resulting scalar value to the CPU without the user having to do a library specific operation. But this will make our doctests suite pass both for numpy 1.x and numpy 2.x+ unchanged. This is always less verbose for scikit-learn users calling [solution-c] Add a new flag
|
@ogrisel it seems some of your comment is misformatted or missing words? Could you take a look? |
@betatim I edited my comment. |
Thanks a lot! I think my least favourite option is (c). It feels like we are delegating the complexity to our users, who probably have even less clue about "the right thing to do" than we have. I like (b) and think that it isn't a big deal that something like I also wonder if we need to specify that the type of the return value of |
I also like (b) for the same reasons as @betatim mentions |
There seems to be some amount of consensus and no new comments for a while. Should we try to wrap this up? @glemaitre what do you think of the options Oliver listed? Do you also like option (b)? |
Thanks @betatim to keep track of this issue. Option (b) looks the right trade-off right now. |
/take |
@ckosten this is not a good issue to begin with. You can look for good first issues and help wanted tags to find one. |
It was assigned as a beginner problem in a scikit learn workshop recently...
|
Shall we close this issue and open matching meta-issue to track the remaining work to do? It might be redundant with the array API meta issue at #26024 which is already well under way and since most scalar-returning metric function will likely such a treatment to have their tests pass on PyTorch with CUDA. Maybe we can leave it open for now and close it once all the metric function referenced in #26024 have been addressed. |
To be clear, you mean for them to return a scalar? Then yeah I'm happy to have this closed. |
While working on the representation imposed by NEP51, I found out that we recently made the
accuracy_score
to return a Python scalar while, up-to-now, other metric are returning NumPy scalar.This change was made due to the array API work:
scikit-learn/sklearn/utils/_array_api.py
Lines 448 to 454 in b0da1b7
I assume that we are getting to an intersection where we should make the output of our metrics consistent but also foresee potential requirements: as the comment indicate, calling
float()
will be a sync point but it might not be the best strategy for lazy computation.This RFC is a placeholder to discuss what strategy we should be implementing.
The text was updated successfully, but these errors were encountered: