-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
make_scorer needs_threshold makes wrong assumptions #2588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think threshold was intended to mean "aggregate classification metric On Thu, Nov 14, 2013 at 3:50 PM, Mathieu Blondel
|
Well, here's what the documentation about
So it does mention continuous outputs. I actually think |
I think this is an issue we should fix before next release. |
@mblondel I agree that we should fix this before the release. Do you have a suggestion for fixing the first problem (without relying on the class structure ;)? The basic issue is identical with #1404, right? I agree that the documentation is confusing with your application in mind (which I hadn't and which @larsmans might not have had when he rewrote it). So we should either have a Is it ok to just remove the check for the second problem? |
Exactly. A few people opposed having decision_function in all regressors. I think we could just do a try / except.
I think so (the check is already done by roc_auc_score anyway). |
Ok so you propose to use try:
y_pred = clf.decision_function(X)
[...]
except (NotImplementedError, AttributeError):
try:
y_pred = clf.predict_proba(X)
[...]
except (NotImplementedError, AttributeError):
y_pred = clf.predict(X) This might give some weird results when people pass something that doesn't have a decision function, but is not a regressor, for example k-means, or if there is a classifier that doesn't have a |
I think (correct me if I'm wrong) most of our classifiers (if not all) have at least decision_function or predict_proba. For clustering, we can't be blamed if people do things that do not make sense: "garbage in garbage out". |
Another idea that comes to mind is to have an utility function or class for turning any regressor into a binary classifier:
The created classifier can be passed without ambiguity to OneVsRestClassifier or GridSearchCV.
I think this solution is less convenient to use. Also, unless we dynamically create classes, when doing parameter tuning, we will need to use parameters like |
I think all classifiers do have a First I thought we want a robust way to find out if something is a regressor. But then I noticed that is not what your solution implements. You are dividing "classifiers and regressors with Maybe it would then make easier to just add Somehow I feel this blurs the line between doing classification and regression, and in general I feel it is easier if one is clear about what one is doing. Can you maybe explain the motivation again? You have three relevance levels that you want to regress? It would be nice to see the exact example as I don't entirely understand the use case. If we are building a ranking API that would be good to know ;) |
As an alternative solution to try / except, I suggested a helper utility / class to turn any regressor into a binary classifier. We could implement this in two ways: by a meta-estimator or by dynamically building the class. In the former case, one would have to use
In retrospect, I often feel that we should have had
This would be my preferred solution too if people who opposed could reconsider
Regressors are a perfectly valid way to do ranking (they are called pointwise methods in the learning to rank literature). If you have only two relevance levels (0 and 1), binary classifiers can also be used. So I would like to be able to use both regressors or binary classifiers. Since we need continuous scores, we need to call (As a side note, in my experience so far, pointwise methods are just as good as pairwise or listwise methods if you optimize hyper-parameters) Likewise, as I argued in #1404, regressors are a perfectly valid way to do binary classification (the squared loss has good properties).
We can already do ranking with our current API. What our API doesn't allow yet is to specify groups (query ids in the information retrieval context). In my research, I do ranking with a single group (the entire training set). Thus, our current API is fine. |
Thank you for your explanations. When thinking about the problem, I also thought that Do you think in you setting it makes sense to differentiate between regressors and classifiers in your setting? If we say squared loss is a valid loss for classification (I would claim it is a bad old habit) then the only difference between regression and classification is the semantics of predict, and the tie with the loss-function becomes a bit odd. As we won't rename |
I guess it's not necessary if you write your own estimator but I'd like to reuse existing regressors classifiers in scikit-learn. |
Sorry for insisting but can we agree on a solution? This is blocking me. Can you vote between adding decision_function to RegressorMixin and doing a try/except here? So far we have +2 for the former. CC @ogrisel @larsmans @jnothman @GaelVaroquaux |
OMG this has been a year. That is horrible. |
I found two issues with
make_scorer
'sneeds_threshold
option.First, it doesn't work if the base estimator is a regressor. Using a regressor is a perfectly valid use case, e.g., if you want do use a regressor but still want to optimize your hyper-parameters against AUC (pointwise ranking with two relevance levels). It does work with
Ridge
but this is becausedecision_function
andpredict
are aliases of each others. See also the discussion in issue #1404.Second,
_ThresholdScorer
checks that the number of unique values iny_true
is 2 but this need not be the case. For example, I can use a regressor and optimize my hyper-parameters against NDCG with more than 2 relevance levels.The text was updated successfully, but these errors were encountered: