Open
Description
I have a multi-class problem. I tried to calculate the ROC-AUC score using the function metrics.roc_auc_score()
. This function has support for multi-class but it needs the probability estimates, for that the classifier needs to have the method predict_proba()
. For example, svm.LinearSVC()
does not have it and I have to use svm.SVC()
but it takes so much time with big datasets.
Here is an example is an example of what I try to do:
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
# Get the data
iris = datasets.load_iris()
X, y = iris.data, iris.target
# Create the model
clf = SVC(kernel='linear', probability=True)
# Split the data in train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0)
# Train the model
clf.fit(X_train, y_train)
# Predict the test data
predicted = clf.predict(X_test)
predicted_proba = clf.predict_proba(X_test)
roc_auc = roc_auc_score(y_test, predicted_proba, multi_class='ovr')
If the classifier is changed to svm.LinearSVC()
it will throw an error. It will be useful to add support for multi-class problems without the probability estimates since svm.LinearSVC()
is faster than svm.SVC()
.