Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Deprecated negative valued scorers #6028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions doc/modules/model_evaluation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ For the most common use cases, you can designate a scorer object with the
All scorer ojects follow the convention that higher return values are better
than lower return values. Thus the returns from mean_absolute_error
and mean_squared_error, which measure the distance between the model
and the data, are negated.
and the data, are negated.


======================== ======================================= ==================================
Expand All @@ -65,7 +65,7 @@ Scoring Function Comment
'f1_macro' :func:`metrics.f1_score` macro-averaged
'f1_weighted' :func:`metrics.f1_score` weighted average
'f1_samples' :func:`metrics.f1_score` by multilabel sample
'log_loss' :func:`metrics.log_loss` requires ``predict_proba`` support
'neg_log_loss' :func:`metrics.log_loss` requires ``predict_proba`` support. Negative value returned so greater=better
'precision' etc. :func:`metrics.precision_score` suffixes apply as with 'f1'
'recall' etc. :func:`metrics.recall_score` suffixes apply as with 'f1'
'roc_auc' :func:`metrics.roc_auc_score`
Expand All @@ -74,9 +74,9 @@ Scoring Function Comment
'adjusted_rand_score' :func:`metrics.adjusted_rand_score`

**Regression**
'mean_absolute_error' :func:`metrics.mean_absolute_error`
'mean_squared_error' :func:`metrics.mean_squared_error`
'median_absolute_error' :func:`metrics.median_absolute_error`
'neg_mean_absolute_error' :func:`metrics.mean_absolute_error` Negative MAE returned so greater=better
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this explanation is the best possible but I don't have a better one ;)
\

'neg_mean_squared_error' :func:`metrics.mean_squared_error` Negative MSE returned so greater=better
'neg_median_absolute_error' :func:`metrics.median_absolute_error` Negative MedAE returned so greater=better
'r2' :func:`metrics.r2_score`
======================== ======================================= ==================================

Expand All @@ -92,7 +92,7 @@ Usage examples:
>>> model = svm.SVC()
>>> cross_val_score(model, X, y, scoring='wrong_choice')
Traceback (most recent call last):
ValueError: 'wrong_choice' is not a valid scoring value. Valid options are ['accuracy', 'adjusted_rand_score', 'average_precision', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'log_loss', 'mean_absolute_error', 'mean_squared_error', 'median_absolute_error', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'r2', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'roc_auc']
ValueError: 'wrong_choice' is not a valid scoring value. Valid options are ['accuracy', 'adjusted_rand_score', 'average_precision', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'neg_log_loss', 'neg_mean_absolute_error', 'neg_mean_squared_error', 'neg_median_absolute_error', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'r2', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'roc_auc']

.. note::

Expand Down Expand Up @@ -246,7 +246,7 @@ Some also work in the multilabel case:
fbeta_score
hamming_loss
jaccard_similarity_score
log_loss
neg_log_loss
precision_recall_fscore_support
precision_score
recall_score
Expand Down
2 changes: 1 addition & 1 deletion examples/model_selection/plot_underfitting_overfitting.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@

# Evaluate the models using crossvalidation
scores = cross_val_score(pipeline, X[:, np.newaxis], y,
scoring="mean_squared_error", cv=10)
scoring="neg_mean_squared_error", cv=10)

X_test = np.linspace(0, 1, 100)
plt.plot(X_test, pipeline.predict(X_test[:, np.newaxis]), label="Model")
Expand Down
6 changes: 3 additions & 3 deletions examples/plot_kernel_ridge_regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -153,17 +153,17 @@
kr = KernelRidge(kernel='rbf', alpha=0.1, gamma=0.1)
train_sizes, train_scores_svr, test_scores_svr = \
learning_curve(svr, X[:100], y[:100], train_sizes=np.linspace(0.1, 1, 10),
scoring="mean_squared_error", cv=10)
scoring="neg_mean_squared_error", cv=10)
train_sizes_abs, train_scores_kr, test_scores_kr = \
learning_curve(kr, X[:100], y[:100], train_sizes=np.linspace(0.1, 1, 10),
scoring="mean_squared_error", cv=10)
scoring="neg_mean_squared_error", cv=10)

plt.plot(train_sizes, test_scores_svr.mean(1), 'o-', color="r",
label="SVR")
plt.plot(train_sizes, test_scores_kr.mean(1), 'o-', color="g",
label="KRR")
plt.xlabel("Train size")
plt.ylabel("Mean Squared Error")
plt.ylabel("Negative Mean Squared Error")
plt.title('Learning curves')
plt.legend(loc="best")

Expand Down
4 changes: 2 additions & 2 deletions sklearn/linear_model/tests/test_ridge.py
Original file line number Diff line number Diff line change
Expand Up @@ -339,7 +339,7 @@ def _test_ridge_loo(filter_):

# check that we get same best alpha with custom loss_func
f = ignore_warnings
scoring = make_scorer(mean_squared_error, greater_is_better=False)
scoring = make_scorer(neg_mean_squared_error, greater_is_better=False)
ridge_gcv2 = RidgeCV(fit_intercept=False, scoring=scoring)
f(ridge_gcv2.fit)(filter_(X_diabetes), y_diabetes)
assert_equal(ridge_gcv2.alpha_, alpha_)
Expand All @@ -352,7 +352,7 @@ def _test_ridge_loo(filter_):
assert_equal(ridge_gcv3.alpha_, alpha_)

# check that we get same best alpha with a scorer
scorer = get_scorer('mean_squared_error')
scorer = get_scorer('neg_mean_squared_error')
ridge_gcv4 = RidgeCV(fit_intercept=False, scoring=scorer)
ridge_gcv4.fit(filter_(X_diabetes), y_diabetes)
assert_equal(ridge_gcv4.alpha_, alpha_)
Expand Down
22 changes: 18 additions & 4 deletions sklearn/metrics/scorer.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
from functools import partial

import numpy as np
from ..utils import deprecated

from . import (r2_score, median_absolute_error, mean_absolute_error,
mean_squared_error, accuracy_score, f1_score,
Expand Down Expand Up @@ -314,12 +315,15 @@ def make_scorer(score_func, greater_is_better=True, needs_proba=False,

# Standard regression scores
r2_scorer = make_scorer(r2_score)
mean_squared_error_scorer = make_scorer(mean_squared_error,
neg_mean_squared_error_scorer = make_scorer(mean_squared_error,
greater_is_better=False)
mean_absolute_error_scorer = make_scorer(mean_absolute_error,
mean_squared_error_scorer = deprecated("Function returns negative MSE so that greater=better. This function is deprecated in version 0.18 and will be removed in version 0.20. Use neg_mean_squared_error instead")(neg_mean_squared_error_scorer)
neg_mean_absolute_error_scorer = make_scorer(mean_absolute_error,
greater_is_better=False)
median_absolute_error_scorer = make_scorer(median_absolute_error,
mean_absolute_error_scorer = deprecated("Function returns negative Mean Absolute Error so that greater=better. This function is deprecated in version 0.18 and will be removed in version 0.20. Use neg_absolute_squared_error instead")(neg_mean_absolute_error_scorer)
neg_median_absolute_error_scorer = make_scorer(median_absolute_error,
greater_is_better=False)
median_absolute_error_scorer = deprecated("Function returns negative Median Absolute Error so that greater=better. This function is deprecated in version 0.18 and will be removed in version 0.20. Use neg_median_absolute_error instead")(neg_median_absolute_error_scorer)

# Standard Classification Scores
accuracy_scorer = make_scorer(accuracy_score)
Expand All @@ -334,18 +338,23 @@ def make_scorer(score_func, greater_is_better=True, needs_proba=False,
recall_scorer = make_scorer(recall_score)

# Score function for probabilistic classification
log_loss_scorer = make_scorer(log_loss, greater_is_better=False,
neg_log_loss_scorer = make_scorer(log_loss, greater_is_better=False,
needs_proba=True)
log_loss_scorer = deprecated("Function returns negative Log Loss so that greater=better. This function is deprecated in version 0.18 and will be removed in version 0.20. Use neg_median_absolute_error instead")(neg_log_loss_scorer)

# Clustering scores
adjusted_rand_scorer = make_scorer(adjusted_rand_score)

SCORERS = dict(r2=r2_scorer,
neg_median_absolute_error=neg_median_absolute_error_scorer,
neg_mean_absolute_error=neg_mean_absolute_error_scorer,
neg_mean_squared_error=neg_mean_squared_error_scorer,
median_absolute_error=median_absolute_error_scorer,
mean_absolute_error=mean_absolute_error_scorer,
mean_squared_error=mean_squared_error_scorer,
accuracy=accuracy_scorer, roc_auc=roc_auc_scorer,
average_precision=average_precision_scorer,
neg_log_loss=neg_log_loss_scorer,
log_loss=log_loss_scorer,
adjusted_rand_score=adjusted_rand_scorer)

Expand All @@ -356,3 +365,8 @@ def make_scorer(score_func, greater_is_better=True, needs_proba=False,
qualified_name = '{0}_{1}'.format(name, average)
SCORERS[qualified_name] = make_scorer(partial(metric, pos_label=None,
average=average))





15 changes: 14 additions & 1 deletion sklearn/metrics/tests/test_score_objects.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from sklearn.utils.testing import assert_true
from sklearn.utils.testing import ignore_warnings
from sklearn.utils.testing import assert_not_equal
from sklearn.utils.testing import assert_warns

from sklearn.base import BaseEstimator
from sklearn.metrics import (f1_score, r2_score, roc_auc_score, fbeta_score,
Expand All @@ -32,7 +33,8 @@
from sklearn.multiclass import OneVsRestClassifier


REGRESSION_SCORERS = ['r2', 'mean_absolute_error', 'mean_squared_error',
REGRESSION_SCORERS = ['r2', 'neg_mean_absolute_error', 'neg_mean_squared_error',
'neg_median_absolute_error','mean_absolute_error', 'mean_squared_error',
'median_absolute_error']

CLF_SCORERS = ['accuracy', 'f1', 'f1_weighted', 'f1_macro', 'f1_micro',
Expand Down Expand Up @@ -355,3 +357,14 @@ def test_scorer_sample_weight():
assert_true("sample_weight" in str(e),
"scorer {0} raises unhelpful exception when called "
"with sample weights: {1}".format(name, str(e)))

def test_scorer_neg():
#test that neg_scorers for scorers in [mean_squared_error, mean_absolute_error,
#median_absolute_error, log_loss] return same result as original scorers,
#also test that original scorers return error message
X, y = make_classification(random_state=0)
_, y_ml = make_multilabel_classification(n_samples=X.shape[0],
random_state=0)
split = train_test_split(X, y, y_ml, random_state=0)
X_train, X_test, y_train, y_test, y_ml_train, y_ml_test = split

2 changes: 1 addition & 1 deletion sklearn/model_selection/tests/test_validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ def test_cross_val_score_with_score_func_regression():
assert_array_almost_equal(r2_scores, [0.94, 0.97, 0.97, 0.99, 0.92], 2)

# Mean squared error; this is a loss function, so "scores" are negative
mse_scores = cross_val_score(reg, X, y, cv=5, scoring="mean_squared_error")
mse_scores = cross_val_score(reg, X, y, cv=5, scoring="neg_mean_squared_error")
expected_mse = np.array([-763.07, -553.16, -274.38, -273.26, -1681.99])
assert_array_almost_equal(mse_scores, expected_mse, 2)

Expand Down
2 changes: 1 addition & 1 deletion sklearn/tests/test_cross_validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -883,7 +883,7 @@ def test_cross_val_score_with_score_func_regression():

# Mean squared error; this is a loss function, so "scores" are negative
mse_scores = cval.cross_val_score(reg, X, y, cv=5,
scoring="mean_squared_error")
scoring="neg_mean_squared_error")
expected_mse = np.array([-763.07, -553.16, -274.38, -273.26, -1681.99])
assert_array_almost_equal(mse_scores, expected_mse, 2)

Expand Down