Thanks to visit codestin.com
Credit goes to github.com

Skip to content

cross_val_score handling of multiple metrics #11006

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
minggli opened this issue Apr 21, 2018 · 2 comments · Fixed by #11008
Closed

cross_val_score handling of multiple metrics #11006

minggli opened this issue Apr 21, 2018 · 2 comments · Fixed by #11008

Comments

@minggli
Copy link
Contributor

minggli commented Apr 21, 2018

Description

model_selection.cross_val_score explicitly blocks multiple scores despite calling cross_validate underneath the hood. Essentially, we have two identical functions but have manually blocked cross_val_score from accepting multiple scores.

def cross_val_score(estimator, X, y=None, groups=None, scoring=None, cv=None,
                    n_jobs=1, verbose=0, fit_params=None,
                    pre_dispatch='2*n_jobs'):

   # To ensure multimetric format is not supported
    scorer = check_scoring(estimator, scoring=scoring)

    cv_results = cross_validate(estimator=estimator, X=X, y=y, groups=groups,
                                scoring={'score': scorer}, cv=cv,
                                return_train_score=False,
                                n_jobs=n_jobs, verbose=verbose,
                                fit_params=fit_params,
                                pre_dispatch=pre_dispatch)
    return cv_results['test_score']

It creates confusion and duplication of APIs. Open to discussion but can we perhaps consider:

  1. re-evaluate previous discussion in [MRG + 2] ENH Allow cross_val_score, GridSearchCV et al. to evaluate on multiple metrics #7388 and make cross_val_score accept both single and multiple scores and callable, None?
  2. Or alternatively make error message more explicit when passing multiple scores to cross_val_score.

Steps/Code to Reproduce

from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_breast_cancer

X, y = load_breast_cancer(return_X_y=True)
lr = LogisticRegression()

cross_val_score(lr, X, y, scoring=['accuracy', 'f1'])

Expected Results

>>> cross_val_score(lr, X, y, scoring=['accuracy', 'f1'])
{'fit_time': array([0.00580502, 0.00351   , 0.00399804]), 'score_time': array([0.00100303, 0.00081301, 0.00084615]), 'test_accuracy': array([0.93684211, 0.96842105, 0.94179894]), 'test_f1': array([0.95081967, 0.97520661, 0.9527897 ])}

Actual Results

>>> cross_val_score(lr, X, y, scoring=['accuracy', 'f1'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mingli/GitHub/personal/scikit-learn/sklearn/model_selection/_validation.py", line 349, in cross_val_score
    scorer = check_scoring(estimator, scoring=scoring)
  File "/Users/mingli/GitHub/personal/scikit-learn/sklearn/metrics/scorer.py", line 305, in check_scoring
    " None. %r was passed" % scoring)
ValueError: scoring value should either be a callable, string or None. ['accuracy', 'f1'] was passed

Versions

Darwin-17.4.0-x86_64-i386-64bit
Python 3.6.4 (default, Mar 22 2018, 13:54:22)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)]
NumPy 1.14.2
SciPy 1.0.1
Scikit-Learn 0.20.dev0

@minggli
Copy link
Contributor Author

minggli commented Apr 21, 2018

happy to work on it too.

@jnothman
Copy link
Member

jnothman commented Apr 21, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants