Closed
Description
Description
model_selection.cross_val_score explicitly blocks multiple scores despite calling cross_validate underneath the hood. Essentially, we have two identical functions but have manually blocked cross_val_score from accepting multiple scores.
def cross_val_score(estimator, X, y=None, groups=None, scoring=None, cv=None,
n_jobs=1, verbose=0, fit_params=None,
pre_dispatch='2*n_jobs'):
# To ensure multimetric format is not supported
scorer = check_scoring(estimator, scoring=scoring)
cv_results = cross_validate(estimator=estimator, X=X, y=y, groups=groups,
scoring={'score': scorer}, cv=cv,
return_train_score=False,
n_jobs=n_jobs, verbose=verbose,
fit_params=fit_params,
pre_dispatch=pre_dispatch)
return cv_results['test_score']
It creates confusion and duplication of APIs. Open to discussion but can we perhaps consider:
- re-evaluate previous discussion in [MRG + 2] ENH Allow
cross_val_score
,GridSearchCV
et al. to evaluate on multiple metrics #7388 and make cross_val_score accept both single and multiple scores and callable, None? - Or alternatively make error message more explicit when passing multiple scores to cross_val_score.
Steps/Code to Reproduce
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_breast_cancer
X, y = load_breast_cancer(return_X_y=True)
lr = LogisticRegression()
cross_val_score(lr, X, y, scoring=['accuracy', 'f1'])
Expected Results
>>> cross_val_score(lr, X, y, scoring=['accuracy', 'f1'])
{'fit_time': array([0.00580502, 0.00351 , 0.00399804]), 'score_time': array([0.00100303, 0.00081301, 0.00084615]), 'test_accuracy': array([0.93684211, 0.96842105, 0.94179894]), 'test_f1': array([0.95081967, 0.97520661, 0.9527897 ])}
Actual Results
>>> cross_val_score(lr, X, y, scoring=['accuracy', 'f1'])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mingli/GitHub/personal/scikit-learn/sklearn/model_selection/_validation.py", line 349, in cross_val_score
scorer = check_scoring(estimator, scoring=scoring)
File "/Users/mingli/GitHub/personal/scikit-learn/sklearn/metrics/scorer.py", line 305, in check_scoring
" None. %r was passed" % scoring)
ValueError: scoring value should either be a callable, string or None. ['accuracy', 'f1'] was passed
Versions
Darwin-17.4.0-x86_64-i386-64bit
Python 3.6.4 (default, Mar 22 2018, 13:54:22)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)]
NumPy 1.14.2
SciPy 1.0.1
Scikit-Learn 0.20.dev0
Metadata
Metadata
Assignees
Labels
No labels