-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG+1] Reduce warnings in the model_selection tests #5703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
47b3e8a
to
31a366a
Compare
Can you quickly summarize why the errors were raised and why you change fixes them? |
Those were the only two warnings I observed, with the first one repeated twice or thrice I think... |
31a366a
to
f0bce2f
Compare
I am wondering if this should be extended to cross_validation tests too?? I've attempted to suppress blindly the warnings raised in tests for the old c_v / g_s / l_c at #5568 reasoning that they will be taken care of by model selection tests... ;) |
Arghhh the ConvergenceWarning is not fully removed.. sorry... on it... |
f0bce2f
to
a7c65ae
Compare
@amueller Fixed! this can be reviewed and merged! |
➜ tests git:(test_val_warnings) ✗ nosetests -v -s .
test_search.test_parameter_grid ... ok
test_search.test_grid_search ... [Parallel(n_jobs=1)]: Done 9 out of 9 | elapsed: 0.0s finished
ok
test_search.test_grid_search_score_method ... ok
test_search.test_grid_search_labels ... ok
test_search.test_trivial_grid_scores ... ok
test_search.test_no_refit ... ok
test_search.test_grid_search_error ... ok
test_search.test_grid_search_iid ... ok
test_search.test_grid_search_no_score ... ok
test_search.test_pandas_input ... ok
test_search.test_refit ... ok
test_search.test_grid_search_one_grid_point ... ok
test_search.test_grid_search_bad_param_grid ... ok
test_search.test_grid_search_sparse ... ok
test_search.test_grid_search_sparse_scoring ... ok
test_search.test_grid_search_precomputed_kernel ... ok
test_search.test_grid_search_precomputed_kernel_error_nonsquare ... ok
test_search.test_grid_search_precomputed_kernel_error_kernel_function ... ok
test_search.test_gridsearch_nd ... ok
test_search.test_X_as_list ... ok
test_search.test_y_as_list ... ok
test_search.test_unsupervised_grid_search ... ok
test_search.test_gridsearch_no_predict ... ok
test_search.test_param_sampler ... ok
test_search.test_randomized_search_grid_scores ... ok
test_search.test_grid_search_score_consistency ... ok
test_search.test_pickle ... ok
test_search.test_grid_search_with_multioutput_data ... ok
test_search.test_predict_proba_disabled ... ok
test_search.test_grid_search_allows_nans ... ok
test_search.test_grid_search_failing_classifier ... ok
test_search.test_grid_search_failing_classifier_raise ... ok
test_search.test_parameters_sampler_replacement ... ok
test_split.test_kfold_valueerrors ... ok
test_split.test_kfold_indices ... ok
test_split.test_kfold_no_shuffle ... ok
test_split.test_stratified_kfold_no_shuffle ... ok
test_split.test_stratified_kfold_ratios ... ok
test_split.test_cross_validator_with_default_indices ... ok
test_split.train_test_split_pandas ... ok
test_split.test_kfold_balance ... ok
test_split.test_stratifiedkfold_balance ... ok
test_split.test_shuffle_kfold ... ok
test_split.test_shuffle_kfold_stratifiedkfold_reproducibility ... ok
test_split.test_shuffle_stratifiedkfold ... ok
test_split.test_kfold_can_detect_dependent_samples_on_digits ... ok
test_split.test_shuffle_split ... ok
test_split.test_stratified_shuffle_split_init ... ok
test_split.test_stratified_shuffle_split_iter ... ok
test_split.test_stratified_shuffle_split_even ... ok
test_split.test_predefinedsplit_with_kfold_split ... ok
test_split.test_label_shuffle_split ... ok
test_split.test_leave_label_out_changing_labels ... ok
test_split.test_train_test_split_errors ... ok
test_split.test_train_test_split ... ok
test_split.train_test_split_mock_pandas ... ok
test_split.test_shufflesplit_errors ... ok
test_split.test_shufflesplit_reproducible ... ok
test_split.test_safe_split_with_precomputed_kernel ... ok
test_split.test_train_test_split_allow_nans ... ok
test_split.test_check_cv ... ok
test_split.test_cv_iterable_wrapper ... ok
test_split.test_label_kfold ... ok
test_split.test_nested_cv ... ok
test_split.test_build_repr ... ok
test_validation.test_cross_val_score ... ok
test_validation.test_cross_val_score_predict_labels ... ok
test_validation.test_cross_val_score_pandas ... ok
test_validation.test_cross_val_score_mask ... ok
test_validation.test_cross_val_score_precomputed ... ok
test_validation.test_cross_val_score_fit_params ... ok
test_validation.test_cross_val_score_score_func ... ok
test_validation.test_cross_val_score_errors ... ok
test_validation.test_cross_val_score_with_score_func_classification ... ok
test_validation.test_cross_val_score_with_score_func_regression ... ok
test_validation.test_permutation_score ... ok
test_validation.test_permutation_test_score_allow_nans ... ok
test_validation.test_cross_val_score_allow_nans ... ok
test_validation.test_cross_val_score_multilabel ... ok
test_validation.test_cross_val_predict ... ok
test_validation.test_cross_val_predict_input_types ... ok
test_validation.test_cross_val_predict_pandas ... ok
test_validation.test_cross_val_score_sparse_fit_params ... ok
test_validation.test_learning_curve ... ok
test_validation.test_learning_curve_unsupervised ... ok
test_validation.test_learning_curve_verbose ... [Parallel(n_jobs=1)]: Done 15 out of 15 | elapsed: 0.0s finished
ok
test_validation.test_learning_curve_incremental_learning_not_possible ... ok
test_validation.test_learning_curve_incremental_learning ... ok
test_validation.test_learning_curve_incremental_learning_unsupervised ... ok
test_validation.test_learning_curve_batch_and_incremental_learning_are_equal ... ok
test_validation.test_learning_curve_n_sample_range_out_of_bounds ... ok
test_validation.test_learning_curve_remove_duplicate_sample_sizes ... ok
test_validation.test_learning_curve_with_boolean_indices ... ok
test_validation.test_validation_curve ... ok
test_validation.test_check_is_permutation ... ok
test_validation.test_cross_val_predict_sparse_prediction ... ok
----------------------------------------------------------------------
Ran 96 tests in 12.914s
OK |
da8afc8
to
0315318
Compare
@amueller could you look at this one too if you are online? |
@@ -126,6 +126,7 @@ def _is_training_data(self, X): | |||
X = np.ones((10, 2)) | |||
X_sparse = coo_matrix(X) | |||
y = np.arange(10) // 2 | |||
y2 = np.array([1, 1, 1, 2, 2, 2, 3, 3, 3, 3]) // 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you don't need the // 2
LGTM apart from nitpick |
0315318
to
3d8276e
Compare
Thanks for the review! Have addressed your comments. |
@@ -216,6 +216,7 @@ def test_kfold_valueerrors(): | |||
# though all the classes are not necessarily represented at on each | |||
# side of the split at each split | |||
with warnings.catch_warnings(): | |||
warnings.simplefilter("ignore") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR, but what is the rationale behind not raising an error for this extreme case when it creates an empty test fold. ie the number of labels for all classes is less than the number of folds?
It is highly likely that this will raise a meaningless error at a further stage. For ex
dtc = DecisionTreeClassifier()
X2 = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]])
y = np.array([3, 3, -1, -1, 2])
cross_val_score(dtc, X2, y)
ValueError: Found array with 0 sample(s) (shape=(0, 2)) while a minimum of 1 is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Thanks for the catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Use data that will converge for the multioutput case - Use atleast 3 samples per class to conform to 3fold cv - Add the elided ignore warnings line - Use the iris dataset to prevent non-convergence of sag solver
3d8276e
to
e0df7c9
Compare
Thanks ! |
Thanks for the reviews and merge :D |
Fix #5669
test_cross_val_predict_input_types
, use the iris dataset to prevent non-convergence... (This won't slow down the tests by any significant amount)cross_val*
which uses 3 fold cv...@amueller