[MRG+1]: TEST runtime down to 4:30 min on an old laptop #5711

giorgiop · 2015-11-04T08:56:45Z

Runtime comparison is on a laptop configured with

Mac OS X
4 GB 1600 MHz DDR3 
1.8 GHz Intel Core i5

Tests run in master in

time nosetests sklearn/
real    5m28.285s
user    5m59.565s
sys  0m8.353s

giorgiop · 2015-11-04T09:11:19Z

nosetests --with-timer sklearn/gaussian_process/tests/test_gpc.py

sklearn.gaussian_process.tests.test_gpc.test_custom_optimizer: 21.0710s
sklearn.gaussian_process.tests.test_gpc.test_random_starts: 3.2065s
sklearn.gaussian_process.tests.test_gpc.test_multi_class_n_jobs: 1.0145s
sklearn.gaussian_process.tests.test_gpc.test_multi_class: 0.3340s
sklearn.gaussian_process.tests.test_gpc.test_lml_gradient: 0.1230s
sklearn.gaussian_process.tests.test_gpc.test_converged_to_local_maximum: 0.1130s
sklearn.gaussian_process.tests.test_gpc.test_lml_precomputed: 0.1064s
sklearn.gaussian_process.tests.test_gpc.test_lml_improving: 0.1058s
sklearn.gaussian_process.tests.test_gpc.test_predict_consistent: 0.1001s

sklearn.gaussian_process.tests.test_gpc.test_custom_optimizer: 1.3062s
sklearn.gaussian_process.tests.test_gpc.test_multi_class_n_jobs: 0.9798s
sklearn.gaussian_process.tests.test_gpc.test_random_starts: 0.7233s
sklearn.gaussian_process.tests.test_gpc.test_multi_class: 0.3823s
sklearn.gaussian_process.tests.test_gpc.test_lml_gradient: 0.1617s
sklearn.gaussian_process.tests.test_gpc.test_converged_to_local_maximum: 0.1316s
sklearn.gaussian_process.tests.test_gpc.test_lml_precomputed: 0.1232s
sklearn.gaussian_process.tests.test_gpc.test_lml_improving: 0.1021s
sklearn.gaussian_process.tests.test_gpc.test_predict_consistent: 0.1003s

giorgiop · 2015-11-04T09:28:25Z

nosetests --with-timer sklearn/gaussian_process/tests/test_gpr.py

sklearn.gaussian_process.tests.test_gpr.test_custom_optimizer: 6.8643s
sklearn.gaussian_process.tests.test_gpr.test_random_starts: 3.8899s
sklearn.gaussian_process.tests.test_gpr.test_sample_statistics: 2.5284s
sklearn.gaussian_process.tests.test_gpr.test_duplicate_input: 0.3336s
sklearn.gaussian_process.tests.test_gpr.test_y_multioutput: 0.2750s
sklearn.gaussian_process.tests.test_gpr.test_y_normalization: 0.2549s
sklearn.gaussian_process.tests.test_gpr.test_lml_gradient: 0.2105s
sklearn.gaussian_process.tests.test_gpr.test_predict_cov_vs_std: 0.1856s
sklearn.gaussian_process.tests.test_gpr.test_lml_improving: 0.1843s
sklearn.gaussian_process.tests.test_gpr.test_gpr_interpolation: 0.1789s
sklearn.gaussian_process.tests.test_gpr.test_converged_to_local_maximum: 0.1761s
sklearn.gaussian_process.tests.test_gpr.test_lml_precomputed: 0.1745s
sklearn.gaussian_process.tests.test_gpr.test_solution_inside_bounds: 0.1688s
sklearn.gaussian_process.tests.test_gpr.test_anisotropic_kernel: 0.0456s
sklearn.gaussian_process.tests.test_gpr.test_prior: 0.0041s
sklearn.gaussian_process.tests.test_gpr.test_no_optimizer: 0.0028s

sklearn.gaussian_process.tests.test_gpr.test_random_starts: 0.9299s
sklearn.gaussian_process.tests.test_gpr.test_sample_statistics: 0.8917s
sklearn.gaussian_process.tests.test_gpr.test_custom_optimizer: 0.3682s
sklearn.gaussian_process.tests.test_gpr.test_duplicate_input: 0.3316s
sklearn.gaussian_process.tests.test_gpr.test_y_multioutput: 0.2671s
sklearn.gaussian_process.tests.test_gpr.test_y_normalization: 0.2543s
sklearn.gaussian_process.tests.test_gpr.test_lml_gradient: 0.2073s
sklearn.gaussian_process.tests.test_gpr.test_gpr_interpolation: 0.1887s
sklearn.gaussian_process.tests.test_gpr.test_predict_cov_vs_std: 0.1811s
sklearn.gaussian_process.tests.test_gpr.test_converged_to_local_maximum: 0.1808s
sklearn.gaussian_process.tests.test_gpr.test_solution_inside_bounds: 0.1771s
sklearn.gaussian_process.tests.test_gpr.test_lml_improving: 0.1747s
sklearn.gaussian_process.tests.test_gpr.test_lml_precomputed: 0.1692s
sklearn.gaussian_process.tests.test_gpr.test_anisotropic_kernel: 0.0472s
sklearn.gaussian_process.tests.test_gpr.test_prior: 0.0041s
sklearn.gaussian_process.tests.test_gpr.test_no_optimizer: 0.0019s

giorgiop · 2015-11-04T09:45:06Z

@jmetzen any idea how to make this line faster? It's the bottleneck in there.

glouppe · 2015-11-04T09:55:17Z

@giorgiop Maybe reduce the size of X and Y to (5, 2) and (6, 2) (line 26 and 27)? (this should reduces by 75% the duration of all tests in test_kernels.py)

giorgiop · 2015-11-04T10:01:04Z

nosetests --with-timer sklearn/gaussian_process/tests/test_kernels.py

sklearn.gaussian_process.tests.test_kernels.test_kernel_gradient: 5.6498s
sklearn.gaussian_process.tests.test_kernels.test_kernel_clone: 0.0295s
sklearn.gaussian_process.tests.test_kernels.test_kernel_diag: 0.0246s
sklearn.gaussian_process.tests.test_kernels.test_set_get_params: 0.0231s
sklearn.gaussian_process.tests.test_kernels.test_kernel_versus_pairwise: 0.0229s
sklearn.gaussian_process.tests.test_kernels.test_kernel_theta: 0.0175s
sklearn.gaussian_process.tests.test_kernels.test_auto_vs_cross: 0.0133s
sklearn.gaussian_process.tests.test_kernels.test_matern_kernel: 0.0064s
sklearn.gaussian_process.tests.test_kernels.test_kernel_stationary: 0.0051s
sklearn.gaussian_process.tests.test_kernels.test_kernel_operator_commutative: 0.0023s
sklearn.gaussian_process.tests.test_kernels.test_kernel_anisotropic: 0.0015s

sklearn.gaussian_process.tests.test_kernels.test_kernel_gradient: 1.3098s
sklearn.gaussian_process.tests.test_kernels.test_kernel_clone: 0.0303s
sklearn.gaussian_process.tests.test_kernels.test_set_get_params: 0.0266s
sklearn.gaussian_process.tests.test_kernels.test_kernel_versus_pairwise: 0.0202s
sklearn.gaussian_process.tests.test_kernels.test_kernel_diag: 0.0138s
sklearn.gaussian_process.tests.test_kernels.test_auto_vs_cross: 0.0138s
sklearn.gaussian_process.tests.test_kernels.test_kernel_theta: 0.0137s
sklearn.gaussian_process.tests.test_kernels.test_kernel_stationary: 0.0061s
sklearn.gaussian_process.tests.test_kernels.test_matern_kernel: 0.0051s
sklearn.gaussian_process.tests.test_kernels.test_kernel_anisotropic: 0.0013s
sklearn.gaussian_process.tests.test_kernels.test_kernel_operator_commutative: 0.0011s

Thanks for the hint @glouppe

giorgiop · 2015-11-04T10:27:56Z

Regarding test_spectral_embedding, the offending commit seems to be #5443

Ping @AlexandreAbraham :)

giorgiop · 2015-11-04T11:57:22Z

nosetests --with-timer --timer-top-n 5 sklearn/utils/tests/test_extmath.py

sklearn.utils.tests.test_extmath.test_randomized_svd_power_iteration_normalizer: 1.3771s
sklearn.utils.tests.test_extmath.test_incremental_variance_numerical_stability: 0.5242s
sklearn.utils.tests.test_extmath.test_logsumexp: 0.3015s
sklearn.utils.tests.test_extmath.test_randomized_svd_low_rank: 0.1158s
sklearn.utils.tests.test_extmath.test_randomized_svd_infinite_rank: 0.0401s

sklearn.utils.tests.test_extmath.test_incremental_variance_numerical_stability: 0.5455s
sklearn.utils.tests.test_extmath.test_randomized_svd_power_iteration_normalizer: 0.4139s
sklearn.utils.tests.test_extmath.test_logsumexp: 0.3238s
sklearn.utils.tests.test_extmath.test_randomized_svd_low_rank: 0.1095s
sklearn.utils.tests.test_extmath.test_randomized_svd_infinite_rank: 0.0376s

AlexandreAbraham · 2015-11-04T12:10:30Z

The new _graph_connected_component is expected to be slower because it saves memory but here it is really too slow. I'll try to find a good compromise.

giorgiop · 2015-11-04T12:42:43Z

nosetests --with-timer --timer-top-n 5 sklearn/linear_model/tests/test_coordinate_descent.py

sklearn.linear_model.tests.test_coordinate_descent.test_multitask_enet_and_lasso_cv: 1.1202s
sklearn.linear_model.tests.test_coordinate_descent.test_uniform_targets: 0.1222s
sklearn.linear_model.tests.test_coordinate_descent.test_enet_path: 0.1062s
sklearn.linear_model.tests.test_coordinate_descent.test_lasso_cv: 0.0928s
sklearn.linear_model.tests.test_coordinate_descent.test_sparse_input_dtype_enet_and_lassocv: 0.0738s

sklearn.linear_model.tests.test_coordinate_descent.test_multitask_enet_and_lasso_cv: 0.5759s
sklearn.linear_model.tests.test_coordinate_descent.test_uniform_targets: 0.1230s
sklearn.linear_model.tests.test_coordinate_descent.test_lasso_cv: 0.1167s
sklearn.linear_model.tests.test_coordinate_descent.test_enet_path: 0.1013s
sklearn.linear_model.tests.test_coordinate_descent.test_sparse_input_dtype_enet_and_lassocv: 0.0726s

giorgiop · 2015-11-04T12:57:08Z

time nosetests sklearn/
real    4m45.825s
user    5m18.897s
sys 0m7.819s

Fixing test_spectral_embedding will do the job.

test_spectral_embedding.test_spectral_embedding_callable_affinity: 12.6687s
test_spectral_embedding.test_spectral_embedding_precomputed_affinity: 12.4119s
test_spectral_embedding.test_pipeline_spectral_clustering: 6.3183s
sklearn.neighbors.tests.test_neighbors.test_kneighbors_parallel: 5.4751s
sklearn.datasets.tests.test_lfw.test_load_fake_lfw_pairs: 3.3964s
sklearn.datasets.tests.test_lfw.test_load_fake_lfw_people: 2.9395s
test_split.test_nested_cv: 2.9142s
sklearn.ensemble.tests.test_iforest.test_iforest_sparse: 2.8101s
sklearn.ensemble.tests.test_bagging.test_oob_score_removed_on_warm_start: 2.3969s
sklearn.tests.test_cross_validation.test_kfold_can_detect_dependent_samples_on_digits: 2.3388s

If we want to push this for the 0.17 release, we may just revert the offending commit and open a new PR for that? @amueller

giorgiop · 2015-11-04T14:40:41Z

nosetests --with-timer --timer-top-n 5 sklearn/model_selection/tests/test_split.py

test_split.test_nested_cv: 3.0061s
test_split.test_kfold_can_detect_dependent_samples_on_digits: 2.2134s
test_split.test_stratified_shuffle_split_even: 0.1780s
test_split.test_shuffle_kfold_stratifiedkfold_reproducibility: 0.0294s
test_split.test_cross_validator_with_default_indices: 0.0258s

test_split.test_nested_cv: 1.9301s
test_split.test_kfold_can_detect_dependent_samples_on_digits: 0.6001s
test_split.test_stratified_shuffle_split_even: 0.1821s
test_split.test_label_kfold: 0.0249s
test_split.test_stratified_shuffle_split_iter: 0.0224s

giorgiop · 2015-11-04T15:04:17Z

I have reduced the amount of data used by test_kfold_can_detect_dependent_samples_on_digits. The semantic should be untouched. The test refers to #2372 so maybe @ogrisel can confirm whether my version of the test is all right.

amueller · 2015-11-04T15:09:04Z

Thanks you so much for investigating @giorgiop this is super helpful.
Which offending commit do you mean? #5443? That's not in 0.17.X, is it?

giorgiop · 2015-11-04T15:12:26Z

That's the right commit, which indeed is not in 0.17.X. But we believe that commit is the cause of the slow down of the tests in spectral_embedding in master.

amueller · 2015-11-04T15:25:38Z

Ok. I'm focussing on 0.17.X for the moment :)

giorgiop · 2015-11-04T15:35:30Z

OK sure. On 0.17.X, on the same machine:

time nosetests --with-timer --timer-top-n 20 sklearn/

sklearn.neighbors.tests.test_neighbors.test_kneighbors_parallel: 25.4760s
sklearn.decomposition.tests.test_dict_learning.test_dict_learning_reconstruction_parallel: 11.2362s
sklearn.decomposition.tests.test_dict_learning.test_dict_learning_lassocd_readonly_data: 9.6868s
sklearn.ensemble.tests.test_bagging.test_base_estimator: 7.6753s
sklearn.ensemble.tests.test_bagging.test_parallel_classification: 4.5408s
sklearn.metrics.tests.test_pairwise.test_pairwise_parallel(<function pairwise_kernels at 0x108cd06a8>, <function callable_rbf_kernel at 0x10c67d488>, {'gamma': 0.1}): 4.1631s
sklearn.decomposition.tests.test_sparse_pca.test_fit_transform_parallel: 3.9137s
sklearn.datasets.tests.test_lfw.test_load_fake_lfw_pairs: 3.6907s
sklearn.metrics.tests.test_pairwise.test_pairwise_parallel(<function pairwise_distances at 0x108cd0598>, 'euclidean', {}): 3.2684s
sklearn.tests.test_pipeline.test_feature_union_parallel: 3.1745s
sklearn.metrics.tests.test_pairwise.test_pairwise_parallel(<function pairwise_kernels at 0x108cd06a8>, 'polynomial', {'degree': 1}): 3.1430s
sklearn.datasets.tests.test_lfw.test_load_fake_lfw_people: 3.0926s
sklearn.ensemble.tests.test_bagging.test_classification: 2.5076s
sklearn.metrics.tests.test_pairwise.test_pairwise_parallel(<function pairwise_distances at 0x108cd0598>, <function wminkowski at 0x1086ea840>, {'p': 1, 'w': array([ 1.,  2.,  3.,  4.])}): 2.4811s
sklearn.ensemble.tests.test_bagging.test_parallel_regression: 2.2660s
sklearn.ensemble.tests.test_bagging.test_oob_score_removed_on_warm_start: 2.2611s
sklearn.ensemble.tests.test_bagging.test_sparse_classification: 2.2533s
sklearn.tests.test_cross_validation.test_kfold_can_detect_dependent_samples_on_digits: 2.2246s
test_t_sne.test_preserve_trustworthiness_approximately: 2.0327s
sklearn.decomposition.tests.test_online_lda.test_lda_partial_fit_multi_jobs: 1.9221s
----------------------------------------------------------------------
Ran 6364 tests in 303.294s

OK (SKIP=14)

real    5m9.007s
user    4m9.069s
sys 0m5.084s

giorgiop · 2015-11-04T15:40:35Z

The list is quite different from the one from master.
My bad that I did not get we wanted to work mostly on 0.17.X.

jmetzen · 2015-11-05T08:12:54Z

@jmetzen any idea how to make this line faster? It's the bottleneck in there.

Yes, my bad, this can be speed up easily. The two nested for-loops are not required. You can change the test such as shown here: https://gist.github.com/jmetzen/a7d5afc15a882e4ce443

_approx_fprime needs to be imported from sklearn.gaussian_process.kernels and you can remove the line from scipy.optimize import approx_fprime

giorgiop · 2015-11-05T08:45:00Z

@jmetzen thanks! Here the improvement nosetests --with-timer --timer-top-n 3 sklearn/gaussian_process/tests/test_kernels.py:

sklearn.gaussian_process.tests.test_kernels.test_kernel_gradient: 1.3574s
sklearn.gaussian_process.tests.test_kernels.test_kernel_clone: 0.0320s
sklearn.gaussian_process.tests.test_kernels.test_set_get_params: 0.0253s

sklearn.gaussian_process.tests.test_kernels.test_kernel_gradient: 0.1073s
sklearn.gaussian_process.tests.test_kernels.test_kernel_clone: 0.0321s
sklearn.gaussian_process.tests.test_kernels.test_set_get_params: 0.0254s

giorgiop · 2015-11-05T14:16:29Z

So we have reduced real runtime of 65s (~20%). I am compiling with MLK though. time nosetests sklearn/

# master
real    5m44.856s
user    6m14.534s
sys 0m10.325s

# this branch
real    4m37.495s
user    5m11.101s
sys 0m7.780s

The remaining big slow down is addressed by #5713. @amueller can you cherry pick from here what is needed for 0.17.X?

TomDLT · 2015-11-05T20:56:45Z

So we have reduced real runtime of 65s (~20%)

great !

giorgiop · 2015-11-12T09:46:09Z

For some reasons, test_randomized_svd_power_iteration_normalizer is much slower (relatively to the other tests) on travis than on my machine, but this is true only for the instance with python=3.5.

Now that we have nose-timer installed and running in all the CIs, I am also surprised to see how the slowest tests for travis vs. appveyor are actually not the same at all.

giorgiop · 2015-11-13T10:30:11Z

I have tried with another machine on a virtual env with the same configuration of travis with python=3.5, and test_randomized_svd_power_iteration_normalizer is never so slow as it is on CI. It does not even make it in top 10.

amueller · 2015-11-14T17:51:10Z

sorry @giorgiop I was unclear. i meant that I am working on 0.17 and had not time to review this pr.

Re differences in runtime: that is probably due to different blas? We should probably try to have good blas everywhere.

giorgiop · 2015-11-18T15:49:00Z

@arthurmensch may have a clue about this :)

arthurmensch · 2015-11-19T08:54:56Z

Conda 2.4 with Python 3.5 uses OpenBLAS by default, could this be related ?

giorgiop · 2015-12-11T01:02:31Z

This should be ready for review/merge, except the issue with slow linear algebra due to interaction with MLK/conda/python3.5 which may be addressed later on.

ogrisel · 2016-02-01T15:32:22Z

sklearn/gaussian_process/tests/test_gpc.py

        gpc = GaussianProcessClassifier(kernel=kernel).fit(X, y)

        lml, lml_gradient = \
            gpc.log_marginal_likelihood(gpc.kernel_.theta, True)

-        assert_true(np.all((np.abs(lml_gradient) < 1e-4)
-                           | (gpc.kernel_.theta == gpc.kernel_.bounds[:, 0])
-                           | (gpc.kernel_.theta == gpc.kernel_.bounds[:, 1])))


Just a note for later: let's ignore those kind of non-important pep8 violations. We could give a list of flake8 warnings to ignore in a conf file in the repo.

ogrisel · 2016-02-01T15:39:04Z

Aside from minor comments, +1 on my side. This PR could be squashed and merged without waiting for the fix for test_randomized_svd_power_iteration_normalizer.

AlexandreAbraham · 2016-02-01T15:43:51Z

I just realized that #5713, that fixes a big part of this problem, has not been merged. I think that we should not let it die.

giorgiop · 2016-02-01T20:54:53Z

Thanks guys. I may be able to work on this next week!

I just realized that #5713, that fixes a big part of this problem, has not been merged. I think that we should not let it die.

+1 !

ogrisel · 2016-02-03T10:46:38Z

I have addressed my own comments in #6270. Let's follow-up there.

giorgiop · 2016-02-21T22:29:13Z

Thanks @ogrisel

giorgiop force-pushed the faster-test branch from 3457ba5 to 04fb77b Compare November 4, 2015 08:57

giorgiop force-pushed the faster-test branch from 04fb77b to 32fe19e Compare November 4, 2015 09:12

giorgiop force-pushed the faster-test branch from 24866a9 to c8e050f Compare November 4, 2015 11:57

giorgiop changed the title ~~[WIP]: TEST runtime down to 4:30 min on an old laptop~~ [MRG]: TEST runtime down to 4:30 min on an old laptop Nov 9, 2015

giorgiop added 5 commits November 12, 2015 09:44

test_gpc

604daca

test_gpr

e6a9e24

test_kernels

12c3855

test_extmath

368943c

test_coordinate_descent

34ced09

giorgiop added 3 commits November 12, 2015 09:44

split long test in test_neighbors

7c31d55

test_split

9ed589e

test_kernels.test_kernel_gradient

210b8f7

giorgiop force-pushed the faster-test branch from db66c09 to 210b8f7 Compare November 12, 2015 08:44

amueller added the Waiting for Reviewer label Dec 10, 2015

ogrisel mentioned this pull request Feb 1, 2016

Tests run much slower after sprint #5639

Closed

ogrisel reviewed Feb 1, 2016
View reviewed changes

ogrisel changed the title ~~[MRG]: TEST runtime down to 4:30 min on an old laptop~~ [MRG+1]: TEST runtime down to 4:30 min on an old laptop Feb 1, 2016

ogrisel mentioned this pull request Feb 3, 2016

[MRG] Reduce test running time #6270

Merged

ogrisel closed this Feb 3, 2016

Uh oh!

[MRG+1]: TEST runtime down to 4:30 min on an old laptop #5711

[MRG+1]: TEST runtime down to 4:30 min on an old laptop #5711

Uh oh!

Conversation

giorgiop commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

glouppe commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

AlexandreAbraham commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

amueller commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

amueller commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

giorgiop commented Nov 4, 2015

Uh oh!

jmetzen commented Nov 5, 2015

Uh oh!

giorgiop commented Nov 5, 2015

Uh oh!

giorgiop commented Nov 5, 2015

Uh oh!

TomDLT commented Nov 5, 2015

Uh oh!

giorgiop commented Nov 12, 2015

Uh oh!

giorgiop commented Nov 13, 2015

Uh oh!

amueller commented Nov 14, 2015

Uh oh!

giorgiop commented Nov 18, 2015

Uh oh!

arthurmensch commented Nov 19, 2015

Uh oh!

giorgiop commented Dec 11, 2015

Uh oh!

ogrisel Feb 1, 2016

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Feb 1, 2016

Uh oh!

AlexandreAbraham commented Feb 1, 2016

Uh oh!

giorgiop commented Feb 1, 2016

Uh oh!

ogrisel commented Feb 3, 2016

Uh oh!

giorgiop commented Feb 21, 2016

Uh oh!

Uh oh!