Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG] FIX performance issue in _graph_connected_component #6268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ogrisel
Copy link
Member

@ogrisel ogrisel commented Feb 2, 2016

This is a reworked version of the fix in #5713 with a new test.

@ogrisel ogrisel added the Bug label Feb 2, 2016
@ogrisel
Copy link
Member Author

ogrisel commented Feb 2, 2016

test_spectral_embedding_callable_affinity now runs in 800ms instead of 9s+ in master.

@ogrisel ogrisel force-pushed the fix-spectral-embedding-stopping-condition branch from fc3211b to 2530aef Compare February 2, 2016 12:53
@ogrisel
Copy link
Member Author

ogrisel commented Feb 2, 2016

Note: 0.17 did not have such a strong perf issue, so no need to backport.

@ogrisel ogrisel force-pushed the fix-spectral-embedding-stopping-condition branch from 2530aef to bd9ca75 Compare February 2, 2016 15:15
@ogrisel ogrisel changed the title [MRG] FIX performance issue in _graph_connected_component [WIP] FIX performance issue in _graph_connected_component Feb 2, 2016
@ogrisel
Copy link
Member Author

ogrisel commented Feb 2, 2016

Travis has revealed that this code can be much worse than before with recent python / numpy / scipy:

test_spectral_embedding.test_spectral_embedding_precomputed_affinity: 51.6601s
test_spectral_embedding.test_spectral_embedding_callable_affinity: 51.0443s

although I don't reproduce this behavior on my local workstation which is very similar... More work needed.

@AlexandreAbraham maybe we should revert to the code of 0.17?

@ogrisel ogrisel force-pushed the fix-spectral-embedding-stopping-condition branch from 2f7a881 to bc46718 Compare February 3, 2016 08:23
@ogrisel
Copy link
Member Author

ogrisel commented Feb 3, 2016

I changed the scipy version and it's still very slow on travis:

test_spectral_embedding.test_spectral_embedding_callable_affinity: 33.1952s
test_spectral_embedding.test_spectral_embedding_precomputed_affinity: 31.9874s

instead of 0.8s on my box with the same scipy version.

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Feb 3, 2016 via email

@ogrisel
Copy link
Member Author

ogrisel commented Feb 3, 2016

Lack of good linear algebra library?

The slow configuration the one is using MKL :) Let me try to replicate locally to see if MKL is the culprit. This is quite unlikely though at this is mostly about walking a graph with np.logical_or and a for loop...

@ogrisel
Copy link
Member Author

ogrisel commented Feb 3, 2016

Those tests run in 0.4ms with MKL on my box (same python / numpy / scipy versions from anaconda as on travis). This is really weird.

@jakevdp
Copy link
Member

jakevdp commented Feb 3, 2016

Scipy might be the culprit – I seem to remember some recent changes to how scipy.sparse does row access.

@ogrisel
Copy link
Member Author

ogrisel commented Feb 3, 2016

I have the same version of scipy on my local machine (0.16.1) and I don't have the problem.

@ogrisel ogrisel force-pushed the fix-spectral-embedding-stopping-condition branch from bc46718 to 19024a8 Compare February 6, 2016 18:14
@ogrisel
Copy link
Member Author

ogrisel commented Feb 6, 2016

Rebased on top of current master to see if the number of threads of MKL or OpenBLAS is related to this issue.

@ogrisel
Copy link
Member Author

ogrisel commented Feb 6, 2016

Ok the number of threads of MKL / OpenBLAS was the source of the problem. This is much faster now. Merging.

ogrisel added a commit that referenced this pull request Feb 6, 2016
…-condition

[WIP] FIX performance issue in _graph_connected_component
@ogrisel ogrisel merged commit b0d071b into scikit-learn:master Feb 6, 2016
@ogrisel ogrisel changed the title [WIP] FIX performance issue in _graph_connected_component [MRG] FIX performance issue in _graph_connected_component Feb 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants