Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Fix for spectral clustering error when using 'amg' solver #13707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Aug 29, 2019

Conversation

whitews
Copy link
Contributor

@whitews whitews commented Apr 24, 2019

Reference Issues/PRs

Fixes #13393. See also PR #12316

What does this implement/fix? Explain your changes.

Fixes LinAlgError when using spectral clustering with the amg solver

Any other comments?

This PR is derived from the previous PR #12316 submitted by Andrew Knyazev (lobpcg). In that PR, Andrew fixed issue #13393 and also added a new label assignment option 'clusterQR'. It was requested that the PR be split to separate the fix and the new label assignment functionality. This PR contains Andrew's fix for the AMG bug.

@whitews whitews changed the title Spec clust amg fix Fix for spectral clustering error when using 'amg' solver Apr 24, 2019
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @whitews

I'm not sure what chance it has to get into 0.21, but just in case:
Please add an entry to the change log at doc/whats_new/v0.21.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

@whitews
Copy link
Contributor Author

whitews commented Apr 24, 2019

Updated the change log. Let me know if anything else is needed.

@whitews
Copy link
Contributor Author

whitews commented Apr 25, 2019

Don't understand the codecov/patch failure, all local tests are passing for me. What does this failure mean?

@jnothman
Copy link
Member

jnothman commented Apr 25, 2019 via email

@whitews
Copy link
Contributor Author

whitews commented Apr 25, 2019

@jnothman This seems pretty clean now, everything is passing.

@lobpcg
Copy link
Contributor

lobpcg commented Apr 26, 2019

I think that the mathematically proper fix is changing in sklearn/manifold/spectral_embedding_.py the present

        laplacian = _set_diag(laplacian, 1 + 1e-5, norm_laplacian)

        # noinspection PyUnboundLocalVariable
        ml = smoothed_aggregation_solver(check_array(laplacian, 'csr'))

into

        laplacian = _set_diag(laplacian, 1, norm_laplacian)

        # noinspection PyUnboundLocalVariable
        ml = smoothed_aggregation_solver(check_array(laplacian + 1e-5 * sparse.eye(laplacian.shape[0], 'csr'))

so that the LOBPCG solver is still called on the unchanged Laplacian, but only the AMG preconditioner is fed with the shifted Laplacian. I have updated my #13393 to highlight this.

I am unsure how the memory allocation would work in my suggestion above. May be to save memory one can do something like:

        laplacian = _set_diag(laplacian, 1, norm_laplacian)
        laplacian = laplacian + 1e-5 * sparse.eye(laplacian.shape[0]
        # noinspection PyUnboundLocalVariable
        ml = smoothed_aggregation_solver(check_array(laplacian, 'csr'))
        laplacian = laplacian - 1e-5 * sparse.eye(laplacian.shape[0]

centers = np.eye(n_clusters, n_features)
S, true_labels = make_blobs(n_samples=n_samples, centers=centers,
cluster_std=1., random_state=42)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be check separately norm_laplacian = False and norm_laplacian = True . The latter is the default, the only option currently checked.

@jnothman
Copy link
Member

@whitews are you continuing with this?

@whitews
Copy link
Contributor Author

whitews commented May 23, 2019

@jnothman Yes, I've updated the PR. Have the tests changed? Getting a 404 response for a deb package in the np_atlas Azure build.

@jnothman
Copy link
Member

jnothman commented May 23, 2019 via email

@jnothman
Copy link
Member

Yes, bit I've been low on time to review this and other pull requests. I hope one of us can get to it soon.

@lobpcg
Copy link
Contributor

lobpcg commented Jul 23, 2019

@jnothman is there a plan trying to finish this one?

@jnothman jnothman closed this Jul 25, 2019
@jnothman jnothman reopened this Jul 25, 2019
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @whitews.

Btw, I've confirmed that the test fails on master, which is good.

@jnothman
Copy link
Member

jnothman commented Aug 2, 2019

Please merge the master from upstream to avoid the Circle CI failure.

@whitews
Copy link
Contributor Author

whitews commented Aug 2, 2019

@jnothman I think this is ready. Are there any outstanding requests?

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM. Awaiting another review. Thanks @whitews and @lobpcg.

@lobpcg
Copy link
Contributor

lobpcg commented Aug 13, 2019

https://scikit-learn.org/stable/modules/generated/sklearn.cluster.spectral_clustering.html, says that using amg eigen solver may lead to instabilities. This can be now removed, I think.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I pushed some cosmetic commits, will merge when CI is green.

@ogrisel ogrisel merged commit 372092c into scikit-learn:master Aug 29, 2019
@jnothman
Copy link
Member

jnothman commented Aug 30, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AMG spectral clustering fails just after a few iterations of LOBPCG with " leading minor of the array is not positive definite"
5 participants