Thanks to visit codestin.com
Credit goes to github.com

Skip to content

PPCA algorithm returns inconsistent noise variance depending on truncated vs full solution #8541

Closed
@polmorenoc

Description

@polmorenoc

In the full solution the returned value is (pca.fit_full):
self.noise_variance
= explained_variance_[n_components:].mean()

whereas in the truncated case (pca.fit_truncated) you do:
self.noise_variance
= (total_var.sum() - self.explained_variance_.sum())

So in the second case the total unexplained variance is returned, not the average. This is problematic not only because of the inconsistency, but because certain code in sklearn.decomposition.PCA, e.g. the computation of the precision matrix assumes that the self.noise_variance is the average noise variance for it to work as per Tipping and Bishop (http://www.miketipping.com/papers/met-mppca.pdf), which is not the case in the truncated PCA. Let me know if I'm missing something.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions