PPCA algorithm returns inconsistent noise variance depending on truncated vs full solution

In the full solution the returned value is (pca._fit_full):
self.noise_variance_ = explained_variance_[n_components:].mean()

whereas in the truncated case (pca._fit_truncated) you do:
self.noise_variance_ = (total_var.sum() - self.explained_variance_.sum())

So in the second case the total unexplained variance is returned, not the average. This is problematic not only because of the inconsistency, but because certain code in sklearn.decomposition.PCA, e.g. the computation of the precision matrix assumes that the self.noise_variance is the average noise variance for it to work as per Tipping and Bishop (http://www.miketipping.com/papers/met-mppca.pdf), which is not the case in the truncated PCA. Let me know if I'm missing something.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

PPCA algorithm returns inconsistent noise variance depending on truncated vs full solution #8541

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

PPCA algorithm returns inconsistent noise variance depending on truncated vs full solution #8541

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions