Problems in sklearn.decomposition.PCA with "n_components='mle' option"

We have found several problems in the implementation of the method to automatically tune the number of components of the PCA algorithms:
1. The algorithm never tests full rank: this is most probably due to the fact that loops using the rank end always at rank-1 (`for i in range(rank)`).
2. If two eigen values are equals there is a log(0) issue.
3. Zeros eigen values are not treated explicitly

Possible solutions:
- For (1): Checking the loops ranges
- For (3): Predetecting small eigen values lower than the numerical noise excluding them from rank scan

I have no idea for 2. We had the problem here with very small eigen values (in numerical noise) which were totally identical. I never managed to create a syntetic dataset which reproduce the problem since the even with symetric datasets, there is always a small difference (in the order of numerical precision) between theoretically identical eigen values. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Problems in sklearn.decomposition.PCA with "n_components='mle' option" #4441

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Problems in sklearn.decomposition.PCA with "n_components='mle' option" #4441

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions