Closed
Description
We have found several problems in the implementation of the method to automatically tune the number of components of the PCA algorithms:
- The algorithm never tests full rank: this is most probably due to the fact that loops using the rank end always at rank-1 (
for i in range(rank)
). - If two eigen values are equals there is a log(0) issue.
- Zeros eigen values are not treated explicitly
Possible solutions:
- For (1): Checking the loops ranges
- For (3): Predetecting small eigen values lower than the numerical noise excluding them from rank scan
I have no idea for 2. We had the problem here with very small eigen values (in numerical noise) which were totally identical. I never managed to create a syntetic dataset which reproduce the problem since the even with symetric datasets, there is always a small difference (in the order of numerical precision) between theoretically identical eigen values.