Description
In _pca, when running _infer_dimension for the 'mle' solver, there is an off by one problem which has previously been discussed
here,
here, and
here
In _infer_dimension
we iterate over range(len(spectrum))
and we call out iterated variable rank
. This is misleading because rank insinuates meaning the rank of the matrix, where rank=0 would mean a matrix that has no variance.
Even in _assess_dimension
the variable rank
is sometimes used as an index and sometimes as a value that is used in the computation. Maybe someone can help me understand what that means for how _assess_dimension
is it meant work. Is it supposed to return the likelihood of the indexed eigenvalue or of the number of the rank?
In the code coverage you can see that the rank == n_features condition is never executed.
In principle we should probably catch the case for the matrix of rank 0 before, because it's easy to detect. I'll try to come up with a test. I'd be interested in working on a PR for this.
Ping @adrinjalali