Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Off-By-One Error in _pca with 'mle' #16546

Closed
@lschwetlick

Description

@lschwetlick

In _pca, when running _infer_dimension for the 'mle' solver, there is an off by one problem which has previously been discussed
here,
here, and
here

In _infer_dimension we iterate over range(len(spectrum)) and we call out iterated variable rank. This is misleading because rank insinuates meaning the rank of the matrix, where rank=0 would mean a matrix that has no variance.

Even in _assess_dimension the variable rank is sometimes used as an index and sometimes as a value that is used in the computation. Maybe someone can help me understand what that means for how _assess_dimension is it meant work. Is it supposed to return the likelihood of the indexed eigenvalue or of the number of the rank?

In the code coverage you can see that the rank == n_features condition is never executed.

In principle we should probably catch the case for the matrix of rank 0 before, because it's easy to detect. I'll try to come up with a test. I'd be interested in working on a PR for this.

Ping @adrinjalali

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions