Support n_components being a float for TruncatedSVD #10988

jamesqo · 2018-04-16T21:36:21Z

For sklearn.decomposition.PCA you can pass a decimal argument for n_components to specify the minimum explained variance ratio you want returned by fit(). Would it be possible to also have this behavior for TruncatedSVD?

The text was updated successfully, but these errors were encountered:

rth · 2018-04-18T08:02:30Z

There was related discussion about this in #7973 . Generally I think it would be good to have better constency between PCA and TruncatedSVD. The implementation should take care to avoid issues raised in #10034 that were fixed in #10042 ...

Micky774 · 2022-02-02T04:32:48Z

So I'm a bit confused on this one -- isn't the point of TruncatedSVD that it only computes up to k components of SVD? So to calculate the distribution of explained variance (to determine how many components to keep), we'd first have to compute a full svd decomp, calculate the explained variances, find the cutoff amount, and retain the selected components right? That's at least how PCA does it. So, without truncating a full SVD, I don't know how we'd be able to decide a priori the right number of components.

rth · 2022-02-02T08:44:39Z

Yes, it looks like in PCA a float n_compontents only works if svd_solver="full" (i.e. a full SVD is computed) while the TruncatedSVD doesn't have the algorithm="full". So if we wanted to implement this we would need to add it. I'm not sure overall it would be worth it.

Since it's an issue from 2018 and there wasn't much interest in it from users since, closing as won't fix. Thanks for investigating @Micky774 !

zachmayer mentioned this issue Jul 15, 2020

TruncatedSVD option to reduce k instead of raising "n_components must be < n_features;" #17916

Open

cmarmo added module:decomposition Needs Decision Requires decision labels Jan 15, 2022

rth closed this as completed Feb 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support n_components being a float for TruncatedSVD #10988

Support n_components being a float for TruncatedSVD #10988

jamesqo commented Apr 16, 2018

rth commented Apr 18, 2018

Micky774 commented Feb 2, 2022

rth commented Feb 2, 2022

Support n_components being a float for TruncatedSVD #10988

Support n_components being a float for TruncatedSVD #10988

Comments

jamesqo commented Apr 16, 2018

rth commented Apr 18, 2018

Micky774 commented Feb 2, 2022

rth commented Feb 2, 2022