Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Support n_components being a float for TruncatedSVD #10988

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jamesqo opened this issue Apr 16, 2018 · 3 comments
Closed

Support n_components being a float for TruncatedSVD #10988

jamesqo opened this issue Apr 16, 2018 · 3 comments

Comments

@jamesqo
Copy link

jamesqo commented Apr 16, 2018

For sklearn.decomposition.PCA you can pass a decimal argument for n_components to specify the minimum explained variance ratio you want returned by fit(). Would it be possible to also have this behavior for TruncatedSVD?

@rth
Copy link
Member

rth commented Apr 18, 2018

There was related discussion about this in #7973 . Generally I think it would be good to have better constency between PCA and TruncatedSVD. The implementation should take care to avoid issues raised in #10034 that were fixed in #10042 ...

@Micky774
Copy link
Contributor

Micky774 commented Feb 2, 2022

So I'm a bit confused on this one -- isn't the point of TruncatedSVD that it only computes up to k components of SVD? So to calculate the distribution of explained variance (to determine how many components to keep), we'd first have to compute a full svd decomp, calculate the explained variances, find the cutoff amount, and retain the selected components right? That's at least how PCA does it. So, without truncating a full SVD, I don't know how we'd be able to decide a priori the right number of components.

@rth
Copy link
Member

rth commented Feb 2, 2022

Yes, it looks like in PCA a float n_compontents only works if svd_solver="full" (i.e. a full SVD is computed) while the TruncatedSVD doesn't have the algorithm="full". So if we wanted to implement this we would need to add it. I'm not sure overall it would be worth it.

Since it's an issue from 2018 and there wasn't much interest in it from users since, closing as won't fix. Thanks for investigating @Micky774 !

@rth rth closed this as completed Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants