-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
FIX: Repair PCA array API support tag for svd_solver "full", "covariance_eigh", or "auto" #32198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
OmarManzoor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @icfaust
If the covariance_eigh solver supports the array api then you can simply add in the current docs for array_api where PCA is already mentioned and add covariance_eigh as an additional value.
OmarManzoor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @icfaust
| self.svd_solver in ["full", "randomized"] | ||
| and self.power_iteration_normalizer == "QR" | ||
| solver = getattr(self, "_fit_svd_solver", self.svd_solver) | ||
| tags.array_api_support = solver not in ["arpack", "randomized"] or ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this!
I would keep a whitelist here because for now array API is available in a few cases. It also makes it easier to check that the documentation and the code match.
If the documentation is correct I would expect something like this?
tags.array_api_support = solver in ["full", "covariance_eigh"] or (solver == "randomized" and self.power_iteration_normalizer == "QR")|
Merging this one, thanks! |
Follow on to #31784 (which is a follow-up to #30777), the conditions laid out in documentation for PCA: https://scikit-learn.org/dev/modules/array_api.html#estimators states:
The documentation for PCA states for
power_iteration_normalizer("https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html) that:This is leading to array API support failures in our repository (uxlfoundation/scikit-learn-intelex#2578) as we prep for your 1.8 release.
The logic should swap from a whitelist to a blacklist to speed and simplify logic, thereby only checking the power_iteration_normalizer if svd_solver is either 1) "randomized" or _fit_svd_solver is set to "randomized". This corrects the logic to probe for the
_fit_svd_solverif available (to handle thesvd_sovler=autocondition) or fall back to checkingsvd_solverif the PCA estimator has not been fit.I'm not sure how to update the documentation to reflect this as the 'covariance_eigh' solver also supports array API but is not reflected in the documentation.
@OmarManzoor @lesteve