-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG+1] Added _fit_svd_solver variable to PCA #11225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
_fit_method will be set as soon as fit(X) method is called. Depending upon the value passes for svd_solver and input array X, value of _fit_method will be set.
sklearn/decomposition/pca.py
Outdated
@@ -388,7 +388,7 @@ def _fit(self, X): | |||
else: | |||
n_components = self.n_components | |||
|
|||
# Handle svd_solver | |||
# Handle svd_solver and _fit_method | |||
svd_solver = self.svd_solver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can replace variable svd_solver
with the new self._fit_method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everywhere? IMO svd_solver
is more concise/readable..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
svd_solver
is just being a local variable in _fit()
method. If we replace it with a class variable (say self._fit_svd_solver
for now), it will serve both the purposes, doing svd_solver
's current task and expose the solver used.
Moreover, NearestNeighbours
also handle algorithm='auto'
in this manner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but you would have to do 4-5 replacements of svd_solver
with self._fit_svd_solver
to save one assignment, self._fit_svd_solver = svd_solver
at the end (current solution). In any case that's not too important, either way would be fine, up to you.
Generally we need tests for a PR related to source code, but since |
I'm not sure that _fit_method is a great name... I'd rather have "solver in
there"
|
@jnothman Will it be okay if we overwrite value of |
No that wont work, because, when running, pca = PCA(svd_solver='auto')
pca.fit(X1)
pca.fit(X2) you wouldn't want the solver in the second fit to depend on the choice in the first one.
+1 maybe
Yes, though a test for two repeated fits (see above) behave consistently for the solver wouldn't hurt. Unless there is already a test like that. |
I'm OK with it and actually don't care much about the name (since it's private). Maybe we can also consider |
Replacing svd_solver with self._fit_svd_solver let us keep track of svd_solver method even if it is passed as 'auto'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @njkevlani
sklearn/decomposition/pca.py
Outdated
@@ -388,26 +388,26 @@ def _fit(self, X): | |||
else: | |||
n_components = self.n_components | |||
|
|||
# Handle svd_solver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a serious problem but I think you might leave the comment as it is. (Handle _fit_svd_solver
seems awkward from my side.)
sklearn/decomposition/pca.py
Outdated
# Handle svd_solver | ||
svd_solver = self.svd_solver | ||
if svd_solver == 'auto': | ||
# Handle _svd_solver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_svd_solver
-> svd_solver
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qinhanmin2014 Fixed typo.
@qinhanmin2014 @rth Would you guys want me to change anything further? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, merging. Thank you @njkevlani !
(I would have preferred to the keep the local svd_solver
variable as less verbose, but since there is no consensus on that, this is fine.)
Fixes #11223 Expose solver used with PCA(svd_solver='auto')
_fit_method will be set as soon as fit(X) method is called. Depending upon the value passes for svd_solver and input array X, value of _fit_method will be set.