-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
ENH Exposes latent mean and variance for GPCs #22227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Exposes latent mean and variance for GPCs #22227
Conversation
return_std_of_f flag for GPCs
return_std_of_f flag for GPCsreturn_std_of_f flag for GPCs
return_std_of_f flag for GPCsreturn_std_of_f flag for GPCs
|
Depending on whether the API decision, I would be happy to update this with the latest changes in |
noashin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is lacking is an explanation of the meaning of the std of f. Emphasizing that this is not the variance of \pi (the link function over the latent function) and it does not translate directly to the confidence intervals. Maybe it can be added to the general documentation of GPs in scikit.
As the API is now more similar to the one of GaussianProcessRegressor, without further explanation, it implies that the quantity returned is equivalent to the one returned by the flag return_std in GaussianProcessRegressor (even though the name of the flag is different).
| X = self._validate_data(X, ensure_2d=True, dtype="numeric", reset=False) | ||
| else: | ||
| X = self._validate_data(X, ensure_2d=False, dtype=None, reset=False) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put the check whether std_f can be returned or not here, after the kernel tests. Than the check can be similar to the one done around lines 773:
if self.n_classes_ > 2:
if return_std_of_f:
raise ValueError(
"Returning the standard deviation of the "
"latent function f is only supported for GPCs "
"that use the Laplace Approximation."
)
else:
return self.base_estimator_.predict_proba(X)
else:
return self.base_estimator_.predict_proba(
X, return_std_of_f=return_std_of_f
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! Will implement these changes.
sklearn/gaussian_process/_gpc.py
Outdated
| return self | ||
|
|
||
| def predict(self, X): | ||
| def predict(self, X, return_std_of_f=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that std_of_f should be available only via predict_proba.
As predict does not provide a probabilistic estimation returning an the std of the nuiance function is out of the scope in a way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Noa,
First of all, thanks for the review!
I added the flag here as well to match the API for GPR. In the predict method for Gaussian Process Regression, you can also find a return_std. If we remove it from this predict, wouldn't that make the APIs different?
I'm happy to remove it either way, but I'm curious about what you think regarding whether the APIs for GPR and GPC should match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe @adrinjalali could share his opinion of whether keeping the interfaces as similar as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's worth anything, someone mentioned the consistency with GPR in a comment to the original issue: #22226 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that in this case having it only for predict_proba makes more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll include those changes then!
|
Dear @noashin, Again, many thanks for the review. I have added a paragraph to the documentation explaining what the keyword would do, and I am hopefully clear and specific on the fact that we are returning Let me know if I should address anything else! |
adrinjalali
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the argument could also be simply called return_std
sklearn/gaussian_process/_gpc.py
Outdated
| return self | ||
|
|
||
| def predict(self, X): | ||
| def predict(self, X, return_std_of_f=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that in this case having it only for predict_proba makes more sense.
It might be misleading! People might assume that it refers to the uncertainty over the class probabilities |
I agree. I think that return_std will be misleading. |
|
ping @glemaitre maybe. |
|
I'm happy to update it, but it seems we're waiting for @glemaitre to give input on whether we should call it |
|
Ping @antoinebaker and @snath-xoc , I think you can give a nicer feedback here than I can. |
|
Thanks for the PR @miguelgondu ! If I understood correctly the original issue, I see two motivations behind returning
But if you had these applications in mind, I don't think the current approach is enough (because I don't have a strong opinion here but maybe a separate method (name and signature TBD): laplace_approximation(X, return_std=False, return_cov=False)
# returns f_mean, f_std or even f_cov if possiblewould be better that returning |
|
Thanks for the PR @miguelgondu, I agree with @antoinebaker that taking this into a dedicated function may be better as compared to within predict_proba so we don't confuse users with y_mean and f_mean (although I see that it is well-documented). Being able to quantify the uncertainty in the latent f space could be interesting but case-specific, with the proposed "laplace_approximation" term we could also provide the option of calculating cov/std based uncertainty estimates. |
|
Hi @antoinebaker and @snath-xoc, Thanks for the feedback! Agreed, the best path forward would be to implement another method. I'll try to do it soon, and ask for a re-review. |
4657afa to
7111d99
Compare
return_std_of_f flag for GPCsad3e6aa to
e8c8de2
Compare
|
Dear @noashin @adrinjalali @antoinebaker and @snath-xoc, I have addressed the suggested changes. Now the For now, I only test whether the returned values have the right shape, that exceptions get raised when calling the method with more than two classes, and that the method also works on string kernels. Let me know if I should include any more tests. The validity should already be covered by previous tests on the GPC. I'm also happy to clean-up the git history via rebasing and squashing irrelevant commits. |
antoinebaker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @miguelgondu for the PR, latent_mean_and_variance is a very good name I think.
A few formatting nitpicks, otherwise LGTM!
49762dd to
dec800a
Compare
|
Thanks for the feedback, @antoinebaker ! I have addressed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @miguelgondu thank you for the changes, I thought I would just do some nitpicks to ensure consistency in the naming of latent_mean and latent_var (unless you have a good reason for calling them f_mean and var_f_star?) throughout the code
doc/whats_new/upcoming_changes/sklearn.gaussian_process/22227.enhancement.rst
Outdated
Show resolved
Hide resolved
|
Hi @snath-xoc , Thanks for the feedback! I had left the previous names because of the notation in the book we cite. That being said, I agree with the consistency changes you proposed. They're now implemented! Happy to address any other feedback. Let me know! |
|
LGTM once we get a second opinion now! |
|
LGTM too ! |
…by moving the verifications
Co-authored-by: antoinebaker <[email protected]>
c51825e to
9403c59
Compare
|
@noashin and @adrinjalali . Sorry for the second ping. Let me know if I should provide any changes! |
adrinjalali
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much. This seems quite straightforward now, and with the two reviews we have, I'm confident with the contribution.
Reference Issues/PRs
Fixes #22226.
What does this implement/fix? Explain your changes.
Adds a
return_std_of_fflag for binary Gaussian Process Classifiers. This value was already being computed in some of the methods, so this PR just exposes that variable and adds checks in the API to ensure this is only possible for binary (and not multiclass) classification using the Laplace approximation.Any other comments?