Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@miguelgondu
Copy link
Contributor

@miguelgondu miguelgondu commented Jan 16, 2022

Reference Issues/PRs

Fixes #22226.

What does this implement/fix? Explain your changes.

Adds a return_std_of_f flag for binary Gaussian Process Classifiers. This value was already being computed in some of the methods, so this PR just exposes that variable and adds checks in the API to ensure this is only possible for binary (and not multiclass) classification using the Laplace approximation.

Any other comments?

@miguelgondu miguelgondu changed the title [WIP] Adds a return std flag for GPCs [WIP] Adds a return_std_of_f flag for GPCs Jan 16, 2022
@miguelgondu miguelgondu changed the title [WIP] Adds a return_std_of_f flag for GPCs [MRG] Adds a return_std_of_f flag for GPCs Jan 16, 2022
@miguelgondu miguelgondu changed the title [MRG] Adds a return_std_of_f flag for GPCs [MRG] [ENH] Adds a return_std_of_f flag for GPCs Jan 18, 2022
@miguelgondu
Copy link
Contributor Author

Depending on whether the API decision, I would be happy to update this with the latest changes in main.

Copy link
Contributor

@noashin noashin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is lacking is an explanation of the meaning of the std of f. Emphasizing that this is not the variance of \pi (the link function over the latent function) and it does not translate directly to the confidence intervals. Maybe it can be added to the general documentation of GPs in scikit.
As the API is now more similar to the one of GaussianProcessRegressor, without further explanation, it implies that the quantity returned is equivalent to the one returned by the flag return_std in GaussianProcessRegressor (even though the name of the flag is different).

X = self._validate_data(X, ensure_2d=True, dtype="numeric", reset=False)
else:
X = self._validate_data(X, ensure_2d=False, dtype=None, reset=False)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put the check whether std_f can be returned or not here, after the kernel tests. Than the check can be similar to the one done around lines 773:

if self.n_classes_ > 2:
if return_std_of_f:
raise ValueError(
"Returning the standard deviation of the "
"latent function f is only supported for GPCs "
"that use the Laplace Approximation."
)
else:
return self.base_estimator_.predict_proba(X)
else:
return self.base_estimator_.predict_proba(
X, return_std_of_f=return_std_of_f
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! Will implement these changes.

return self

def predict(self, X):
def predict(self, X, return_std_of_f=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that std_of_f should be available only via predict_proba.
As predict does not provide a probabilistic estimation returning an the std of the nuiance function is out of the scope in a way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Noa,

First of all, thanks for the review!

I added the flag here as well to match the API for GPR. In the predict method for Gaussian Process Regression, you can also find a return_std. If we remove it from this predict, wouldn't that make the APIs different?

I'm happy to remove it either way, but I'm curious about what you think regarding whether the APIs for GPR and GPC should match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @adrinjalali could share his opinion of whether keeping the interfaces as similar as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's worth anything, someone mentioned the consistency with GPR in a comment to the original issue: #22226 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that in this case having it only for predict_proba makes more sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll include those changes then!

@github-actions
Copy link

github-actions bot commented Sep 26, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 9403c59. Link to the linter CI: here

@miguelgondu
Copy link
Contributor Author

Dear @noashin,

Again, many thanks for the review.

I have added a paragraph to the documentation explaining what the keyword would do, and I am hopefully clear and specific on the fact that we are returning $\sqrt{\text{Var}[f_*]}$ and not $\sqrt{\text{Var}[\pi(f)]}$. I have also moved the changelog bit to 1.4.

Let me know if I should address anything else!

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the argument could also be simply called return_std

return self

def predict(self, X):
def predict(self, X, return_std_of_f=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that in this case having it only for predict_proba makes more sense.

@miguelgondu
Copy link
Contributor Author

I wonder if the argument could also be simply called return_std

It might be misleading! People might assume that it refers to the uncertainty over the class probabilities $\pi(f)$ instead of the latent variable $f$. I'm happy to change it either way. What does @noashin think?

@noashin
Copy link
Contributor

noashin commented Sep 28, 2023

I wonder if the argument could also be simply called return_std

It might be misleading! People might assume that it refers to the uncertainty over the class probabilities π(f) instead of the latent variable f. I'm happy to change it either way. What does @noashin think?

I agree. I think that return_std will be misleading.

@adrinjalali
Copy link
Member

ping @glemaitre maybe.

@miguelgondu
Copy link
Contributor Author

I'm happy to update it, but it seems we're waiting for @glemaitre to give input on whether we should call it return_std or return_std_of_f.

@adrinjalali
Copy link
Member

Ping @antoinebaker and @snath-xoc , I think you can give a nicer feedback here than I can.

@antoinebaker
Copy link
Contributor

Thanks for the PR @miguelgondu !

If I understood correctly the original issue, I see two motivations behind returning f_std:

  1. plot the posterior latent f "in logit space" along with its confidence region, something like fill_between(x, f_mean-f_std,f_mean+f_std)
  2. sample the posterior latent f (from the Laplace approximation which is Gaussian with mean and covariance f_mean and f_cov ), and generate corresponding samples y_prob = logistic(f) to plot, use in ancestral sampling or compute bootstrap estimates (like confidence intervals or std for y_prob)

But if you had these applications in mind, I don't think the current approach is enough (because f_mean is missing) ?

I don't have a strong opinion here but maybe a separate method (name and signature TBD):

laplace_approximation(X, return_std=False, return_cov=False) 
# returns f_mean, f_std or even f_cov if possible

would be better that returning f_std inside predict_proba. I feel returning y_prob with f_std would mix apples with oranges and could lead to misuse (eg thinking it's y_mean, y_std like in the GPR case, even if the docstring warns against it).

@snath-xoc
Copy link
Contributor

snath-xoc commented Feb 21, 2025

Thanks for the PR @miguelgondu, I agree with @antoinebaker that taking this into a dedicated function may be better as compared to within predict_proba so we don't confuse users with y_mean and f_mean (although I see that it is well-documented). Being able to quantify the uncertainty in the latent f space could be interesting but case-specific, with the proposed "laplace_approximation" term we could also provide the option of calculating cov/std based uncertainty estimates.

@miguelgondu
Copy link
Contributor Author

Hi @antoinebaker and @snath-xoc,

Thanks for the feedback!

Agreed, the best path forward would be to implement another method. I'll try to do it soon, and ask for a re-review.

@miguelgondu miguelgondu force-pushed the gpc_return_std_flag branch from 4657afa to 7111d99 Compare April 22, 2025 15:35
@miguelgondu miguelgondu changed the title [MRG] [ENH] Adds a return_std_of_f flag for GPCs [MRG] [ENH] Exposes latent mean and variance for GPCs Apr 22, 2025
@miguelgondu miguelgondu force-pushed the gpc_return_std_flag branch from ad3e6aa to e8c8de2 Compare April 22, 2025 15:47
@miguelgondu
Copy link
Contributor Author

miguelgondu commented Apr 22, 2025

Dear @noashin @adrinjalali @antoinebaker and @snath-xoc,

I have addressed the suggested changes. Now the GaussianProcessClassifier has a latent_mean_and_variance method that gives access to the values. I'm happy to change the name if need be. The method is now used internally by predict_proba as well.

For now, I only test whether the returned values have the right shape, that exceptions get raised when calling the method with more than two classes, and that the method also works on string kernels. Let me know if I should include any more tests. The validity should already be covered by previous tests on the GPC.

I'm also happy to clean-up the git history via rebasing and squashing irrelevant commits.

Copy link
Contributor

@antoinebaker antoinebaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @miguelgondu for the PR, latent_mean_and_variance is a very good name I think.

A few formatting nitpicks, otherwise LGTM!

@miguelgondu miguelgondu force-pushed the gpc_return_std_flag branch from 49762dd to dec800a Compare April 24, 2025 14:17
@miguelgondu
Copy link
Contributor Author

Thanks for the feedback, @antoinebaker ! I have addressed it.

Copy link
Contributor

@snath-xoc snath-xoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @miguelgondu thank you for the changes, I thought I would just do some nitpicks to ensure consistency in the naming of latent_mean and latent_var (unless you have a good reason for calling them f_mean and var_f_star?) throughout the code

@miguelgondu
Copy link
Contributor Author

Hi @snath-xoc ,

Thanks for the feedback! I had left the previous names because of the notation in the book we cite. That being said, I agree with the consistency changes you proposed. They're now implemented!

Happy to address any other feedback. Let me know!

@snath-xoc
Copy link
Contributor

LGTM once we get a second opinion now!

@antoinebaker
Copy link
Contributor

LGTM too !

@miguelgondu miguelgondu force-pushed the gpc_return_std_flag branch from c51825e to 9403c59 Compare May 5, 2025 12:46
@miguelgondu
Copy link
Contributor Author

@noashin and @adrinjalali . Sorry for the second ping. Let me know if I should provide any changes!

@adrinjalali adrinjalali changed the title [MRG] [ENH] Exposes latent mean and variance for GPCs ENH Exposes latent mean and variance for GPCs May 6, 2025
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much. This seems quite straightforward now, and with the two reviews we have, I'm confident with the contribution.

@adrinjalali adrinjalali merged commit b55aba5 into scikit-learn:main May 6, 2025
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a return_std_of_f kwarg to GPC's predict and predict_proba, just like the one GPR has

8 participants