Description
Describe the bug
In line 425, 426 of /sklearn/gaussian_process/_gpr.py (inside the predict method) y_var is modified in place:
# Compute variance of predictive distribution
# Use einsum to avoid explicitly forming the large matrix
# V^T @ V just to extract its diagonal afterward.
y_var = self.kernel_.diag(X)
y_var -= np.einsum("ij,ji->i", V.T, V)
If a kernel (e.g. a custom kernel) is used that returns X when diag(X) is called (X[:,0] == var == diag(X)) this leads to the modification of the original X vector.
So now X == y_var - np.einsum("ij,ji->i", V.T, V).
A simple but dirty fix for me is to always make and return a copy, when building a custom kernel: var = np.copy(X[:,0]).
But because this error is hard to find and debug, and the documentation does not state that one should make a copy, as well user expectation is that a method call won't modify the input, i suggest the following FIX:
FIX
Just replace the in place modification : y_var -= np.einsum("ij,ji->i", V.T, V)
with creation of a new array: y_var = y_var - np.einsum("ij,ji->i", V.T, V)
Steps/Code to Reproduce
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import Hyperparameter, Kernel
min_x = 0
max_x = 50
std = 0.2
stop_time = 50
nr_plot_points = 20
number_of_train_points = 5
class MinT(Kernel):
def __init__(self, sigma_0=1.0, sigma_0_bounds=(0.01, 10)):
self.sigma_0 = sigma_0
self.sigma_0_bounds = sigma_0_bounds
@property
def hyperparameter_sigma_0(self):
return Hyperparameter("sigma_0", "numeric", self.sigma_0_bounds)
def __call__(self, X, Y=None, eval_gradient=False):
"""Return the kernel k(X, Y) and optionally its gradient.
Parameters
----------
X : ndarray of shape (n_samples_X, n_features)
Left argument of the returned kernel k(X, Y)
Y : ndarray of shape (n_samples_Y, n_features), default=None
Right argument of the returned kernel k(X, Y). If None, k(X, X)
if evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Only supported when Y is None.
Returns
-------
K : ndarray of shape (n_samples_X, n_samples_Y)
Kernel k(X, Y)
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims),\
optional
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when `eval_gradient`
is True.
"""
X = np.atleast_2d(X)
ones_x = np.ones_like(X)
if Y is None:
Kc = X*ones_x.T
Kr = ones_x * X.T
Kcr = np.concatenate((Kc[...,None], Kr[...,None]), axis=-1)
K = np.min(Kcr, axis=-1)
else:
if eval_gradient:
raise ValueError("Gradient can only be evaluated when Y is None.")
ones_y = np.ones_like(Y)
Kc = X*ones_y.T
Kr = ones_x * Y.T
Kcr = np.concatenate((Kc[...,None], Kr[...,None]), axis=-1)
K = np.min(Kcr, axis=-1)
if eval_gradient:
if not self.hyperparameter_sigma_0.fixed:
K_gradient = np.empty((K.shape[0], K.shape[1], 1))
K_gradient[..., 0] = self.sigma_0
return K, K_gradient
else:
return K, np.empty((X.shape[0], X.shape[0], 0))
else:
return K
def diag(self, X):
"""Returns the diagonal of the kernel k(X, X).
The result of this method is identical to np.diag(self(X)); however,
it can be evaluated more efficiently since only the diagonal is
evaluated.
Parameters
----------
X : ndarray of shape (n_samples_X, n_features)
Left argument of the returned kernel k(X, Y).
Returns
-------
K_diag : ndarray of shape (n_samples_X,)
Diagonal of kernel k(X, X).
"""
return X[:,0] #np.copy(X[:,0])
def is_stationary(self):
"""Returns whether the kernel is stationary."""
return False
def __repr__(self):
return "{0}(sigma_0={1:.3g})".format(self.__class__.__name__, self.sigma_0)
rng = np.random.RandomState(None)
t_train = np.linspace(0, stop_time, num=number_of_train_points)
y_train = rng.normal(0, t_train*std)
gpr_min_t = GaussianProcessRegressor(kernel=MinT(), random_state=None, n_restarts_optimizer=5)
gpr_min_t.fit(t_train.reshape(-1, 1), y_train)
x = np.linspace(min_x, max_x, nr_plot_points)
X = x.reshape(-1, 1)
X_copy = np.copy(X)
y_mean, y_std = gpr_min_t.predict(X, return_std=True)
assert np.all(X_copy == X)
Expected Results
The assertion should hold: assert np.all(X_copy == X)
Actual Results
You know: The assertion fails, because X is changed in place by sklearn code... and thus is now equal to:
X = y_var - np.einsum("ij,ji->i", V.T, V)
Versions
System:
python: 3.8.10 (default, Jun 22 2022, 20:18:18) [GCC 9.4.0]
executable: .venv/bin/python
machine: Linux-5.15.0-46-generic-x86_64-with-glibc2.29
Python dependencies:
sklearn: 1.1.1
pip: 22.2.1
setuptools: 63.2.0
numpy: 1.23.1
scipy: 1.8.0
Cython: None
pandas: None
matplotlib: 3.5.2
joblib: 1.1.0
threadpoolctl: 3.1.0
Built with OpenMP: True
threadpoolctl info:
user_api: openmp
internal_api: openmp
prefix: libgomp
filepath: .venv/lib/python3.8/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
version: None
num_threads: 16
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: .venv/lib/python3.8/site-packages/numpy.libs/libopenblas64_p-r0-742d56dc.3.20.so
version: 0.3.20
threading_layer: pthreads
architecture: Zen
num_threads: 16
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: .venv/lib/python3.8/site-packages/scipy.libs/libopenblasp-r0-8b9e111f.3.17.so
version: 0.3.17
threading_layer: pthreads
architecture: Zen
num_threads: 16