Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Training hyperparameters for multidimensional Gaussian process regression

Here is a simple working implementation of a code where I use Gaussian process regression (GPR) in Python's scikit-learn with 2-dimensional inputs (i.e grid over x1 and x2) and 1-dimensional outputs (y).

import numpy as np
from matplotlib import pyplot as plt 
from sklearn.gaussian_process import GaussianProcessRegressor 
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C
from mpl_toolkits.mplot3d import Axes3D

#  Example independent variable (observations)
X = np.array([[0.,0.], [1.,0.], [2.,0.], [3.,0.], [4.,0.], 
                [5.,0.], [6.,0.], [7.,0.], [8.,0.], [9.,0.], [10.,0.], 
                [11.,0.], [12.,0.], [13.,0.], [14.,0.],
                [0.,1.], [1.,1.], [2.,1.], [3.,1.], [4.,1.], 
                [5.,1.], [6.,1.], [7.,1.], [8.,1.], [9.,1.], [10.,1.], 
                [11.,1.], [12.,1.], [13.,1.], [14.,1.],
                [0.,2.], [1.,2.], [2.,2.], [3.,2.], [4.,2.], 
                [5.,2.], [6.,2.], [7.,2.], [8.,2.], [9.,2.], [10.,2.], 
                [11.,2.], [12.,2.], [13.,2.], [14.,2.]])#.T

# Example dependent variable (observations) - noiseless case 
y = np.array([4.0, 3.98, 4.01, 3.95, 3.9, 3.84,3.8,
              3.73, 2.7, 1.64, 0.62, 0.59, 0.3, 
              0.1, 0.1,
            4.4, 3.9, 4.05, 3.9, 3.5, 3.4,3.3,
              3.23, 2.6, 1.6, 0.6, 0.5, 0.32, 
              0.05, 0.02,
            4.0, 3.86, 3.88, 3.76, 3.6, 3.4,3.2,
              3.13, 2.5, 1.6, 0.55, 0.51, 0.23, 
              0.11, 0.01]) 

x1 = np.linspace(0, 14, 20)
x2 = np.linspace(0, 5, 100) 

i = 0 
inputs_x = []
while i < len(x1):
    j = 0
    while j < len(x2):
        inputs_x.append([x1[i],x2[j]])
        j = j + 1
    i = i + 1
inputs_x_array = np.array(inputs_x) 

# Instantiate a Gaussian Process model
kernel = C(1.0, (1e-3, 1e3)) * RBF((1e-2, 1e2), (1e-2, 1e2))
gp = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=20)

gp.fit(X, y.reshape(-1,1)) #removing reshape results in a different error

y_pred, sigma = gp.predict(inputs_x_array, return_std=True)

It works, but when defining the kernel, how can I ensure I set different hyperparameters (e.g. different scale lengths) for my different inputs (i.e. x1 and x2)? In the example above, the standard kernel used is a radial basis function (RBF) which appears to have a single length scale despite two input dimensions. But how could this kernel (or a custom kernel, e.g. hyperbolic tangent) be trained to account for different hyperparameters for the different input dimensions?

like image 468
Mathews24 Avatar asked Feb 09 '19 07:02

Mathews24


People also ask

What are the Hyperparameters of Gaussian process?

The hyperparameters in Gaussian process regression (GPR) model with a specified kernel are often estimated from the data via the maximum marginal likelihood. Due to the non-convexity of marginal likelihood with respect to the hyperparameters, the optimization may not converge to the global maxima.

Is Kriging the same as Gaussian process regression?

In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging gives the best linear unbiased prediction (BLUP) at unsampled locations.

What is GPR regression?

Gaussian process regression (GPR) is a nonparametric, Bayesian approach to regression that is making waves in the area of machine learning. GPR has several benefits, working well on small datasets and having the ability to provide uncertainty measurements on the predictions.

Is GPR machine learning?

Gaussian Process Regression (GPR) is a remarkably powerful class of machine learning algorithms that, in contrast to many of today's state-of-the-art machine learning models, relies on few parameters to make predictions.


1 Answers

You'll need anisotropic kernels, which are only supported by a few kernels in sklearn for the moment. RBF is such an example where you can give a list as input for the length_scale parameter. For example, RBF(length_scale = [1, 10], length_scale_bounds=(1e-5, 1e5)) is perfectly valid, where 1 holds for x1 and 10 holds for x2.

Most kernels in sklearn however are isotropic, where the anisotropic case is -currently- not supported. If you want more freedom, I suggest you take a look at other packages (like GPy) or you can always try to implement your own anisotropic kernel.

like image 93
Riley Avatar answered Sep 18 '22 19:09

Riley