FVGP Single Task Notebook

In this notebook we will go through a few features of fvGP. We will be primarily concerned with regression over a single dimension output and single task. See the multiple_task_test_notebook.ipynb for single dimension and multiple task example. The extension to multiple dimensions is straight forward.

Import fvgp and relevant libraries

import fvgp
from fvgp import gp
import numpy as np
import matplotlib.pyplot as plt

Defining some input data and testing points

def function(x):
    return np.sin(1.1 * x)+np.cos(0.5 * x)
x_data = np.linspace(-2*np.pi, 10*np.pi,50).reshape(-1,1)
y_data = function(x_data)
x_pred = np.linspace(-2*np.pi, 10 * np.pi, 1000)

Setting up the fvgp single task object

NOTE: The input data need to be given in the form (N x input_space_dim). The output can either be a N array or N x 1 array where N is the number of data points. See help(gp.GP) for more information.

obj = gp.GP(1, x_data,y_data, init_hyperparameters = np.array([10,10]),use_inv = False)

Training our gaussian process regression on given data

hyper_param_bounds = np.array([[0.001, 5.],[ 0.001, 100]])
##this will block the main thread, even if you use "hgdl", another option is "global" or "local"
obj.train(hyper_param_bounds, method = "hgdl")

/home/docs/checkouts/readthedocs.org/user_builds/fvgp/envs/stable/lib/python3.8/site-packages/distributed/dashboard/core.py:20: UserWarning: 
Dask needs bokeh >= 2.4.2, < 3 for the dashboard.
You have bokeh==3.1.0.
Continuing without the dashboard.
  warnings.warn(

/home/docs/checkouts/readthedocs.org/user_builds/fvgp/envs/stable/lib/python3.8/site-packages/scipy/optimize/_minimize.py:554: RuntimeWarning: Method L-BFGS-B does not use Hessian information (hess).
  warn('Method %s does not use Hessian information (hess).' % method,

Looking the posterior mean at the test points

post_mean= obj.posterior_mean(x_pred.reshape(-1,1))["f(x)"]
post_var= obj.posterior_covariance(x_pred.reshape(-1,1))["v(x)"]

plt.plot(x_pred, post_mean, label='gp interpolation')
plt.scatter(x_data, y_data, label='data')
plt.plot(x_pred,function(x_pred), label = 'ground truth')
plt.fill_between(x_pred, post_mean + 3.0 *np.sqrt(post_var),post_mean - 3.0 * np.sqrt(post_var), color = 'grey', alpha = 0.5)
plt.legend()

<matplotlib.legend.Legend at 0x7f133220a3d0>

../_images/2d9de91db4881c6f218634706cb1e1e705f2db70888b0f0894b1d386bdaa6a19.png

Training Asynchronously

obj = gp.GP(1, x_data,y_data, init_hyperparameters = np.array([10,10]),
                            variances = np.zeros(y_data.reshape(-1,1).shape))

hyper_param_bounds = np.array([[0.0001, 100], [ 0.0001, 100]])

async_obj = obj.train_async(hyper_param_bounds)

/home/docs/checkouts/readthedocs.org/user_builds/fvgp/envs/stable/lib/python3.8/site-packages/distributed/dashboard/core.py:20: UserWarning: 
Dask needs bokeh >= 2.4.2, < 3 for the dashboard.
You have bokeh==3.1.0.
Continuing without the dashboard.
  warnings.warn(

Updating asynchronously

Updates hyperparameters to current optimization values

obj.update_hyperparameters(async_obj)

array([10, 10])

Killing training

obj.kill_training(async_obj)

Looking at the posterior mean at the test points

post_mean= obj.posterior_mean(x_pred.reshape(-1,1))['f(x)']

plt.plot(x_pred, post_mean, label='interpolation')
plt.scatter(x_data, y_data, label='data')
plt.plot(x_pred, function(x_pred), label='ground truth')
plt.fill_between(x_pred, post_mean + 3.0 *np.sqrt(post_var),post_mean - 3.0 * np.sqrt(post_var), color = 'grey', alpha = 0.5)
plt.legend()

<matplotlib.legend.Legend at 0x7f13dfc71e50>

../_images/c2e533e2d673fd75fd6c7cd56f6ab5f739a2f29e8c2035fa0eb460c65a6f10d1.png

Custom Kernels

def kernel_l1(x1,x2, hp, obj):
    ################################################################
    ###standard anisotropic kernel in an input space with l1########
    ################################################################
    d1 = abs(np.subtract.outer(x1[:,0],x2[:,0])) 
    return hp[0] * np.exp(-d1/hp[1])

obj = gp.GP(1, x_data,y_data,
                init_hyperparameters = np.array([10,10]),
                variances = np.zeros(y_data.shape),
                gp_kernel_function = kernel_l1)

Training our gaussian process regression on given data

hyper_param_bounds = np.array([[0.0001, 1000],[ 0.0001, 1000]])
obj.train(hyper_param_bounds)

Looking the posterior mean at the test points

post_mean= obj.posterior_mean(x_pred.reshape(-1,1))["f(x)"]

plt.plot(x_pred, post_mean, label='interpolation')
plt.scatter(x_data, y_data, label='data')
plt.plot(x_pred, function(x_pred), label='ground truth')
plt.fill_between(x_pred, post_mean + 3.0 *np.sqrt(post_var),post_mean - 3.0 * np.sqrt(post_var), color = 'grey', alpha = 0.5)
plt.legend()

<matplotlib.legend.Legend at 0x7f13dfb921f0>

../_images/1274ec6d95ab95e5b392de781ca4cc10204de1fdf5711263f5258cd04bbb3e7a.png

Prior Mean Functions

NOTE: The prior mean function must return a 1d vector, e.g., (100,)

def example_mean(x,hyperparameters,gp_obj):
    return np.ones(len(x))

obj = gp.GP(1, x_data,y_data,init_hyperparameters = np.array([10,10]),
                            variances = np.zeros(y_data.shape),
                            gp_mean_function = example_mean)

Training our gaussian process regression on given data

hyper_param_bounds = np.array([[0.0001, 1000],[ 0.0001, 1000]])
obj.train(hyper_param_bounds)

Looking the posterior mean at the test points

post_mean= obj.posterior_mean(x_pred.reshape(-1,1))["f(x)"]

plt.plot(x_pred, post_mean, label='interpolation')
plt.scatter(x_data, y_data, label='data')
plt.plot(x_pred, function(x_pred), label='ground truth')
plt.fill_between(x_pred, post_mean + 3.0 *np.sqrt(post_var),post_mean - 3.0 * np.sqrt(post_var), color = 'grey', alpha = 0.5)
plt.legend()

<matplotlib.legend.Legend at 0x7f13dfab6550>

../_images/704dea8beb80e5dd07549a199222d956123428e263a9f991d2d17cf935e466e6.png

Read the Docs v: stable

Versions: latest; stable

Downloads

On Read the Docs: Project Home; Builds