Kernels

Kernels Reference

GpABC functions and types for working with kernels.

Index

Types and Functions

AbstractGPKernel

Abstract kernel type. User-defined kernels should derive from it.

Implementations have to provide methods for get_hyperparameters_size and covariance. Methods for covariance_training, covariance_diagonal and covariance_grad are optional.

source
GpABC.covarianceMethod.
covariance(ker::AbstractGPKernel, log_theta::AbstractArray{Float64, 1},
    x::AbstractArray{Float64, 2}, z::AbstractArray{Float64, 2})

Return the covariance matrix. Should be overridden by kernel implementations.

Arguments

  • ker: The kernel object. Implementations must override with their own subtype.
  • log_theta: natural logarithm of hyperparameters.
  • x, z: Input data, reshaped into 2-d arrays. x must have dimensions $n \times d$; z must have dimensions $m \times d$.

Return

The covariance matrix, of size $n \times m$.

source
covariance_diagonal(ker::AbstractGPKernel, log_theta::AbstractArray{Float64, 1},
    x::AbstractArray{Float64, 2})

This is a speedup version of covariance, which is invoked if the caller is not interested in the entire covariance matrix, but only needs the variance, i.e. the diagonal of the covariance matrix.

Default method just returns diag(covariance(...)), with x === z. Kernel implementations can optionally override it to achieve betrer performance, by not computing the non diagonal elements of covariance matrix.

See covariance for description of arguments.

Return

The 1-d array of variances, of size size(x, 1).

source
covariance_grad(ker::AbstractGPKernel, log_theta::AbstractArray{Float64, 1},
    x::AbstractArray{Float64, 2}, R::AbstractArray{Float64, 2})

Return the gradient of the covariance function with respect to logarigthms of hyperparameters, based on the provided direction matrix.

This function can be optionally overridden by kernel implementations. If the gradient function is not provided, gp_train will fail back to NelderMead algorithm by default.

Arguments

  • ker: The kernel object. Implementations must override with their own subtype.
  • log_theta: natural logarithm of hyperparameters
  • x: Training data, reshaped into a 2-d array. x must have dimensions $n \times d$.
  • R the directional matrix, $n \times n$
\[R = \frac{1}{\sigma_n^2}(\alpha * \alpha^T - K^{-1}); \alpha = K^{-1}y\]

Return

A vector of size length(log_theta), whose $j$'th element is equal to

\[tr(R \frac{\partial K}{\partial \eta_j})\]
source
covariance_training(ker::AbstractGPKernel, log_theta::AbstractArray{Float64, 1},
    training_x::AbstractArray{Float64, 2})

This is a speedup version of covariance, which is only called during traing sequence. Intermediate matrices computed in this function for particular hyperparameters can be cached and reused subsequently, either in this function or in covariance_grad

Default method just delegates to covariance with x === z. Kernel implementations can optionally override it for betrer performance.

See covariance for description of arguments and return values.

source
get_hyperparameters_size(kernel::AbstractGPKernel, training_data::AbstractArray{Float64, 2})

Return the number of hyperparameters for used by this kernel on this training data set. Should be overridden by kernel implementations.

source
MaternArdKernel <: AbstractGPKernel

Matérn kernel with distinct length scale for each dimention, $l_k$. Parameter $\nu$ (nu) is passed in constructor. Currently, only values of $\nu=1$, $\nu=3$ and $\nu=5$ are supported.

\[\begin{aligned} K_{\nu=1}(r) &= \sigma_f^2e^{-\sqrt{r}}\\ K_{\nu=3}(r) &= \sigma_f^2(1 + \sqrt{3r})e^{-\sqrt{3r}}\\ K_{\nu=5}(r) &= \sigma_f^2(1 + \sqrt{3r} + \frac{5}{3}r)e^{-\sqrt{5r}}\\ r_{ij} &= \sum_{k=1}^d\frac{(x_{ik}-z_{jk})^2}{l_k^2} \end{aligned}\]

$r_{ij}$ are computed by scaled_squared_distance

Hyperparameters

The length of hyperparameters array for this kernel depends on the dimensionality of the data. Assuming each data point is a vector in a $d$-dimensional space, this kernel needs $d+1$ hyperparameters, in the following order:

  1. $\sigma_f$: the signal standard deviation
  2. $l_1, \ldots, l_d$: the length scales for each dimension
source
MaternIsoKernel <: AbstractGPKernel

Matérn kernel with uniform length scale across all dimensions, $l$. Parameter $\nu$ (nu) is passed in constructor. Currently, only values of $\nu=1$, $\nu=3$ and $\nu=5$ are supported.

\[\begin{aligned} K_{\nu=1}(r) &= \sigma_f^2e^{-\sqrt{r}}\\ K_{\nu=3}(r) &= \sigma_f^2(1 + \sqrt{3r})e^{-\sqrt{3r}}\\ K_{\nu=5}(r) &= \sigma_f^2(1 + \sqrt{3r} + \frac{5}{3}r)e^{-\sqrt{5r}}\\ r_{ij} &= \sum_{k=1}^d\frac{(x_{ik}-z_{jk})^2}{l^2} \end{aligned}\]

$r_{ij}$ are computed by scaled_squared_distance

Hyperparameters

Hyperparameters vector for this kernel must contain two elements, in the following order:

  1. $\sigma_f$: the signal standard deviation
  2. $l$: the length scale
source
SquaredExponentialArdKernel <: AbstractGPKernel

Squared exponential kernel with distinct length scale for each dimention, $l_k$.

\[\begin{aligned} K(r) & = \sigma_f^2 e^{-r/2} \\ r_{ij} & = \sum_{k=1}^d\frac{(x_{ik}-z_{jk})^2}{l_k^2} \end{aligned}\]

$r_{ij}$ are computed by scaled_squared_distance

Hyperparameters

The length of hyperparameters array for this kernel depends on the dimensionality of the data. Assuming each data point is a vector in a $d$-dimensional space, this kernel needs $d+1$ hyperparameters, in the following order:

  1. $\sigma_f$: the signal standard deviation
  2. $l_1, \ldots, l_d$: the length scales for each dimension
source
SquaredExponentialIsoKernel <: AbstractGPKernel

Squared exponential kernel with uniform length scale across all dimensions, $l$.

\[\begin{aligned} K(r) & = \sigma_f^2 e^{-r/2} \\ r_{ij} & = \sum_{k=1}^d\frac{(x_{ik}-z_{jk})^2}{l^2} \end{aligned}\]

$r_{ij}$ are computed by scaled_squared_distance

Hyperparameters

Hyperparameters vector for this kernel must contain two elements, in the following order:

  1. $\sigma_f$: the signal standard deviation
  2. $l$: the length scale
source
ExponentialArdKernel

Alias for MaternArdKernel(1)

source
ExponentialIsoKernel

Alias for MaternIsoKernel(1)

source
scaled_squared_distance(log_ell::AbstractArray{Float64, 1},
    x::AbstractArray{Float64, 2}, z::AbstractArray{Float64, 2})

Compute the scaled squared distance between x and z:

\[r_{ij} = \sum_{k=1}^d\frac{(x_{ik}-z_{jk})^2}{l_k^2}\]

The gradient of this function with respect to length scale hyperparameter(s) is returned by scaled_squared_distance_grad.

Arguments

  • x, z: Input data, reshaped into 2-d arrays. x must have dimensions $n \times d$; z must have dimensions $m \times d$.
  • log_ell: logarithm of length scale(s). Can either be an array of size one (isotropic), or an array of size d (ARD)

Return

An $n \times m$ matrix of scaled squared distances

source
scaled_squared_distance_grad(log_ell::AbstractArray{Float64, 1},
    x::AbstractArray{Float64, 2}, z::AbstractArray{Float64, 2}, R::AbstractArray{Float64, 2})

Return the gradient of the scaled_squared_distance function with respect to logarigthms of length scales, based on the provided direction matrix.

Arguments

  • x, z: Input data, reshaped into 2-d arrays. x must have dimensions $n \times d$; z must have dimensions $m \times d$.
  • log_ell: logarithm of length scale(s). Can either be an array of size one (isotropic), or an array of size d (ARD)
  • R the direction matrix, $n \times m$. This can be used to compute the gradient of a function that depends on scaled_squared_distance via the chain rule.

Return

A vector of size length(log_ell), whose $k$'th element is equal to

\[\text{tr}(R \frac{\partial K}{\partial l_k})\]
source