ForK Library

buildkriging.f90 File Reference

Subroutine to Build Kriging Surface based on Function Values. More...

Go to the source code of this file.


subroutine buildkriging (ndim, ntot, X, Y, stot, H, beta, V, hyper, hyperflag, optflagi, covarflagi)
 This subroutine creates a Kriging surface based on the supplied training data X and Y, with the parameters H, hyperflag, optflagi, covarflagi using to specify details of the construction process. The creation of the Kriging model is governed by the following steps:

Detailed Description

Subroutine to Build Kriging Surface based on Function Values.

Definition in file buildkriging.f90.

Function Documentation

subroutine buildkriging ( integer,intent(in)  ndim,
integer,intent(in)  ntot,
real(8),dimension(ndim,ntot),intent(in)  X,
real(8),dimension(ntot),intent(in)  Y,
integer,intent(in)  stot,
real(8),dimension(stot,ntot),intent(in)  H,
real(8),dimension(stot),intent(out)  Beta,
real(8),dimension(ntot),intent(out)  V,
real(8),dimension(ndim+2),intent(out)  hyper,
integer,intent(in)  hyperflag,
integer,intent(in)  optflagi,
integer,intent(in)  covarflagi 

This subroutine creates a Kriging surface based on the supplied training data X and Y, with the parameters H, hyperflag, optflagi, covarflagi using to specify details of the construction process. The creation of the Kriging model is governed by the following steps:

  1. The fitting of model parameters (hyperparameters) based on maximization of the likelihood equation
  2. Creation of the Covariance Matrix
  3. Determination of the Optimal regression parameters
  4. Creation of the processed data V by inverting the covariance matrix and multiplying the difference between the training data and regression value by the inverse of the covariance matrix

The Kriging model is built on the assumption that the data Y obey a Gaussian process with an assumed form for the mean function and the covariance between data points:

\[ Y = N(m(\vec{x}), K(\vec{x},\vec{x}))\]

where \(m(\vec{x})\) is the mean function and \(K(\vec{x},\vec{x})\) represents the covariance between function values. For this work, a regression mean function is assumed. Using this form, the mean function has the following form:

\[ m(x) = h^{T}(\vec{x}) \beta \]

where \(h^{T}(\vec{x})\) represents a column vector containing the basis functions of the basis evaluated at the points \(\vec{x}\). The regression parameters \(\beta\) are treated as part of the Kriging model and are determined while constructioning the model. Using this form of the mean function yields a Universal Kriging model. The case of a \(p=0\) regression (where the vector \(h(\vec{x})\) reduces to unity) is referred to as ordinary Kriging and is also covered by this functional form. The assumption of a vague prior on the regression parameters gives the following closed form for the parameters:

\[ \beta=(H K^{-1} H^{T})^{-1} H^T K^{-1} Y = A^{-1} H^T K^{-1} Y \]

where \(K\) is the covariance matrix between the training data. For a Kriging model, the covariance between function values is assumed to be only a function of the distance between points. The multi-dimension covariance function is constructed using a tensor product of one dimension functions. The multi-dimension covariance is calculated in subroutine covarfunc. The elements in ths covariance matrix are given as:

\[ K_{i,j} = cov(y_{i},y_{j}) = \sigma^{2} k(\vec{X}_{i},\vec{X}_{j}; \theta) + \sigma^{2}_{n} \delta_{i,j} \]

The parameters \(\sigma\) and \(\theta\) (and in some cases \(\sigma_{n}\)) are denoted as hyperparameters and are determined maximizing the likelihood equation for the Kriging model. This likelihood gives the probability that a Gaussian process with specified hyperparamters describes the training data X and Y. By picking the hyperparameters that maximize this probability, a Kriging model that best describes the data can be constructed. The hyperparameters can be determined in two different ways in this code. The first way is based on the Mean square error estimate and is computed in likelihood_mle. For this method, the noise level \(\sigma_{n}\) is prescribed to ensure proper conditioning of the covariance matrix and the optimal covariance magnitude \(\sigma\) is determined in closed form. Hence, only the length scales \(\theta\) are determined through numerical optimization. This optimization is performed using a simplex search or pattern search. The second way of determining hyperparameters is based on the likelihood equation for a gaussian process with a vague prior on the regression parameters. This likelihood is computed in likelihood. Using this equation, optimization is used to determine all of the parameters, including the covariance magnitude \(\sigma\) and noise \(\sigma_{n}\). This way of determining hyperparameters should be used when the noise level of the function needs to be fitted.
With the regression and covariance parameters determined, the final processed data can be constructed using the inverse of the covariance matrix. To make predictions from the Kriging model, the following vector is required:

\[ V = K^{-1} (Y - H^{T} \beta) \]

where \( K \) is the covariance matrix, the product \(H^{T} \beta\) represents the mean function evaluated at the training points and \(Y\) represents the function values at the training points. Using this processed data, the regression parameters and covariance parameters, predictions can be made based on the Kriging model using:

Brian Lockwood
Department of Mechanical Engineering
University of Wyoming
May 1, 2012
in)ndim The dimension of the problem
in)ntot : The number of Training points
in)X : The location of the training points (size=[ndimxntot])
in)Y : Funcation values for the training points (size=[ntot])
in)stot : Number of Terms in the regression
in)H: The collocation matrix for the regression (size=[stotxntot])
out) beta: Regression coefficients based on the optimal estimate for the Kriging model (size=[stot])
out)hyper: Hyperparameters for the Kriging Model (size=[ndim+2])

  • hyper(1:ndim) correspond to the length scale used in each dimension ( \(\theta\) in the covariance functions)
  • hyper(ndim+1) is the magnitude of the covariance matrix
  • hyper(ndim+2) is the fitted noise for the function evaluations
out)V: Processed Training Data (size=[ntot])
In order to make predictions from the Kriging Model, the inverse of the covariance matrix onto the residual of the training data from the regression is all that is needed. To avoid re-computing and inverting the covariance matrix, the product is stored explicitly
in)hyperflagi: Flag to govern how the hyperparameters are selected

  • hyperflag=0 uses the likelihood formula to determine the length scale only
    The noise in the Function is hard-coded as a small value simply to ensure the covariance matrix is properly conditioned
    The magnitude for the covariance function is determined explicitly based on optimality for the likelihood function (Solve for magnitude when derivative of likelihood is set to zero)
  • hyperflag=1 uses the likelihood formula based on all hyperparameters with no parameters determined by explicit relation
    All hyperparameters including noise and magnitude are determined via the optimization. Required when the amount of noise in the function evaluations is to be fitted. However, the dimension of the optimization problem is now higher so this will be slower.
in)optflagi: Flag to govern the optimization algorithm used to determine the hyperparameters

  • optflag=0 Simplex Search (Nelder-Mead Method) to determine the hyperparameters
    Local optimization technique using simplex elements to determine search direction and line search to determine step sizes
    Fastest method but inherently sequential and may stall on local optimum
  • optflag=1 Pattern Search used to determine hyperparameters
    Global optimization method with fixed search directions (forward and backward in each direction)
    Each iteration requires \(2*ndim\) likelihood evaluations; however, these may be performed in parallel using openmp (Each thread performs own likelihood evaluation so OMP_STACK_SIZE must be set large enough)
in)covarflagi: Flag to govern which covariance function is used

  • covarflag==0 Uses Matern function with \(\nu=1/2\)
  • covarflag==1 Uses Matern function with \(\nu=3/2\)
  • covarflag==2 Uses Matern function with \(\nu=5/2\)

The parameter \(\nu\) governs the smoothness and differentiability of the covariance function. For function only Kriging, all three options are available.
Good rule of thumb is to use the least smooth ( \(\nu=1/2\)) unless there is a reason to assume more smoothness in the data
For small numbers of training points, higher \(\nu\) can improve accuracy

Return values:
betaRegression coefficients based on the optimal estimate for the Kriging model (size=[stot])
hyperHyperparameters for the Kriging Model (size=[ndim+2])
VProcessed Training Data (size=[ntot])

Definition at line 88 of file buildkriging.f90.

References covars::covarflag, choleskymod::invertsymmetric(), choleskymod::matrixmulttrans(), choleskymod::matvec(), choleskymod::matvectrans(), opt::optflag, choleskymod::symmatrixmulttrans(), choleskymod::symmatvec(), and choleskymod::symmetricsolve().

Referenced by krigingwrapper().

 All Classes Namespaces Files Functions Variables