What is Empirical Bayesian Kriging?

Introduction

Empirical Bayesian kriging (EBK) is a geostatistical interpolation method that automates the most difficult aspects of building a valid kriging model. Other kriging methods in Geostatistical Analyst require you to manually adjust parameters in order to receive accurate results, but EBK automatically calculates these parameters through a process of subsetting and simulations.

Empirical Bayesian kriging also differs from other kriging methods by accounting for the error introduced by estimating the underlying semivariogram. Other kriging methods calculate the semivariogram from known data locations and use this single semivariogram to make predictions at unknown locations; this process implicitly assumes that the estimated semivariogram is the true semivariogram for the interpolation region. By not taking the uncertainty of semivariogram estimation into account, other kriging methods underestimate the standard errors of prediction.

Empirical Bayesian kriging is offered in the Geostatistical Wizard and as a geoprocessing tool.

Advantages and disadvantages

Advantages

  • Requires minimal interactive modeling
  • Standard errors of prediction are more accurate than other kriging methods
  • Allows accurate predictions of moderately nonstationary data
  • More accurate than other kriging methods for small datasets

Disadvantages

  • Processing time rapidly increases as the number of input points, the subset size, or the overlap factor increase. Applying a transformation will also increase processing time. These parameters are described below.
  • Processing is slower than other kriging methods, especially when outputting to raster.
  • Cokriging and anisotropy are unavailable.
  • A small number of parameters in the semivariogram model limits the ability to customize. Other kriging methods provide many choices for the semivariogram model.
  • The Log Empirical transformation is particularly sensitive to outliers. If you use this transformation with data that contains outliers, you might receive predictions that are orders of magnitude larger or smaller than the values of your input points. This parameter is described in the "Transformations" section below.

Semivariogram estimation

Unlike other kriging methods (which use weighted least squares), the semivariogram parameters in EBK are estimated using restricted maximum likelihood (REML). Due to the computational limitations of REML for large datasets, the input data is first divided into overlapping subsets of a specified size (defaulted to 100 points per subset). In each subset, semivariograms are estimated in the following way:

  1. A semivariogram is estimated from the data in the subset.
  2. Using this semivariogram as a model, new data is unconditionally simulated at each of the input locations in the subset.
  3. A new semivariogram is estimated from the simulated data.
  4. Steps 2 and 3 are repeated a specified number of times. In each repetition, the semivariogram estimated in step 1 is used to simulate a new set of data at the input locations, and the simulated data is used to estimate a new semivariogram.

This process creates a large number of semivariograms for each subset, and when they are plotted together, the result is a distribution of semivariograms that are shaded by density (the darker the blue color, the more semivariograms pass through that region). In addition, the median of the distribution is colored with a solid red line, and the 25th and 75th percentiles are colored with red dashed lines, as shown below.

Simulated semivariograms
Simulated semivariograms

The number of simulated semivariograms per subset is defaulted to 100, and each of these semivariograms is an estimate of the true semivariogram for the subset.

For each location, the prediction is generated using a unique semivariogram distribution that is calculated using a weighted sum of the distributions from the surrounding subsets; subsets close to the prediction location are given higher weights than subsets that are farther away.

Kriging model

Empirical Bayesian kriging differs from other kriging methods in Geostatistical Analyst by using an intrinsic random function of order 0 (IRF-0) as the kriging model.

Other kriging models assume that the process follows an overall mean (or specified trend) with individual variations around this mean. Large deviations are pulled back toward the mean, so values never deviate too far. However, EBK does not assume a tendency toward an overall mean, so large deviations are just as likely to get larger as they are to get smaller.

Semivariogram model

For a given distance h, empirical Bayesian kriging uses a semivariogram model with the following form:

γ(h)= Nugget + b|h|α

The nugget and b (slope) must be positive, and α (power) must be between 0.25 and 1.75. Under these restrictions, the parameters are estimated using REML. This semivariogram model does not have a range or sill parameter because the function has no upper bound. In EBK it is possible to analyze the empirical distribution of the parameter estimates because many semivariograms are estimated at each location. Clicking the Nugget, Slope, or Power tab displays the distributions of the associated parameters. The graphic below shows the distributions of the semivariogram parameters for the simulated semivariograms shown in the previous graphic:

Distributions of nugget, slope, and power
Distributions of nugget, slope, and power

By clicking a different location on the preview surface, the semivariogram distribution and the distributions of the semivariogram parameters are displayed for the new location. If the distributions do not significantly change across the data domain, this suggests that the data is globally stationary. The distributions should change smoothly across the data domain, but if you see large changes in the distributions over small distances, increasing the value for Overlap Factor can smooth the transitions of the distributions.

NoteNote:

As described in the "Transformations" section below, applying a transformation changes the kriging model from IRF-0 to simple kriging.

Transformations

Empirical Bayesian kriging offers the multiplicative skewing normal score transformation with the choice of two base distributions: Empirical and Log Empirical. The Log Empirical transformation requires all data values to be positive, and it will guarantee that all predictions will be positive. This is appropriate for data like rainfall that cannot be negative.

Transformation options
Transformation options

If a transformation is applied, a simple kriging model is used instead of IRF-0, and the semivariograms are fitted with an exponential semivariogram model. Because of these changes, the parameter distributions change to Nugget, Partial Sill, and Range. An additional Transformation tab appears that displays the distribution of the fitted transformations (one for each simulation). As with the Semivariograms tab, the transformation distribution is colored by density, and quantile lines are provided.

Distributions of nugget, partial sill, range, and transformation
Distributions of nugget, partial sill, range, and transformation

New parameters for empirical Bayesian kriging

Empirical Bayesian kriging employs three parameters that do not appear in other kriging methods:

References

11/2/2012