STT 997 — Lecture 01

Monday, January 12, 2026 (4:11 PM)

Background / Motivation

  • Book on Geostatistics (he wrote)

Spatial Data Setup

  • Spatial data in $\mathbb{R}^d$ (d-dimensional)
  • Locations:
    \(s_1, s_2, \ldots, s_n\)
  • Observations: \(Y(s)\)

Example

  • Temperature
  • Input: $s_i$ (longitude, latitude)
  • Output: $Y$

  • More generally: \(s = (v_1, \ldots, v_d) \;\longrightarrow\; Y\)

Other Examples

  • Recommendation engines (movies)
  • $a \in {0,1}^N$ (N movies)

Prediction Perspective

  • BLUE (Best Linear Unbiased Estimator) — prediction
  • $Y(s)$, or denote data as $(s, Y)$

  • Think of $Y$ as a stochastic process in $d$-dimensional space
  • Similar to regression: \(Y = f(s) + \varepsilon\) (our focus)

Objective

  • Prediction
  • Given data, predict $Y$ at a new location

  • Prediction criterion

Gaussian Process (GP)

  • ${Y(s),\; s \in D \subset \mathbb{R}^d}$

  • Time series: requires ordering
  • Random field: boundary between “field” and “process”

  • Indexed by: \(s = (m,n)\) on a continuous region

Finite-Dimensional Distributions

  • For any finite set $s_1,\ldots,s_n$: \((Y(s_1),\ldots,Y(s_n))^T \sim \text{MVN}(\mu, V)\)
  • Hence, we can work with any finite dimension

Mean and Covariance

  • Mean function: \(\mu(s) = \mathbb{E}[Y(s)]\)
  • Covariance function: \(k(s,s') = \text{Cov}(Y(s),Y(s'))\)
  • $s’ \neq s$

  • Given mean and variance, we can write MVN

  • Covariance matrix: \(V[i,j] = k(s_i, s_j)\)

Stationarity (2nd Order)

  • Assumption: \(k(s',s) = k(s-s') = \rho(s-s')\)
  • Then: \(V[i,j] = k(s_i - s_j)\)
  • $k$ must be a stationary covariance function (CF)

Properties of Stationary Covariance Functions

  1. $k(0) > 0$
  2. $k(-s) = k(s)$
  • For $k(\cdot)$ to be a valid CF, it must satisfy:
    • For any $s_1,\ldots,s_n$, \(\left(k(s_i - s_j)\right)_{i,j=1}^n \succeq 0\) (positive semi-definite)

Bochner’s Theorem

  • $k$ is a stationary covariance function iff \(k(s) = \int_{\mathbb{R}^d} e^{i s^T \omega} \, dF(\omega)\) where $F$ is a finite measure

  • This is a characteristic function
  • Also called translation invariant

Best Linear Unbiased Predictor (BLUP)

(Green prediction, simple notation)

  • Given $Y_1,\ldots,Y_n$, predict: \(\hat{Y}_0 = a + \sum_{i=1}^n b_i Y_i\)

  • Choose $a,b$ to minimize: \(\mathbb{E}\left[(Y_0 - a - b^T Y)^2\right]\)

Bias–Variance Decomposition

$$ \mathbb{E}\left[(Y_0 - a - b^T Y)^2\right] = \text{Var}(Y_0 - a - b^T Y)

  • \left(\mathbb{E}[Y_0 - a - b^T Y]\right)^2 $$

  • Since: \(\mathbb{E}[\xi^2] = \text{Var}(\xi) + (\mathbb{E}[\xi])^2\)

  • Unbiasedness condition: \(\mathbb{E}(Y_0 - a - b^T Y) = 0 \Rightarrow a = \mathbb{E}(Y_0) - b^T \mathbb{E}(Y)\)

Variance Minimization

  • $a$ is constant in variance, so drop it

  • Minimize: \(\text{Var}(Y_0 - b^T Y)\)

  • Objective: \(Q(b) = b^T V b - 2 b^T k\) where:
    • $V = \text{Var}(Y)$
    • $k = \text{Cov}(Y, Y_0)$
  • Gradient: \(\frac{\partial Q(b)}{\partial b} = 2 V b - 2 k = 0\)

  • Solution: \(V b = k \quad\Rightarrow\quad b = V^{-1} k\)

Final Predictor

\(\hat{Y}_0 = \mathbb{E}(Y_0) - b^T \mathbb{E}(Y) + b^T Y\)

\[= \mathbb{E}(Y_0) + b^T (Y - \mathbb{E}(Y))\] \[= \mathbb{E}(Y_0) + (V^{-1}k)^T (Y - \mathbb{E}(Y))\]
  • Expectation: \(\mathbb{E}(\hat{Y}_0) = \mathbb{E}(Y_0)\)

Comments