Edit this page on GitHub

STT 997 — Lecture 01

Monday, January 12, 2026 (4:11 PM)

Background / Motivation

Book on Geostatistics (he wrote)

Spatial Data Setup

Spatial data in $\mathbb{R}^d$ (d-dimensional)
Locations:
$s_1, s_2, \ldots, s_n$
Observations: $Y(s)$

Example

Temperature
Input: $s_i$ (longitude, latitude)
Output: $Y$
More generally: $s = (v_1, \ldots, v_d) \;\longrightarrow\; Y$

Other Examples

Recommendation engines (movies)
$a \in {0,1}^N$ (N movies)

Prediction Perspective

BLUE (Best Linear Unbiased Estimator) — prediction
$Y(s)$, or denote data as $(s, Y)$
Think of $Y$ as a stochastic process in $d$-dimensional space
Similar to regression: $Y = f(s) + \varepsilon$ (our focus)

Objective

Prediction
Given data, predict $Y$ at a new location
Prediction criterion

Gaussian Process (GP)

${Y(s),\; s \in D \subset \mathbb{R}^d}$
Time series: requires ordering
Random field: boundary between “field” and “process”
Indexed by: $s = (m,n)$ on a continuous region

Finite-Dimensional Distributions

For any finite set $s_1,\ldots,s_n$: $(Y(s_1),\ldots,Y(s_n))^T \sim \text{MVN}(\mu, V)$
Hence, we can work with any finite dimension

Mean and Covariance

Mean function: $\mu(s) = \mathbb{E}[Y(s)]$
Covariance function: $k(s,s') = \text{Cov}(Y(s),Y(s'))$
$s’ \neq s$
Given mean and variance, we can write MVN
Covariance matrix: $V[i,j] = k(s_i, s_j)$

Stationarity (2nd Order)

Assumption: $k(s',s) = k(s-s') = \rho(s-s')$
Then: $V[i,j] = k(s_i - s_j)$
$k$ must be a stationary covariance function (CF)

Properties of Stationary Covariance Functions

$k(0) > 0$
$k(-s) = k(s)$

For $k(\cdot)$ to be a valid CF, it must satisfy:
- For any $s_1,\ldots,s_n$, $\left(k(s_i - s_j)\right)_{i,j=1}^n \succeq 0$ (positive semi-definite)

Bochner’s Theorem

$k$ is a stationary covariance function iff $k(s) = \int_{\mathbb{R}^d} e^{i s^T \omega} \, dF(\omega)$ where $F$ is a finite measure
This is a characteristic function
Also called translation invariant

Best Linear Unbiased Predictor (BLUP)

(Green prediction, simple notation)

Given $Y_1,\ldots,Y_n$, predict: $\hat{Y}_0 = a + \sum_{i=1}^n b_i Y_i$
Choose $a,b$ to minimize: $\mathbb{E}\left[(Y_0 - a - b^T Y)^2\right]$

Bias–Variance Decomposition

$$ \mathbb{E}\left[(Y_0 - a - b^T Y)^2\right] = \text{Var}(Y_0 - a - b^T Y)

\left(\mathbb{E}[Y_0 - a - b^T Y]\right)^2 $$
Since: $\mathbb{E}[\xi^2] = \text{Var}(\xi) + (\mathbb{E}[\xi])^2$
Unbiasedness condition: $\mathbb{E}(Y_0 - a - b^T Y) = 0 \Rightarrow a = \mathbb{E}(Y_0) - b^T \mathbb{E}(Y)$

Variance Minimization

$a$ is constant in variance, so drop it
Minimize: $\text{Var}(Y_0 - b^T Y)$
Objective: $Q(b) = b^T V b - 2 b^T k$ where:
- $V = \text{Var}(Y)$
- $k = \text{Cov}(Y, Y_0)$
Gradient: $\frac{\partial Q(b)}{\partial b} = 2 V b - 2 k = 0$
Solution: $V b = k \quad\Rightarrow\quad b = V^{-1} k$

Final Predictor

$\hat{Y}_0 = \mathbb{E}(Y_0) - b^T \mathbb{E}(Y) + b^T Y$

\[= \mathbb{E}(Y_0) + b^T (Y - \mathbb{E}(Y))\] \[= \mathbb{E}(Y_0) + (V^{-1}k)^T (Y - \mathbb{E}(Y))\]

Expectation: $\mathbb{E}(\hat{Y}_0) = \mathbb{E}(Y_0)$