STT 997 — Lecture 01
Monday, January 12, 2026 (4:11 PM)
Background / Motivation
- Book on Geostatistics (he wrote)
Spatial Data Setup
- Spatial data in $\mathbb{R}^d$ (d-dimensional)
- Locations:
\(s_1, s_2, \ldots, s_n\) - Observations: \(Y(s)\)
Example
- Temperature
- Input: $s_i$ (longitude, latitude)
-
Output: $Y$
- More generally: \(s = (v_1, \ldots, v_d) \;\longrightarrow\; Y\)
Other Examples
- Recommendation engines (movies)
- $a \in {0,1}^N$ (N movies)
Prediction Perspective
- BLUE (Best Linear Unbiased Estimator) — prediction
-
$Y(s)$, or denote data as $(s, Y)$
- Think of $Y$ as a stochastic process in $d$-dimensional space
- Similar to regression: \(Y = f(s) + \varepsilon\) (our focus)
Objective
- Prediction
-
Given data, predict $Y$ at a new location
- Prediction criterion
Gaussian Process (GP)
-
${Y(s),\; s \in D \subset \mathbb{R}^d}$
- Time series: requires ordering
-
Random field: boundary between “field” and “process”
- Indexed by: \(s = (m,n)\) on a continuous region
Finite-Dimensional Distributions
- For any finite set $s_1,\ldots,s_n$: \((Y(s_1),\ldots,Y(s_n))^T \sim \text{MVN}(\mu, V)\)
- Hence, we can work with any finite dimension
Mean and Covariance
- Mean function: \(\mu(s) = \mathbb{E}[Y(s)]\)
- Covariance function: \(k(s,s') = \text{Cov}(Y(s),Y(s'))\)
-
$s’ \neq s$
-
Given mean and variance, we can write MVN
- Covariance matrix: \(V[i,j] = k(s_i, s_j)\)
Stationarity (2nd Order)
- Assumption: \(k(s',s) = k(s-s') = \rho(s-s')\)
- Then: \(V[i,j] = k(s_i - s_j)\)
- $k$ must be a stationary covariance function (CF)
Properties of Stationary Covariance Functions
- $k(0) > 0$
- $k(-s) = k(s)$
- For $k(\cdot)$ to be a valid CF, it must satisfy:
- For any $s_1,\ldots,s_n$, \(\left(k(s_i - s_j)\right)_{i,j=1}^n \succeq 0\) (positive semi-definite)
Bochner’s Theorem
-
$k$ is a stationary covariance function iff \(k(s) = \int_{\mathbb{R}^d} e^{i s^T \omega} \, dF(\omega)\) where $F$ is a finite measure
- This is a characteristic function
- Also called translation invariant
Best Linear Unbiased Predictor (BLUP)
(Green prediction, simple notation)
-
Given $Y_1,\ldots,Y_n$, predict: \(\hat{Y}_0 = a + \sum_{i=1}^n b_i Y_i\)
-
Choose $a,b$ to minimize: \(\mathbb{E}\left[(Y_0 - a - b^T Y)^2\right]\)
Bias–Variance Decomposition
$$ \mathbb{E}\left[(Y_0 - a - b^T Y)^2\right] = \text{Var}(Y_0 - a - b^T Y)
-
\left(\mathbb{E}[Y_0 - a - b^T Y]\right)^2 $$
-
Since: \(\mathbb{E}[\xi^2] = \text{Var}(\xi) + (\mathbb{E}[\xi])^2\)
-
Unbiasedness condition: \(\mathbb{E}(Y_0 - a - b^T Y) = 0 \Rightarrow a = \mathbb{E}(Y_0) - b^T \mathbb{E}(Y)\)
Variance Minimization
-
$a$ is constant in variance, so drop it
-
Minimize: \(\text{Var}(Y_0 - b^T Y)\)
- Objective:
\(Q(b) = b^T V b - 2 b^T k\)
where:
- $V = \text{Var}(Y)$
- $k = \text{Cov}(Y, Y_0)$
-
Gradient: \(\frac{\partial Q(b)}{\partial b} = 2 V b - 2 k = 0\)
- Solution: \(V b = k \quad\Rightarrow\quad b = V^{-1} k\)
Final Predictor
\(\hat{Y}_0 = \mathbb{E}(Y_0) - b^T \mathbb{E}(Y) + b^T Y\)
\[= \mathbb{E}(Y_0) + b^T (Y - \mathbb{E}(Y))\] \[= \mathbb{E}(Y_0) + (V^{-1}k)^T (Y - \mathbb{E}(Y))\]- Expectation: \(\mathbb{E}(\hat{Y}_0) = \mathbb{E}(Y_0)\)
Comments