Convergence in Distribution

\(X_n \xrightarrow[n\to\infty]{d} X\)

Definition

A sequence $X_n$ converges in distribution to $X$ if \(F_{X_n}(x) \to F_X(x) \quad \text{for all continuity points } x \text{ of } F_X,\) where $F_{X_n}$ and $F_X$ are the distribution functions of $X_n$ and $X$.

Equivalently (and more robustly):

For every bounded, continuous function $f$, \(\mathbb{E}[f(X_n)] \to \mathbb{E}[f(X)].\)

This formulation is often the most useful in proofs.


Interpretation

Convergence in distribution means:

The distributions of $X_n$ converge to the distribution of $X$, even if the random variables themselves do not converge on a common probability space.

Key points:

  • No pathwise or sample-wise meaning is required.
  • $X_n$ and $X$ do not need to be defined on the same probability space.
  • Only the laws matter.

Heuristically:

  • a.s. convergence = paths converge
  • in probability = values are usually close
  • in distribution = histograms converge

How Convergence in Distribution Typically Arises

1. Central Limit Theorem (CLT)

The most common source.

If $X_1,X_2,\dots$ are i.i.d. with $\mathbb{E}[X_1]=\mu$ and $\mathrm{Var}(X_1)=\sigma^2<\infty$, then \(\sqrt{n}(\bar X_n - \mu) \xrightarrow{d} N(0,\sigma^2).\)

Mental trigger: normalized sums $\Rightarrow$ CLT $\Rightarrow$ convergence in distribution.


2. Slutsky’s Theorem

If

  • $X_n \xrightarrow{d} X$, and
  • $Y_n \xrightarrow{p} c$ (constant),

then \(X_n + Y_n \xrightarrow{d} X + c, \qquad X_n Y_n \xrightarrow{d} cX.\)

This is a primary tool for manipulating limits in distribution.


3. Continuous Mapping Theorem (CMT)

If \(X_n \xrightarrow{d} X\) and $f$ is continuous, then \(f(X_n) \xrightarrow{d} f(X).\)

This is how nonlinear transformations of CLT limits are justified.


4. Delta Method

If \(\sqrt{n}(X_n - \theta) \xrightarrow{d} N(0,\sigma^2)\) and $g$ is differentiable at $\theta$, then \(\sqrt{n}(g(X_n) - g(\theta)) \xrightarrow{d} N\!\left(0, (g'(\theta))^2\sigma^2\right).\)

This is a refined application of the Continuous Mapping Theorem.


Additional Characterizations of Convergence in Distribution

Characteristic Functions (Lévy’s Continuity Theorem)

Let \(\varphi_{X_n}(t) = \mathbb{E}[e^{itX_n}], \qquad \varphi_X(t) = \mathbb{E}[e^{itX}].\)

Then \(X_n \xrightarrow{d} X \quad \Longleftrightarrow \quad \varphi_{X_n}(t) \to \varphi_X(t) \quad \text{for all } t \in \mathbb{R}.\)

This result is known as Lévy’s Continuity Theorem.

Characteristic functions are often used to:

  • prove the Central Limit Theorem,
  • analyze sums of independent random variables,
  • establish Gaussian or stable limits without explicit densities.

Mental trigger: independence + sums + scaling ⇒ characteristic functions.


Discrete Special Case

If (X_n) and (X) are discrete random variables with countable support, then \(X_n \xrightarrow{d} X \quad \Longleftrightarrow \quad P(X_n = x) \to P(X = x) \quad \text{for all } x.\)

This criterion is only valid in the purely discrete setting.

It does not apply to continuous random variables, since in that case \(P(X = x) = 0 \quad \text{for all } x.\)

Thus, pointwise convergence of probabilities is a special case, not a general definition of convergence in distribution.

Important Caution

Convergence in distribution is not defined by pointwise probabilities in general.

Valid general characterizations are:

  • convergence of distribution functions at continuity points,
  • convergence of expectations against bounded continuous test functions,
  • convergence of characteristic functions.

Pointwise convergence of probabilities applies only in the discrete case.

Relationship to Other Modes of Convergence

  • Strength hierarchy: \(X_n \xrightarrow{\text{a.s.}} X \;\Rightarrow\; X_n \xrightarrow{p} X \;\Rightarrow\; X_n \xrightarrow{d} X.\)

  • Convergence in distribution does not imply convergence in probability.
  • If the limit $X$ is a constant, then \(X_n \xrightarrow{d} c \;\Rightarrow\; X_n \xrightarrow{p} c.\)

Expectations and Convergence in Distribution

Convergence in distribution alone does not imply \(\mathbb{E}[X_n] \to \mathbb{E}[X].\)

To pass expectations through the limit, one needs:

  • uniform integrability, or
  • additional moment control.

This is a common source of mistakes.


Key Facts to Internalize

  • Convergence in distribution is the weakest standard mode of convergence.
  • It is invariant under changes of probability space.
  • It is the natural language of asymptotic theory (CLT, invariance principles).
  • It is stable under continuous transformations and Slutsky-type operations.

One-line Summary

Convergence in distribution means the laws of $X_n$ converge to the law of $X$, and it most commonly arises through the Central Limit Theorem, Slutsky’s theorem, and continuous mappings.

Comments