Lecture 32 — Scheffé’s Theorem, Total Variation, and Portmanteau Properties
1. Scheffé’s Theorem
Let $f_n$ be the pdf of $X_n$ and let $f$ be the pdf of $X$.
Assume:
(Sketched on page 1.)
Goal
Show that \(\sup_{B\in\mathcal B(\mathbb R)} \vert P(X_n\in B) - P(X\in B) \vert \;\longrightarrow\; 0,\) i.e. convergence in total variation.
Key inequality
For any Borel set $B$, \(\vert P(X_n\in B) - P(X\in B)\vert = \vert \int_B f_n(x)\,dx - \int_B f(x)\,dx\vert \le \int_B \vert f_n(x) - f(x)\vert \,dx.\)
Thus, \(\sup_B \vert P(X_n\in B) - P(X\in B)\vert \le \int_{\mathbb R} \vert f_n(x)-f(x)\vert \,dx. \tag{1}\)
Hence it suffices to show: \(\int_{\mathbb R}\vert f_n-f\vert \to 0.\)
2. Positivity decomposition
Using \(a = a^+ - a^-, \qquad \vert a\vert = a^+ + a^-,\) write: \(\int_{\mathbb R}(f - f_n) = \int (f - f_n)^+ - \int (f - f_n)^-.\)
But \(\int (f - f_n) = 0 \quad\text{since both integrate to 1}.\)
Thus, \(\int \vert f - f_n\vert = 2\int (f - f_n)^+. \tag{2}\)
Dominating function
On page 1 your notes check the DCT bound:
-
If $f_n(x) > f(x)$ then
$(f(x)-f_n(x))^+ = 0 \le f(x)$. -
If $f_n(x) < f(x)$ then
$0 < f(x)-f_n(x)\le f(x)$.
So: \((f(x)-f_n(x))^+ \le f(x).\)
Since $f_n\to f$ pointwise and the dominating function $f$ is integrable,
DCT gives:
Plugging into (2):
\[\int \vert f_n-f\vert \to 0.\]Then (1) gives total variation convergence.
3. Consequence for Distribution Functions
In our setting, $P(X=x)=0$ for all $x\in\mathbb R$.
Thus the CDFs are continuous.
If $\vert M_n - M\vert _{TV}\to 0$, then:
\[F_{X_n}(x) = P(X_n \le x) \to P(X\le x) = F_X(x) \quad\text{uniformly in }x.\]So CDF convergence is uniform.
This comment appears in your notes as:
“Convergence should be uniformly.”
4. Total Variation Norm
For probability measures $M_n(B)=P(X_n\in B)$:
\[\vert M_n - M\vert = \sup_{B\in \mathcal B(\mathbb R)} \vert M_n(B) - M(B)\vert \to 0.\]This is stronger than convergence in distribution.
5. Example — The Empirical Median of Uniform[0,1]
(Derived on pages 1–2; reference to Durrett 3.2.6.)
Let ${U_k}$ be i.i.d. uniform on $[0,1]$.
Let $U_{(1)}\le U_{(2)}\le \dots \le U_{(2n+1)}$ be the order statistics.
The sample median is
\(V_{n+1} = U_{(n+1)}.\)
Density (page 2 diagram)
\(f_{V_{n+1}}(x) = \binom{2n+1}{n,n} x^n (1-x)^n, \qquad 0<x<1.\)
This comes from the multinomial probability that exactly $n$ observations lie left of $x$ and $n$ right of $x$.
Mean
\(E[V_{n+1}] = \frac12.\)
Variance
Using the Beta integral identity (page 2): \(\int_0^1 x^k(1-x)^l\,dx = \frac{k!\,l!}{(k+l+1)!},\) your notes compute: \(\operatorname{Var}(V_{n+1}) = \frac{n}{4(2n+1)(2n+2)} \sim \frac{1}{8n}.\)
Normalization
Define \(Y_n = 2\sqrt{2n}\left(V_{n+1} - \frac12\right).\)
Then (page 2): \(f_{Y_n}(y) = \phi_n(y) = \left(1 - \frac{y^2}{2n}\right)^n \quad\to\quad e^{-y^2/2},\)
so \(Y_n \Rightarrow N(0,1).\)
This is a CLT for the sample median under uniform sampling.
6. Portmanteau Theorem (Durrett Thm 3.2.5)
Your notes (page 2–3) list the four equivalent conditions for weak convergence.
Let $X_n\Rightarrow X$.
The following are equivalent:
-
(CDF convergence)
$X_n \Rightarrow X$. -
(Open sets)
\(\liminf_{n\to\infty} P(X_n\in O) \ge P(X\in O),\qquad \forall\ O\ \text{open}.\) -
(Closed sets)
\(\limsup_{n\to\infty} P(X_n\in C) \le P(X\in C),\qquad \forall\ C\ \text{closed}.\) -
(Boundary condition)
For all Borel sets $A$ with boundary $\partial A$ satisfying \(P(X \in \partial A)=0,\) we have \(P(X_n\in A)\to P(X\in A).\)
Durrett Probability 3.2.5 - Portmanteau Theorem
The following are equivalent: \(\begin{aligned} &\quad(i)\quad X_n\Rightarrow X_\infty, \\ &\quad(ii)\quad \text{ for all open sets } G, \liminf_{n\to\infty} P(X_n\in G)\ge P(X_\infty\in G), \\ &\quad(iii)\quad \text{ for all closed sets } F, \limsup_{n\to\infty} P(X_n\in F)\le P(X_\infty\in F), \\ &\quad(iv)\quad \text{ for all Borel sets }A\text{ with } P(X_\infty\in \partial A)=0, \lim_{n\to\infty} P(X_n\in A)= P(X_\infty\in A). \end{aligned}\)
Sketch: (4) ⇒ (1)
(Page 3 diagram.)
Take the interval $A=(-\infty,x]$.
Its boundary is ${x}$.
If $P(X=x)=0$, then (4) gives:
which is exactly the definition of convergence in distribution.
Sketch: (1) ⇒ (2)
If $X_n\to X$ almost surely (the stronger assumption used in notes for intuition): \(\mathbb{1}_{\{X_n\in O\}} \to \mathbb{1}_{\{X\in O\}}\) pointwise except on a null set.
Then by Fatou’s lemma:
\[\liminf_n P(X_n\in O) = \liminf_n E[\mathbb{1}_{\{X_n \in O\}}] \ge E[\liminf_n \mathbb{1}_{\{X_n\in O\}}] = P(X\in O).\](2) ⇒ (3)
Apply (2) to $O = C^c$.
Then take complements.
(3) ⇒ (4)
For any Borel set $A$:
\[A^\circ \subseteq A \subseteq \overline A.\]Since \(\partial A = \overline A \cap (\overline{A^\circ})^c,\) the condition $P(X\in\partial A)=0$ ensures:
\[P(X\in \overline A) = P(X\in A) = P(X\in A^\circ).\]Apply (2) and (3) to the interior and closure.
Cheat-Sheet Summary — Lecture 32
-
Scheffé’s theorem: pointwise convergence of pdfs ⇒ convergence in total variation.
Uses positivity decomposition and DCT. -
Total variation convergence implies uniform CDF convergence.
-
Empirical median CLT for uniform samples: \(2\sqrt{2n}\left(V_{n+1}-1/2\right) \Rightarrow N(0,1).\)
-
Portmanteau Theorem gives 4 equivalent formulations of weak convergence:
- CDF convergence.
- Open-set inequality.
- Closed-set inequality.
- Convergence on sets with zero boundary mass.
Comments