2023-Q3 – Heavy-Tailed Symmetric Variables, Truncation, and Limit Laws
2023 Probability Prelim Exam (PDF)
Problem Statement
Let $X, X_1, X_2, \dots$ be a sequence of i.i.d. random variables such that
- $X$ is symmetric, meaning $X \overset{d}{=} -X$,
- The tail satisfies
\(P(|X|>x)= \begin{cases} 1, & 0\le x < e,\\[4pt] \dfrac{1}{x^2\ln x}, & x \ge e. \end{cases}\)
Define the truncated variables \(Y_{n,m} = X_m \cdot \mathbf{1}\{|X_m|\le \sqrt{n}\}, \qquad m=1,\dots,n.\)
Solve the following:
(a) Compute $E(X^2)$.
(b)
(i) Prove
\(\sum_{m=1}^n P(Y_{n,m}\neq X_m)\;\xrightarrow[n\to\infty]{}\;0.\)
(ii) Show
\(E(Y_{n,m}^2) \sim 2\ln(\ln n)
\qquad\text{as } n\to\infty.\)
(c) Prove the following two sequences converge in distribution and identify the limit:
(i) \(\frac{\sum_{m=1}^n Y_{n,m}}{\sqrt{2\ln(\ln n)}}\)
(ii) \(\frac{\sum_{m=1}^n X_m}{\sqrt{2\ln(\ln n)}}\)
Solution
(a) Divergence of the second moment
:contentReference[oaicite:3]{index=3}
Claim.
$E(X^2)=\infty$.
Proof.
Since $X$ is symmetric and $x^2$ is even,
\(E(X^2)=2\int_0^\infty x^2\,dP(X>x)
=2\int_0^\infty x\,P(|X|>x)\,dx.\)
For $x\ge e$ we use the given tail: \(P(|X|>x)=\frac{1}{x^2\ln x}.\)
Thus for $x\ge e$, \(\int_e^\infty x \cdot \frac{1}{x^2\ln x}\,dx = \int_e^\infty \frac{dx}{x\ln x}.\)
The integral
\(\int_e^\infty \frac{dx}{x\ln x}\)
is the classical integral that diverges like $\ln(\ln x)$.
Therefore $E(X^2)=\infty$.
Conclusion.
$X$ has infinite variance; the truncation at $\sqrt{n}$ in later parts is essential.
(b)(i) Probability that truncation changes $X_m$
Claim.
\(\sum_{m=1}^n P(Y_{n,m}\neq X_m) = \sum_{m=1}^n P(|X|>\sqrt{n}) \longrightarrow 0.\)
Proof.
The event $Y_{n,m}\neq X_m$ occurs exactly when
\(|X_m|>\sqrt{n}.\)
By iid, \(\sum_{m=1}^n P(Y_{n,m}\neq X_m) = n\,P(|X|>\sqrt{n}).\)
For $\sqrt{n}\ge e$, \(P(|X|>\sqrt{n}) = \frac{1}{n\,\ln(\sqrt{n})} = \frac{2}{n\ln n}.\)
Hence \(n\cdot \frac{2}{n\ln n} = \frac{2}{\ln n} \to 0.\)
Conclusion.
The probability that truncation changes any coordinate goes to zero.
(b)(ii) Asymptotic variance of the truncated variable
Claim.
\(E(Y_{n,m}^2) \sim 2\ln(\ln n).\)
Proof.
Because $Y_{n,m}=X_m$ on ${|X_m|\le\sqrt{n}}$, \(E(Y_{n,m}^2) = \int_0^{\sqrt{n}} 2x\,P(|X|>x)\,dx.\)
Split the integral at $e$:
-
The part $\int_0^e 2x\cdot 1\,dx$ is finite and irrelevant asymptotically.
-
For $x\ge e$, \(2x\,P(|X|>x) = 2x\cdot \frac{1}{x^2\ln x} = \frac{2}{x\ln x}.\)
Thus for the dominant term: \(\int_e^{\sqrt{n}} \frac{2}{x\ln x}\,dx = 2\ln(\ln(\sqrt{n})) - 2\ln(\ln e) = 2\ln\!\big(\tfrac12\ln n\big) + O(1).\)
As $n\to\infty$, \(2\ln(\tfrac12\ln n) =2\ln(\ln n) + O(1).\)
Conclusion.
\(E(Y_{n,m}^2) \sim 2\ln(\ln n).\)
(c)(i) Limit distribution of the truncated sum
Claim.
\(\frac{\sum_{m=1}^n Y_{n,m}}{\sqrt{2\ln(\ln n)}} \;\xrightarrow{d}\; N(0,1).\)
Proof.
Define for each fixed $n$ the independent triangular array variables
$Z_{n,m}=Y_{n,m}$.
We check Lindeberg–Feller CLT.
-
Mean zero: symmetry gives $E(Y_{n,m})=0$.
-
Variance: \(s_n^2 = \sum_{m=1}^n E(Y_{n,m}^2) \sim n\cdot 2\ln(\ln n).\)
-
Lindeberg condition:
For any $\varepsilon>0$, consider \(L_n(\varepsilon) = \frac{1}{s_n^2}\sum_{m=1}^n E\!\left[ Y_{n,m}^2\;\mathbf{1}\{|Y_{n,m}|>\varepsilon s_n\}\right].\)
But $|Y_{n,m}|\le \sqrt{n}$, while $s_n\asymp \sqrt{n\ln\ln n}$.
For any fixed $\varepsilon$, for large $n$, \(|Y_{n,m}| \le \sqrt{n} < \varepsilon s_n,\) so the indicator is zero eventually. Hence $L_n(\varepsilon)=0$ for large $n$.
Thus Lindeberg–Feller applies: \(\frac{\sum_{m=1}^n Y_{n,m}}{s_n} \;\xrightarrow{d}\; N(0,1).\)
Since
\(s_n = \sqrt{n\,E(Y_{n,m}^2)}\sim \sqrt{n\cdot 2\ln(\ln n)},\)
dividing numerator and denominator by $\sqrt{n}$ gives
\(\frac{\sum_{m=1}^n Y_{n,m}}{\sqrt{2\ln(\ln n)}}\xrightarrow{d}N(0,1).\)
Conclusion.
The truncated normalized sum converges to standard normal.
(c)(ii) Convergence of the full sum
Claim.
\(\frac{\sum_{m=1}^n X_m}{\sqrt{2\ln(\ln n)}} \;\xrightarrow{d}\; N(0,1).\)
Proof.
Let
\(D_n = \sum_{m=1}^n (X_m - Y_{n,m}).\)
By part (b)(i),
\(P(X_m\neq Y_{n,m}) = P(|X_m|>\sqrt{n})\)
and the sum of these probabilities goes to $0$.
Thus
\(\sum_{m=1}^n \mathbf{1}\{X_m\neq Y_{n,m}\}
\to 0 \quad\text{in probability}.\)
Because the truncation only removes terms with $|X_m|>\sqrt{n}$ and those occur with vanishing total probability, we get \(D_n \xrightarrow{P} 0.\)
We already know from (c)(i) that
\(\frac{\sum_{m=1}^n Y_{n,m}}{\sqrt{2\ln(\ln n)}}\xrightarrow{d}N(0,1).\)
Apply Slutsky’s theorem: \(\frac{\sum_{m=1}^n X_m}{\sqrt{2\ln(\ln n)}} = \frac{\sum_{m=1}^n Y_{n,m}}{\sqrt{2\ln(\ln n)}} + \frac{D_n}{\sqrt{2\ln(\ln n)}} \;\xrightarrow{d}\; N(0,1).\)
Conclusion.
The full untruncated sum has the same limit distribution as the truncated one.
Key Takeaways
-
Infinite variance variables can still produce CLT-type results if one truncates at a growing level; the variance behaves like
\(\operatorname{Var}(Y_{n,m})\sim 2\ln(\ln n).\) -
Tail integration technique: For heavy tails, \(E(X^2)=2\int x\,P(|X|>x)\,dx\) is often easier than integrating the density.
-
When Lindeberg holds automatically:
If $|Y_{n,m}|\le a_n$ and $a_n/s_n\to 0$, then the Lindeberg indicator is eventually always zero. - Truncation + Borel–Cantelli + Slutsky:
- Truncate heavy tails to obtain a workable variance,
- Show truncation errors vanish in probability,
- Apply Slutsky to transfer the limit distribution from the truncated sum to the full sum.
- Heavy-tail behavior:
The integral $\int dx /(x\ln x)$ diverges like $\ln(\ln x)$, which is why $\ln(\ln n)$ naturally appears.
Comments