2023-Q3 – Heavy-Tailed Symmetric Variables, Truncation, and Limit Laws

2023 Probability Prelim Exam (PDF)

Problem Statement

Let $X, X_1, X_2, \dots$ be a sequence of i.i.d. random variables such that

  • $X$ is symmetric, meaning $X \overset{d}{=} -X$,
  • The tail satisfies
    \(P(|X|>x)= \begin{cases} 1, & 0\le x < e,\\[4pt] \dfrac{1}{x^2\ln x}, & x \ge e. \end{cases}\)

Define the truncated variables \(Y_{n,m} = X_m \cdot \mathbf{1}\{|X_m|\le \sqrt{n}\}, \qquad m=1,\dots,n.\)

Solve the following:

(a) Compute $E(X^2)$.

(b)
(i) Prove
\(\sum_{m=1}^n P(Y_{n,m}\neq X_m)\;\xrightarrow[n\to\infty]{}\;0.\)

(ii) Show
\(E(Y_{n,m}^2) \sim 2\ln(\ln n) \qquad\text{as } n\to\infty.\)

(c) Prove the following two sequences converge in distribution and identify the limit:

(i) \(\frac{\sum_{m=1}^n Y_{n,m}}{\sqrt{2\ln(\ln n)}}\)

(ii) \(\frac{\sum_{m=1}^n X_m}{\sqrt{2\ln(\ln n)}}\)


Solution

(a) Divergence of the second moment

:contentReference[oaicite:3]{index=3}

Claim.

$E(X^2)=\infty$.

Proof.

Since $X$ is symmetric and $x^2$ is even,
\(E(X^2)=2\int_0^\infty x^2\,dP(X>x) =2\int_0^\infty x\,P(|X|>x)\,dx.\)

For $x\ge e$ we use the given tail: \(P(|X|>x)=\frac{1}{x^2\ln x}.\)

Thus for $x\ge e$, \(\int_e^\infty x \cdot \frac{1}{x^2\ln x}\,dx = \int_e^\infty \frac{dx}{x\ln x}.\)

The integral \(\int_e^\infty \frac{dx}{x\ln x}\) is the classical integral that diverges like $\ln(\ln x)$.
Therefore $E(X^2)=\infty$.

Conclusion.

$X$ has infinite variance; the truncation at $\sqrt{n}$ in later parts is essential.


(b)(i) Probability that truncation changes $X_m$

Claim.

\(\sum_{m=1}^n P(Y_{n,m}\neq X_m) = \sum_{m=1}^n P(|X|>\sqrt{n}) \longrightarrow 0.\)

Proof.

The event $Y_{n,m}\neq X_m$ occurs exactly when
\(|X_m|>\sqrt{n}.\)

By iid, \(\sum_{m=1}^n P(Y_{n,m}\neq X_m) = n\,P(|X|>\sqrt{n}).\)

For $\sqrt{n}\ge e$, \(P(|X|>\sqrt{n}) = \frac{1}{n\,\ln(\sqrt{n})} = \frac{2}{n\ln n}.\)

Hence \(n\cdot \frac{2}{n\ln n} = \frac{2}{\ln n} \to 0.\)

Conclusion.

The probability that truncation changes any coordinate goes to zero.


(b)(ii) Asymptotic variance of the truncated variable

Claim.

\(E(Y_{n,m}^2) \sim 2\ln(\ln n).\)

Proof.

Because $Y_{n,m}=X_m$ on ${|X_m|\le\sqrt{n}}$, \(E(Y_{n,m}^2) = \int_0^{\sqrt{n}} 2x\,P(|X|>x)\,dx.\)

Split the integral at $e$:

  1. The part $\int_0^e 2x\cdot 1\,dx$ is finite and irrelevant asymptotically.

  2. For $x\ge e$, \(2x\,P(|X|>x) = 2x\cdot \frac{1}{x^2\ln x} = \frac{2}{x\ln x}.\)

Thus for the dominant term: \(\int_e^{\sqrt{n}} \frac{2}{x\ln x}\,dx = 2\ln(\ln(\sqrt{n})) - 2\ln(\ln e) = 2\ln\!\big(\tfrac12\ln n\big) + O(1).\)

As $n\to\infty$, \(2\ln(\tfrac12\ln n) =2\ln(\ln n) + O(1).\)

Conclusion.

\(E(Y_{n,m}^2) \sim 2\ln(\ln n).\)


(c)(i) Limit distribution of the truncated sum

Claim.

\(\frac{\sum_{m=1}^n Y_{n,m}}{\sqrt{2\ln(\ln n)}} \;\xrightarrow{d}\; N(0,1).\)

Proof.

Define for each fixed $n$ the independent triangular array variables
$Z_{n,m}=Y_{n,m}$.

We check Lindeberg–Feller CLT.

  • Mean zero: symmetry gives $E(Y_{n,m})=0$.

  • Variance: \(s_n^2 = \sum_{m=1}^n E(Y_{n,m}^2) \sim n\cdot 2\ln(\ln n).\)

  • Lindeberg condition:
    For any $\varepsilon>0$, consider \(L_n(\varepsilon) = \frac{1}{s_n^2}\sum_{m=1}^n E\!\left[ Y_{n,m}^2\;\mathbf{1}\{|Y_{n,m}|>\varepsilon s_n\}\right].\)
    But $|Y_{n,m}|\le \sqrt{n}$, while $s_n\asymp \sqrt{n\ln\ln n}$.
    For any fixed $\varepsilon$, for large $n$, \(|Y_{n,m}| \le \sqrt{n} < \varepsilon s_n,\) so the indicator is zero eventually. Hence $L_n(\varepsilon)=0$ for large $n$.

Thus Lindeberg–Feller applies: \(\frac{\sum_{m=1}^n Y_{n,m}}{s_n} \;\xrightarrow{d}\; N(0,1).\)

Since
\(s_n = \sqrt{n\,E(Y_{n,m}^2)}\sim \sqrt{n\cdot 2\ln(\ln n)},\) dividing numerator and denominator by $\sqrt{n}$ gives \(\frac{\sum_{m=1}^n Y_{n,m}}{\sqrt{2\ln(\ln n)}}\xrightarrow{d}N(0,1).\)

Conclusion.

The truncated normalized sum converges to standard normal.


(c)(ii) Convergence of the full sum

Claim.

\(\frac{\sum_{m=1}^n X_m}{\sqrt{2\ln(\ln n)}} \;\xrightarrow{d}\; N(0,1).\)

Proof.

Let
\(D_n = \sum_{m=1}^n (X_m - Y_{n,m}).\)

By part (b)(i), \(P(X_m\neq Y_{n,m}) = P(|X_m|>\sqrt{n})\) and the sum of these probabilities goes to $0$.
Thus \(\sum_{m=1}^n \mathbf{1}\{X_m\neq Y_{n,m}\} \to 0 \quad\text{in probability}.\)

Because the truncation only removes terms with $|X_m|>\sqrt{n}$ and those occur with vanishing total probability, we get \(D_n \xrightarrow{P} 0.\)

We already know from (c)(i) that
\(\frac{\sum_{m=1}^n Y_{n,m}}{\sqrt{2\ln(\ln n)}}\xrightarrow{d}N(0,1).\)

Apply Slutsky’s theorem: \(\frac{\sum_{m=1}^n X_m}{\sqrt{2\ln(\ln n)}} = \frac{\sum_{m=1}^n Y_{n,m}}{\sqrt{2\ln(\ln n)}} + \frac{D_n}{\sqrt{2\ln(\ln n)}} \;\xrightarrow{d}\; N(0,1).\)

Conclusion.

The full untruncated sum has the same limit distribution as the truncated one.


Key Takeaways

  • Infinite variance variables can still produce CLT-type results if one truncates at a growing level; the variance behaves like
    \(\operatorname{Var}(Y_{n,m})\sim 2\ln(\ln n).\)

  • Tail integration technique: For heavy tails, \(E(X^2)=2\int x\,P(|X|>x)\,dx\) is often easier than integrating the density.

  • When Lindeberg holds automatically:
    If $|Y_{n,m}|\le a_n$ and $a_n/s_n\to 0$, then the Lindeberg indicator is eventually always zero.

  • Truncation + Borel–Cantelli + Slutsky:
    • Truncate heavy tails to obtain a workable variance,
    • Show truncation errors vanish in probability,
    • Apply Slutsky to transfer the limit distribution from the truncated sum to the full sum.
  • Heavy-tail behavior:
    The integral $\int dx /(x\ln x)$ diverges like $\ln(\ln x)$, which is why $\ln(\ln n)$ naturally appears.

Comments