High Dimensional Probability — Lecture 01
Monday, January 12, 2026
Course structure
- Textbook: High Dimensional Probability
- Chapter 1: Review of Probability
- Chapter 2: Preparation of material from one dimension ↓
- Chapter 3: Random vectors
Concentration Inequalities
Hoeffding Inequality
Let {X_k}_{k=1}^N be i.i.d. symmetric Bernoulli:
- P(X_k = ±1) = 1/2
Let a_1, …, a_N ∈ ℝ.
\[\mathbb{P}\left( \sum_{k=1}^N a_k X_k > t \right) \le \exp\left( - \frac{t^2}{\|a\|_2^2} \right)\]where \(\|a\|_2 = \left( \sum_{k=1}^N a_k^2 \right)^{1/2}\) (distance from vector 0)
This form will become a big deal.
Example: Z ~ N(0,1)
\[\mathbb{P}(Z > t) = \frac{e^{-t^2/2}}{\sqrt{2\pi}}, \quad t > 0\]Sub-Gaussian — tail smaller than Gaussian tail
Let \(S_N = \sum_{k=1}^N a_k X_k\)
Then \(\mathrm{SD}(S_N) = \|a\|_2\)
\[\mathrm{Var}(S_N) = \sum_{k=1}^N a_k^2\] \[\mathbb{P}\left( \sum_{k=1}^N a_k X_k > \|a\|_2 t \right) \le e^{-t^2/2}\]This is close to standard normal by CLT.
Thus: \(\mathbb{P}(Z > t) \le e^{-t^2/2}\)
Why Hoeffding if we have CLT?
The difference between the two is $ 1/\sqrt{N} $, which is huge.
\[\mathbb{P}\left( \sum_{k=1}^N a_k X_k > \|a\|_2 t \right) \le e^{-t^2/2}\]versus
\[\mathbb{P}(Z > t) \le e^{-t^2/2}\]Exponential trick
\[\mathbb{P}\left( \sum_{k=1}^N a_k X_k > t \right) = \mathbb{P}\left( e^{\lambda \sum a_k X_k} > e^{\lambda t} \right)\]Use exponentials to make probability small.
Markov Inequality
For any Y ≥ 0: \(\mathbb{P}(Y > t) \le \frac{\mathbb{E}[Y]}{t}, \quad t > 0\)
Apply to $ Y = e^{\lambda \sum a_k X_k} $:
\[\mathbb{P}\left( \sum a_k X_k > t \right) \le e^{-\lambda t} \mathbb{E}\left[ e^{\lambda \sum a_k X_k} \right]\] \[= e^{-\lambda t} \prod_{k=1}^N \mathbb{E}\left[ e^{\lambda a_k X_k} \right]\]Moment generating bound
\[\mathbb{E}[e^{\lambda X}] = \frac{e^\lambda + e^{-\lambda}}{2} \le e^{\lambda^2/2}, \quad \lambda > 0\](Taylor expansion trick)
\[e^x = \sum_{k=0}^\infty \frac{x^k}{k!}\]Odd terms cancel: \(\frac{e^x + e^{-x}}{2} = \sum_{k=0}^\infty \frac{x^{2k}}{(2k)!} \le \sum_{k=0}^\infty \frac{x^{2k}}{2^k k!} = e^{x^2/2}\)
Thus: \(\mathbb{E}[e^{\lambda a_k X_k}] \le e^{\lambda^2 a_k^2 / 2}\)
Combine and minimize
\[\mathbb{P}\left( \sum a_k X_k > t \right) \le e^{-\lambda t} e^{\lambda^2 \|a\|_2^2 / 2}\]Minimize over λ:
\[\frac{d}{d\lambda} \left( -\lambda t + \frac{\lambda^2}{2} \|a\|_2^2 \right) = -t + \lambda \|a\|_2^2 = 0\] \[\lambda = \frac{t}{\|a\|_2^2}\]Plug back in: $$ \exp\left( - \frac{t^2}{|a|_2^2}
- \frac{t^2}{2|a|_2^2} \right) = e^{-t^2/|a|_2^2} $$
Chernoff Inequality
Let {X_k}_{k=1}^N be i.i.d. with $ X_k \sim \text{Bern}(p_k) $.
\[S_N = \sum_{k=1}^N X_k\] \[\mu = \mathbb{E}[S_N] = \sum_{k=1}^N p_k\]For $ t > \mu $:
\[\mathbb{P}(S_N > t) \le e^{-\mu} \left( \frac{e\mu}{t} \right)^t\]Proof sketch
\[\mathbb{P}\left( \sum X_k > t \right) = \mathbb{P}\left( e^{\lambda \sum X_k} > e^{\lambda t} \right)\] \[\le e^{-\lambda t} \prod_{k=1}^N \mathbb{E}[e^{\lambda X_k}]\] \[\mathbb{E}[e^{\lambda X_k}] = e^\lambda p_k + (1 - p_k) = (e^\lambda - 1)p_k + 1 \le \exp\left( (e^\lambda - 1)p_k \right)\]Thus: \(\le e^{-\lambda t + (e^\lambda - 1)\mu}\)
Minimize to get: \(\lambda = \log(t/\mu)\)
Two-sided bound
-
If $ t \ge \mu $: \(\mathbb{P}(S_N > t) \le e^{-\mu} \left( \frac{e\mu}{t} \right)^t\)
-
If $ t < \mu $: \(\mathbb{P}(S_N < t) \le e^{-\mu} \left( \frac{e\mu}{t} \right)^t\)
Example: Poisson
Let $ X \sim \text{Poisson}(\lambda) $.
Represent as: \(X = \sum_{k=1}^N X_k, \quad X_k \sim \text{Bernoulli}(p_k = \lambda/N)\)
Then $ \mu = \lambda $.
\[\mathbb{P}(X \ge t) \le e^{-\lambda} \left( \frac{e\lambda}{t} \right)^t, \quad t > \lambda\] \[\mathbb{P}(X \le t), \quad t < \lambda\]Compare to sub-Gaussian: \(e^{-t^2/\lambda} \quad \text{vs} \quad \exp\left( -t + t\log(t/\lambda) \right)\)
- For $ t \lesssim \lambda $: behaves Gaussian
- For $ t \gg \lambda $: behaves exponential (not sub-Gaussian)
If $ \lambda \to \infty $: \(\frac{X - \lambda}{\sqrt{\lambda}} \Rightarrow N(0,1)\)
Chernoff Setup (normalized)
\[\mathbb{P}(|S_N - \mu| \ge \delta \mu) \le 2 e^{-C \mu \delta^2}\]where $ 0 < \delta < 1 $, and $ C $ is an absolute constant.
Assume $ \mu = 1 $:
\[\{ |S_N - 1| \ge \delta \} = \{ S_N \ge 1 + \delta \} \cup \{ S_N \le 1 - \delta \}\] \[\le \exp\{ -\delta - (1-\delta)\log(1-\delta) \} + \exp\{ \delta - (1+\delta)\log(1+\delta) \}\]Using Taylor series: \(\log(1 - \delta) = -\sum_{n=1}^\infty \frac{\delta^n}{n}\)
For Poisson: \(\mathbb{P}(|X - \lambda| > t) \le 2 e^{-C t^2/\lambda}, \quad t < \lambda\)
Conclusion: This is Gaussian-like behavior, but only up to $ \lambda $.
Comments