STT 996 – High Dimensional Probability

Lecture 07 (Reconstructed + Expanded)


§4 Subgaussian Random Vectors

Definition (Subgaussian Vector)

A random vector $X \in \mathbb{R}^n$ is subgaussian if for all $x \in S^{n-1}$, the 1-dimensional projection \(\langle X, x \rangle\) is a subgaussian random variable.


Norm Definition

The subgaussian norm of a vector is: \(\|X\|_{\psi_2} := \sup_{x \in S^{n-1}} \|\langle X, x \rangle\|_{\psi_2}\)


Recall (From Chapter 2)

Let ${X_k}_{k=1}^n$ be independent, centered subgaussian random variables: \(\mathbb{E}[X_k] = 0\)

Then: \(\left\|\sum_{k=1}^n a_k X_k \right\|_{\psi_2}^2 \le C \sum_{k=1}^n a_k^2 \|X_k\|_{\psi_2}^2\)


Result (Key Inequality)

Let $X = (X_1, \dots, X_n)$ with independent, centered subgaussian entries.

Then: \(\|X\|_{\psi_2} \le C \cdot \max_{1 \le k \le n} \|X_k\|_{\psi_2}\)


Proof Sketch (Reconstructed)

For $x \in S^{n-1}$: \(\langle X, x \rangle = \sum_{k=1}^n x_k X_k\)

Then: \(\|\langle X, x \rangle\|_{\psi_2}^2 \le C \sum_{k=1}^n x_k^2 \|X_k\|_{\psi_2}^2\)

Since: \(\sum_{k=1}^n x_k^2 = 1\)

We get: \(\|\langle X, x \rangle\|_{\psi_2}^2 \le C \max_k \|X_k\|_{\psi_2}^2\)

This bound does not depend on $x$, so taking supremum gives result.


Example: Standard Gaussian

Let: \(g_n \sim \mathcal{N}(0, I_n)\)

Then: \(g_n = (Z_1, \dots, Z_n), \quad Z_k \overset{iid}{\sim} \mathcal{N}(0,1)\)

Thus: \(\|g_n\|_{\psi_2} \le C \max_k \|Z_k\|_{\psi_2} \le C\)

(Each $Z_k$ is subgaussian with constant norm.)


General Gaussian Case

Let: \(X \sim \mathcal{N}(0, \Sigma)\)

Then: \(X = \Sigma^{1/2} g_n\)


Estimation

\[\|X\|_{\psi_2} = \sup_{x \in S^{n-1}} \|\langle \Sigma^{1/2} g_n, x \rangle\|_{\psi_2}\]

Rewrite: \(\langle \Sigma^{1/2} g_n, x \rangle = \langle g_n, \Sigma^{1/2} x \rangle\)

Thus: \(\|\langle X, x \rangle\|_{\psi_2} \le C \|\Sigma^{1/2} x\|_2\)

Take supremum: \(\|X\|_{\psi_2} \le C \sup_{x \in S^{n-1}} \|\Sigma^{1/2} x\|_2\)

But: \(\sup_{x \in S^{n-1}} \|\Sigma^{1/2} x\|_2 = \sqrt{\lambda_{\max}(\Sigma)}\)

So: \(\boxed{ \|X\|_{\psi_2} \le C \sqrt{\lambda_{\max}(\Sigma)} }\)


Example: Symmetric Bernoulli

Let: \(X_k \in \{-1, +1\}, \quad P(X_k = \pm 1) = \frac{1}{2}\)

Then $X = (X_1, \dots, X_n)$ is subgaussian.

Reason:

  • Each $X_k$ is subgaussian
  • Independence applies previous theorem

Uniform Distribution on the Sphere

Let: \(X \sim \text{Unif}(S^{n-1})\)

Goal: bound \(P(|\langle X, v \rangle| > t)\)


Key Trick: Gaussian Representation

Let: \(g \sim \mathcal{N}(0, I_n)\)

Then: \(X = \frac{g}{\|g\|_2}\)


Rotational Invariance Trick

Take orthogonal matrix $O$ such that: \(Ov = e_1\)

Then: \(\langle X, v \rangle = \langle OX, Ov \rangle = \langle OX, e_1 \rangle\)

But: \(OX \overset{d}{=} X\)

So: \(\langle X, v \rangle \overset{d}{=} X_1\)


Rewrite Event

\[\langle X, v \rangle = \frac{g_1}{\|g\|_2}\]

Thus: \(P(\langle X, v \rangle > t) = P\left(g_1 > t \|g\|_2 \right)\)


Split Norm

Let: \(\|g\|_2^2 = g_1^2 + \sum_{k=2}^n g_k^2\)

Then: \(g_1 > t \|g\|_2 \Rightarrow g_1^2 > t^2 (g_1^2 + \sum_{k=2}^n g_k^2)\)

Rearrange: \(g_1^2 (1 - t^2) > t^2 \sum_{k=2}^n g_k^2\)


Key Reduction

Let: \(\tilde{g} = (g_2, \dots, g_n)\)

Then: \(g_1^2 > \frac{t^2}{1 - t^2} \|\tilde{g}\|_2^2\)


Independence Trick

$g_1$ is independent of $\tilde{g}$.

So: \(P(\langle X, v \rangle > t) = \mathbb{E}\left[ P\left(g_1 > \frac{t}{\sqrt{1-t^2}} \|\tilde{g}\|_2 \mid \tilde{g} \right) \right]\)


Subgaussian Bound

Using Gaussian tail: \(P(g_1 > s) \le e^{-s^2/2}\)

So: \(\le \mathbb{E}\left[ \exp\left(-\frac{t^2}{2(1-t^2)} \|\tilde{g}\|_2^2 \right) \right]\)


MGF of Chi-Square

\[\|\tilde{g}\|_2^2 \sim \chi^2(n-1)\]

MGF: \(\mathbb{E}[e^{-s \|\tilde{g}\|_2^2}] = (1 + 2s)^{-(n-1)/2}\)

Plug in: \(s = \frac{t^2}{2(1-t^2)}\)

Gives: \(= (1 - t^2)^{(n-1)/2}\)


Final Bound

Using: \(1 - x \le e^{-x}\)

We get: \(P(|\langle X, v \rangle| > t) \le 2 e^{-c n t^2}\)


Key Conclusion

Uniform sphere is subgaussian with variance proxy $\sim 1/n$: \(\|\langle X, v \rangle\|_{\psi_2} \le \frac{C}{\sqrt{n}}\)


Kernel Trick (Preview)

Kernel Definition

\(K: \mathcal{X} \times \mathcal{X} \to \mathbb{R}\)

Examples:

  • Gaussian kernel: \(K(u,v) = e^{-\|u - v\|^2}\)

  • Polynomial kernel: \(K(u,v) = (\langle u, v \rangle + c)^k\)


Kernel Trick Idea

Find feature map: \(\Phi: \mathcal{X} \to \mathcal{H}\)

such that: \(K(u,v) = \langle \Phi(u), \Phi(v) \rangle_{\mathcal{H}}\)


Validity Condition

Kernel is valid iff: \([K(u_i, u_j)]_{i,j=1}^n \succeq 0\)


RKHS Note

We embed into a Hilbert space:

  • Reproducing Kernel Hilbert Space (RKHS)
  • Avoids explicit feature construction

Grothendieck Identity (Very Important)

Let: \(g \sim \mathcal{N}(0, I_n)\)

Then: \(\mathbb{E}\left[\operatorname{sign}(\langle g,u\rangle)\operatorname{sign}(\langle g,v\rangle)\right] = \frac{2}{\pi} \arcsin(\langle u,v \rangle)\)


Why This Works

Use rotational invariance:

WLOG: \(u = (1,0,\dots), \quad v = (v_1, v_2, 0,\dots)\)

Then: \(\langle g,u \rangle = g_1\)

\[\langle g,v \rangle = v_1 g_1 + v_2 g_2\]

This reduces to a 2D Gaussian problem.


Key Insight

The expectation depends only on: \(\langle u,v \rangle = \cos(\theta)\)

So geometry fully determines behavior.


Big Picture Takeaways

  • Subgaussian vectors are controlled via 1D projections
  • Independence → clean norm bounds
  • Gaussian + orthogonal invariance = powerful reduction tool
  • Uniform sphere ≈ Gaussian normalized
  • Kernel trick = geometry in disguise
  • Grothendieck identity connects probability ↔ geometry

Comments