## Notes of Michel Ledoux’s course

Talagrand’s inequality and hypercontractivity

1. Talagrand’s inequality

Theorem 1 (Talagrand) Let ${f:\{\pm 1\}^N\rightarrow{\mathbb R}}$. Let ${\mu}$ be the uniform measure on ${\{\pm 1\}^N}$.

$\displaystyle \begin{array}{rcl} Var_{\mu}(f)\leq C\,\sum_{i=1}^{N}\frac{|D_i f|_2^2}{1+\log(\frac{|D_i f|_2}{|D_i f|_1})}. \end{array}$

This improves on the KKL theorem. Indeed, it implies

Theorem 2 (Kahn, Kalai, Linial) Let ${f:\{\pm 1\}^N\rightarrow\{\pm 1\}}$. Then there exists ${i\in[N]}$ such that

$\displaystyle \begin{array}{rcl} Inf_i (f)\geq\frac{1}{8C}\mu(A)(1-\mu(A))\frac{\log N}{N}. \end{array}$

Indeed, for any ${p}$,

$\displaystyle \begin{array}{rcl} Inf_i (f)=\frac{1}{2}|D_i f|_p^p . \end{array}$

Today, I will prove Talagrand’s inequality in a slightly extended context.

2. Markov chains

A Markov kernel on a finite set ${X}$ is a function ${K:X\times X\rightarrow{\mathbb R}_+}$ such that ${\sum_{y\in X}K(x,y)=1}$ for all ${x}$. Say ${K}$ is reversible with respect to invariant probability measure ${\mu}$ if

$\displaystyle \begin{array}{rcl} \sum_{x\in X}K(x,y)\mu(x)=\mu(y), \end{array}$

and

$\displaystyle \begin{array}{rcl} K(x,y)\mu(x)=K(y,x)\mu(y). \end{array}$

The corresponding Dirichlet form is ${\mathcal{E}(f,g)=\frac{1}{2}\sum_{x,y\in X}(f(x)-f(y))(g(x)-g(y))}$.

The spectral gap is the best constant ${\lambda}$ such that ${\lambda Var_{\mu}(f)\leq \mathcal{E}(f,f)}$.

The logarithmic Sobolev constant is the best constant ${\rho}$ in

$\displaystyle \begin{array}{rcl} \rho Ent_{\mu}(f^2)\leq 2 \mathcal{E}(f,f). \end{array}$

Here,

$\displaystyle \begin{array}{rcl} Ent_{\mu}(f^2)=\int f\log f \,d\mu -(\int f\,d\mu) \log(\int f\,d\mu). \end{array}$

It is stronger than spectral gap, ${\rho\leq\lambda}$ (with our choice of normalization).

Spectral gap is equivalent to exponential decay under heat flow.

Logarithmic Sobolev inequality is equivalent to hypercontractivity of heat flow,

$\displaystyle \begin{array}{rcl} \|P_t f\|_2 \leq \|f\|_p \end{array}$

with ${p=1+e^{-2\rho t}<2}$.

On a product space ${X_1 \times X_2}$, there is a product kernel, and

$\displaystyle \begin{array}{rcl} \lambda =\min(\lambda_1 ,\lambda_2),\quad \rho=\min(\rho_1 ,\rho_2). \end{array}$

Example 1 Let ${K(x,y)=\mu(y)}$.

In other words, ${KF=\mathop{\mathbb E}_{\mu}(f)}$, ${\mathcal{E}(f,f)=Var_{\mu}(f)}$, ${\lambda=1}$.

Example 2 Product of ${N}$ copies of previous example.

Then

$\displaystyle \begin{array}{rcl} \mathcal{E}(f,f)=\sum_{i=1}^{N}Var_{\mu_i}(f)=\sum_{i=1}^{N}|\int f d\mu_i -f|_2^2. \end{array}$

For this example, Talagrand’s inequality states that

$\displaystyle \begin{array}{rcl} Var_{\bigotimes \mu_i}(f)\leq\frac{4}{\rho}\sum_{i=1}^{N}\frac{Var_{\mu_i}(f)}{1+\log(\frac{Var_{\mu_i}(f)}{\|\int f d\mu_i -f\|_1})}. \end{array}$

On the biased cube, i.e. ${X_i =\{\pm 1\}}$, ${\mu=p\delta_1 +q\delta_{-1}}$, ${\lambda=1}$, ${\rho=\frac{2(p-q)}{\log p -\log q}}$ (Bonami-Beckner inequality).

2.1. Proof of Talagrand’s inequality for products

Now I give the proof.

Assume ${\mathop{\mathbb E}_{\mu}(f)=0}$. Write

$\displaystyle \begin{array}{rcl} Var_{\mu}(f)&=&-\int_{0}^{\infty}\frac{d}{dt}(\int(P_t f)^2 \,d\mu)\,dt\\ &=& 2\int_{0}^{\infty}E(P_t f ,P_t f)\,dt\\ &=&2\sum_{i=1}^{N}\int_{0}^{\infty}(\int(L_i P_t f)^2 \,d\mu)\,dt. \end{array}$

We treat separately small and large values of ${t}$. Write

$\displaystyle \begin{array}{rcl} \|f\|^2 =\|f\|^2 -\|P_T f\|^2 +\|P_T f\|^2 . \end{array}$

Since

$\displaystyle \begin{array}{rcl} \|P_T f\|^2 \leq e^{-2T}\|f\|^2 , \end{array}$

for ${T\geq \frac{1}{2}}$,

$\displaystyle \begin{array}{rcl} Var_{\mu}(f)\leq 4\sum_{i=1}^{N}\int_{0}^{T}(\int(L_i P_t f)^2 \,d\mu)\,dt. \end{array}$

Commutation: ${P_t}$ commutes with the ${L_i}$‘s, so

$\displaystyle \begin{array}{rcl} \| L_i P_t f \|_2^2 &=& \| P_t L_i f \|_2^2 \\ &\leq& \| L_i f \|_p^2 \\ &\leq& \| L_i f \|_1^{2\theta}\| L_i f \|_2^{2(1-\theta)} \\ \end{array}$

by hypercontractivity and Hölder, ${\theta=\theta(t)}$ given by

$\displaystyle \begin{array}{rcl} \frac{1}{p}=\frac{\theta}{1}+\frac{1-\theta}{2}. \end{array}$

Finally,

$\displaystyle \begin{array}{rcl} \|f\|_2^2 \leq 4\sum_{i=1}^{N}\| L_i f \|_2^2 \int_{0}^{T}(\frac{\| L_i f \|_1}{\| L_i f \|_2})^{2\theta(t)}\,dt, \end{array}$

gives Talagrand’s inequality.

The main ingredients of the proof were

1. Expression of energy as a sum of ${N}$ terms.
2. Commutation.
3. Global hypercontractivity.

This carries over to certain non product situations.

2.2. The symmetric group

Let ${X=\mathfrak{S}_n}$ be the symmetric group, equipped with the kernel corresponding to the random walk associated to the generating set consisting of all transpositions, i.e.

$\displaystyle \begin{array}{rcl} Kf(x)=\sum_{\tau \,\mathrm{transposition}}\frac{2}{n(n-1)}f(\tau x), \end{array}$

$\displaystyle \begin{array}{rcl} \mathcal{E}(f,f)=\frac{1}{n(n-1)}\sum_{\tau \,\mathrm{transposition}}\|D_{\tau}f\|_2^2 . \end{array}$

Commutation holds: ${KD_{\tau}=D_{\tau}K}$.

The spectral gap ${\lambda=\frac{2}{n-1}}$ is due to Diaconis-Shahshahani in 1981.

Hypercontractivity ${\rho\sim\frac{1}{n\log n}}$ was proven by Diaconis and Saloff-Coste (1996) and Lee-Yau (1998).

The above proof automatically gives the following Talagrand inequality.

$\displaystyle \begin{array}{rcl} Var_{\mu}(f)\leq C\frac{\log n}{n}\sum_{\tau}\frac{\|D_{\tau}f\|_2}{1+\log(\frac{\|D_{\tau}f\|_2}{\|D_{\tau}f\|_1})}. \end{array}$

In terms of influences of Boolean functions, there exists a transposition ${\tau}$ such that

$\displaystyle \begin{array}{rcl} Inf_{\tau}(A)\geq\frac{1}{Cn}\mu(A)(1-\mu(A)). \end{array}$

It is not that exciting since it follows from spectral gap. For Talagrand’s inequality to improve on spectral gap, ${\rho}$ must be substantially larger than ${\lambda}$, i.e.

$\displaystyle \begin{array}{rcl} \rho\log\frac{1}{\rho}\gg \lambda. \end{array}$

3. Continuous setting

3.1. Gaussian case

From now on, ${\mu}$ is the Gaussian measure on ${{\mathbb R}^N}$. Then Talagrand’s inequality reads

$\displaystyle \begin{array}{rcl} Var_{\mu}(f)\leq C\sum_{i=1}^{N}\frac{\|\partial_{i}f\|_2}{1+\log(\frac{\|\partial_{i}f\|_2}{\|\partial_{i}f\|_1})}. \end{array}$

The proof goes along similar lines. ${P_t}$ is replaced by the Ornstein-Uhlenbeck semigroup

$\displaystyle \begin{array}{rcl} P_t f(x)=\int_{{\mathbb R}^N}f(e^{-t}x+(1-e^{-2t})^{1/2}y)\,d\mu(y). \end{array}$

It commutes with partial derivatives. Hypercontractivity holds with ${\rho=1}$.

3.2. Extensions

On ${{\mathbb R}^N ={\mathbb R}^{n_1}\times\cdots\times{\mathbb R}^{n_{\ell}}}$, use measure ${e^{-V_i}\,dx}$. Assume a lower bound on the Hessian ${V''\geq-K}$ (the correct formulation involves the Ricci curvature of the semigroup). Then commutation holds up to a factor ${e^{Kt}}$:

$\displaystyle \begin{array}{rcl} |\nabla P_t f|\leq e^{Kt}P_t(|\nabla f|). \end{array}$

And one gets a Talagrand inequality.

3.3. The round sphere

Let ${D_{ij}f=x_i \partial_j -x_j \partial_i}$. Write

$\displaystyle \begin{array}{rcl} \mathcal{E}(f,f)=\sum_{i,i}\int(D_{ij}f)^2 \,d\mu. \end{array}$

Commutation ${\Delta D_{ij}=D_{ij}\Delta}$ holds. ${\lambda=\rho=n-1}$. This gives a Talagrand inequality

$\displaystyle \begin{array}{rcl} Var_{\mu}(f)\leq C\sum_{i,\,j=1}^{N}\frac{\|D_{ij}f\|_2}{1+\log(\frac{\|D_{ij}f\|_2}{\|D_{ij}f\|_1})}. \end{array}$

4. ${L_1}$ Talagrand inequality

4.1. ${L_1}$ influences

Apply Talagrand’s inequality to characteristic functions of sets. The ${L_1}$-influence makes sense (Keller, Mossel, Sen 2010).

In the Gaussian case, it has the following geometric meaning. Given ${A\in {\mathbb R}^N}$, let

$\displaystyle \begin{array}{rcl} A_i^x =\{y\in{\mathbb R}\,;\,(x_1 ,\ldots,x_{i-1},y,x_{i+1},\ldots,x_n)\in A\}. \end{array}$

Then define

$\displaystyle \begin{array}{rcl} \mu_i^+ (A_i^x)=\liminf_{r\rightarrow 0}\frac{1}{r}(\mu(A_i^x +[-r,r])-\mu(A_i^x)). \end{array}$

Then

$\displaystyle \begin{array}{rcl} Inf_i (A):=\|\partial_{i}(1_A )\|_1 =\int\mu_i^+ (A_i^x)\,d\mu(x). \end{array}$

Theorem 3 (Keller, Mossel, Sen 2010) There exists ${i\in[n]}$ such that

$\displaystyle \begin{array}{rcl} Inf_i (A)\geq c\,\mu(A)(1-\mu(A))\frac{\sqrt{\log N}}{N}. \end{array}$

This is optimal.

4.2. Proof of the ${L_1}$-influence bound

I show that a slight modification of above proof gives Keller, Mossel and Sen’s result.

Change ${t}$ into ${2t}$:

$\displaystyle \begin{array}{rcl} Var_{\mu}(f)=4\sum_{i=1}^{N}\int_{0}^{\infty}(\int(\partial_i P_{2t}f)^2 \,d\mu)\,dt. \end{array}$

Use spectral gap,

$\displaystyle \begin{array}{rcl} \|\partial_i P_{2t}f\|_2^2 &\leq& e^{-2t} \|P_t (\partial_i P_{t}f)\|_2^2 \\ &\leq& e^{-2t} \|\partial_i P_{t}f\|_p^2 , \end{array}$

by hypercontractivity.

Trick: If ${\|f\|_{\infty}\leq 1}$, ${P_t f}$ is Lipschitz,

$\displaystyle \begin{array}{rcl} \|\partial_i P_{t}f\|_{\infty}\leq\frac{1}{\sqrt{t}}. \end{array}$

This easily follows from the integral representation of the Ornstein-Uhlenbeck semigroup. Estimate

$\displaystyle \begin{array}{rcl} \|P_t (\partial_i P_{t}f)\|_p \leq \end{array}$

Get

$\displaystyle \begin{array}{rcl} \|\partial_i P_{2t}f\|_2^2 \leq e^{-2t}(\frac{1}{\sqrt{t}})^{\frac{2(p-1)}{p}}\|\partial_i P_{t}f\|_{1}^{2/p}. \end{array}$

Talgrand’s inequality for ${L^1}$ norms reads

$\displaystyle \begin{array}{rcl} Var_{\mu}(f)\leq C\,\sum_{i=1}^{N}\frac{\|\partial_{i}f\|_1 (1+\|\partial_{i}f\|)}{\sqrt{1 +\log^+ (\frac{1}{\|\partial_{i}f\|_1}}}. \end{array}$

From there, the application to influences is immediate.

4.3. Connections with isoperimetry

On ${{\mathbb R}^N ={\mathbb R}^{n_1}\times\cdots\times{\mathbb R}^{n_{\ell}}}$, use Gaussian measure again. The ${L_1}$ Talagrand inequality reads

$\displaystyle \begin{array}{rcl} Var_{\mu}(f)\leq C\,\sum_{i=1}^{N}\frac{\|\nabla_{i}f\|_1 (1+\|\nabla_{i}f\|)}{\sqrt{1 +\log^+ (\frac{1}{\|\nabla_{i}f\|_1})}}. \end{array}$

In terms of influences, there exists ${i\in[\ell]}$ such that

$\displaystyle \begin{array}{rcl} \|\nabla_i (1_A)\|_1 \geq \frac{c}{\ell}\,\mu(A)(1-\mu(A))\sqrt{\log(\frac{\ell}{C\,\mu(A)(1-\mu(A))})}. \end{array}$

In case one only one factor, this says that

$\displaystyle \begin{array}{rcl} \|\nabla(1_A)\|_1 \geq c\,\mu(A)(1-\mu(A))\sqrt{\log(\frac{1}{C\,\mu(A)(1-\mu(A))})}. \end{array}$

We recognize a quantitative form of the Gaussian isoperimetric inequality

$\displaystyle \begin{array}{rcl} \mu^+ (A)\geq \phi\circ\Phi^{-1}(\mu(A)). \end{array}$

Note that

$\displaystyle \begin{array}{rcl} \phi\circ\Phi^{-1}(t)\sim t\sqrt{2\log\frac{1}{t}} \end{array}$

as ${t}$ tends to ${0}$. So we merely lost a constant factor.