Notes of Michel Ledoux’s course

Talagrand’s inequality and hypercontractivity

1. Talagrand’s inequality

Theorem 1 (Talagrand) Let {f:\{\pm 1\}^N\rightarrow{\mathbb R}}. Let {\mu} be the uniform measure on {\{\pm 1\}^N}.

\displaystyle  \begin{array}{rcl}  Var_{\mu}(f)\leq C\,\sum_{i=1}^{N}\frac{|D_i f|_2^2}{1+\log(\frac{|D_i f|_2}{|D_i f|_1})}. \end{array}

This improves on the KKL theorem. Indeed, it implies

Theorem 2 (Kahn, Kalai, Linial) Let {f:\{\pm 1\}^N\rightarrow\{\pm 1\}}. Then there exists {i\in[N]} such that

\displaystyle  \begin{array}{rcl}  Inf_i (f)\geq\frac{1}{8C}\mu(A)(1-\mu(A))\frac{\log N}{N}. \end{array}

Indeed, for any {p},

\displaystyle  \begin{array}{rcl}  Inf_i (f)=\frac{1}{2}|D_i f|_p^p . \end{array}

Today, I will prove Talagrand’s inequality in a slightly extended context.

2. Markov chains

A Markov kernel on a finite set {X} is a function {K:X\times X\rightarrow{\mathbb R}_+} such that {\sum_{y\in X}K(x,y)=1} for all {x}. Say {K} is reversible with respect to invariant probability measure {\mu} if

\displaystyle  \begin{array}{rcl}  \sum_{x\in X}K(x,y)\mu(x)=\mu(y), \end{array}

and

\displaystyle  \begin{array}{rcl}  K(x,y)\mu(x)=K(y,x)\mu(y). \end{array}

The corresponding Dirichlet form is {\mathcal{E}(f,g)=\frac{1}{2}\sum_{x,y\in X}(f(x)-f(y))(g(x)-g(y))}.

The spectral gap is the best constant {\lambda} such that {\lambda Var_{\mu}(f)\leq \mathcal{E}(f,f)}.

The logarithmic Sobolev constant is the best constant {\rho} in

\displaystyle  \begin{array}{rcl}  \rho Ent_{\mu}(f^2)\leq 2 \mathcal{E}(f,f). \end{array}

Here,

\displaystyle  \begin{array}{rcl}  Ent_{\mu}(f^2)=\int f\log f \,d\mu -(\int f\,d\mu) \log(\int f\,d\mu). \end{array}

It is stronger than spectral gap, {\rho\leq\lambda} (with our choice of normalization).

Spectral gap is equivalent to exponential decay under heat flow.

Logarithmic Sobolev inequality is equivalent to hypercontractivity of heat flow,

\displaystyle  \begin{array}{rcl}  \|P_t f\|_2 \leq \|f\|_p \end{array}

with {p=1+e^{-2\rho t}<2}.

On a product space {X_1 \times X_2}, there is a product kernel, and

\displaystyle  \begin{array}{rcl}  \lambda =\min(\lambda_1 ,\lambda_2),\quad \rho=\min(\rho_1 ,\rho_2). \end{array}

Example 1 Let {K(x,y)=\mu(y)}.

In other words, {KF=\mathop{\mathbb E}_{\mu}(f)}, {\mathcal{E}(f,f)=Var_{\mu}(f)}, {\lambda=1}.

Example 2 Product of {N} copies of previous example.

Then

\displaystyle  \begin{array}{rcl}  \mathcal{E}(f,f)=\sum_{i=1}^{N}Var_{\mu_i}(f)=\sum_{i=1}^{N}|\int f d\mu_i -f|_2^2. \end{array}

For this example, Talagrand’s inequality states that

\displaystyle  \begin{array}{rcl}  Var_{\bigotimes \mu_i}(f)\leq\frac{4}{\rho}\sum_{i=1}^{N}\frac{Var_{\mu_i}(f)}{1+\log(\frac{Var_{\mu_i}(f)}{\|\int f d\mu_i -f\|_1})}. \end{array}

On the biased cube, i.e. {X_i =\{\pm 1\}}, {\mu=p\delta_1 +q\delta_{-1}}, {\lambda=1}, {\rho=\frac{2(p-q)}{\log p -\log q}} (Bonami-Beckner inequality).

2.1. Proof of Talagrand’s inequality for products

Now I give the proof.

Assume {\mathop{\mathbb E}_{\mu}(f)=0}. Write

\displaystyle  \begin{array}{rcl}  Var_{\mu}(f)&=&-\int_{0}^{\infty}\frac{d}{dt}(\int(P_t f)^2 \,d\mu)\,dt\\ &=& 2\int_{0}^{\infty}E(P_t f ,P_t f)\,dt\\ &=&2\sum_{i=1}^{N}\int_{0}^{\infty}(\int(L_i P_t f)^2 \,d\mu)\,dt. \end{array}

We treat separately small and large values of {t}. Write

\displaystyle  \begin{array}{rcl}  \|f\|^2 =\|f\|^2 -\|P_T f\|^2 +\|P_T f\|^2 . \end{array}

Since

\displaystyle  \begin{array}{rcl}  \|P_T f\|^2 \leq e^{-2T}\|f\|^2 , \end{array}

for {T\geq \frac{1}{2}},

\displaystyle  \begin{array}{rcl}  Var_{\mu}(f)\leq 4\sum_{i=1}^{N}\int_{0}^{T}(\int(L_i P_t f)^2 \,d\mu)\,dt. \end{array}

Commutation: {P_t} commutes with the {L_i}‘s, so

\displaystyle  \begin{array}{rcl}  \| L_i P_t f \|_2^2 &=& \| P_t L_i f \|_2^2 \\ &\leq& \| L_i f \|_p^2 \\ &\leq& \| L_i f \|_1^{2\theta}\| L_i f \|_2^{2(1-\theta)} \\ \end{array}

by hypercontractivity and Hölder, {\theta=\theta(t)} given by

\displaystyle  \begin{array}{rcl}  \frac{1}{p}=\frac{\theta}{1}+\frac{1-\theta}{2}. \end{array}

Finally,

\displaystyle  \begin{array}{rcl}  \|f\|_2^2 \leq 4\sum_{i=1}^{N}\| L_i f \|_2^2 \int_{0}^{T}(\frac{\| L_i f \|_1}{\| L_i f \|_2})^{2\theta(t)}\,dt, \end{array}

gives Talagrand’s inequality.

The main ingredients of the proof were

  1. Expression of energy as a sum of {N} terms.
  2. Commutation.
  3. Global hypercontractivity.

This carries over to certain non product situations.

2.2. The symmetric group

Let {X=\mathfrak{S}_n} be the symmetric group, equipped with the kernel corresponding to the random walk associated to the generating set consisting of all transpositions, i.e.

\displaystyle  \begin{array}{rcl}  Kf(x)=\sum_{\tau \,\mathrm{transposition}}\frac{2}{n(n-1)}f(\tau x), \end{array}

\displaystyle  \begin{array}{rcl}  \mathcal{E}(f,f)=\frac{1}{n(n-1)}\sum_{\tau \,\mathrm{transposition}}\|D_{\tau}f\|_2^2 . \end{array}

Commutation holds: {KD_{\tau}=D_{\tau}K}.

The spectral gap {\lambda=\frac{2}{n-1}} is due to Diaconis-Shahshahani in 1981.

Hypercontractivity {\rho\sim\frac{1}{n\log n}} was proven by Diaconis and Saloff-Coste (1996) and Lee-Yau (1998).

The above proof automatically gives the following Talagrand inequality.

\displaystyle  \begin{array}{rcl}  Var_{\mu}(f)\leq C\frac{\log n}{n}\sum_{\tau}\frac{\|D_{\tau}f\|_2}{1+\log(\frac{\|D_{\tau}f\|_2}{\|D_{\tau}f\|_1})}. \end{array}

In terms of influences of Boolean functions, there exists a transposition {\tau} such that

\displaystyle  \begin{array}{rcl}  Inf_{\tau}(A)\geq\frac{1}{Cn}\mu(A)(1-\mu(A)). \end{array}

It is not that exciting since it follows from spectral gap. For Talagrand’s inequality to improve on spectral gap, {\rho} must be substantially larger than {\lambda}, i.e.

\displaystyle  \begin{array}{rcl}  \rho\log\frac{1}{\rho}\gg \lambda. \end{array}

3. Continuous setting

3.1. Gaussian case

From now on, {\mu} is the Gaussian measure on {{\mathbb R}^N}. Then Talagrand’s inequality reads

\displaystyle  \begin{array}{rcl}  Var_{\mu}(f)\leq C\sum_{i=1}^{N}\frac{\|\partial_{i}f\|_2}{1+\log(\frac{\|\partial_{i}f\|_2}{\|\partial_{i}f\|_1})}. \end{array}

The proof goes along similar lines. {P_t} is replaced by the Ornstein-Uhlenbeck semigroup

\displaystyle  \begin{array}{rcl}  P_t f(x)=\int_{{\mathbb R}^N}f(e^{-t}x+(1-e^{-2t})^{1/2}y)\,d\mu(y). \end{array}

It commutes with partial derivatives. Hypercontractivity holds with {\rho=1}.

3.2. Extensions

On {{\mathbb R}^N ={\mathbb R}^{n_1}\times\cdots\times{\mathbb R}^{n_{\ell}}}, use measure {e^{-V_i}\,dx}. Assume a lower bound on the Hessian {V''\geq-K} (the correct formulation involves the Ricci curvature of the semigroup). Then commutation holds up to a factor {e^{Kt}}:

\displaystyle  \begin{array}{rcl}  |\nabla P_t f|\leq e^{Kt}P_t(|\nabla f|). \end{array}

And one gets a Talagrand inequality.

3.3. The round sphere

Let {D_{ij}f=x_i \partial_j -x_j \partial_i}. Write

\displaystyle  \begin{array}{rcl}  \mathcal{E}(f,f)=\sum_{i,i}\int(D_{ij}f)^2 \,d\mu. \end{array}

Commutation {\Delta D_{ij}=D_{ij}\Delta} holds. {\lambda=\rho=n-1}. This gives a Talagrand inequality

\displaystyle  \begin{array}{rcl}  Var_{\mu}(f)\leq C\sum_{i,\,j=1}^{N}\frac{\|D_{ij}f\|_2}{1+\log(\frac{\|D_{ij}f\|_2}{\|D_{ij}f\|_1})}. \end{array}

4. {L_1} Talagrand inequality

4.1. {L_1} influences

Apply Talagrand’s inequality to characteristic functions of sets. The {L_1}-influence makes sense (Keller, Mossel, Sen 2010).

In the Gaussian case, it has the following geometric meaning. Given {A\in {\mathbb R}^N}, let

\displaystyle  \begin{array}{rcl}  A_i^x =\{y\in{\mathbb R}\,;\,(x_1 ,\ldots,x_{i-1},y,x_{i+1},\ldots,x_n)\in A\}. \end{array}

Then define

\displaystyle  \begin{array}{rcl}  \mu_i^+ (A_i^x)=\liminf_{r\rightarrow 0}\frac{1}{r}(\mu(A_i^x +[-r,r])-\mu(A_i^x)). \end{array}

Then

\displaystyle  \begin{array}{rcl}  Inf_i (A):=\|\partial_{i}(1_A )\|_1 =\int\mu_i^+ (A_i^x)\,d\mu(x). \end{array}

Theorem 3 (Keller, Mossel, Sen 2010) There exists {i\in[n]} such that

\displaystyle  \begin{array}{rcl}  Inf_i (A)\geq c\,\mu(A)(1-\mu(A))\frac{\sqrt{\log N}}{N}. \end{array}

This is optimal.

4.2. Proof of the {L_1}-influence bound

I show that a slight modification of above proof gives Keller, Mossel and Sen’s result.

Change {t} into {2t}:

\displaystyle  \begin{array}{rcl}  Var_{\mu}(f)=4\sum_{i=1}^{N}\int_{0}^{\infty}(\int(\partial_i P_{2t}f)^2 \,d\mu)\,dt. \end{array}

Use spectral gap,

\displaystyle  \begin{array}{rcl}  \|\partial_i P_{2t}f\|_2^2 &\leq& e^{-2t} \|P_t (\partial_i P_{t}f)\|_2^2 \\ &\leq& e^{-2t} \|\partial_i P_{t}f\|_p^2 , \end{array}

by hypercontractivity.

Trick: If {\|f\|_{\infty}\leq 1}, {P_t f} is Lipschitz,

\displaystyle  \begin{array}{rcl}  \|\partial_i P_{t}f\|_{\infty}\leq\frac{1}{\sqrt{t}}. \end{array}

This easily follows from the integral representation of the Ornstein-Uhlenbeck semigroup. Estimate

\displaystyle  \begin{array}{rcl}  \|P_t (\partial_i P_{t}f)\|_p \leq \end{array}

Get

\displaystyle  \begin{array}{rcl}  \|\partial_i P_{2t}f\|_2^2 \leq e^{-2t}(\frac{1}{\sqrt{t}})^{\frac{2(p-1)}{p}}\|\partial_i P_{t}f\|_{1}^{2/p}. \end{array}

Talgrand’s inequality for {L^1} norms reads

\displaystyle  \begin{array}{rcl}  Var_{\mu}(f)\leq C\,\sum_{i=1}^{N}\frac{\|\partial_{i}f\|_1 (1+\|\partial_{i}f\|)}{\sqrt{1 +\log^+ (\frac{1}{\|\partial_{i}f\|_1}}}. \end{array}

From there, the application to influences is immediate.

4.3. Connections with isoperimetry

On {{\mathbb R}^N ={\mathbb R}^{n_1}\times\cdots\times{\mathbb R}^{n_{\ell}}}, use Gaussian measure again. The {L_1} Talagrand inequality reads

\displaystyle  \begin{array}{rcl}  Var_{\mu}(f)\leq C\,\sum_{i=1}^{N}\frac{\|\nabla_{i}f\|_1 (1+\|\nabla_{i}f\|)}{\sqrt{1 +\log^+ (\frac{1}{\|\nabla_{i}f\|_1})}}. \end{array}

In terms of influences, there exists {i\in[\ell]} such that

\displaystyle  \begin{array}{rcl}  \|\nabla_i (1_A)\|_1 \geq \frac{c}{\ell}\,\mu(A)(1-\mu(A))\sqrt{\log(\frac{\ell}{C\,\mu(A)(1-\mu(A))})}. \end{array}

In case one only one factor, this says that

\displaystyle  \begin{array}{rcl}  \|\nabla(1_A)\|_1 \geq c\,\mu(A)(1-\mu(A))\sqrt{\log(\frac{1}{C\,\mu(A)(1-\mu(A))})}. \end{array}

We recognize a quantitative form of the Gaussian isoperimetric inequality

\displaystyle  \begin{array}{rcl}  \mu^+ (A)\geq \phi\circ\Phi^{-1}(\mu(A)). \end{array}

Note that

\displaystyle  \begin{array}{rcl}  \phi\circ\Phi^{-1}(t)\sim t\sqrt{2\log\frac{1}{t}} \end{array}

as {t} tends to {0}. So we merely lost a constant factor.

Advertisements

About metric2011

metric2011 is a program of Centre Emile Borel, an activity of Institut Henri Poincaré, 11 rue Pierre et Marie Curie, 75005 Paris, France. See http://www.math.ens.fr/metric2011/
This entry was posted in Course. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s