Notes of Christophe Garban’s talk

 

Influences, noise sensitivity and randomized algorithms

I will survey a few famous theorems involving influences, together with sketches of proofs. I will mention applications to statistical physics and randomized algorithms.

 

1. Influences

 

1.1. Boolean functions

 

A Boolean function is a function {f:\{\pm 1\}^n \rightarrow \{\pm 1\}}. There are {2^{2^n}} of them. That is a lot. The point of the lecture is to give some structural results on this set.

Example 1 Our favourite Boolean functions are

  1. Majority: {f(x)=sign(\sum x_i)}.
  2. Dictator: {f(x)=x_i}.
  3. Fix a subdomain of size {n^2} of the {{\mathbb Z}^2}-grid, and two pieces {A} and {B} of the boundary. Play bond percolation. Define {f=1} iff there is an open path connecting {A} to {B}.

In this last example, say an edge is pivotal if changing that edge changes {f}. The probability of a given edge to be pivotal is conjectured to be {\sim n^{-5/4}}. Known in the triangular lattice.

 

1.2. Influences

 

Definition 1 Given a probability distribution on the hypercube, and {k\in[n]}, the influence of the {k}-th coordinate on a Boolean function {f} is the probability that {k} be pivotal, i.e.

\displaystyle  \begin{array}{rcl}  Inf_k (f)=\mathop{\mathbb P}(f(x)\not=f(xe_k)). \end{array}

Most of the time, the probability distribution on the hypercube will be uniform. When we use instead the {p}-biased Bernoulli distribution {\mathop{\mathbb P}_p =(p\delta_1 +(1-\delta_{-1})^{\otimes n}}, we denoter influence by {I_{k}^{p}(f)}.

Example 2 For the majority function {f_n}

 

Definition 2 The total influence is

\displaystyle  \begin{array}{rcl}  I(f)=\sum_{k=1}^{n}Inf_k (f)=\|Inf(f)\|_1 . \end{array}

 

1.3. Geometric interpretation

 

Given a subset {A} of the hypercube, define the edge-boundary {\partial_{E}(A)} as the set of edges joining {A} to its complement. If we use the uniform distribution on the hypercube, for the characteristic function {f} of {A},

\displaystyle  \begin{array}{rcl}  I(f)=2^{1-n}|\partial_{E}(A)|. \end{array}

 

1.4. An easy estimate

 

Proposition 3 For all {f:\{\pm 1\}^n \rightarrow\{ 0,1 \}}, there exists {k} such that

\displaystyle  \begin{array}{rcl}  Inf_k (f)\geq \frac{1}{n}Var(f). \end{array}

 

Proof: Poincaré inequality states that {Var(f)\leq I(f)}. \Box

This argument applies to all real valued functions.

 

1.5. The KKL Theorem

This is an improvement on the Poincaré inequality, taking into account the fact the function is boolean.

Theorem 4 (Kahn, Kalai, Linial 1988) There is a universal constant {c} such that for every {f:\{\pm 1\}^n \rightarrow\{ 0,1 \}}, there exists {k\in[n]} such that

\displaystyle  \begin{array}{rcl}  Inf_k (f)\geq \frac{c}{n}Var(f). \end{array}

 

Example 3 (Ben Or, Linial) The tribes function. Divide {[n]} in {n/k} blocks of size {k}. Define {f(x)=1} iff at least one block is entirely {1}‘s. The clever choice is {k=\log_2 n -\log_2 \log_2 n}.

 

Then for every {m},

\displaystyle  \begin{array}{rcl}  Inf_m (f)\sim 2^{1-k}\sim\frac{\log n}{n}. \end{array}

Remark 1 Assume that {A} is invariant under a transitive group of permutations of {[n]}. Then

\displaystyle  \begin{array}{rcl}  I(f)\geq c\,Var(f)\log n. \end{array}

This can be viewed as an isoperimetric inequality.

 

1.6. Proof of KKL theorem

 

It uses some discrete harmonic analysis. Use the notation from the courses by Kindler or Raghavendra. Every function (Fourier-Walsh) decomposes as

\displaystyle  \begin{array}{rcl}  f=\sum_{S\subset [n]}\hat{f}_S \chi_S , \end{array}

where

\displaystyle  \begin{array}{rcl}  \chi_S (x)=\prod_{i\in S} x_i . \end{array}

Parseval’s identity is

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb E}(f^2 )=\sum_{S\subset [n]}\hat{f}_S^{2} . \end{array}

Denote by

\displaystyle  \begin{array}{rcl}  \nabla_k f (x)=f(xe_k)-f(x). \end{array}

In general,

\displaystyle  \begin{array}{rcl}  \nabla_{k}f=2\sum_{\{S\subset[n]\,;\,k\in S}\hat{f}_S \chi_S , \end{array}

thus

\displaystyle  \begin{array}{rcl}  \widehat{\nabla_{k}f}=2\hat{f}_S \textrm{ if }k\in S,\quad =0 \textrm{ otherwise}. \end{array}

Thus influences read

\displaystyle  \begin{array}{rcl}  Inf_k (f)&=&\mathop{\mathbb P}(f(x)\not=f(xe_k))\\ &=&\mathop{\mathbb P} (\nabla_{k}f\not=0)\\ &=&\mathop{\mathbb E} |\nabla_{k}f|^2 \\ &=&4\sum_{\{S\subset[n]\,;\,k\in S}\hat{f}_S^2 . \end{array}

Assume (by contradiction) that for all {k}, {Inf_k (f)\leq n^{-3/4}}. Then all {\nabla_k f} have small support. Heisenberg’s uncertainty principle suggests that this implies that spectrum is high. Bonami-Beckner inequality makes this quantitative. On the other hand,

\displaystyle  \begin{array}{rcl}  I(f)&=&\sum_{k}Inf_{k}(f)\\ &=&\sum_{S}|S|\hat{f}(S)^2 . \end{array}

If spectrum is high, This is much larger than {\sum_{S}\hat{f}(S)^2 =Var(f)}.

 

2. Application to statistical physics

 

2.1. Russo’s formula

 

Proposition 5 Let {f:\{\pm 1\}^n\rightarrow\{ 0,1 \}} be monotone. Use the {p}-biased distribution on the hypercube. Then the derivative of {p\mapsto \mathop{\mathbb E}_p (f)} at {p} is the total influence {I^p (f)}.

 

The KKL theorem implies the following lower bound (BKKKL), for all {p}, with a {p}-independant constant {c},

\displaystyle  \begin{array}{rcl}  \exists k\in[n] \textrm{ such that } I_k^p (f)\geq c\, Var_p (f) \frac{\log n}{n}. \end{array}

For symmetric functions,

\displaystyle  \begin{array}{rcl}  I^p (f)\geq c\, Var_p (f) \log n. \end{array}

So the slope is big, this is an instance of a sharp threshold result.

One can view this as a generalization of Kolmogorov’s {0-1}-law.

Theorem 6 Let {A\in\{ 0,1 \}^{{\mathbb Z}}} be a monotone set. Then {\mathop{\mathbb P}_p (A)} is either {0} or {1}.

 

2.2. Bond percolation

 

Kesten’s proof that {p_c =\frac{1}{2}} for bond percolation on {{\mathbb Z}^2}. He knew that {f} is balanced at {p=\frac{1}{2}}, he proved by hand that influences were large, so just above {p=\frac{1}{2}} connections across rectangles of aspect ratio {5} have high probability.

 

2.3. First passage percolation

 

Study balls in random metrics on {{\mathbb Z}^d}. One knows that a limit shape exists. KKL has been used to show that fluctuations around that shape are small.

 

2.4. Dynamical percolation

 

See Gabor Pete’s talk on friday.

 

3. Randomized algorithms

 

3.1. Revealment

 

Given a Boolean function {f}, one wants an algorithm {A} that computes {f} with high probability, using few queries, i.e. minimizing the revealment

\displaystyle  \begin{array}{rcl}  \delta_A :=\max_{k\in[n]}\mathop{\mathbb P}(k \textrm{ is used by }A). \end{array}

Theorem 7 (Schramm, Steif) For every Boolean function {f}, for all {k\in[n]}, for all algorithms {A},

\displaystyle  \begin{array}{rcl}  \sum_{|S|=k}\hat{f}_S^2 \leq\delta_A k\mathop{\mathbb E}(f^2). \end{array}

This means that small revealment gives a strong control on the Fourier coefficients of {f}.

There is an algorithm with revealment {\leq O(n^{-1/4})} for the connection function: launch a sample connection, i.e. travel through the domain querying at each step the colour of the encountered hex. This requires {\leq O(n^{7/4})} queries.

There is a conjecturally better algorithm.

 

3.2. Queried sets

 

Let {J_A} denote the set of variables queried by {A}. It is easy to show that

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb E}|J_A| \geq I(f). \end{array}

In fact, R. O’Donnell and R. Servedio could prove that

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb E}|J_A| \geq I(f)^2 . \end{array}

This is very useful, since it works in the opposite direction as KKL, and gives lower bounds on revealment, since

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb E}|J_A| \leq n\delta_A . \end{array}

For instance, for the connection Boolean function, {|J_A|} is bounded from above by the typical length of an {SLE} curve, i.e. {n^{7/4}}. Compare to the lower bound {n^{6/4}} provided by Peres, Schramm et al.

 

3.3. Witness revealment

 

Theorem 8 (Schramm, Steif) For every Boolean function {f}, if there exists an algorithm {A} with small revealment, then {f} is noise sensitive.

 

Define a witness as a set of coordinates that suffice to prove that {f=1} (e.g. a connection path for the connection Boolean function). The witness revealment of {f} is the infimum over all randomized witnesses {W} which compute {f} of

\displaystyle  \begin{array}{rcl}  \max_{k\in[n]}\mathop{\mathbb P}(k\in W). \end{array}

Question: does small witness revealment imply noise-sensitivity ?

 

About metric2011

metric2011 is a program of Centre Emile Borel, an activity of Institut Henri Poincaré, 11 rue Pierre et Marie Curie, 75005 Paris, France. See http://www.math.ens.fr/metric2011/
This entry was posted in Workshop lecture and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s