## Notes of Christophe Garban’s talk

Influences, noise sensitivity and randomized algorithms

I will survey a few famous theorems involving influences, together with sketches of proofs. I will mention applications to statistical physics and randomized algorithms.

1. Influences

1.1. Boolean functions

A Boolean function is a function ${f:\{\pm 1\}^n \rightarrow \{\pm 1\}}$. There are ${2^{2^n}}$ of them. That is a lot. The point of the lecture is to give some structural results on this set.

Example 1 Our favourite Boolean functions are

1. Majority: ${f(x)=sign(\sum x_i)}$.
2. Dictator: ${f(x)=x_i}$.
3. Fix a subdomain of size ${n^2}$ of the ${{\mathbb Z}^2}$-grid, and two pieces ${A}$ and ${B}$ of the boundary. Play bond percolation. Define ${f=1}$ iff there is an open path connecting ${A}$ to ${B}$.

In this last example, say an edge is pivotal if changing that edge changes ${f}$. The probability of a given edge to be pivotal is conjectured to be ${\sim n^{-5/4}}$. Known in the triangular lattice.

1.2. Influences

Definition 1 Given a probability distribution on the hypercube, and ${k\in[n]}$, the influence of the ${k}$-th coordinate on a Boolean function ${f}$ is the probability that ${k}$ be pivotal, i.e.

$\displaystyle \begin{array}{rcl} Inf_k (f)=\mathop{\mathbb P}(f(x)\not=f(xe_k)). \end{array}$

Most of the time, the probability distribution on the hypercube will be uniform. When we use instead the ${p}$-biased Bernoulli distribution ${\mathop{\mathbb P}_p =(p\delta_1 +(1-\delta_{-1})^{\otimes n}}$, we denoter influence by ${I_{k}^{p}(f)}$.

Example 2 For the majority function ${f_n}$

Definition 2 The total influence is

$\displaystyle \begin{array}{rcl} I(f)=\sum_{k=1}^{n}Inf_k (f)=\|Inf(f)\|_1 . \end{array}$

1.3. Geometric interpretation

Given a subset ${A}$ of the hypercube, define the edge-boundary ${\partial_{E}(A)}$ as the set of edges joining ${A}$ to its complement. If we use the uniform distribution on the hypercube, for the characteristic function ${f}$ of ${A}$,

$\displaystyle \begin{array}{rcl} I(f)=2^{1-n}|\partial_{E}(A)|. \end{array}$

1.4. An easy estimate

Proposition 3 For all ${f:\{\pm 1\}^n \rightarrow\{ 0,1 \}}$, there exists ${k}$ such that

$\displaystyle \begin{array}{rcl} Inf_k (f)\geq \frac{1}{n}Var(f). \end{array}$

Proof: Poincaré inequality states that ${Var(f)\leq I(f)}$. $\Box$

This argument applies to all real valued functions.

1.5. The KKL Theorem

This is an improvement on the Poincaré inequality, taking into account the fact the function is boolean.

Theorem 4 (Kahn, Kalai, Linial 1988) There is a universal constant ${c}$ such that for every ${f:\{\pm 1\}^n \rightarrow\{ 0,1 \}}$, there exists ${k\in[n]}$ such that

$\displaystyle \begin{array}{rcl} Inf_k (f)\geq \frac{c}{n}Var(f). \end{array}$

Example 3 (Ben Or, Linial) The tribes function. Divide ${[n]}$ in ${n/k}$ blocks of size ${k}$. Define ${f(x)=1}$ iff at least one block is entirely ${1}$‘s. The clever choice is ${k=\log_2 n -\log_2 \log_2 n}$.

Then for every ${m}$,

$\displaystyle \begin{array}{rcl} Inf_m (f)\sim 2^{1-k}\sim\frac{\log n}{n}. \end{array}$

Remark 1 Assume that ${A}$ is invariant under a transitive group of permutations of ${[n]}$. Then

$\displaystyle \begin{array}{rcl} I(f)\geq c\,Var(f)\log n. \end{array}$

This can be viewed as an isoperimetric inequality.

1.6. Proof of KKL theorem

It uses some discrete harmonic analysis. Use the notation from the courses by Kindler or Raghavendra. Every function (Fourier-Walsh) decomposes as

$\displaystyle \begin{array}{rcl} f=\sum_{S\subset [n]}\hat{f}_S \chi_S , \end{array}$

where

$\displaystyle \begin{array}{rcl} \chi_S (x)=\prod_{i\in S} x_i . \end{array}$

Parseval’s identity is

$\displaystyle \begin{array}{rcl} \mathop{\mathbb E}(f^2 )=\sum_{S\subset [n]}\hat{f}_S^{2} . \end{array}$

Denote by

$\displaystyle \begin{array}{rcl} \nabla_k f (x)=f(xe_k)-f(x). \end{array}$

In general,

$\displaystyle \begin{array}{rcl} \nabla_{k}f=2\sum_{\{S\subset[n]\,;\,k\in S}\hat{f}_S \chi_S , \end{array}$

thus

$\displaystyle \begin{array}{rcl} \widehat{\nabla_{k}f}=2\hat{f}_S \textrm{ if }k\in S,\quad =0 \textrm{ otherwise}. \end{array}$

$\displaystyle \begin{array}{rcl} Inf_k (f)&=&\mathop{\mathbb P}(f(x)\not=f(xe_k))\\ &=&\mathop{\mathbb P} (\nabla_{k}f\not=0)\\ &=&\mathop{\mathbb E} |\nabla_{k}f|^2 \\ &=&4\sum_{\{S\subset[n]\,;\,k\in S}\hat{f}_S^2 . \end{array}$

Assume (by contradiction) that for all ${k}$, ${Inf_k (f)\leq n^{-3/4}}$. Then all ${\nabla_k f}$ have small support. Heisenberg’s uncertainty principle suggests that this implies that spectrum is high. Bonami-Beckner inequality makes this quantitative. On the other hand,

$\displaystyle \begin{array}{rcl} I(f)&=&\sum_{k}Inf_{k}(f)\\ &=&\sum_{S}|S|\hat{f}(S)^2 . \end{array}$

If spectrum is high, This is much larger than ${\sum_{S}\hat{f}(S)^2 =Var(f)}$.

2. Application to statistical physics

2.1. Russo’s formula

Proposition 5 Let ${f:\{\pm 1\}^n\rightarrow\{ 0,1 \}}$ be monotone. Use the ${p}$-biased distribution on the hypercube. Then the derivative of ${p\mapsto \mathop{\mathbb E}_p (f)}$ at ${p}$ is the total influence ${I^p (f)}$.

The KKL theorem implies the following lower bound (BKKKL), for all ${p}$, with a ${p}$-independant constant ${c}$,

$\displaystyle \begin{array}{rcl} \exists k\in[n] \textrm{ such that } I_k^p (f)\geq c\, Var_p (f) \frac{\log n}{n}. \end{array}$

For symmetric functions,

$\displaystyle \begin{array}{rcl} I^p (f)\geq c\, Var_p (f) \log n. \end{array}$

So the slope is big, this is an instance of a sharp threshold result.

One can view this as a generalization of Kolmogorov’s ${0-1}$-law.

Theorem 6 Let ${A\in\{ 0,1 \}^{{\mathbb Z}}}$ be a monotone set. Then ${\mathop{\mathbb P}_p (A)}$ is either ${0}$ or ${1}$.

2.2. Bond percolation

Kesten’s proof that ${p_c =\frac{1}{2}}$ for bond percolation on ${{\mathbb Z}^2}$. He knew that ${f}$ is balanced at ${p=\frac{1}{2}}$, he proved by hand that influences were large, so just above ${p=\frac{1}{2}}$ connections across rectangles of aspect ratio ${5}$ have high probability.

2.3. First passage percolation

Study balls in random metrics on ${{\mathbb Z}^d}$. One knows that a limit shape exists. KKL has been used to show that fluctuations around that shape are small.

2.4. Dynamical percolation

See Gabor Pete’s talk on friday.

3. Randomized algorithms

3.1. Revealment

Given a Boolean function ${f}$, one wants an algorithm ${A}$ that computes ${f}$ with high probability, using few queries, i.e. minimizing the revealment

$\displaystyle \begin{array}{rcl} \delta_A :=\max_{k\in[n]}\mathop{\mathbb P}(k \textrm{ is used by }A). \end{array}$

Theorem 7 (Schramm, Steif) For every Boolean function ${f}$, for all ${k\in[n]}$, for all algorithms ${A}$,

$\displaystyle \begin{array}{rcl} \sum_{|S|=k}\hat{f}_S^2 \leq\delta_A k\mathop{\mathbb E}(f^2). \end{array}$

This means that small revealment gives a strong control on the Fourier coefficients of ${f}$.

There is an algorithm with revealment ${\leq O(n^{-1/4})}$ for the connection function: launch a sample connection, i.e. travel through the domain querying at each step the colour of the encountered hex. This requires ${\leq O(n^{7/4})}$ queries.

There is a conjecturally better algorithm.

3.2. Queried sets

Let ${J_A}$ denote the set of variables queried by ${A}$. It is easy to show that

$\displaystyle \begin{array}{rcl} \mathop{\mathbb E}|J_A| \geq I(f). \end{array}$

In fact, R. O’Donnell and R. Servedio could prove that

$\displaystyle \begin{array}{rcl} \mathop{\mathbb E}|J_A| \geq I(f)^2 . \end{array}$

This is very useful, since it works in the opposite direction as KKL, and gives lower bounds on revealment, since

$\displaystyle \begin{array}{rcl} \mathop{\mathbb E}|J_A| \leq n\delta_A . \end{array}$

For instance, for the connection Boolean function, ${|J_A|}$ is bounded from above by the typical length of an ${SLE}$ curve, i.e. ${n^{7/4}}$. Compare to the lower bound ${n^{6/4}}$ provided by Peres, Schramm et al.

3.3. Witness revealment

Theorem 8 (Schramm, Steif) For every Boolean function ${f}$, if there exists an algorithm ${A}$ with small revealment, then ${f}$ is noise sensitive.

Define a witness as a set of coordinates that suffice to prove that ${f=1}$ (e.g. a connection path for the connection Boolean function). The witness revealment of ${f}$ is the infimum over all randomized witnesses ${W}$ which compute ${f}$ of

$\displaystyle \begin{array}{rcl} \max_{k\in[n]}\mathop{\mathbb P}(k\in W). \end{array}$

Question: does small witness revealment imply noise-sensitivity ?