1. Proof of the KKL Theorem
Harper’s theorem deals with the sum of influences. It is a bit stronger than the direct consequences of Fourier expansion described by Mossel yesterday. The KKL Theorem is even stronger.
1.1. Fourier expression for influences
Notation: . It is a metric space of diameter (Hamming metric) and a graph. Let be a subset of . Let be the 0-1-valued characteristic function of . Assume that . Recall that
is nicely written as a sum of multilinear polynomials. In particular,
We are aiming at a Fourier expression of influences. Let flip the -th coordinate. Then
where is the set of edges joining to its complement . corresponds to the subset of edges parallel the -axis.
Note that takes values , or . is the probability that , so
The Fourier coefficients of are
This is sharp for .
Theorem 1 (Consequence of Harper’s theorem)
This is sharp for subcubes. Indeed, every vertex of the -cube has neighbours in the subcube. Since , this explains the log term. There is a simple proof, using the method of canonical paths, see below.
Our main intermediate step towards KKL Theorem is the following result.
Theorem 2 Let , . Then
This implies that
This consequence is still weaker than Harper’s theorem as far as is concerned, but the theorem is stronger. It says that the portion of the -norm carried by small sets is negligible.
Recall that our main aim is: If is small and is Boolean, then is concentrated on high frequences, i.e. large sets. Note that for Boolean functions, for all and . If is small, then for distinct and , and are very different.
Recall Khintchine’s theorem. It says that if a function has only degree one Fourier coefficients, then all the norms of are comparable.
Theorem 3 If is of degree one, , then
We want to replace degree by small degree. Also, we cannot assume a Boolean function is entirely supported on low degrees. We can merely assume that it holds approximately. Fortunately for us, the relevant extension of Khintchine’s theorem exists in the litterature. Also, the best constant in Khintchine’s theorem was not given by Khintchine, but only several decades later.
Theorem 4 (Bonami, Gross, Beckner (Nelson,…)) Let , . Let
It is easy to derive the inequality for from the -dimensional case, by induction, due to the product structure. The -dimensional case is a not quite easy exercise in calculus (only 3 parameters involved).
Aline Bonami was the first to prove this, although Ed Nelson considered similar issues motivated by mathematical physics earlier. Gross observed the connection with log-Sobolev inequalities.
1.3. Back to Harper’s Theorem
Now I state Harper’s theorem in its full force.
Theorem 5 (Harper) Use lexicographic ordering on . Then sets of the form minimize among sets of the same size.
Here is a proof of the consequence alluded to above. Given , , the canonical path from to consists in flipping bits when necessary, from left to right. Consider all canonical paths from to . There are such paths, . Every such path contains at least one edge in . Conversely, given an edge which amounts to flipping the -th coordinate, belongs to at most such canonical paths. This implies that
1.4. Proof of Theorem 2
Now I prove the Theorem above. Take . Then
Spell it out
When one sticks to such that , . Since , we get Theorem 2.
1.5. How KKL Theorem applies to collective coin flipping
Theorem 6 (KKL) Let . Assume . Then
where is a universal constant.
By applying it iteratively, one gets the following Corollary, which applies to collective coin flipping.
Corollary 7 (KKL) Let . Assume . For all , there exists such that there exists with such that
Proof: Project to some coordinate. Then .
1.6. Proof of KKL Theorem
Suppose that for all , . We will show that the sum .
Recall that . . Apply hypercontractivity.
On the other hand,
In other words, all except of the Fourier weights of are above level. This implies that .
KKL is sharp up to the universal constant. Regarding constants, the gap between the tribes example and the KKL estimate is not that large: a factor of . Note that we have applied hypercontractivity blindly to a Boolean function, without taking into account the fact it is Boolean. There may be some room for improvement, but I am unable to make it.
2. Application to first passage percolation
On the grid, change the length of edges: every edge is assigned a length or with probability . What is the distance between and ?
It is at most , so we need not care of edges outside a square of size . Let be the number of relevant edges, let record the random choices. Then distance is a function .
One can show that where not much is known about the constant . What is the variance of ?
Theorem 8 (Kesten) .
Later on, Talagrand gave Gaussian tail estimates on .
I will show next time that Kesten’s estimate follows easily from for Boolean functions.
Using hypercontractivity, I will show that , and also how Gaussian tail estimates follow.