Notes of Guy Kindler’s lecture nr 1

Dictatorship testing and hardness of approximation

We shall explain a few sharp hardness of approximation results. This involves several components: PCP, Parallel Repetition, discrete analysis. Then all these components must be merged together, and it is usually delicate.

Several points of views are needed. For instance, we shall frequently shift from the probabilistic checking of proofs to instances of combinatorial optimization problems.

1. Linear equations

In 1996, Johan Håstad gave a sharp hardness of approximation result for E${3}$LIN${(2)}$. Specificly, he proved that ${\forall\epsilon}$, ${\delta>0}$, ${(1-\epsilon,\frac{1}{2}+\delta)}$-GAP E${3}$LIN${(2)}$ is NP-hard.

1.1. E${3}$LIN${(2)}$

An instance ${I}$ of E${3}$LIN${(2)}$ is a (weighted) set of linear equations over Boolean variables ${x_j \in\{\pm 1\}}$, each of the form ${x_{i_1}x_{i_2}x_{i_3}=b}$, ${b\in\{\pm 1\}}$.

Say that

• ${I}$ is a YES instance if there exists an assignment of variables that satisfies a fraction ${\geq 1+\epsilon}$ of equations.
• ${I}$ is a NO instance if no assignment of variables satisfies a fraction ${>\frac{1}{2}+\delta}$ of equations.

The ${(1-\epsilon,\frac{1}{2}+\delta)}$-GAP version of E${3}$LIN${(2)}$ consists in deciding wether a given instance, which is known to be YES or NO, is YES or is NO.

1.2. Reduction

Reduction between from GAP BLAH’ to GAP BLAH means a polytime algorithm which computes from an instance ${I'}$ of problem BLAH’ an instance ${I}$ of problem BLAH, in such a way that if ${I'}$ is YES, so is ${I}$, and if ${I'}$ is NO, so is ${I}$.

Usual decision problems are special cases of gap problems: ${(1,1)}$-GAP problems. The PCP theorem states that a certain gap problem is NP-hard, i.e. admits reductions to arbitrary NP decision problems.

1.3. Testing

Let us switch point of view, and use the testing interpretation. The reduction ${R}$ generates a distribution over equations. Pick an equation at random and evaluate it. An assignment is viewed as an attempt to prove that ${I}$ is YES. If ${I'}$ is YES, then there exists an assignment ${a}$ such that ${\mathop{\mathbb P}_{\mathrm{equations}}(\textrm{equation is satisfied})\geq 1-\epsilon}$. This is called the completeness test. If ${I'}$ is NO, then for every assignment ${a}$, ${\mathop{\mathbb P}_{\mathrm{equations}}(\textrm{equation is satisfied})\leq \frac{1}{2}+\delta}$. This is called the soundness test.

In this language, we see Håstad’s theorem as the conclusion of a long story. Every mathematical proof can be written formally (Frege) and checked in polynomial time (Gödel). Cook-Levin: we can even verify proofs by local tests. Once the proof written according to suitable rules, it is sufficient to check consistency of each ${3}$-uple of bits. PCP says a bounded number of queries suffice to assert correctness of the proof up to a small probability of error. We are dealing with the ultimate result: 3 bits are sufficient, and with the optimal probability gap of ${1/2}$.

1.4. Boolean functions on ${\{\pm 1\}^k}$

Example 1 Majority is ${sign(\sum x_i)}$. ${i}$-th dictatorship is ${x_i}$. Products of dictatorships constitute the Walsh basis of characters ${w_{S}(x)=\prod_{i\in S}x_i}$. A ${j}$-junta is a Boolean function that depends on at most ${j}$ variables.

Note that characters are multiplicative: ${w_{S}(xy)=w_{S}(x)w_{S}(y)}$.

1.5. Testing characters

A ${j}$-bit test detecting a class ${\mathcal{C}}$ of Boolean functions consists of a probability distribution on ${(\{\pm 1\}^k)^{j}}$ and a decision ${g:\{\pm 1\}^j \rightarrow\{\pm 1\}}$. We accept if ${g(f(x^1) ,\ldots,f(x^j))=1}$.

Its completeness is ${\geq c}$ if ${\mathop{\mathbb P}_{x^1 ,\ldots,x^j}(\textrm{accept})\geq c}$ is ${f\in\mathcal{C}}$. Its soundness is ${\leq s}$ if for all ${f}$ far enough from ${\mathcal{C}}$ (in terms of Hamming distance), ${\mathop{\mathbb P}(\textrm{accept})\leq s}$. (Far enough will be specified in examples).

The character test (aka linearity test) uses multiplicativity. It is a ${3}$-bit test detecting the class of characters. The distribution is obtained as follows: pick ${x}$ and ${y}$ uniformly at random in ${\{\pm 1\}^k}$, and let ${z=xy}$. The decision is ${g(a,b,c)=abc}$. In other words, ${f}$ is accepted whatever the input iff ${f}$ is multiplicative, i.e. iff ${f}$ is a character.

The test is complete by construction. Let us show that for functions whose Hamming distance to characters is ${\geq \frac{1}{2}-\delta}$, soundness is ${\leq \frac{1}{2}+\delta}$. Bellare, Goldreich and Sudan, [BGS], had a cumbersome proof, limitied to small ${\delta}$. Håstad, [H], gave the nice and general proof which follows. Note this is a sort of zero-knowledge proof, since when ${\mathop{\mathbb P}(\textrm{accept})> s}$, one does not know which character is close to ${f}$.

Proof: View ${f}$ as a real valued function and write it as a linear combination of characters (Fourier-Walsh transform),

$\displaystyle \begin{array}{rcl} f=\sum_{S\subset[k]}\hat{f}_{S}w_{S}. \end{array}$

This can be proven as follows. Given real valued function ${f}$, ${g}$ on ${\{\pm 1\}^k}$, define inner product

$\displaystyle \begin{array}{rcl} \langle f,g \rangle=\mathop{\mathbb E}_{x}(f(x)g(x)). \end{array}$

Since coordinates are independant and have expectation ${0}$,

$\displaystyle \begin{array}{rcl} \langle w_S ,w_T \rangle=\mathop{\mathbb E}_x (w_{S\Delta T})=0 \end{array}$

unless ${S\Delta T=\emptyset}$, i.e. ${S=T}$. So characters form an orthonormal basis. Furthermore, ${\hat{f}_{S}=\langle f,w_S \rangle}$. If ${f}$ is Boolean,

$\displaystyle \begin{array}{rcl} 1=\mathop{\mathbb E}_x (f(x)^2)=|f|^2 =\sum_{S\subset [k]}\hat{f}_{S}^{2}. \end{array}$

The probability ${P}$ that the test accepts is given by

$\displaystyle \begin{array}{rcl} 2P-1&=&\mathop{\mathbb E}_{x,\,y}(f(x)f(y)f(xy))\\ &=&\sum_{S,\,T,\,U}\hat{f}_{S}\hat{f}_{T}\hat{f}_{R}\mathop{\mathbb E}_{x,\,y}(w_S (x)w_T (y)w_R (xy))\\ &=&\sum_{S,\,T,\,U}\hat{f}_{S}\hat{f}_{T}\hat{f}_{R}\mathop{\mathbb E}_{x,\,y}(w_S (x)w_T (y)w_R (x)w_R (y))\\ &=&\sum_{S,\,T,\,U}\hat{f}_{S}\hat{f}_{T}\hat{f}_{R}\mathop{\mathbb E}_{x}(w_S (x)w_R (x))\mathop{\mathbb E}_y (w_T (y)w_R (y))\\ &=&\sum_{S\in[k]}\hat{f}_{S}^{3}. \end{array}$

To sum up, if probability of acceptance is ${\geq\frac{1}{2}+\delta}$, then

$\displaystyle \begin{array}{rcl} \sum_{S\in[k]}\hat{f}_{S}^{3}\geq 2\delta. \end{array}$

Since ${\sum_{S}\hat{f}_{S}^{2}=1}$, there must be some ${S\in[k]}$ such that ${\hat{f}_{S}\geq 2\delta}$. The Hamming distance of ${f}$ and ${w_S}$ is

$\displaystyle \begin{array}{rcl} H(f,w_s)&=&\mathop{\mathbb P}_x (f(x)\not=w_S (x))\\ &=&\frac{1}{2}+\frac{1}{2}\mathop{\mathbb E}_x (f(x)\not=w_S (x))\\ &=&\frac{1}{2}+\frac{1}{2}\langle f,w_S \rangle\\ &=&\frac{1}{2}-\frac{1}{2}\hat{f}_{S}\\ &\leq&\frac{1}{2}-\delta. \end{array}$

Observe that, due to concentration, most functions ${f}$ have ${\hat{f}_{S}}$ close to ${0}$, i.e. ${H(f,w_s)}$ close to ${\frac{1}{2}}$. Given a constant ${\delta}$, when ${k}$ is large, the proportion of ${f}$ such that ${H(f,w_s)\leq\frac{1}{2}-\delta}$ decays exponentially. So we have really tested something. $\Box$

1.6. From testing characters to testing dictatorships

Assume we are allowed any bounded number of queries. Iterate the character test. If ${f}$ passes all tests, then ${f}$ is very close to some character. How can one distinguish ${f}$ from a character ? Non dictatorship characters are much more noise sensitive. The probability that ${f}$ changes if one coordinate is changed is twice larger for non-dictatorship than for dictatorships. Do we really need the preliminary character test ? Maybe not.

Here is a first attempt for a dictatorship test. Let ${\epsilon}$ be a noise parameter. Pick ${x}$ and ${y}$ uniformly at random, independantly. Let ${z\sim N_{\epsilon}(k)}$ be an ${\epsilon}$-noice on ${k}$ variables, i.e. each coordinate ${z_i}$ takes independently value ${-1}$ with probability ${\epsilon}$. Accept iff ${f(x)f(y)f(xyz)=1}$. Dictatorships pass this test with probability ${1-\epsilon}$ (completeness is ${\geq 1-\epsilon}$). Unfortunately, the constant function ${1}$ passes the test with probability ${1}$, so soundness is ${1}$ for functions up to distance ${1/2}$ of dictatorships.

Observe that dictatorships satisfy ${f(-x)=-f(x)}$ whereas ${1}$ does not. So here is Håstad’s dictatorship test. Pick ${x}$ and ${y}$ uniformly at random, independantly. Let ${z\sim N_{\epsilon}(k)}$, and ${u}$ uniform in ${\{\pm 1\}}$. Accept iff ${uf(x)f(y)f(uxyz)=1}$. Again, completeness is ${=1-\epsilon}$. For soundness, we shall show today that if ${\mathop{\mathbb P}(\textrm{accept})>1-\epsilon-\alpha}$, then there exists ${i\in[k]}$ such that ${H(f,x_i)<\alpha}$.

Proof: If ${P}$ is the probability of acceptance,

$\displaystyle \begin{array}{rcl} 2P-1&=&\mathop{\mathbb E}_{x,y,z,u}(uf(x)f(y)f(uxyz))\\ &=&\sum_{S,\,T,\,R}\hat{f}_{S}\hat{f}_{T}\hat{f}_{R}\mathop{\mathbb E}_{x,y,z,u}(u\,w_{S}(x)w_{T}(y)w_{R}(uxyz))\\ &=&\sum_{S,\,T,\,R}\hat{f}_{S}\hat{f}_{T}\hat{f}_{R}\mathop{\mathbb E}_{x,y,z,u}(u^{|R|+1}w_{S}(x)w_{R}(x)w_{T}(y)w_{R}(y)w_{R}(z))\\ &=&\sum_{S,\,T,\,R}\hat{f}_{S}\hat{f}_{T}\hat{f}_{R}\mathop{\mathbb E}_{u}(u^{|R|+1})\mathop{\mathbb E}_{x}(w_{S}(x)w_{R}(x))\mathop{\mathbb E}_{y}(w_{T}(y)w_{R}(y))\mathop{\mathbb E}_{z}(w_{R}(z))\\ &=&\sum_{S\subset [k],\,|S|\,\mathrm{odd}}\mathop{\mathbb E}_{z}(w_{S}(z))\hat{f}_{S}^{3}\\ &=&\sum_{S\subset [k],\,|S|\,\mathrm{odd}}(1-2\epsilon)^{|S|}\hat{f}_{S}^{3}. \end{array}$

$\Box$

References

[BGS] Bellare, Mihir; Goldreich, Oded; Sudan, Madhu; Free bits, PCPs, and nonapproximability–towards tight results. SIAM J. Comput. 27 (1998), no. 3, 804–915.

[H] Håstad, Johan; Some optimal inapproximability results. J. ACM 48 (2001), no. 4, 798–859.