## Notes of Robi Krauthgamer’s lecture nr 3

Approximating Sparsest-Cut

1. Problem definition

The input of the Sparsest-Cut problem is a finite graph ${G=(V,E)}$, ${|V|=n}$, with set of ${k}$ demand pairs (in fact, edges) ${(s_1,t_1), \ldots, (s_k,t_k)}$. The idea is to find a subset ${S\subseteq G}$ thus dividing ${G}$ into two sets, ${S}$ and ${\bar{S}}$, such that as few edges as possible are cut, but the demand pairs are separated. Given such a graph ${S}$ we define the sparsity for the cut ${(S,\bar{S})}$ as the ratio between cost and effectiveness:

Definition 1

$\displaystyle \begin{array}{rcl} \mbox{sparsity}(S)& = &\frac{\#\mbox{edges cut by } (S,\bar{S})}{\#\mbox{demand pairs cut by } (S,\bar{S})}\\ &=& \frac{\sum_{uv\in E} | 1_S(u)-1_S(v)|}{\sum_{i=1}^k |1_S(s_i)-1_S(t_i)|}. \end{array}$

Note that ${| 1_S(u)-1_S(v)|=1}$ when exactly one of ${u,v}$ is in ${S}$, otherwise 0. We could give weights for demands and cuts, but here we stick to unit costs.

Example 1 Assume that we have ${\binom{n}{2}}$ demand pairs, i.e. every edge is a demand pair. Now the Sparsest-Cut problem is uniform demand. (If ${k\neq \binom{n}{2}}$, we would have general demand Sparsest-Cut). We have

$\displaystyle sparsity(S)=\frac{|E(S,\bar{S})|}{|S||\bar{S}|}.$

${(S,\bar{S})}$ denotes the set of edges of ${G}$ with exactly one end vertex in ${S}$. We may assume ${|S|\leq \frac{n}{2}}$ (since if this were not true, we could interchance ${S}$ and ${\bar{S}}$) and ${\frac{n}{2}\leq |\bar{S}| \leq n}$. Thus after scaling the problem is equivalent to minimizing ${\frac{|E(S,\bar{S})|}{|S|}}$, up to a factor of 2. ${\min_S \frac{|E(S,\bar{S})|}{|S|}}$ is called edge expansion of ${G}$.

2. Computational problem

Given graph ${G}$, pairs ${(s_1,t_1), \ldots, (s_k,t_k)}$, compute

$\displaystyle sparsity(G)=\min_S sparsity(S).$

This problem is NP-hard, but there is a polytime ${O(\log k)}$-approximation with linear program relaxation. Today we do ${O(\log n)}$-approximation. Let us first write an integer program.

2.1. Integer program

The idea is to map every vertex to 0 or 1. Then edges from 0 to 1 have the cost 1, other edges have the cost 0.

Definition 2 (Cut metric ${(V,d)}$) A (semi-)metric ${d}$ is a cut metric if there exists a subset ${S\subseteq V}$ such that ${d(u,v)=|1_S(u)-1_S(v)|}$ for all ${u,v\in V}$.

Now the interger program is

$\displaystyle IP=\min\frac{\sum_{uv\in E} d(u,v)}{\sum_i d(s_i,t_i)} \mbox{ such that } d \mbox{ is a cut metric}.$

This is exactly Sparsest-Cut.

2.2. LP-relaxation

We relax the demand that ${d}$ has to be a cut metric by allowing it to be any metric:

$\displaystyle \min \frac{\sum_{uv\in E} d(u,v)}{\sum_i d(s_i,t_i)}$

such that ${d(x,z)\leq d(x,y)+d(y,z) \;\forall x,y,z\in V}$ and ${d(x,y)\geq 0 \;\forall x,y\in V}$. This is invariant of scaling the metric ${d}$. So we may assume that ${\sum_i d(s_i,t_i)=1}$, and we get a linear program relaxation

$\displaystyle \begin{array}{rcl} LP=\min \sum_{uv\in E} d(u,v)\mbox{ s.t. } && \sum_i d(s_i,t_i)=1\\ &&d(x,z)\leq d(x,y)+d(y,z)\; \forall x,y,z\in V\\ &&d(x,y)\geq 0 \;\forall x,y\in V. \end{array}$

Since the relaxation allows solutions which are not cut metric, we have ${LP \leq IP = sparsity(G)}$. Now the idea is to use the LP relaxation to get a solution in polynomial time, and then round it to get some solution ${S}$ for the original problem. The cost of the rounding we allow to be at most ${O(\log n)}$.

Theorem 3 (Linial-London-Rabinovich, Aumann-Rabani 1995) This LP can be rounded with factor ${O(\log k)}$ thus giving a ${O(\log k)}$-approximation for Sparsest-Cut.

The idea here is to look at ${V}$ as a metric space, and embedd it into ${l_1^m}$, which is a convex combination of all cuts. Here ${m}$ is some finite index. We use Bourgain’s theorem, noting that in a sense it holds ${l_2\subseteq l_1}$.

Theorem 4 (Bourgain 1985) Every metric space on ${n}$ points can be embedded into ${l_2}$ with distortion ${O(\log n)}$.

Proof of Theorem 3: Solve the LP. Then apply Bourgain’s embedding, and get a non-contractive map ${f:V\rightarrow l_1^m}$. Now

$\displaystyle \sum_i ||f(s_i)-f(t_i)||_1 \geq 1$

We can scale ${f}$ by a constant ${\leq 1}$ to make this an equality, thus

$\displaystyle \sum_i ||f(s_i)-f(t_i)||_1 = 1.$

Also

$\displaystyle \sum_{uv \in E} ||f(u)-f(v)||_1 \leq O(\log n) \cdot LP.$

Lemma 5 Every finite ${l_1}$-metric ${\tilde{d}}$ can be represented as a positive combination of cut metrics: There exists constants ${\alpha_j}$, ${\alpha_j > 0}$, and sets ${A_j\subseteq V}$ sucht that for all ${u,v\in V}$

$\displaystyle \tilde{d}(u,v)=\sum_j\alpha_j |1_{A_j}(u)-1_{A_j}(v)|.$

Moreover, given the ${l_1}$-embedding ${f}$, this representation can be computed in polynomial time.

Proof: We work coordinate by coordinate. Given ${f:V\rightarrow l_1^m}$. We replace each coordinate of ${f}$, denoted by ${f_t:V\rightarrow \mathbb{R}}$, by a map ${g_t=g:V\rightarrow \mathbb{R}^{n-1}}$ as follows: Suppose ${V=\{v_1,v_2,\ldots v_n\}}$ s.t. ${f_t(v_1)\leq \ldots \leq f_t(v_n)}$. Define

$\displaystyle \begin{array}{rcl} g(v_1)&=&(0,\ldots,0)\\ g(v_2)&=&(f_t(v_2)-f_t(v_1),0,\ldots,0)\\ g(v_3)&=&(f_t(v_2)-f_t(v_1),f_t(v_3)-f_t(v_2),0,\ldots,0)\\ \ldots&&\\ g(v_3)&=&(f_t(v_2)-f_t(v_1),f_t(v_3)-f_t(v_2),\ldots, f_t(v_n)-f_t(v_{n-1}))\\ \end{array}$

After replacing each coordinate of ${f}$ as above, we have a new map ${\tilde{f}:V\rightarrow l_1^{m(n-1)}}$ s.t.

$\displaystyle ||\tilde{f}(u)-\tilde{f}(v)||_1=||f(u)-f(v)||_1$

for all ${u,v\in V}$. Every coordinate of ${\tilde{f}}$ takes only 2 values, 0 or ${f_t(v_j)-f_t(v_{j-1})}$ (in fact at most 2 values, but we can ignore the ones taking only 1 value). I.e.

$\displaystyle \#\{\tilde{f}(v_1),\ldots \tilde{f}(v_n)\}\leq 2$

This means that the coordinate is just some ${\alpha_j>0}$ times a cut metric of ${(A_j,\bar{A}_j)}$:

$\displaystyle |\tilde{f}_t(u)-\tilde{f}_t(v)|=\alpha_t |1_{A_t}(u)-1_{A_t}(v)|$

for all ${u,v\in V}$. $\Box$ We use the lemma to get a map ${\tilde{f}}$ equivalent to ${(\alpha_j,A_j)}$. Now one of these ${A_j}$‘s give us a “good“ cut.

Claim 1 There exists an index ${j^*}$ such that ${A_{j^*}}$ is a good cut, meaning that

$\displaystyle LHS:= \frac{\sum_{uv\in E}\tilde{d}(u,v)}{\sum_i\tilde{d}(s_i,t_i)}\geq \frac{\sum_{uv\in E}|1_{A_{j^*}}(u)-1_{A_{j^*}}(v)|}{\sum_i|1_{A_{j^*}}(s_i)-1_{A_{j^*}}(t_i)|}=: RHS$

Here ${LHS}$ is our original problem, and the we take the best cut among the ${A_j}$‘s. The idea of the proof is that if ${c_1,\ldots,c_n,d_1,\ldots,d_n>0}$, then ${\frac{c_1+c_2+\cdots +c_n}{d_1+ d_2+ \cdots + d_n} \geq \min_j \frac{c_j}{d_j}}$. This can be proven by contradiction: assume the contrary for every ${j}$. Multiply the inequalities by ${d_j}$ and sum up to ge a contradiction ${c_1+\cdots + c_n < c_1+\cdots + c_n}$.

Proof of the Claim: Assume the contrary for all ${j}$. Thus

$\displaystyle \frac{\sum_{uv\in E}\sum_{j}\alpha_j|1_{A_{j}}(u)-1_{A_{j}}(v)|}{\sum_i\sum_j \alpha_j|1_{A_{j}}(s_i)-1_{A_{j}}(t_i)|}=LHS < \frac{\sum_{uv\in E}|1_{A_{j^*}}(u)-1_{A_{j^*}}(v)|}{\sum_i|1_{A_{j^*}}(s_i)-1_{A_{j^*}}(t_i)|}$

We multiply this inequality for every ${j^*}$ by ${\alpha_{j^*}}$ and the denominator of RHS, and add up:

$\displaystyle \sum_i\sum_{j^*}\alpha_{j^*}|1_{A_{j^*}}(s_i)-1_{A_{j^*}}(t_i)|\frac{\sum_{uv\in E}\sum_{j}\alpha_j|1_{A_{j}}(u)-1_{A_{j}}(v)|}{\sum_i\sum_j\alpha_j|1_{A_{j}}(s_i)-1_{A_{j}}(t_i)|}< \sum_{uv\in E}\sum_{j^*}\alpha_{j^*}|1_{A_{j^*}}(u)-1_{A_{j^*}}(v)|$

We get

$\displaystyle \sum_{uv\in E}\sum_{j}\alpha_j|1_{A_{j}}(u)-1_{A_{j}}(v)|< \sum_{uv\in E}\sum_{j^*}\alpha_{j^*}|1_{A_{j^*}}(u)-1_{A_{j^*}}(v)|$

which is a contradiction. $\Box$

By the claim, one of the sets ${A_j}$ is a good cut, and

$\displaystyle sparsity(G)\geq LP \cdot O(\log n) = LHS \geq RHS.$

$\Box$

The above LP relaxation can be improved to get ${O(\log k)}$ approximation, but with an LP relaxation you cannot do any better than that.

Theorem 6 For all ${n}$ there exists an ${n}$ point metric whose embedding into ${l_1}$ requires distortion ${\geq c\log n}$, where ${c>0}$ is a universal constant.

Proof: Let ${g}$ be an ${n}$-vertex edge-expander (bounded degree), which means that for all ${S}$ with ${|S|\leq \frac{n}{2}}$ holds ${|E(S,|bar{S})|\geq \Omega(|S|)}$. Let ${D_G}$ be the shortest path metric. We show that embedding ${d_G}$ into ${l_1}$ requires distortion ${\geq c\log n}$. For suitable ${\eta>0}$ , ${\eta d_G}$ is feasible for LP (uniform demands). Thus

$\displaystyle LP=\frac {\sum_{uv\in E}\eta d_G(u,v)}{\sum_i\eta d_G(s_i,t_i)} \leq \frac{|e|\cdot 1}{\binom{n}{2}\Omega(\log n)} = O(\frac{1}{n\log n})$

The integer solution gives

$\displaystyle sparsity(G)=\min_S \frac{|E(S,\bar{S})|}{|S||\bar{S}|} \geq \frac{\Omega(|S|)}{|S|n}=\Omega(\frac{1}{n})$

We know that LP is smaller than IP by ${O(\log n)}$. From a ${c\log n}$ embedding it would follow that here exists ${S}$ such that ${sparsity(S)\leq LP\cdot c\log n}$ meaning that ${ \sum \frac{O(1)}{n\log n} c\log n \leq sparsity(G)}$. $\Box$

With SDP approach the following tight results have been obtained: ${O(\sqrt{\log n})}$ for uniform demands (Arora-Rao-Vazirani), and ${\tilde{O}(\sqrt{\log k})}$ for general demands (Arora-Lee-Naor). The gap in Sparsest-Cut problem is interesting also because of its direct relation to embeddings.

References

S. Arora, J. Lee, and A. Naor. Euclidean distortion and the sparsest cut. In Proc. 37th Symp. on Theory of Computation (STOC), pages 553-562, 2005.

S. Arora, S. Rao, and U. Vazirani. Expander Flows, Geometric Embeddings and Graph Partitioning. ACM Symposium on Theory of Computing, 2004.

J. Bourgain. On Lipschitz embeddings of finite metric spaces in Hilbert space. Israel J. of Math. 1985.

F. T. Leighton and S. Rao. An approximate max-flow min-cut theorem for uniform multicom- modity flow problems with applications to approximation algorithms. In Proc. 29th Symp. on Foundations of Computer Science (FOCS), pages 422-431, 1988.

N. Linial, E. London, and Y. Rabinovich. The geometry of graphs and some of its algorithmic applications. Combinatorica, 15(2):215-245, 1995.