## Notes of Claire Mathieu’s lecture nr 3

Clustering and coloring

On monday, we saw LP algorithms for cut problems (multicut…). Today, we move to SDP based algorithms.

Let me first complete the analysis of the Multicut algorithm I described on monday.

Recall Multicut is the search for a cut which separates given pairs ${\{s_i ,t_i\}}$ and minimizes cost ${\sum_e c_e x_e}$.

Method: Use LP solution as a metric on ${G}$. For ${i=1,\ldots,k}$, remove a ball ${B(s_i ,r_i)}$ (for a well-chosen ${r_i <1/2}$).

Analysis: 1. Feasibility follows from ${r_i <1/2}$. 2. Cost. Choose ${r_i}$ to minimize ${c(r)/V(r)}$ where ${c(r)}$ is the cost of edges leaving ${B(r)}$ and

$\displaystyle \begin{array}{rcl} V(r)=V_0 +\sum_{e\,\mathrm{ inside }\,B(s_i ,r)}c_e x_e +\sum_{e\,\mathrm{ crossing }\,B(s_i ,r)}c_e x_e . \end{array}$

Key observation: ${V'(r)\geq c(r)}$. In fact, equality holds unless one crosses vertices (known as coarea formula/inequality in Riemannian geometry). Thus

$\displaystyle \begin{array}{rcl} \frac{c(r)}{V(r)}=\frac{d}{dr}\log V(r). \end{array}$

Apply mean value theorem: there exists ${r_i \in(0,1/2)}$ such that

$\displaystyle \begin{array}{rcl} \frac{c(r_i)}{V(r_i)}\leq \frac{\log(V(1/2))-\log(V(0))}{\frac{1}{2}-0}\leq 2\log(1+\frac{OPT(LP)}{V_0}). \end{array}$

Summing up,

$\displaystyle \begin{array}{rcl} cost(OUTPUT)\leq 2\log(1+\frac{OPT(LP)}{V_0})(kV_0 +OPT(LP)). \end{array}$

Choosing ${V_0 =OPT(LP)/k}$ yields a ${4\log(k+1)}$-approximation.

Note: since, ${\log}$ was improved to ${\sqrt{\log}}$, this is due to N. Garg, V. Vazirani and M. Yannakakis, [GVY],

1. Correlation clustering

Input: a graph, and for every edge ${e=uv}$, two weights ${w_{e}^{+}}$ (a measure of similarity) and ${w_{e}^{-}}$ (a measure of dissimilarity). Goal: partition the vertices into parts to maximize

$\displaystyle \begin{array}{rcl} \sum_{u,\,v\,\mathrm{in\, same\, part}}w_{uv}^{+}+\sum_{u,\,v\,\mathrm{in\, different\, parts}}w_{uv}^{-}. \end{array}$

No restriction on the number of parts.

Claim: There exists a ${1/2}$-approximation.

Indeed, a random partition into two parts gives

$\displaystyle \begin{array}{rcl} \mathop{\mathbb E}(OUTPUT)=\sum_{e}\frac{w_{uv}^{+}+w_{uv}^{-}}{2}\geq\frac{1}{2}\sum_{e}\max\{w_{e}^{+},w_{e}^{-}\}\geq\frac{1}{2}OPT. \end{array}$

This is useless, since it does not use input. How to improve ? Iterate ? Not a bad idea. But we do differently, we use an SDP.

Theorem 1 There is an SDP-based ${3/4}$-approximation.

Proof:

SDP formulation: As variables, take vectors ${v_i}$ indexed by vertices, with coordinates indexed by parts. Define

$\displaystyle \begin{array}{rcl} cost=\sum_{ij}w_{ij}^{+}v_i \cdot v_j +w_{ij}^{-}(1-v_i \cdot v_j) \end{array}$

and add constraints ${|v_i|^2 =1}$. Then integer solutions exactly correspond to clusterings maximizing the given cost. So the SDP is a relaxation of our clustering problem.

Rounding: As for MAX CUT, let us first choose a random hyperplane. This cannot yield better than a ${1/2}$-approximation. Indeed, if the input wants to be split into ${n}$ parts, the best the SDP can do is to provide vectors which are nearly orthonormal. A hyperplane will split the cloud in halves.

The trick is to take two random hyperplanes through the origin. They partition the sphere into 4 regions. Then

$\displaystyle \begin{array}{rcl} \mathop{\mathbb E}(OUTPUT)=\sum_{ij}w_{ij}^{+}\mathop{\mathbb P}(i,\, j\textrm{ in same part}) +w_{ij}^{-}\mathop{\mathbb P}(i,\, j\textrm{ in different parts}). \end{array}$

We compute

$\displaystyle \begin{array}{rcl} \mathop{\mathbb P}(i,\, j\textrm{ in same part})=\mathop{\mathbb P}(\textrm{neither hyperplane separates }i\textrm{ and }j) =(1-\frac{\theta_{ij}}{\pi})^2 , \end{array}$

where ${\theta_{ij}}$ is the angle between ${v_i}$ and ${v_j}$. Therefore

$\displaystyle \begin{array}{rcl} \mathop{\mathbb E}(OUTPUT)=\sum_{ij}w_{ij}^{+}(1-\frac{\theta_{ij}}{\pi})^2 +w_{ij}^{-}(1-(1-\frac{\theta_{ij}}{\pi})^2). \end{array}$

Whereas the SDP value is

$\displaystyle \begin{array}{rcl} Cost(SDP)=\sum_{ij}w_{ij}^{+}\cos(\theta_{ij})+w_{ij}^{-}(1-\cos(\theta_{ij})). \end{array}$

Fact: For all ${\theta}$,

$\displaystyle \begin{array}{rcl} \frac{(1-\theta/\pi)^2}{\cos\theta}\geq \frac{3}{4},\quad \frac{1-(1-\theta/\pi)^2}{1-\cos\theta}\geq \frac{3}{4}. \end{array}$

This yields the approximation factor of ${3/4}$. $\Box$

Remark: if one used ${c}$ hyperplanes, the exponent ${2}$ would be replaced by ${c}$. It turns out that ${c=2}$ gives the best numerical value.

2. Coloring a ${3}$-colorable graph

Input: a ${3}$-colorable graph. Goal: color it, using as few colors as possible.

Claim: if ${2}$-colors are possible, it is trivial to find a ${2}$-coloring.

Claim: every graph of degree ${\leq\Delta}$ can be ${\Delta+1}$-colored. Indeed, proceed greedily. Pick a vertex, pick colors for it and its neighbors. Jump to a neighbor,…

Claim: There is an algorithm to color a graph with ${O(\sqrt{n})}$ colors. Indeed,

1. Reduce degree to ${\sqrt{n}}$ by coloring some vertices. While there exists vertex ${v}$ with degree ${>\sqrt{n}}$, take ${v}$ and its first neighbors, find a ${3}$-colouring (easy, since neighbors alone form a ${2}$-colorable graph). Remove these vertices and the ${3}$ used colors. Process stops after at most ${\sqrt{n}}$ steps, having used at most ${3\sqrt{n}}$ colors.

2. Use previous algorithm for remaining graph.

To do better than ${\sqrt{n}}$, we use an SDP.

Theorem 2 There is an SDP-based ${n^{.387..}}$-approximation.

Proof: Start with ${2}$-colorability. Want to map vertices to ${{\mathbb R}}$. Asking that ${v_i \cdot v_j =-1}$ if ${ij\in E}$ and ${|v_i|^2 =1}$ is feasible in ${{\mathbb R}}$ iff graph is ${2}$-colorable.

For ${3}$-colorability, want to map vertices to ${{\mathbb R}^2}$. Look for a set of SDP constraints that is feasible (in ${{\mathbb R}^2}$) iff the graph is ${3}$-colorable. Answer: require that ${v_i \cdot v_j =-1/2}$ if ${ij\in E}$ and ${|v_i|^2 =1}$.

So here is our SDP: Minimize objective function ${0}$ (i.e. merely solve the feasibility problem) under constraints ${v_i \cdot v_j =-1/2}$ if ${ij\in E}$ and ${|v_i|^2 =1}$.

Rounding: Pick independantly ${t}$ random hyperplanes through the origin. For each region ${R}$, let ${S_R=\{i\,;\,v_i \in R,\,\forall \,j\sim i,\;v_j \notin R\}}$. Use one color for all of ${S_R}$, do this for all ${R}$ (it uses ${2^t}$ colors). Remove all ${S_R}$‘s. Repeat.

Analysis: If ${ij\in E}$,

$\displaystyle \begin{array}{rcl} \mathop{\mathbb P}(i,\, j\textrm{ in same region})=(1-\frac{\theta_{ij}}{\pi})^{t}=3^{-t}. \end{array}$

Fix ${i}$,

$\displaystyle \begin{array}{rcl} \mathop{\mathbb P}(i\textrm{ is colored in one iteration}) &=&\mathop{\mathbb P}(\forall j\sim i,\, j\textrm{ is in a different region from }i)\\ &\geq& 1-3^{-t}degree(i)\geq \frac{1}{2} \end{array}$

if ${t=\log_3 (2n)}$. So with this choice of ${t}$, half of the vertices get colored at each iteration, and the total number of colors needed is ${ \leq (\log n).2^{\log_3 (2n)}}$. This gives a ${n^{0.63}}$ approximation, which is worse than ${\sqrt{n}}$.

To improve this, combine with the previous idea.

1. Reduce maximum degree to ${d=n^{\log_3 (6)}}$ by repeatedly taking vertex of degree ${>d}$ and its neighbors and ${3}$-coloring them.

2. On remaining graph, solve SDP relaxation and proceed to rounding.

Analysis is modified as followed.

$\displaystyle \begin{array}{rcl} \mathop{\mathbb E}(\textrm{number of colors used}) =3.\frac{n}{d}+(\log n).2^t \end{array}$

where ${t}$ is such that ${1-3^{-t}d=1/2}$, i.e. ${t=\log_3 d}$, giving a number of colors ${O(n^{.387...})}$. $\Box$

3. MAX SAT

Since there is time left, let us study one more example.

Input: a SAT formula in CNF form. Goal: find an assignment of variables satisfying the largest possible number of clauses.

Trivial algorithm: random assignment. Each clause is satisfied with probability at least ${1/2}$, so

$\displaystyle \begin{array}{rcl} \mathop{\mathbb E}(\textrm{number of satisfied clauses})\geq \frac{1}{2}\textrm{number of clauses} \geq \frac{1}{2}OPT. \end{array}$

LP based algorithm. Take one number ${v_j}$ for each variable. Cost of a clause is the sum of ${v_j}$ (or ${1-v_j}$) for all variables occuring in it positively (resp. negatively). Total cost is the the sum of costs of clauses.

Analysis concludes to a ${(1-e^{-1})}$-approximation.

How can we do better ? Here is a ${3/4}$-approximation:

1. Try trivial assigment.

2. Try LP-based algorithm

3. Output the better.

This works since the two algorithms work badly on disjoint sets of inputs. The trivial algorithm is poor when there are ${1}$-variable clauses, where the LP-based algorithm does well.

References

[GVY] Garg, Naveen; Vazirani, Vijay V.; Yannakakis, Mihalis; Approximate max-flow min-(multi)cut theorems and their applications. SIAM J. Comput. 25 (1996), no. 2, 235–251.