Notes of Claire Mathieu’s lecture nr 3

 

Clustering and coloring

On monday, we saw LP algorithms for cut problems (multicut…). Today, we move to SDP based algorithms.

Let me first complete the analysis of the Multicut algorithm I described on monday.

Recall Multicut is the search for a cut which separates given pairs {\{s_i ,t_i\}} and minimizes cost {\sum_e c_e x_e}.

Method: Use LP solution as a metric on {G}. For {i=1,\ldots,k}, remove a ball {B(s_i ,r_i)} (for a well-chosen {r_i <1/2}).

Analysis: 1. Feasibility follows from {r_i <1/2}. 2. Cost. Choose {r_i} to minimize {c(r)/V(r)} where {c(r)} is the cost of edges leaving {B(r)} and

\displaystyle  \begin{array}{rcl}  V(r)=V_0 +\sum_{e\,\mathrm{ inside }\,B(s_i ,r)}c_e x_e +\sum_{e\,\mathrm{ crossing }\,B(s_i ,r)}c_e x_e . \end{array}

Key observation: {V'(r)\geq c(r)}. In fact, equality holds unless one crosses vertices (known as coarea formula/inequality in Riemannian geometry). Thus

\displaystyle  \begin{array}{rcl}  \frac{c(r)}{V(r)}=\frac{d}{dr}\log V(r). \end{array}

Apply mean value theorem: there exists {r_i \in(0,1/2)} such that

\displaystyle  \begin{array}{rcl}  \frac{c(r_i)}{V(r_i)}\leq \frac{\log(V(1/2))-\log(V(0))}{\frac{1}{2}-0}\leq 2\log(1+\frac{OPT(LP)}{V_0}). \end{array}

Summing up,

\displaystyle  \begin{array}{rcl}  cost(OUTPUT)\leq 2\log(1+\frac{OPT(LP)}{V_0})(kV_0 +OPT(LP)). \end{array}

Choosing {V_0 =OPT(LP)/k} yields a {4\log(k+1)}-approximation.

Note: since, {\log} was improved to {\sqrt{\log}}, this is due to N. Garg, V. Vazirani and M. Yannakakis, [GVY],

 

1. Correlation clustering

 

Input: a graph, and for every edge {e=uv}, two weights {w_{e}^{+}} (a measure of similarity) and {w_{e}^{-}} (a measure of dissimilarity). Goal: partition the vertices into parts to maximize

\displaystyle  \begin{array}{rcl}  \sum_{u,\,v\,\mathrm{in\, same\, part}}w_{uv}^{+}+\sum_{u,\,v\,\mathrm{in\, different\, parts}}w_{uv}^{-}. \end{array}

No restriction on the number of parts.

Claim: There exists a {1/2}-approximation.

Indeed, a random partition into two parts gives

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb E}(OUTPUT)=\sum_{e}\frac{w_{uv}^{+}+w_{uv}^{-}}{2}\geq\frac{1}{2}\sum_{e}\max\{w_{e}^{+},w_{e}^{-}\}\geq\frac{1}{2}OPT. \end{array}

This is useless, since it does not use input. How to improve ? Iterate ? Not a bad idea. But we do differently, we use an SDP.

Theorem 1 There is an SDP-based {3/4}-approximation.

 

Proof:

SDP formulation: As variables, take vectors {v_i} indexed by vertices, with coordinates indexed by parts. Define

\displaystyle  \begin{array}{rcl}  cost=\sum_{ij}w_{ij}^{+}v_i \cdot v_j +w_{ij}^{-}(1-v_i \cdot v_j) \end{array}

and add constraints {|v_i|^2 =1}. Then integer solutions exactly correspond to clusterings maximizing the given cost. So the SDP is a relaxation of our clustering problem.

Rounding: As for MAX CUT, let us first choose a random hyperplane. This cannot yield better than a {1/2}-approximation. Indeed, if the input wants to be split into {n} parts, the best the SDP can do is to provide vectors which are nearly orthonormal. A hyperplane will split the cloud in halves.

The trick is to take two random hyperplanes through the origin. They partition the sphere into 4 regions. Then

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb E}(OUTPUT)=\sum_{ij}w_{ij}^{+}\mathop{\mathbb P}(i,\, j\textrm{ in same part}) +w_{ij}^{-}\mathop{\mathbb P}(i,\, j\textrm{ in different parts}). \end{array}

We compute

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb P}(i,\, j\textrm{ in same part})=\mathop{\mathbb P}(\textrm{neither hyperplane separates }i\textrm{ and }j) =(1-\frac{\theta_{ij}}{\pi})^2 , \end{array}

where {\theta_{ij}} is the angle between {v_i} and {v_j}. Therefore

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb E}(OUTPUT)=\sum_{ij}w_{ij}^{+}(1-\frac{\theta_{ij}}{\pi})^2 +w_{ij}^{-}(1-(1-\frac{\theta_{ij}}{\pi})^2). \end{array}

Whereas the SDP value is

\displaystyle  \begin{array}{rcl}  Cost(SDP)=\sum_{ij}w_{ij}^{+}\cos(\theta_{ij})+w_{ij}^{-}(1-\cos(\theta_{ij})). \end{array}

Fact: For all {\theta},

\displaystyle  \begin{array}{rcl}  \frac{(1-\theta/\pi)^2}{\cos\theta}\geq \frac{3}{4},\quad \frac{1-(1-\theta/\pi)^2}{1-\cos\theta}\geq \frac{3}{4}. \end{array}

This yields the approximation factor of {3/4}. \Box

Remark: if one used {c} hyperplanes, the exponent {2} would be replaced by {c}. It turns out that {c=2} gives the best numerical value.

 

2. Coloring a {3}-colorable graph

 

Input: a {3}-colorable graph. Goal: color it, using as few colors as possible.

Claim: if {2}-colors are possible, it is trivial to find a {2}-coloring.

Claim: every graph of degree {\leq\Delta} can be {\Delta+1}-colored. Indeed, proceed greedily. Pick a vertex, pick colors for it and its neighbors. Jump to a neighbor,…

Claim: There is an algorithm to color a graph with {O(\sqrt{n})} colors. Indeed,

1. Reduce degree to {\sqrt{n}} by coloring some vertices. While there exists vertex {v} with degree {>\sqrt{n}}, take {v} and its first neighbors, find a {3}-colouring (easy, since neighbors alone form a {2}-colorable graph). Remove these vertices and the {3} used colors. Process stops after at most {\sqrt{n}} steps, having used at most {3\sqrt{n}} colors.

2. Use previous algorithm for remaining graph.

To do better than {\sqrt{n}}, we use an SDP.

Theorem 2 There is an SDP-based {n^{.387..}}-approximation.

 

Proof: Start with {2}-colorability. Want to map vertices to {{\mathbb R}}. Asking that {v_i \cdot v_j =-1} if {ij\in E} and {|v_i|^2 =1} is feasible in {{\mathbb R}} iff graph is {2}-colorable.

For {3}-colorability, want to map vertices to {{\mathbb R}^2}. Look for a set of SDP constraints that is feasible (in {{\mathbb R}^2}) iff the graph is {3}-colorable. Answer: require that {v_i \cdot v_j =-1/2} if {ij\in E} and {|v_i|^2 =1}.

So here is our SDP: Minimize objective function {0} (i.e. merely solve the feasibility problem) under constraints {v_i \cdot v_j =-1/2} if {ij\in E} and {|v_i|^2 =1}.

Rounding: Pick independantly {t} random hyperplanes through the origin. For each region {R}, let {S_R=\{i\,;\,v_i \in R,\,\forall \,j\sim i,\;v_j \notin R\}}. Use one color for all of {S_R}, do this for all {R} (it uses {2^t} colors). Remove all {S_R}‘s. Repeat.

Analysis: If {ij\in E},

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb P}(i,\, j\textrm{ in same region})=(1-\frac{\theta_{ij}}{\pi})^{t}=3^{-t}. \end{array}

Fix {i},

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb P}(i\textrm{ is colored in one iteration}) &=&\mathop{\mathbb P}(\forall j\sim i,\, j\textrm{ is in a different region from }i)\\ &\geq& 1-3^{-t}degree(i)\geq \frac{1}{2} \end{array}

if {t=\log_3 (2n)}. So with this choice of {t}, half of the vertices get colored at each iteration, and the total number of colors needed is { \leq (\log n).2^{\log_3 (2n)}}. This gives a {n^{0.63}} approximation, which is worse than {\sqrt{n}}.

To improve this, combine with the previous idea.

1. Reduce maximum degree to {d=n^{\log_3 (6)}} by repeatedly taking vertex of degree {>d} and its neighbors and {3}-coloring them.

2. On remaining graph, solve SDP relaxation and proceed to rounding.

Analysis is modified as followed.

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb E}(\textrm{number of colors used}) =3.\frac{n}{d}+(\log n).2^t \end{array}

where {t} is such that {1-3^{-t}d=1/2}, i.e. {t=\log_3 d}, giving a number of colors {O(n^{.387...})}. \Box

 

3. MAX SAT

Since there is time left, let us study one more example.

Input: a SAT formula in CNF form. Goal: find an assignment of variables satisfying the largest possible number of clauses.

Trivial algorithm: random assignment. Each clause is satisfied with probability at least {1/2}, so

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb E}(\textrm{number of satisfied clauses})\geq \frac{1}{2}\textrm{number of clauses} \geq \frac{1}{2}OPT. \end{array}

LP based algorithm. Take one number {v_j} for each variable. Cost of a clause is the sum of {v_j} (or {1-v_j}) for all variables occuring in it positively (resp. negatively). Total cost is the the sum of costs of clauses.

Analysis concludes to a {(1-e^{-1})}-approximation.

How can we do better ? Here is a {3/4}-approximation:

1. Try trivial assigment.

2. Try LP-based algorithm

3. Output the better.

This works since the two algorithms work badly on disjoint sets of inputs. The trivial algorithm is poor when there are {1}-variable clauses, where the LP-based algorithm does well.

 

References

 

[GVY] Garg, Naveen; Vazirani, Vijay V.; Yannakakis, Mihalis; Approximate max-flow min-(multi)cut theorems and their applications. SIAM J. Comput. 25 (1996), no. 2, 235–251.

 

About metric2011

metric2011 is a program of Centre Emile Borel, an activity of Institut Henri Poincaré, 11 rue Pierre et Marie Curie, 75005 Paris, France. See http://www.math.ens.fr/metric2011/
This entry was posted in Course and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s