Clustering and coloring
On monday, we saw LP algorithms for cut problems (multicut…). Today, we move to SDP based algorithms.
Let me first complete the analysis of the Multicut algorithm I described on monday.
Recall Multicut is the search for a cut which separates given pairs and minimizes cost .
Method: Use LP solution as a metric on . For , remove a ball (for a well-chosen ).
Analysis: 1. Feasibility follows from . 2. Cost. Choose to minimize where is the cost of edges leaving and
Key observation: . In fact, equality holds unless one crosses vertices (known as coarea formula/inequality in Riemannian geometry). Thus
Apply mean value theorem: there exists such that
Choosing yields a -approximation.
Note: since, was improved to , this is due to N. Garg, V. Vazirani and M. Yannakakis, [GVY],
1. Correlation clustering
Input: a graph, and for every edge , two weights (a measure of similarity) and (a measure of dissimilarity). Goal: partition the vertices into parts to maximize
No restriction on the number of parts.
Claim: There exists a -approximation.
Indeed, a random partition into two parts gives
This is useless, since it does not use input. How to improve ? Iterate ? Not a bad idea. But we do differently, we use an SDP.
Theorem 1 There is an SDP-based -approximation.
SDP formulation: As variables, take vectors indexed by vertices, with coordinates indexed by parts. Define
and add constraints . Then integer solutions exactly correspond to clusterings maximizing the given cost. So the SDP is a relaxation of our clustering problem.
Rounding: As for MAX CUT, let us first choose a random hyperplane. This cannot yield better than a -approximation. Indeed, if the input wants to be split into parts, the best the SDP can do is to provide vectors which are nearly orthonormal. A hyperplane will split the cloud in halves.
The trick is to take two random hyperplanes through the origin. They partition the sphere into 4 regions. Then
where is the angle between and . Therefore
Whereas the SDP value is
Fact: For all ,
This yields the approximation factor of .
Remark: if one used hyperplanes, the exponent would be replaced by . It turns out that gives the best numerical value.
2. Coloring a -colorable graph
Input: a -colorable graph. Goal: color it, using as few colors as possible.
Claim: if -colors are possible, it is trivial to find a -coloring.
Claim: every graph of degree can be -colored. Indeed, proceed greedily. Pick a vertex, pick colors for it and its neighbors. Jump to a neighbor,…
Claim: There is an algorithm to color a graph with colors. Indeed,
1. Reduce degree to by coloring some vertices. While there exists vertex with degree , take and its first neighbors, find a -colouring (easy, since neighbors alone form a -colorable graph). Remove these vertices and the used colors. Process stops after at most steps, having used at most colors.
2. Use previous algorithm for remaining graph.
To do better than , we use an SDP.
Theorem 2 There is an SDP-based -approximation.
Proof: Start with -colorability. Want to map vertices to . Asking that if and is feasible in iff graph is -colorable.
For -colorability, want to map vertices to . Look for a set of SDP constraints that is feasible (in ) iff the graph is -colorable. Answer: require that if and .
So here is our SDP: Minimize objective function (i.e. merely solve the feasibility problem) under constraints if and .
Rounding: Pick independantly random hyperplanes through the origin. For each region , let . Use one color for all of , do this for all (it uses colors). Remove all ‘s. Repeat.
Analysis: If ,
if . So with this choice of , half of the vertices get colored at each iteration, and the total number of colors needed is . This gives a approximation, which is worse than .
To improve this, combine with the previous idea.
1. Reduce maximum degree to by repeatedly taking vertex of degree and its neighbors and -coloring them.
2. On remaining graph, solve SDP relaxation and proceed to rounding.
Analysis is modified as followed.
where is such that , i.e. , giving a number of colors .
3. MAX SAT
Since there is time left, let us study one more example.
Input: a SAT formula in CNF form. Goal: find an assignment of variables satisfying the largest possible number of clauses.
Trivial algorithm: random assignment. Each clause is satisfied with probability at least , so
LP based algorithm. Take one number for each variable. Cost of a clause is the sum of (or ) for all variables occuring in it positively (resp. negatively). Total cost is the the sum of costs of clauses.
Analysis concludes to a -approximation.
How can we do better ? Here is a -approximation:
1. Try trivial assigment.
2. Try LP-based algorithm
3. Output the better.
This works since the two algorithms work badly on disjoint sets of inputs. The trivial algorithm is poor when there are -variable clauses, where the LP-based algorithm does well.