Notes of Mathieu’s lecture nr 2

Scribe: Arnaud de Mesmay

This lecture is about Multiway-Cut and Multi-Cut, which are good illustrations of the techniques used to design approximation algorithms using LP relaxations.

1. Min-Cut

The Min-Cut problem is the following problem : the input is a graph and two vertices {s} and {t} and the goal is to find the cheapest cut separating {s} and {t}. As we will see in the next section, it is the Multiway-Cut problem for k=2.

Min-Cut is solvable in polynomial time using classic algorithms (flows..) but we will see here a way to solve it using linear programming.

The first step is to express the problem as an Integer Program (IP)

minimize {\sum_{e\in E}c_ex_e}

\displaystyle  \begin{array}{rcl}  \text{subject to} \begin{cases} \sum_{ e \in p} x_e \geq 1 & \forall \text{ path }p\text{ from }s\text{ to }t \\ x_e\in \{ 0,1 \} & \forall e\in E \end{cases} \end{array}

There are two claims to prove in order to get a polynomial time algorithm.

Claim: There is a separation oracle for the LP relaxation of the above IP (we need it as the number of constraints is exponential)

Proof: The separation algorithm is just Dijkstra’s algorithm to compute a shortest path from {s} to {t} in the graph where edge {e} has length {x_e}. \Box

Claim: the LP relaxation has an integral optimal solution.

Proof: Let {(x_e)} denote the optimal solution of the LP relaxation. Consider distances from {s} to graph vertices in the shortest path metric (using the {x_e} as edge lengths). Pick a random {r} uniformly in the open interval {(0,1)}. Partition the graph into two sets, {L=\{ u: d(s,u)<r\}}, and {R=V-R}. That defines a random integer cut (convex combination of integer cuts). The expected value of this cut:

\displaystyle \sum_{uv\in E}c_e\Pr (e\text{ is cut })=\sum_{uv\in E} c_e |d(s,v)-d(s,u)|\leq \sum_{uv\in E} c_e x_e=\text{ OPT(LP)}.

Hence the optimal solution is a convex combination of integral solutions, thus these integral solutions are optimal too, which concludes the proof. \Box

2. Multiway cut

Input: graph with edge costs {c_e\geq 0}, and {k} terminals, vertices {s_1,s_2,\ldots s_k}. Output: cheapest set of edges disconnecting the {s_i}‘s from one another. For {k \geq 3}, this problem is NP-hard.

Theorem 1 There is a 3/2 approximation.

Proof: The idea is still the same, we use a LP relaxation and then round it.

Equivalently, the output is a partition of {V} into {k} parts, where {s_i} belongs to part {i}, and the cost is the cost of edges between different parts. An IP for the Multiway-Cut problem is :

minimize { \sum_{uv\in E}\sum_{1\leq i\leq k} c_{uv}|x_v^{i}-x_u^{i}|}

\displaystyle  \begin{array}{rcl}  \text{subject to} \begin{cases} \sum_{1\leq i\leq k} x_u^{i}=1 		& \forall u\in V\\ x_{s_i}^i=1				&\forall i=1,2,\ldots ,k\\ x_u^{i}\in \{ 0,1\}			&\forall u,i \end{cases} \end{array}

The idea behind this IP is that {x_u^i=1} means that the vertex {u} is in the partition which contains {s_i}. The constraints force these {x_u^i} to define a partition of {V}, and then the cost of the multiway cut is easily seen to equal the expression we try to minimize.

Observing that {\sum_i |x_u^i-x_v^i|=||x_u-x_v||_1} where {x_u=(x_u^1..x_u^k)} and that {x_{s_i}=(0..1..0)} where the 1 is in the ith position, an equivalent representation of the LP relaxation is :

minimize {\sum_{uv\in E} c_{uv}||x_v-x_u||_1}

\displaystyle  \begin{array}{rcl}  \textrm{subject to} \begin{cases} x_u\in \Delta_k		& \forall u\in V\\ x_{s_i}=(0,0,...,1,0,...,0)				&\forall i=1,2,\ldots ,k\\ \end{cases} \end{array}

where {\Delta_k=\{ x\in R_+^k:\sum_i x_i=1 \}} and the 1 in {x_{s_i}} is in position {i}.

For example, for {k=3}, this LP gives an embedding of {G} in the equilateral triangle with {s_1,s_2,s_3} at the three corners.

Rounding: Similarly as for Min-Cut, the idea to round it is to pick a random {r} and partition with the balls around {s_i} of radius {r}.

The rounding algorithm is the following :

  • Pick {0<r<1} uniformly at random.
  • Pick permutation {\pi} of {[k]} uniformly at random.
  • For {i=1..k-1}; put in region {(s_{\pi_i})} the vertices within distance {r} of {s_{\pi_i}} that have not yet been put elsewhere. Put everyone else in region {(s_{\pi_k})}. The gives a partition, i.e. a multicut.

Analysis: The expected cost of the output cut is

\displaystyle \sum_e c_e \Pr(\text{e is cut})

Up to relabeling, let {s_1} be the one among {s_2..s_k} that minimizes {min(d(s_1,u),d(s_1,v)}.

Then {\Pr(\text{e is cut})=\sum_i{\Pr(\text{e is cut during iteration i of the rounding loop})}}.

But for {j \neq 1}, {s_j} precedes {s_1} in permutation {\pi} with probability {\frac{1}{2}}, and comes after {s_1} with probability {\frac{1}{2}} (because {\pi} is random). Also note that if {s_j} comes after {s_1} in {\pi}, then {s_j}‘s iteration cannot be responsible of separating {u} from {v} (because of the “definition” of {s_1}).

We also have

\displaystyle  \begin{array}{rcl}  d(s_j,u) &=& || (0..1..0) - (x_u^1..x_u^k)||_1 \\ &=& \sum_{i\neq j} x_u^i + (1- x_u^j) \\ &=& 2(1-x_u^j) \end{array}

where the last inequality follows from the constraints in the LP.

Then

\displaystyle  \begin{array}{rcl}  \Pr((u,v) \text{ is cut}) & = & \sum_j \Pr( (u,v) \text{is cut when defining region }(s_j)) \\ &=& 2 |x_u^1-x_v^1| + \sum_{j \neq 1} \Pr((u,v) \text{is cut by } s_j \mid s_j \text{ is before } s_1) \Pr(s_j \text{ before } s_1) \\ & & + \Pr((u,v) \text{is cut by } s_j \mid s_j\text{ is after }s_1) \Pr(s_j \text{ after } s_1) \end{array}

because the probability of cutting for {s_j=s_1} is the probability that {r} falls between {2 (1-x_u^j)} and {2 (1- x_v^j)}. Now recall that {(u,v)} is never cut by {s_j} when {s_j} comes after {s_1} and note, as for {s_j=s_1} that the probability of cutting {(u,v)} at step {j} is bounded by {2 |x_u^j - x_v^j|}. this gives

\displaystyle  \begin{array}{rcl}  \Pr(e \text{ is cut}) &\leq &2|x_u^1-x_v^1| + \frac{1}{2} \sum_{j \neq 1} 2 |x_u^j-x_v^j| \\ &=& |x_u^1-x_v^1| + ||x_u-x_v||_1 \end{array}

Last claim : {|x_u^1-x_v^1| \leq \frac{1}{2} ||x_u - x_v||_1}

Proof: Left as an Exercise. \Box

Now {\Pr(e \text{ is cut}) \leq \frac{3}{2} ||x_u-x_v||_1}, thus the expected value of the output is at most {\frac{3}{2}} of the value of the LP. \Box

3. Multicut

Input: graph with edge costs {c_e\geq 0}, and {k} terminal pairs, pairs of vertices {(s_1,t_1),(s_2,t_2),\ldots ,(s_k,t_k)}. Output: cheapest set of edges disconnecting every {s_i} from {t_i}.

This problem is NP-hard for small {k} ({k=2} ? {k=3} ?)

Theorem 2 There is an {4\ln (k+1)} approximation.

Proof: As before, it is all about finding the right LP and rounding it.

(LP relaxation) We use the following one :

minimize {\sum_{e\in E}c_ex_e}

\displaystyle  \begin{array}{rcl}  \text{subject to} \begin{cases} \sum_{ e \in p} x_e \geq 1 & \forall \text{ i } \forall \text{ path }p\text{ from }s_i \text{ to }t_i  \\ 0 \leq x_e \leq 1 & \forall e\in E \end{cases} \end{array}

(Rounding : ) For {i} going from 1 to k: if {s_i} is not yet separated from {t_i}, then take a ball {B} of radius {r} around {s_i}, where {r<1/2} is chosen to minimize {c(B)/V(B)}, with

\displaystyle c(B)=\sum \{ c(uv), u\in B, v\notin B \}, \text{ and }

\displaystyle V(B) = V_0 + \sum_{e \in \mathcal{B}(s,r)} c_e x_e + \sum_{e \text{ crossing the ball}} c_e x_e \times (\text{ fraction of edge inside}).

(it is different for each {s_i} !). Add {B} as a new part of the partition of {V} that we are constructing, and remove {B} from {G}.

(Analysis:) Feasibility: since {r<1/2}, by the LP constraints on {(s_i,t_i)} we can be sure that no couple {(s_i,t_i)} will be in a ball corresponding to {s_j} for {j\neq i}, so the output is a feasible multicut.

Cost: note that {dV(r)\geq c(r)dr}, so {V'(r)\geq c(r)}, and so {V'(r)/V(r)\geq c(r)/V(r)}. Thus the ball chosen has {c(r)/V(r)\leq f'(r)}, where we define {f(r)=\ln V(r)}. It only remains to use the Mean Value Theorem for {f}.

We have {f(0)=\ln V_0} and {f(1/2)= \ln V(\frac{1}{2}) \leq \ln (V_0 + OPT)}, hence the Mean Value Theorem gives

\displaystyle  \exists r<\frac{1}{2} \text{ s.t. } f'(r) \leq \frac{\ln (1+OPT/V_0)}{\frac{1}{2}-0}

Thus each ball {B_i} added to the partition during iteration {i} increases the cost of the multicut by {c(B_i)\leq V(B_i) 2\ln (1+OPT/V_0)}. Altogether, the final multicut has cost at most

\displaystyle \sum_i V(B_i) 2\ln (1+OPT/V_0)=2(kV_0+OPT)\ln (1+OPT/V_0).

Setting {V_0=OPT/k} proves the Theorem. \Box

Advertisements

About metric2011

metric2011 is a program of Centre Emile Borel, an activity of Institut Henri Poincaré, 11 rue Pierre et Marie Curie, 75005 Paris, France. See http://www.math.ens.fr/metric2011/
This entry was posted in Course and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s