## Notes of Mathieu’s lecture nr 2

Scribe: Arnaud de Mesmay

This lecture is about Multiway-Cut and Multi-Cut, which are good illustrations of the techniques used to design approximation algorithms using LP relaxations.

1. Min-Cut

The Min-Cut problem is the following problem : the input is a graph and two vertices ${s}$ and ${t}$ and the goal is to find the cheapest cut separating ${s}$ and ${t}$. As we will see in the next section, it is the Multiway-Cut problem for k=2.

Min-Cut is solvable in polynomial time using classic algorithms (flows..) but we will see here a way to solve it using linear programming.

The first step is to express the problem as an Integer Program (IP)

minimize ${\sum_{e\in E}c_ex_e}$

$\displaystyle \begin{array}{rcl} \text{subject to} \begin{cases} \sum_{ e \in p} x_e \geq 1 & \forall \text{ path }p\text{ from }s\text{ to }t \\ x_e\in \{ 0,1 \} & \forall e\in E \end{cases} \end{array}$

There are two claims to prove in order to get a polynomial time algorithm.

Claim: There is a separation oracle for the LP relaxation of the above IP (we need it as the number of constraints is exponential)

Proof: The separation algorithm is just Dijkstra’s algorithm to compute a shortest path from ${s}$ to ${t}$ in the graph where edge ${e}$ has length ${x_e}$. $\Box$

Claim: the LP relaxation has an integral optimal solution.

Proof: Let ${(x_e)}$ denote the optimal solution of the LP relaxation. Consider distances from ${s}$ to graph vertices in the shortest path metric (using the ${x_e}$ as edge lengths). Pick a random ${r}$ uniformly in the open interval ${(0,1)}$. Partition the graph into two sets, ${L=\{ u: d(s,u), and ${R=V-R}$. That defines a random integer cut (convex combination of integer cuts). The expected value of this cut:

$\displaystyle \sum_{uv\in E}c_e\Pr (e\text{ is cut })=\sum_{uv\in E} c_e |d(s,v)-d(s,u)|\leq \sum_{uv\in E} c_e x_e=\text{ OPT(LP)}.$

Hence the optimal solution is a convex combination of integral solutions, thus these integral solutions are optimal too, which concludes the proof. $\Box$

2. Multiway cut

Input: graph with edge costs ${c_e\geq 0}$, and ${k}$ terminals, vertices ${s_1,s_2,\ldots s_k}$. Output: cheapest set of edges disconnecting the ${s_i}$‘s from one another. For ${k \geq 3}$, this problem is NP-hard.

Theorem 1 There is a 3/2 approximation.

Proof: The idea is still the same, we use a LP relaxation and then round it.

Equivalently, the output is a partition of ${V}$ into ${k}$ parts, where ${s_i}$ belongs to part ${i}$, and the cost is the cost of edges between different parts. An IP for the Multiway-Cut problem is :

minimize ${ \sum_{uv\in E}\sum_{1\leq i\leq k} c_{uv}|x_v^{i}-x_u^{i}|}$

$\displaystyle \begin{array}{rcl} \text{subject to} \begin{cases} \sum_{1\leq i\leq k} x_u^{i}=1 & \forall u\in V\\ x_{s_i}^i=1 &\forall i=1,2,\ldots ,k\\ x_u^{i}\in \{ 0,1\} &\forall u,i \end{cases} \end{array}$

The idea behind this IP is that ${x_u^i=1}$ means that the vertex ${u}$ is in the partition which contains ${s_i}$. The constraints force these ${x_u^i}$ to define a partition of ${V}$, and then the cost of the multiway cut is easily seen to equal the expression we try to minimize.

Observing that ${\sum_i |x_u^i-x_v^i|=||x_u-x_v||_1}$ where ${x_u=(x_u^1..x_u^k)}$ and that ${x_{s_i}=(0..1..0)}$ where the 1 is in the ith position, an equivalent representation of the LP relaxation is :

minimize ${\sum_{uv\in E} c_{uv}||x_v-x_u||_1}$

$\displaystyle \begin{array}{rcl} \textrm{subject to} \begin{cases} x_u\in \Delta_k & \forall u\in V\\ x_{s_i}=(0,0,...,1,0,...,0) &\forall i=1,2,\ldots ,k\\ \end{cases} \end{array}$

where ${\Delta_k=\{ x\in R_+^k:\sum_i x_i=1 \}}$ and the 1 in ${x_{s_i}}$ is in position ${i}$.

For example, for ${k=3}$, this LP gives an embedding of ${G}$ in the equilateral triangle with ${s_1,s_2,s_3}$ at the three corners.

Rounding: Similarly as for Min-Cut, the idea to round it is to pick a random ${r}$ and partition with the balls around ${s_i}$ of radius ${r}$.

The rounding algorithm is the following :

• Pick ${0 uniformly at random.
• Pick permutation ${\pi}$ of ${[k]}$ uniformly at random.
• For ${i=1..k-1}$; put in region ${(s_{\pi_i})}$ the vertices within distance ${r}$ of ${s_{\pi_i}}$ that have not yet been put elsewhere. Put everyone else in region ${(s_{\pi_k})}$. The gives a partition, i.e. a multicut.

Analysis: The expected cost of the output cut is

$\displaystyle \sum_e c_e \Pr(\text{e is cut})$

Up to relabeling, let ${s_1}$ be the one among ${s_2..s_k}$ that minimizes ${min(d(s_1,u),d(s_1,v)}$.

Then ${\Pr(\text{e is cut})=\sum_i{\Pr(\text{e is cut during iteration i of the rounding loop})}}$.

But for ${j \neq 1}$, ${s_j}$ precedes ${s_1}$ in permutation ${\pi}$ with probability ${\frac{1}{2}}$, and comes after ${s_1}$ with probability ${\frac{1}{2}}$ (because ${\pi}$ is random). Also note that if ${s_j}$ comes after ${s_1}$ in ${\pi}$, then ${s_j}$‘s iteration cannot be responsible of separating ${u}$ from ${v}$ (because of the “definition” of ${s_1}$).

We also have

$\displaystyle \begin{array}{rcl} d(s_j,u) &=& || (0..1..0) - (x_u^1..x_u^k)||_1 \\ &=& \sum_{i\neq j} x_u^i + (1- x_u^j) \\ &=& 2(1-x_u^j) \end{array}$

where the last inequality follows from the constraints in the LP.

Then

$\displaystyle \begin{array}{rcl} \Pr((u,v) \text{ is cut}) & = & \sum_j \Pr( (u,v) \text{is cut when defining region }(s_j)) \\ &=& 2 |x_u^1-x_v^1| + \sum_{j \neq 1} \Pr((u,v) \text{is cut by } s_j \mid s_j \text{ is before } s_1) \Pr(s_j \text{ before } s_1) \\ & & + \Pr((u,v) \text{is cut by } s_j \mid s_j\text{ is after }s_1) \Pr(s_j \text{ after } s_1) \end{array}$

because the probability of cutting for ${s_j=s_1}$ is the probability that ${r}$ falls between ${2 (1-x_u^j)}$ and ${2 (1- x_v^j)}$. Now recall that ${(u,v)}$ is never cut by ${s_j}$ when ${s_j}$ comes after ${s_1}$ and note, as for ${s_j=s_1}$ that the probability of cutting ${(u,v)}$ at step ${j}$ is bounded by ${2 |x_u^j - x_v^j|}$. this gives

$\displaystyle \begin{array}{rcl} \Pr(e \text{ is cut}) &\leq &2|x_u^1-x_v^1| + \frac{1}{2} \sum_{j \neq 1} 2 |x_u^j-x_v^j| \\ &=& |x_u^1-x_v^1| + ||x_u-x_v||_1 \end{array}$

Last claim : ${|x_u^1-x_v^1| \leq \frac{1}{2} ||x_u - x_v||_1}$

Proof: Left as an Exercise. $\Box$

Now ${\Pr(e \text{ is cut}) \leq \frac{3}{2} ||x_u-x_v||_1}$, thus the expected value of the output is at most ${\frac{3}{2}}$ of the value of the LP. $\Box$

3. Multicut

Input: graph with edge costs ${c_e\geq 0}$, and ${k}$ terminal pairs, pairs of vertices ${(s_1,t_1),(s_2,t_2),\ldots ,(s_k,t_k)}$. Output: cheapest set of edges disconnecting every ${s_i}$ from ${t_i}$.

This problem is NP-hard for small ${k}$ (${k=2}$ ? ${k=3}$ ?)

Theorem 2 There is an ${4\ln (k+1)}$ approximation.

Proof: As before, it is all about finding the right LP and rounding it.

(LP relaxation) We use the following one :

minimize ${\sum_{e\in E}c_ex_e}$

$\displaystyle \begin{array}{rcl} \text{subject to} \begin{cases} \sum_{ e \in p} x_e \geq 1 & \forall \text{ i } \forall \text{ path }p\text{ from }s_i \text{ to }t_i \\ 0 \leq x_e \leq 1 & \forall e\in E \end{cases} \end{array}$

(Rounding : ) For ${i}$ going from 1 to k: if ${s_i}$ is not yet separated from ${t_i}$, then take a ball ${B}$ of radius ${r}$ around ${s_i}$, where ${r<1/2}$ is chosen to minimize ${c(B)/V(B)}$, with

$\displaystyle c(B)=\sum \{ c(uv), u\in B, v\notin B \}, \text{ and }$

$\displaystyle V(B) = V_0 + \sum_{e \in \mathcal{B}(s,r)} c_e x_e + \sum_{e \text{ crossing the ball}} c_e x_e \times (\text{ fraction of edge inside}).$

(it is different for each ${s_i}$ !). Add ${B}$ as a new part of the partition of ${V}$ that we are constructing, and remove ${B}$ from ${G}$.

(Analysis:) Feasibility: since ${r<1/2}$, by the LP constraints on ${(s_i,t_i)}$ we can be sure that no couple ${(s_i,t_i)}$ will be in a ball corresponding to ${s_j}$ for ${j\neq i}$, so the output is a feasible multicut.

Cost: note that ${dV(r)\geq c(r)dr}$, so ${V'(r)\geq c(r)}$, and so ${V'(r)/V(r)\geq c(r)/V(r)}$. Thus the ball chosen has ${c(r)/V(r)\leq f'(r)}$, where we define ${f(r)=\ln V(r)}$. It only remains to use the Mean Value Theorem for ${f}$.

We have ${f(0)=\ln V_0}$ and ${f(1/2)= \ln V(\frac{1}{2}) \leq \ln (V_0 + OPT)}$, hence the Mean Value Theorem gives

$\displaystyle \exists r<\frac{1}{2} \text{ s.t. } f'(r) \leq \frac{\ln (1+OPT/V_0)}{\frac{1}{2}-0}$

Thus each ball ${B_i}$ added to the partition during iteration ${i}$ increases the cost of the multicut by ${c(B_i)\leq V(B_i) 2\ln (1+OPT/V_0)}$. Altogether, the final multicut has cost at most

$\displaystyle \sum_i V(B_i) 2\ln (1+OPT/V_0)=2(kV_0+OPT)\ln (1+OPT/V_0).$

Setting ${V_0=OPT/k}$ proves the Theorem. $\Box$