Notes of Assaf Naor’s talk, jan. 17th

 

Approximate kernel clustering

Two papers with Subhash Khot, [KN1], [KN2], plus a recent update. Motivated by practitioners, Song, Smola, Getton, Borgwardt, who came up with nice experiments, and raised a beautiful question.

 

1. Statistical motivation

 

Let {X_i} be random variables, {A=} covariance matrix, called the kernel. We want to cluster {A}, i.e. find a partition {S_1 ,\ldots,S_k} of {[n]}, form the much smaller matrix

\displaystyle  \begin{array}{rcl}  C_{ij}=\sum_{p\in S_i ,\,q\in S_j} a_{ij}. \end{array}

Given a small, {k\times k}, positive semidefinite matrix {B}, called the hypothesis matrix, find a clustering of {A} such that {trace(BC)} is maximal. Notation: {Clust(A|B)=\max\{trace(BC)\,;\,\textrm{all partitions}\}}.

This contains many problems, related for instance to locating low dimensional manifolds in large data spaces.

Question: is there a polytime algorithm to compute (approximate) {Clust(A|B)}.

Theorem 1 Exact computation is NP-hard, but there exists a polytime algorithm which computes {Clust(A|B)} up to the factor {1+\frac{3\pi}{2}}.

 

2. Towards optimal results ?

 

Say {A} is centered if {\sum_{ij}a_{ij}=0}. In case {A} is centered, we believe that our results are optimal.

Example 1 {B=} identity matrix. Then the goal is to find a partition which maximizes the sum of correlations within each piece.

 

Example 2 {B=\begin{pmatrix} 1&-1 \\ -1&1 \end{pmatrix}}. Then the goal is the positive semidefinite Grothendieck problem.

There is a natural method, which consists in replacing signs in {B} by inner products between unit vectors. Grothendieck showed that the integrality gap of this SDP relaxation is bounded, and {\geq \pi/2}. Later on, Rietz showed that it is equal to {\pi/2}. For general {n\times n} matrices {A}, the best {K} is {\Theta(\log n)}. Nemirovsky et al., Alon and the Makarychev bros.

Example 3 {A=} Laplacian of a graph, i.e. {a_{ii}=d_{i}}, {a_{ij}=-1} if {ij\in E}, {=0} otherwise.

{trace(BA)=4E(S,S^c)}, so {Clust(A|\begin{pmatrix} 1&-1 \\ -1&1 \end{pmatrix})=4} MAX CUT. So there is some inapproximability threshold for CLUST.

Theorem 2 Assume {A} is centered and {B} is centered and spherical (i.e. {1}‘s on the diagonal) or {B=} identity matrix. Then there exists a polytime algorithm that computes {Clust(A|B)} up to

  • If {k=2}, {\frac{\pi}{2}}, sharp assuming UGC.
  • If {k=3}, {\frac{16\pi}{27}}, sharp assuming UGC.
  • If {k\geq 4}, {\frac{8\pi}{9}(1-\frac{1}{k})}.

The last value may be tight as well. It depends on a geometric conjecture which I discuss now.

 

3. A geometric puzzle

 

Puzzle: Partition the plane into 3 measurable sets {A,B,C}. Define the Gaussian moment {x_A} of a set {A} as the vector whose components are the integrals of coordinate functions with respect to Gaussian measure. For which partition of the plane is {|x_A|^2 +|x_B|^2 +|x_C|^2} maximal ?

Theorem 3 Maximum is attained at partitions into 3 equal sectors.

 

Next puzzle: Partition {k}-space into {k+1} measurable sets. Similar question. We do not know the answer, but we formulate the Propeller conjecture: only 3 pieces are needed, and are products of {2}-dimensional sectors with complementary subspaces.

Theorem 4 In a partition of {{\mathbb R}^{k-1}} which maximises the sum of the squares of Gaussian moments, the non empty pieces are products of vector subspaces and pieces of a simplicial and conical partition in lower dimension.

If we could prove that the simplices are regular, the Propeller Conjecture would follow.

A similar phenomenon (only a constant number of pieces are necessary) already showed up in work of Keith Ball.

 

4. Back to the algorithm for CLUST{(A,B)}

 

Given a positive definite matrix {B}, view it as a Gram matrix {v_i \cdot v_j}. Let {R(B)} be the radius Euclidean ball containing all {v_i}‘s. Let {C(B)} be the maximum if {\sum_{ij}b_{ij}z_{i}\cdot z_{j}} over all vectors {z_i} arising as Gaussian moments of a partition of Euclidean space.

Theorem 5 There is a polytime algorithm which, given {A} computes {Clust(A|B)} up to {\frac{R(B)^2}{C(B)}}. Assuming UGC, no other algorithm can do better.

 

For {B=B(c)=\begin{pmatrix} 1&0&0 \\ 0&1&0 \\ 0&0&c \end{pmatrix}}, the UGC threshold is

\displaystyle  \begin{array}{rcl}  \frac{4\pi c (1+c)^2}{(1+2c)^3}&&\quad \textrm{if }c\geq\frac{1}{2},\\ \frac{\pi (1+c)^2}{2+4c}&&\quad \textrm{if }c\leq\frac{1}{2},\\ \end{array}

 

4.1. Connection with Grothendieck’s constant

 

Grothendieck’s inequality is

\displaystyle  \begin{array}{rcl}  \max_{x_i \in S^{k-1}}\sum_{ij}a_{ij}x_i \cdot x_j \leq K\,\max_{\epsilon_i =\pm 1}\sum_{ij}a_{ij}\epsilon_i \epsilon_j . \end{array}

Given {k} vectors {v_j} in {{\mathbb R}^k}, there is a generalization

\displaystyle  \begin{array}{rcl}  \max_{x_i \in S^{k-1}}\sum_{ij}a_{ij}x_i \cdot x_j \leq K\,\max_{u_i ,\,u_j \in \{v_1 ,\ldots,v_k}\sum_{ij}a_{ij}u_i \cdot u_j . \end{array}

Theorem 6 The best {K} in the generalized Grothendieck inequality above is {1/C(B)} where {B=(v_i \cdot v_j)_{ij}}.

 

Proof: Project… \Box

 

4.2. Reduction from UGC to CLUST

 

Clustering is like minimizing over maps {[n]\rightarrow[k]}. Reduction amounts to assigning to all functions {[k]^n \rightarrow [k]} a number {OBJ(f)}. Assume that all coordinate functions have {OBJ(x_i)\geq a}. If {f} has no influential coordinate, {OBJ(f)\leq b+o(1)}. This implies that the UGC threshold is {a/b}.

 

References

 

[KN1] Khot Subhash; Naor, Assaf; Approximate kernel clustering. Mathematika 55 (2009), no. 1-2, 129–165.

[KN2] Khot Subhash; Naor, Assaf; Sharp kernel clustering algorithms and their associated Grothendieck inequalities. SODA 2010.

 

Advertisements

About metric2011

metric2011 is a program of Centre Emile Borel, an activity of Institut Henri Poincaré, 11 rue Pierre et Marie Curie, 75005 Paris, France. See http://www.math.ens.fr/metric2011/
This entry was posted in Workshop lecture and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s