**Approximate kernel clustering**

Two papers with Subhash Khot, [KN1], [KN2], plus a recent update. Motivated by practitioners, Song, Smola, Getton, Borgwardt, who came up with nice experiments, and raised a beautiful question.

**1. Statistical motivation **

Let be random variables, covariance matrix, called the *kernel*. We want to cluster , i.e. find a partition of , form the much smaller matrix

Given a small, , positive semidefinite matrix , called the *hypothesis matrix*, find a clustering of such that is maximal. Notation: .

This contains many problems, related for instance to locating low dimensional manifolds in large data spaces.

Question: is there a polytime algorithm to compute (approximate) .

Theorem 1Exact computation is NP-hard, but there exists a polytime algorithm which computes up to the factor .

**2. Towards optimal results ? **

Say is centered if . In case is centered, we believe that our results are optimal.

Example 1identity matrix. Then the goal is to find a partition which maximizes the sum of correlations within each piece.

Example 2. Then the goal is the positive semidefinite Grothendieck problem.

There is a natural method, which consists in replacing signs in by inner products between unit vectors. Grothendieck showed that the integrality gap of this SDP relaxation is bounded, and . Later on, Rietz showed that it is equal to . For general matrices , the best is . Nemirovsky et al., Alon and the Makarychev bros.

Example 3Laplacian of a graph, i.e. , if , otherwise.

, so MAX CUT. So there is some inapproximability threshold for CLUST.

Theorem 2Assume is centered and is centered and spherical (i.e. ‘s on the diagonal) or identity matrix. Then there exists a polytime algorithm that computes up to

If , , sharp assuming UGC.If , , sharp assuming UGC.If , .

The last value may be tight as well. It depends on a geometric conjecture which I discuss now.

**3. A geometric puzzle **

Puzzle: Partition the plane into 3 measurable sets . Define the *Gaussian moment* of a set as the vector whose components are the integrals of coordinate functions with respect to Gaussian measure. For which partition of the plane is maximal ?

Theorem 3Maximum is attained at partitions into 3 equal sectors.

Next puzzle: Partition -space into measurable sets. Similar question. We do not know the answer, but we formulate the Propeller conjecture: only 3 pieces are needed, and are products of -dimensional sectors with complementary subspaces.

Theorem 4In a partition of which maximises the sum of the squares of Gaussian moments, the non empty pieces are products of vector subspaces and pieces of a simplicial and conical partition in lower dimension.

If we could prove that the simplices are regular, the Propeller Conjecture would follow.

A similar phenomenon (only a constant number of pieces are necessary) already showed up in work of Keith Ball.

**4. Back to the algorithm for CLUST **

Given a positive definite matrix , view it as a Gram matrix . Let be the radius Euclidean ball containing all ‘s. Let be the maximum if over all vectors arising as Gaussian moments of a partition of Euclidean space.

Theorem 5There is a polytime algorithm which, given computes up to . Assuming UGC, no other algorithm can do better.

For , the UGC threshold is

** 4.1. Connection with Grothendieck’s constant **

Grothendieck’s inequality is

Given vectors in , there is a generalization

Theorem 6The best in the generalized Grothendieck inequality above is where .

*Proof:* Project…

** 4.2. Reduction from UGC to CLUST **

Clustering is like minimizing over maps . Reduction amounts to assigning to all functions a number . Assume that all coordinate functions have . If has no influential coordinate, . This implies that the UGC threshold is .

** References **

[KN1] Khot Subhash; Naor, Assaf; * Approximate kernel clustering.* Mathematika **55** (2009), no. 1-2, 129–165.

[KN2] Khot Subhash; Naor, Assaf; * Sharp kernel clustering algorithms and their associated Grothendieck inequalities.* SODA 2010.