1. Moments, sums of squares and optimization
1.1. Summary of previous courses
Recall that in a polynomial optimization problem I call primal problem the minimization of over the space of probability measures on , and dual problem the maximization of real numbers such that on .
Real agebraic geometry provides us with efficient characterizations of
- finite Borel measures supported on ;
- polynomials positive on ;
when is basic semialgebraic.
Kindler: can one give dual proofs of these characterizations ? Answer: not that I know. The characterization of measures is functional analysis (spectral theory of selfadjoint operators). The characterization of polynomials positive on is purely algebraic.
Relaxation of the primal problem. A nonnegative sequence is the sequence of moments of a finite measure supported on iff the associated symmetric moment matrix is positive semidefinite. Therefore the global optimization problem is equivalent to the following SDP: if ,
minimize (over ) ,
subject to constraints
, , positive semidefinite, .
The relaxation corresponds to sticking to one value of .
Relaxation of the dual problem.
maximize (over and polynomials ) ,
subject to constraints
where are sums of squares.
Note that there is a correspondance between polynomials of degree which are sums of squares and positive semidefinite matrices of size , so this is indeed an SDP.
The relaxation corresponds to sticking to polynomials such that degree, degree.
Theorem 1 (Lasserre) Assume that . If has non empty interior, .
1.2. Detecting global optimality and extracting solutions
In the primal setting, the ranks of the semidefinite matrices tell us wether we can stop or go to higher degree polynomials (i.e. larger matrices).
Theorem 2 (Lasserre) Let be an optimal solution of and let . If
- One may extract global minimizers
Advertisment for GloptiPoly 3, a software package devoting to optimization of polynomials, which detects global optimality and extracts solutions, http://www/laas.fr/~henrion/software
It does well to up to 10 variables, degree 8.
This is the case when . Write
The binary constraints yield equations on moments. This simplifies the moment matrix: only rows and columns corresponding to square free monomials need to be taken into account. So stabilizes after (often, much earlier).
1.4. Zeros of polynomials
Usually, to compute the zero set of an ideal of polynomials, people compute a Gröbner basis of the ideal, check that it is -dimensional. If so, then there is a procedure to compute complex zeroes. Check which are reals. This may represent a huge loss of energy. If complex zero set is higher dimensional, hopeless. Whereas
Example 1 Solve in .
Compute the moment matrix. The equation makes the matrix be equal to the moment matrix of the Dirac mesure at , showing that the only solution is .
1.5. Putinar versus Karush-Kuhn-Tucker
Theorem 3 (Karush, Kuhn, Tucker) For to be a local minimum, a necessary condition is that there exist nonnegative scalar multipliers such that
The condition is also necessary in the convex case.
Putinar’s theorem tells us that is a sum of squares, so on all of . Evaluation at gives , thus all , . Let . Write Differentiating gives
and , . So the Putinar representation implies the Karush-Kuhn-Tucker necessary condition. Constraints which are not active at give rise to vanishing KKT scalar multipliers. Putinar replaces them by globally determined sum of squares polynomials, whose effect is not felt locally at but globally.
1.6. Example: convex envelope of a function
2. How to handle sparsity
2.1. No free lunch rule
The -th SDP relaxation has -variables and matrix size . SO very soon, SDP solvers are overflowed.
One should exploit symmetry. It seems that the algebraic method provides equivariant SDP’s of much smaller size than
One should exploit sparsity. In general, each of the constraints involves a small number of variables, and the objective function is a sum of polynomials, each depending on only variables. Kojima has developed a systematic procedure to detext these sparsity patterns (look for cliques in a chordal extension of a graph associated to the problem). The resulting -th SDP relaxation has at most variables and matrix size .
Theorem 4 (Lasserre) Let denote the sets of variables involved in the constraint and in . Assume that for every ,
Assume that on . Then there exist sums of squares such that
where depends on the same set of variables as , and is a sum of polynomials in sets ‘s of variables.
Kojima’s software gives good results with and up to .