Sorting as a linear program

https://cs.stackexchange.com/questions/4805

16-10-2019
|

Question

A surprising number of problems have fairly natural reductions to linear programming (LP). See Chapter 7 of [1] for examples such as network flows, bipartite matching, zero-sum games, shortest paths, a form of linear regression, and even circuit evaluation!

Since circuit evaluation reduces to linear programming, any problem in $P$ must have a linear programming formulation. Therefore, we have a "new" sorting algorithm, via reduction to a linear program. So, my questions are

What is the linear program that will sort an array of $n$ real numbers?
What is the running time of the reduce-to-LP-and-solve sorting algorithm?

Algorithms by S. Dasgupta, C. Papadimitriou and U. Vazirani (2006)

Solution

The following answer is basically equivalent to the one you already know, but may seem a bit less "magical". On the other hand, it's more technical, but I believe the general technique "write your problem as an optimization on permutation matrices and invoke Birkhoff-von Neumann" is a great one to know.

For a permutation $\sigma$ of $\{1, \ldots, n\}$ define the permutation matrix $P_\sigma$ as the 0-1 matrix such that $P_{ij} = 1$ if $j = \sigma(i)$ and $P_{ij}=0$ otherwise. This is simply the matrix which permutes the coordinates of a vector $x$ according to $\sigma$: if $y = P_\sigma x$ then $y_i = x_{\sigma(i)}$. I'll denote $y=P_{\sigma}x$ as $\sigma(x)$ from now on.

One more definition: a non-negative $n\times n$ matrix $M$ is doubly-stochastic if each one of its rows and each one of its columns sums to 1.

And one fact which is very important in combinatorial optimization - the Birkhoff-von Neumann theorem:

A matrix $M$ is doubly stochastic if and only if it is a convex combination of permutation matrices, i.e. if and only if there exist permutations $\sigma_1, \ldots, \sigma_k$ and positive reals $\alpha_1 , \ldots, \alpha_k$ such that $M = \sum_{i = 1}^k{\alpha_i P_{\sigma_i}}$ and $\sum{\alpha_i} = 1$.

Notice that a doubly stochastic matrix is defined by the inequalities

$$\forall i: \sum_{j = 1}^n{M_{ij}} = 1$$ $$\forall j: \sum_{i = 1}^n{M_{ij}} = 1$$ $$\forall i, j: M_{ij} \geq 0$$

All these inequalities taken together determine a polytope $P$, and the Birkhoff-von Neumann theorem states that the extremal points (vertices) of this polytope are all permutation matrices. From basic linear programming, we know this implies that any linear program which has the above inequalities as its constraints (and no other constraints) will have a permutation matrix as an optimal solution.

So given an input $a = (a_1, \ldots, a_n)$ to be sorted, we just need to come up with a linear objective $f_a(M)$ for which:

$f_a(P_\tau) < f_a(P_\sigma)$ if $\sigma(a)$ is sorted but $\tau(a)$ is not.

Then formulate a linear program with objective to maximize $f_a(M)$ and the inequalities above as constraints, and you are guaranteed that an optimal solution is the permutation matrix $P_\sigma$ for $\sigma$ such that $\sigma(a)$ is sorted. Of course, it's easy to "read off" $\sigma$ from $P_\sigma$.

One choice for $f_a(M)$ is $v^T M a$ where $v = (1, \ldots, n)$. Verify that

this is linear in $M$;
for $P_\sigma$, $f_a(P_\sigma) = \sum_{i = 1}^n{i a_{\sigma(i)}}$;
the above is maximized at the $\sigma$ for which $\sigma(a)$ is sorted (by contradiction: otherwise you could switch two coordinates of $\sigma(a)$ and achieve a higher value).

And voila, you have a linear program for sorting. Seems silly for sorting, but this is in fact a powerful method in optimization.

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange