Need algorithm for optimization issue of distribution points among the set of attractor points

https://stackoverflow.com/questions/23205092

07-07-2023
|

题

We have the following optimization problem.

There are k (typically, k = 3-10) points on the plane, called attractors: A1, .., Ak. And there is the set of points of size N (N is big, usually 10K-100K+) on the same plane.

We need to tie N1 points to the 1st attractor, N2 - to the 2nd, ..., Nk to the k-th. All points should be tied: N1 + ... + Nk = N.

The target function is like "points are tied to the nearest attractor and attractor has tied nearest points to itself":

Let Pij be j-th point tied to i-th attractor (i=1..k; j=1..Nk).

Let Lk be sum of distances from k-th attractor to points that are tied to it:
Lk = dist(Pk1, A1) + dist(Pk2, A2) + ... + dist(PkN1, A1).
Let f be total sum of distances: f = L1 + .. + Lk.

We need to minimize f.

Can someone provide some advice on how to implement that?
Or maybe there exist some known algorithm to do that?

UPD: This problem may be reduced to assignment problem and solved by hungarian algorithm. And it's special case of minimum cost flow problem.

解决方案

This is a linear program that can be solved by general LP solvers.

It can also be modelled more specifically as a min-cost max-flow problem: Put the attractors on the left side of a bipartite graph and the points on the right side.

Connect every attractor i on the left side to the source via an edge of capacity N_i and cost 0.
Connect every point i on right side to the sink via an edge of capacity 1 and cost 0.
Add edges from every attractor to every point with capacity 1 and the cost being their distance.

The minimum-cost max-flow in this graph is the solution to your problem. You can solve this for example using the method of successive shortest paths with Dijkstra.

Unfortunately I don't know theoretical bounds of this approach for the very specific properties of the graph (bipartite and total capacity n on the left side). I think the worst case for successive shortest paths is (polylog)quadratic, but in practice it should be much faster. Maybe somebody with more experience with linear programming and network optimization can say more about what the best algorithm would be to solve the arising cost-flow problem.

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow