Domanda

Suppose there are 25 points in a line segment, and these points may be unevenly distributed (spatially) as the following figure shows: enter image description here

My question is how we can select 10 points among these 25 points so that these 10 points can be as spatially evenly distributed as possible. In the idea situation, the selected points should be something like this: enter image description here

EDIT: It is true that this question can become more elegant if I can tell the criterion that justify the "even distribution". What I know is my expection for the selected points: if I divide the line segment into 10 equal line segments. I expect there should be one point on each small line segment. Of course it may happen that in some small line segments we cannot find representative points. In that case I will resort to its neighboring small line segment that has representative point. In the next step I will further divide the selected neighboring segment into two parts: if each part has representative points, then the empty representative point problem will be solved. If we cannot find representative point in one of the small line segments, we can further divide it into smaller parts. Or we can resort to the next neighboring line segment.

EDIT: Using dynamic programming, a possible solution is implemented as follows:

#include <iostream>
#include <vector>
using namespace std;


struct  Note
{
    int previous_node;
    double cost;

};
typedef struct Note Note;

int main()
{

    double dis[25] = 
    {0.0344460805029088, 0.118997681558377, 0.162611735194631,
    0.186872604554379, 0.223811939491137, 0.276025076998578,
    0.317099480060861, 0.340385726666133, 0.381558457093008,
    0.438744359656398, 0.445586200710900, 0.489764395788231,
    0.498364051982143, 0.585267750979777, 0.646313010111265,
    0.655098003973841, 0.679702676853675, 0.694828622975817,
    0.709364830858073, 0.754686681982361, 0.765516788149002,
    0.795199901137063, 0.823457828327293, 0.950222048838355, 0.959743958516081};

    Note solutions[25];
    for(int i=0; i<25; i++)
    {
        solutions[i].cost = 1000000;
    }
    solutions[0].cost = 0;
    solutions[0].previous_node = 0;



    for(int i=0; i<25; i++)
    {
        for(int j= i-1; j>=0; j--)
        {
            double tempcost = solutions[j].cost + std::abs(dis[i]-dis[j]-0.1);
            if (tempcost<solutions[i].cost)
            {
                solutions[i].previous_node = j;
                solutions[i].cost = tempcost;
            }

        }
    }
    vector<int> selected_points_index;
    int i= 24;
    selected_points_index.push_back(i);
    while (solutions[i].previous_node != 0)
    {
        i = solutions[i].previous_node;
        selected_points_index.push_back(i);

    }
    selected_points_index.push_back(0);

    std::reverse(selected_points_index.begin(),selected_points_index.end());

    for(int i=0; i<selected_points_index.size(); i++)
        cout<<selected_points_index[i]<<endl;





    return 0;
}

The result are shown in the following figure, where the selected points are denoted as green:

enter image description here

È stato utile?

Soluzione 2

Let {x[i]} be your set of ordered points. I guess what you need to do is to find the subset of 10 points {y[i]} that minimizes \sum{|y[i]-y[i-1]-0.1|} with y[-1] = 0.

Now, if you see the configuration as a strongly connected directed graph, where each node is one of the 25 doubles and the cost for every edge is |y[i]-y[i-1]-0.1|, you should be able to solve the problem in O(n^2 +nlogn) time with the Dijkstra's algorithm.

Another idea, that will probably lead to a better result, is using dynamic programming : if the element x[i] is part of our soltion, the total minimum is the sum of the minimum to get to the x[i] point plus the minimum to get the final point, so you could write a minimum solution for each point, starting from the smallest one, and using for the next one the minimum between his predecessors.

Note that you'll probably have to do some additional work to pick, from the solutions set, the subset of those with 10 points.

EDIT

I've written this in c#:

  for (int i = 0; i < 25; i++)
  {
    for (int j = i-1; j > 0; j--)
    {
      double tmpcost = solution[j].cost + Math.Abs(arr[i] - arr[j] - 0.1);
      if (tmpcost < solution[i].cost)
      {
         solution[i].previousNode = j;
         solution[i].cost = tmpcost;
      }
    }
  }

I've not done a lot of testing, and there may be some problem if the "holes" in the 25 elements are quite wide, leading to solutions that are shorter than 10 elements ... but it's just to give you some ideas to work on :)

Altri suggerimenti

Until a good, and probably O(n^2) solution comes along, use this approximation:

Divide the range into 10 equal-sized bins. Choose the point in each bin closest to the centre of each bin. Job done.

If you find that any of the bins is empty choose a smaller number of bins and try again.

Without information about the scientific model that you are trying to implement it is difficult (a) to suggest a more appropriate algorithm and/or (b) to justify the computational effort of a more complicated algorithm.

You can find approximate solution with Adaptive Non-maximal Suppression (ANMS) algorithm provided the points are weighted. The algorithm selects n best points while keeping them spatially well distributed (most spread across the space).

I guess you can assign point weights based on your distribution criterion - e.g. a distance from uniform lattice of your choice. I think the lattice should have n-1 bins for optimal result.

You can look up following papers discussing the 2D case (the algorithm can be easily realized in 1D):

  1. Turk, Steffen Gauglitz Luca Foschini Matthew, and Tobias Höllerer. "EFFICIENTLY SELECTING SPATIALLY DISTRIBUTED KEYPOINTS FOR VISUAL TRACKING."

  2. Brown, Matthew, Richard Szeliski, and Simon Winder. "Multi-image matching using multi-scale oriented patches." Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 1. IEEE, 2005.

The second paper is less related to your problem but it describes basic ANMS algorithm. The first papers provides faster solution. I guess both will do in 1D for a moderate amount of points (~10K).

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top