Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting evenly distributed points algorithm

Suppose there are 25 points in a line segment, and these points may be unevenly distributed (spatially) as the following figure shows: enter image description here

My question is how we can select 10 points among these 25 points so that these 10 points can be as spatially evenly distributed as possible. In the idea situation, the selected points should be something like this: enter image description here

EDIT: It is true that this question can become more elegant if I can tell the criterion that justify the "even distribution". What I know is my expection for the selected points: if I divide the line segment into 10 equal line segments. I expect there should be one point on each small line segment. Of course it may happen that in some small line segments we cannot find representative points. In that case I will resort to its neighboring small line segment that has representative point. In the next step I will further divide the selected neighboring segment into two parts: if each part has representative points, then the empty representative point problem will be solved. If we cannot find representative point in one of the small line segments, we can further divide it into smaller parts. Or we can resort to the next neighboring line segment.

EDIT: Using dynamic programming, a possible solution is implemented as follows:

#include <iostream>
#include <vector>
using namespace std;


struct  Note
{
    int previous_node;
    double cost;

};
typedef struct Note Note;

int main()
{

    double dis[25] = 
    {0.0344460805029088, 0.118997681558377, 0.162611735194631,
    0.186872604554379, 0.223811939491137, 0.276025076998578,
    0.317099480060861, 0.340385726666133, 0.381558457093008,
    0.438744359656398, 0.445586200710900, 0.489764395788231,
    0.498364051982143, 0.585267750979777, 0.646313010111265,
    0.655098003973841, 0.679702676853675, 0.694828622975817,
    0.709364830858073, 0.754686681982361, 0.765516788149002,
    0.795199901137063, 0.823457828327293, 0.950222048838355, 0.959743958516081};

    Note solutions[25];
    for(int i=0; i<25; i++)
    {
        solutions[i].cost = 1000000;
    }
    solutions[0].cost = 0;
    solutions[0].previous_node = 0;



    for(int i=0; i<25; i++)
    {
        for(int j= i-1; j>=0; j--)
        {
            double tempcost = solutions[j].cost + std::abs(dis[i]-dis[j]-0.1);
            if (tempcost<solutions[i].cost)
            {
                solutions[i].previous_node = j;
                solutions[i].cost = tempcost;
            }

        }
    }
    vector<int> selected_points_index;
    int i= 24;
    selected_points_index.push_back(i);
    while (solutions[i].previous_node != 0)
    {
        i = solutions[i].previous_node;
        selected_points_index.push_back(i);

    }
    selected_points_index.push_back(0);

    std::reverse(selected_points_index.begin(),selected_points_index.end());

    for(int i=0; i<selected_points_index.size(); i++)
        cout<<selected_points_index[i]<<endl;





    return 0;
}

The result are shown in the following figure, where the selected points are denoted as green:

enter image description here

like image 764
feelfree Avatar asked Aug 14 '13 08:08

feelfree


2 Answers

Until a good, and probably O(n^2) solution comes along, use this approximation:

Divide the range into 10 equal-sized bins. Choose the point in each bin closest to the centre of each bin. Job done.

If you find that any of the bins is empty choose a smaller number of bins and try again.

Without information about the scientific model that you are trying to implement it is difficult (a) to suggest a more appropriate algorithm and/or (b) to justify the computational effort of a more complicated algorithm.

like image 135
High Performance Mark Avatar answered Sep 28 '22 12:09

High Performance Mark


Let {x[i]} be your set of ordered points. I guess what you need to do is to find the subset of 10 points {y[i]} that minimizes \sum{|y[i]-y[i-1]-0.1|} with y[-1] = 0.

Now, if you see the configuration as a strongly connected directed graph, where each node is one of the 25 doubles and the cost for every edge is |y[i]-y[i-1]-0.1|, you should be able to solve the problem in O(n^2 +nlogn) time with the Dijkstra's algorithm.

Another idea, that will probably lead to a better result, is using dynamic programming : if the element x[i] is part of our soltion, the total minimum is the sum of the minimum to get to the x[i] point plus the minimum to get the final point, so you could write a minimum solution for each point, starting from the smallest one, and using for the next one the minimum between his predecessors.

Note that you'll probably have to do some additional work to pick, from the solutions set, the subset of those with 10 points.

EDIT

I've written this in c#:

  for (int i = 0; i < 25; i++)
  {
    for (int j = i-1; j > 0; j--)
    {
      double tmpcost = solution[j].cost + Math.Abs(arr[i] - arr[j] - 0.1);
      if (tmpcost < solution[i].cost)
      {
         solution[i].previousNode = j;
         solution[i].cost = tmpcost;
      }
    }
  }

I've not done a lot of testing, and there may be some problem if the "holes" in the 25 elements are quite wide, leading to solutions that are shorter than 10 elements ... but it's just to give you some ideas to work on :)

like image 24
Save Avatar answered Sep 28 '22 14:09

Save