Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to partition 2D-points into intervals (using only vertical lines)?

Tags:

So I have a 2D scatter filled with points (x,y). I want to draw k vertical lines (x_1 = a, x_2 = b, ..., x_k = k), so as to partition the points into k groups.

The optimal solution would minimize the average variance of each group's y_value.

What is the appropriate algorithm? It sounded like k-means but I have the constraint that the lines must be vertical.

like image 452
Geo Tsaousoglou Avatar asked May 22 '19 08:05

Geo Tsaousoglou


1 Answers

Here is an idea based on dynamic programming.
With the following notations: (x_1, y_1), ..., (x_n, y_n) the points, with x_1 <= x_2 <= ... <= x_n to cut in K groups.
Var(i, j) the variance of the y's: y_i, ..., y_j.
F_K((x_1,y_1), ..., (x_n, y_n)) = F_k(1,n) the value of the best solution for the problem.

Then we have the following:
F_k(i,j) = min for l in i...j-k+1 of (Var(i,l) + F_(k-1)(l+1, j) and
F_1(i,j) = Var(i,j).

The property above simply means that the best way to split your points in 'k' groups is to select the leftmost cut (the choice of l), and the best choice of k-1 cuts for the remaining points.

From there you can go for a dynamic program. You'll need a 3D array A of dimensions n*n*K to store the value of F_k(i,j) for all i,j,k. The program would look like:

function get_value(P: points, A: 3D array, i, j, k){
  if A[i][j][k] is defined{
    result = A[i][j][k]
  } else if k == 1 {
    A[i][j][k] = get_var(P, i, j)
    result = A[i][j][k] 
  } else {
    result = +INF
    for l in i ... j-k+1 {
      tmp = get_value(P, A, i, l, 1) + get_value(P, A, l+1, j, k-1)
      if tmp < result {
        result = tmp
      }
    }
  }
  return result
}

NB: I was a bit quick about the range to iterate on for l, that might be something to look into.

like image 54
m.raynal Avatar answered Oct 04 '22 23:10

m.raynal