Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm for fitting points to a grid

Tags:

algorithm

I have a list of points in 2D space that form an (imperfect) grid:

x     x        x         x

 x    x       x           x

                 x
x      x                x

 x     x      x           x

What's the best way to fit these to a rigid grid (i.e. create a two-dimendional array and work out where each point fits in that array)?

There are no holes in the grid, but I don't know in advance what its dimensions are.

EDIT: The grid is not necessarily regular (not even spacing between rows/cols)

like image 836
user1958735 Avatar asked Jan 08 '13 16:01

user1958735


4 Answers

A little bit of an image processing approach: If you think of what you have as a binary image where the X is 1 and the rest is 0, you can sum up rows and columns, and use a peak finding algorithm to identify peaks which would correspond to x and y lines of the grid:

Your points as a binary image:

enter image description here

Sums of row/columns

enter image description here

Now apply some smoothing technique to the signal (e.g. lowess):

enter image description here

I'm sure you get the idea :-)

Good luck

like image 104
Omar Wagih Avatar answered Nov 13 '22 21:11

Omar Wagih


The best I could come up with is a brute-force solution that calculates the grid dimensions that minimize the error in the square of the Euclidean distance between the point and its nearest grid intersection.

This assumes that the number of points p is exactly equal to the number of columns times the number of rows, and that each grid intersection has exactly one point on it. It also assumes that the minimum x/y value for any point is zero. If the minimum is greater than zero, just subtract the minimum x value from each point's x coordinate and the minimum y value from each point's y coordinate.

The idea is to create all of the possible grid dimensions given the number of points. In the example above with 16 points, we would make grids with dimensions 1x16, 2x8, 4x4, 8x2 and 16x1. For each of these grids we calculate where the grid intersections would lie by dividing the maximum width of the points by the number of columns minus 1, and the maximum height of the points by the number of rows minus 1. Then we fit each point to its closest grid intersection and find the error (square of the distance) between the point and the intersection. (Note that this only works if each point is closer to its intended grid intersection than to any other intersection.)

After summing the errors for each grid configuration individually (e.g. getting one error value for the 1x16 configuration, another for the 2x8 configuration and so on), we select the configuration with the lowest error.

Initialization:

  P is the set of points such that P[i][0] is the x-coordinate and
                                   P[i][1] is the y-coordinate
  Let p = |P| or the number of points in P
  Let max_x = the maximum x-coordinate in P
  Let max_y = the maximum y-coordinate in P
     (minimum values are assumed to be zero)

  Initialize min_error_dist = +infinity
  Initialize min_error_cols = -1

Algorithm:

  for (col_count = 1; col_count <= n; col_count++) {
     // only compute for integer # of rows and cols
     if ((p % col_count) == 0) {   
        row_count = n/col_count;

        // Compute the width of the columns and height of the rows
        // If the number of columns is 1, let the column width be max_x
        // (and similarly for rows)
        if (col_count > 1) col_width = max_x/(col_count-1);
        else col_width=max_x;
        if (row_count > 1) row_height = max_y/(row_count-1);
        else row_height=max_y;

        // reset the error for the new configuration
        error_dist = 0.0;
        for (i = 0; i < n; i++) {
           // For the current point, normalize the x- and y-coordinates
           // so that it's in the range 0..(col_count-1)
           //                       and 0..(row_count-1)
           normalized_x = P[i][0]/col_width;
           normalized_y = P[i][1]/row_height;

           // Error is the sum of the squares of the distances between 
           // the current point and the nearest grid point 
           // (in both the x and y direction)
           error_dist += (normalized_x - round(normalized_x))^2 +
                         (normalized_y - round(normalized_y))^2;
        }

        if (error_dist < min_error_dist) {
           min_error_dist = error_dist;
           min_error_cols = col_count;
        }
     }
  }
  return min_error_cols;

Once you've got the number of columns (and thus the number of rows) you can recompute the normalized values for each point and round them to get the grid intersection they belong to.

like image 34
beaker Avatar answered Nov 13 '22 21:11

beaker


In the end I used this algorithm, inspired by beaker's:

  1. Calculate all the possible dimensions of the grid, given the total number of points
  2. For each possible dimension, fit the points to that dimension and calculate the variance in alignment:
    • Order the points by x-value
    • Group the points into columns: the first r points form the first column, where r is the number of rows
    • Within each column, order the points by y-value to determine which row they're in
    • For each row/column, calcuate the range of y-values/x-values
    • The variance in alignment is the maximum range found
  3. Choose the dimension with the least variance in alignment
like image 2
user1958735 Avatar answered Nov 13 '22 21:11

user1958735


I wrote this algorithm that accounts for missing coordinates as well as coordinates with errors.

Python Code

# Input [x, y] coordinates of a 'sparse' grid with errors
xys = [[103,101],
       [198,103],
       [300, 99],
       [ 97,205],
       [304,202],
       [102,295],
       [200,303],
       [104,405],
       [205,394],
       [298,401]]

def row_col_avgs(num_list, ratio):
    # Finds the average of each row and column. Coordinates are
    # assigned to a row and column by specifying an error ratio.
    last_num = 0
    sum_nums = 0
    count_nums = 0
    avgs = []
    num_list.sort()
    for num in num_list:
        if num > (1 + ratio) * last_num and count_nums != 0:
            avgs.append(int(round(sum_nums/count_nums,0)))
            sum_nums = num
            count_nums = 1
        else:
            sum_nums = sum_nums + num
            count_nums = count_nums + 1
        last_num = num
    avgs.append(int(round(sum_nums/count_nums,0)))
    return avgs

# Split coordinates into two lists of x's and y's
xs, ys = map(list, zip(*xys))

# Find averages of each row and column within a specified error.
x_avgs = row_col_avgs(xs, 0.1)
y_avgs = row_col_avgs(ys, 0.1)

# Return Completed Averaged Grid
avg_grid = []
for y_avg in y_avgs:
    avg_row = []
    for x_avg in x_avgs:
        avg_row.append([int(x_avg), int(y_avg)])
    avg_grid.append(avg_row)

print(avg_grid)

Code Output

[[[102, 101], [201, 101], [301, 101]], 
 [[102, 204], [201, 204], [301, 204]], 
 [[102, 299], [201, 299], [301, 299]], 
 [[102, 400], [201, 400], [301, 400]]]

I am also looking for another solution using linear algebra. See my question here.

like image 1
Jakub Avatar answered Nov 13 '22 21:11

Jakub