Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Computing average grid size

I am trying to compute the average cell size on the following set of points, as seen on the picture: grid. The picture was generated using gnuplot:

gnuplot> plot "debug.dat" using 1:2

The points are almost aligned on a rectangular grid, but not quite. There seems to be a bias (jitter?) of say 10-15% along either X or Y. How would one compute efficiently a proper partition in tiles so that there is virtually only one point per tile, size would be expressed as (tilex, tiley). I use the word virtually since the 10-15% bias may have moved a point in another adjacent tile.

Just for reference, I have manually sorted (hopefully correct) and extracted the first 10 points:

 -133920,33480
 -132480,33476
 -131044,33472
 -129602,33467
 -128162,33463
 -139679,34576
 -138239,34572
 -136799,34568
 -135359,34564
 -133925,34562

Just for clarification, a valid tile as per the above description would be (1435,1060), but I am really looking for a quick automated way.

like image 257
malat Avatar asked Dec 01 '14 12:12

malat


1 Answers

Let's do this for X coordinate only:

1) sort the X coordinates

2) look at deltas between two subsequent X coordinates. These delta will fall into two categories - either they correspond to spaces between two columns, or to spaces between crosses within the same column. Your goal is to find a threshold that will separate the long spaces from the short ones. This can be done by finding a threshold that separates the deltas into two groups whose means are the furthest apart (I think)

3) once you have the threshold, separate points into columns. A columns starts and ends with a delta corresponding to the threshold you measured previously

4) calculate average position of each detected column

5) take deltas between subsequent columns. Now, the problem is that you may get a stray point that would break your columns. Use a median to get the strays out.

6) You should have a robust estimate of your gridX

Example, using your data, looking at axis X:

-133920 -132480 -131044 -129602 -128162 -139679 -138239 -136799 -135359 -133925

Sorted + deltas:

5 1434 1436 1440 1440 1440 1440 1440 1442

Here you can see that there is a very obvious threshold between small (5) and large (1434 and up) delta. 1434 will define your space here

Split the points into columns:

-139679|-138239|-136799|-135359|-133925 -133920|-132480|-131044|-129602|-128162
       1440   1440    1440    1434      5    1440    1436    1442    1440

Almost all points are alone, except the two -133925 -133920.

The average grid line positions are:

-139679 -138239 -136799 -135359 -133922.5 -132480 -131044 -129602 -128162

Sorted deltas:

1436.0 1436.5 1440.0 1440.0 1440.0 1440.0 1442.0 1442.5

Median:

1440

Which is the correct answer for your SMALL data set, IMHO.

like image 157
Roman Zenka Avatar answered Oct 11 '22 20:10

Roman Zenka