Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group coordinates by proximity to each other

I'm building a REST API so the answer can't include google maps or javascript stuff. In our app, we have a table containing posts that looks like that :

  ID |   latitude   | longitude   | other_sutff
   1 | 50.4371243   |  5.9681102  |    ...
   2 | 50.3305477   |  6.9420498  |    ...
   3 | -33.4510148  | 149.5519662 |    ...

We have a view with a map that shows all the posts around the world. Hopefully, we will have a lot of posts and it will be ridiculous to show thousands and thousands of markers in the map. So we want to group them by proximity so we can have something like 2-3 markers by continent.

To be clear, we need this : enter image description here Image from https://github.com/googlemaps/js-marker-clusterer

I've done some research and found that k-means seems to be part of the solution. As I am really really bad at Math, I tried a couple of php libraries like this one : https://github.com/bdelespierre/php-kmeans that seems to do a decent job. However, there is a drawback : I have to parse all the table each time the map is loaded. Performance-wise, it's awful.

So I would like to know if someone already got through this problematic or if there is a better solution.

like image 377
AdrienXL Avatar asked Jul 30 '15 11:07

AdrienXL


1 Answers

I kept searching and I've found an alternative to KMeans : GEOHASH

Wikipedia will explain better than me what it is : Wiki geohash

But to summarize, The world map is divided in a grid of 32 cells and to each one is given an alpha-numeric character. Each cell is also divided into 32 cells and so on for 12 levels. So if I do a GROUP BY on the first letter of hash I will get my clusters for the lowest zoom level, if I want more precision, I just need to group by the first N letters of my hash.

So, what I've done is only added one field to my table and generate the hash corresponding to my coordinates:

ID |   latitude   | longitude   | geohash      | other_sutff
 1 | 50.4371243   |  5.9681102  | csyqm73ymkh2 |     ...
 2 | 50.3305477   |  6.9420498  | p24k1mmh98eu |     ...
 3 | -33.4510148  | 149.5519662 | 8x2s9674nd57 |     ...

Now, if I want to get my clusters, I just have to do a simple query :

SELECT count(*) as nb_markers FROM mtable GROUP BY SUBSTRING(geohash,1,2); 

In the substring, 2 is level of precision and must be between 1 and 12

PS : Lib I used to generate my hash

like image 148
AdrienXL Avatar answered Sep 18 '22 06:09

AdrienXL