Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Search 10 nearest Locations in Datastore

Tags:

I have a lot of Entities containing geoPoints stored in Google's Datastore. Now I need to get the 10 nearest locations based on a Location sent to a Google Cloud Function.
I saw, that there is a distance() function in Google's App Engine, but nothing comparable in Google Cloud Functions, not even the possible to calculate anything in the Database.

Is it possible to get the 10 nearest Locations from Datastore only using Google Cloud Functions or do I need to use a different Database for that ?

Best Regards,
Pascal

like image 920
pree Avatar asked Jul 28 '17 16:07

pree


2 Answers

We run a geospatial-heavy service on AppEngine.

Our solution is to store the locations on Memcache and doing the calculations directly instead of relying on the database.

This obviously depends on the amount of locations, but if you are clever about how you store the locations, you can search very quickly.

R-Trees are a very good example: https://en.wikipedia.org/wiki/R-tree

like image 141
Zebs Avatar answered Sep 21 '22 11:09

Zebs


I had a similar need, and I solved it with a grid-based clustering scheme.

Essentially I created a Computed String Property that was the string concatenation of the latitude & longitude with the decimals chopped off.

If an entity had obj.latitude = 37.123456 & obj.longitude = 45.234567 then obj.grid_id="37:45"

When performing a search, I determine the grid of the search latitude & longitude as well as the 8 other surrounding grids and queried for all entities that resided in those 9 grids.

#  for search latitude = 37.456 & longitude = 45.67

query = SomeModel.query(SomeModel.grid_id.IN([
'36:44', '36:45', '36:46',
'37:44', '37:45', '37:46',
'38:44', '38:45', '38:46',
]))

You would then find the 10 nearest in code.

Depending on your needs you may want to make grids id include decimal positions (obj.grid_id="37.1:45.2") or make them less precise(obj.grid_id="30:40")

This may or may not work for you depending on the distribution of you data points, in which case Zebs suggestion to use R-Tree is more robust, but this was simple to implement and sufficed my needs.

like image 45
Alex Avatar answered Sep 21 '22 11:09

Alex