I have a real estate application and a "house" contains the following information:
house:
- house_id
- address
- city
- state
- zip
- price
- sqft
- bedrooms
- bathrooms
- geo_latitude
- geo_longitude
I need to perform an EXTREMELY fast (low latency) retrieval of all homes within a geo-coordinate box.
Something like the SQL below (if I were to use a database):
SELECT * from houses
WHERE latitude IS BETWEEN xxx AND yyy
AND longitude IS BETWEEN www AND zzz
Question: What would be the quickest way for me to store this information so that I can perform the fastest retrieval of data based on latitude & longitude? (e.g. database, NoSQL, memcache, etc)?
This is a typical query for a Geographical Information System (GIS) application. Many of these are solved by using quad-tree, or similar spatial, indices. The tiling mentioned is how these often end up being implemented.
If an index containing the coordinates could fit into memory and the DBMS had a decent optimiser, then a table scan could provide a Cartesian distance from any point of interest with tolerably low overhead. If this is too slow, then the query could be pre-filtered by comparing each coordinate axis separately before doing the full distance calculation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With