K-Nearest Neighbor Query in PostGIS

Tags:

I am using the following Nearest Neighbor Query in PostGIS :

SELECT g1.gid g2.gid FROM points as g1, polygons g2   
WHERE g1.gid <> g2.gid
ORDER BY g1.gid, ST_Distance(g1.the_geom,g2.the_geom)
LIMIT k;

Now, that I have created indexes on the_geom as well as gid column on both the tables, this query is taking much more time than other spatial queries involving spatial joins b/w two tables.

Is there any better way to find K-nearest neighbors? I am using PostGIS.

And, another query which is taking a unusually long time despite creating indexes on geometry column is:

select g1.gid , g2.gid from polygons as g1 , polygons as g2
where st_area(g1.the_geom) > st_area(g2.the_geom) ;

I believe, these queries arent benefited by gist indexes, but why?

Whereas this query:

select a.polyid , sum(length(b.the_geom)) from polygon as a , roads as b  
where st_intersects(a.the_geom , b.the_geom);

returns result after some time despite involving "roads" table which is much bigger than polygons or points table and also involve more complex spatial operators.

207

asked May 05 '12 11:05

Abhishek Sagar

2 Answers

Since late September 2011, PostGIS has supported indexed nearest neighbor queries via a special operator(s) usable in the ORDER BY clause:

SELECT name, gid
FROM geonames
ORDER BY geom <-> st_setsrid(st_makepoint(-90,40),4326)
LIMIT 10;

...will return the 10 objects whose geom is nearest -90,40 in a scalable way. A few more details (options and caveats) are in that announcement post and use of the <-> and the <#> operators is also now documented in the official PostGIS 2.0 reference. (The main difference between the two is that <-> compares the shape centroids and <#> compares their boundaries — no difference for points, other shapes choose what is appropriate for your queries.)

153

answered Oct 24 '22 03:10

natevw

Just a few thoughts on your problem:

st_distance as well as st_area are not able to use indices. This is because both functions can not be reduced to questions like "Is a within b?" or "Do a and b overlap?". Even more concrete: GIST-indices can only operate on the bounding boxes of two objects.

For more information on this you just could look in the postgis manual, which states an example with st_distance and how the query could be improved to perform better.

However, this does not solve your k-nearest-neighbour-problem. For that, right now I do not have a good idea how to improve the performance of the query. The only chance I see would be assuming that the k nearest neighbors are always in a distance of below x meters. Then you could use a similar approach as done in the postgis manual.

Your second query could be speeded up a bit. Currently, you compute the area for each object in table 1 as often as table has rows - the strategy is first to join the data and then select based on that function. You could reduce the count of area computations significantly be precomputing the area:

WITH polygonareas AS (
    SELECT gid, the_geom, st_area(the_geom) AS area
    FROM polygons
)
SELECT g1.gid, g2.gid
FROM polygonareas as g1 , polygonareas as g2 
WHERE g1.area > g2.area;

Your third query can be significantly optimized using bounding boxes: When the bounding boxes of two objects do not overlap, there is no way the objects do. This allows the usage of a given index and thus a huge performance gain.

answered Oct 24 '22 02:10

Thilo

Related questions
                            
                                Prevent less than zero values in postgresql
                            
                                PostgreSQL query to detect overlapping time ranges
                            
                                Left outer join - how to return a boolean for existence in the second table?
                            
                                How do I edit a function in PSQL
                            
                                How to use TypeScript with Sequelize
                            
                                docker postgres image - Failed to initialize, db service is unhealthy
                            
                                How do I use GMT times in postgresql?
                            
                                Why Banks or Financial Companies prefer Oracle than other RDBMS for their "Core" systems? [closed]
                            
                                Issues installing PostGIS
                            
                                How to generate serial number in a query?
                            
                                Guidance on using the WITH clause in SQL
                            
                                Rails 3.2 Postgres Save Error "ActiveRecord::StatementInvalid: PG::Error: ERROR: Syntax error near 'T' at position 5"
                            
                                How to provide an API client with 1,000,000 database results?
                            
                                How to apply array_agg/array_to_json to a query with modified columns
                            
                                How can I query the transaction-isolation level of an existing postgres session?
                            
                                How to list tables affected by cascading delete
                            
                                IIF in postgres
                            
                                How can you tell if a trigger is enabled in PostgreSQL?
                            
                                Postgres XML datatype
                            
                                rails 3/postgres - how long is a string if you don't apply :limit in schema

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

K-Nearest Neighbor Query in PostGIS

Tags:

indexing

postgresql

postgis

nearest-neighbor

Abhishek Sagar

People also ask

2 Answers

natevw

Thilo

Recent Activity

Donate For Us