Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fix postgresql update with join performance?

I hawe geospatial tables named node and ways.

I want to set end_node_id column of ways table with node table attribute using spatial join. two tables have about 100K data.

update ways
set
    end_node_id = n.node_id
from
    ways w
inner join
    nodes n
on
    st_endpoint(w.shape) = n.shape;

But this query takes so many times. After 15 minutes I stopped the query. Is there any performence query for this operation?

Update Explain:

Update on ways w (cost=0.00..669909619.43 rows=24567397 width=576)  
->  Nested Loop  (cost=0.00..669909619.43 rows=24567397 width=576)
          Join Filter: (st_endpoint(w.shape) = n.shape)
          ->  Seq Scan on ways w (cost=0.00..8960.61 rows=120161 width=564)
          ->  Materialize  (cost=0.00..12200.81 rows=204454 width=52)
                        ->  Seq Scan on nodes n  (cost=0.00..9181.54 rows=204454 width=52)
like image 612
barteloma Avatar asked Jun 19 '26 21:06

barteloma


1 Answers

Don't include ways in the from clause! This doesn't do what you expect. Presumably, you want:

update ways w
    set end_node_id = n.node_id
from nodes n
where st_endpoint(w.shape) = n.shape;

In your formulation, the ways in the update is a different reference from the ways in the from. So, your code is creating Cartesian product -- which no doubt slows down the processing. Note that this is different from the behavior of SQL Server, which has similar syntax.

like image 168
Gordon Linoff Avatar answered Jun 22 '26 12:06

Gordon Linoff