I am trying to match spatially all streets in Open Street Maps (OSM) in Brazil to the 2010 Census Enumeration district shapefile in Postgis.
I imported all files to Postgis with WGS84 projection and included spatial indexes (gist).
OSM data comes from geofabrik planet_osm_line, which I filter to extract only the streets. There are 1.8 million streets.
The Enumeration district shapefile has 317 thousand polygons. The shapefiles are not too precise, so ideally I would add a buffer around them to make sure.
ST_Intersects directly on streets to enumeration districts was taking for ever. So, to reduced the numer of spatial matches, by first joinnig the OSM streets with the municipality polygons, creating the OSM_Streets_by_Mun table. There are 5560 municipalities in Brasil and Enumeration districts are nested in them.
So the final query, bellow, includes numeric matching of state and municipal code, to reduce spatial matches to only within municipalities.
Still, the query is not finishing on a 4 core 128gb of RAM server even after 5 days.
Any ideas on how to improve preformance?
CREATE TABLE OSM_Streets_by_SetorCensitario AS
SELECT OSM_Streets_by_Mun.*, setor_censitarioL.geom
FROM OSM_Streets_by_Mun, setor_censitarioL
WHERE OSM_Streets_by_Mun.cod_UF = substring(setor_censitarioL.cd_geocodi,1,2) AND
OSM_Streets_by_Mun.cod_mun = substring(setor_censitarioL.cd_geocodi,1,7) AND
ST_Intersects(way,ST_Buffer(setor_censitarioL.geom,0.005))=true
Here is the output from EXPLAIN command on the query
"Merge Join (cost=359099.59..765616.74 rows=248521 width=3777)"
" Merge Cond: ((("substring"((setor_censitariol.cd_geocodi)::text, 1, 2)) = (osm_streets_by_mun.cod_uf)::text) AND (("substring"((setor_censitariol.cd_geocodi)::text, 1, 7)) = (osm_streets_by_mun.cod_mun)::text))"
" Join Filter: st_intersects(osm_streets_by_mun.way, st_buffer(setor_censitariol.geom, 0.005::double precision))"
" -> Sort (cost=86016.75..86808.33 rows=316634 width=2145)"
" Sort Key: ("substring"((setor_censitariol.cd_geocodi)::text, 1, 2)), ("substring"((setor_censitariol.cd_geocodi)::text, 1, 7))"
" -> Seq Scan on setor_censitariol (cost=0.00..57088.34 rows=316634 width=2145)"
" -> Sort (cost=273082.84..277547.26 rows=1785768 width=1648)"
" Sort Key: osm_streets_by_mun.cod_uf, osm_streets_by_mun.cod_mun"
" -> Seq Scan on osm_streets_by_mun (cost=0.00..87647.68 rows=1785768 width=1648)"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With