Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r points in polygons

Tags:

r

gis

I have a million points and a large shape file—8GB—which is too big for to load into memory in R on my system. The shape file is single-layer so a given x, y will hit at most one polygon - as long as it's not exactly on a boundary! Each polygon is labelled with a severity - e.g. 1, 2, 3. I'm using R on a 64-bit ubuntu machine with 12GB ram.

What's the simplest way to be able to "tag" the data frame to the polygon severity so that I get a data.frame with an extra column, i.e. x ,y, severity?

like image 765
Sean Avatar asked Sep 27 '10 19:09

Sean


1 Answers

Just because all you have is a hammer, doesn't mean every problem is a nail.

Load your data into PostGIS, build a spatial index for your polygons, and do a single SQL spatial overlay. Export results back to R.

By the way, saying the shapefile is 8Gb is not a very useful piece of information. Shapefiles are made from at least three files, the .shp which is the geometry, the .dbf which is the database, and the .shx which connects the two. If your .dbf is 8Gb then you can easily read the shapes themselves in by replacing it with a different .dbf. Even if the .shp is 8Gb it might only be three polygons, in which case it might be easy to simplify them. How many polygons have you got, and how big is the .shp part of the shapefile?

like image 51
Spacedman Avatar answered Sep 30 '22 03:09

Spacedman