<code>st_intersection</code> is very slow compared to <code>st_intersects</code>. So why not use the latter instead of the former? Here's an example with a small toy dataset, but the difference in execution time is huge for my actual set of just 62,020 points intersected with an actual geographic region boundary. I have 24Gb of RAM and the <code>st_intersects</code> code takes a few seconds whereas the <code>st_intersection</code> code takes more than 15 minutes (possibly much more, I haven't had the patience to wait...). Does <code>st_intersection</code> do anything that I am not getting with <code>st_intersects</code>? The below code handles <code>sfc</code> objects but I believe would work equally for <code>sf</code> objects. <pre class="prettyprint"><code>library(sf) library(dplyr) # create square s <- rbind(c(1, 1), c(10, 1), c(10, 10), c(1, 10), c(1, 1)) %>% list %>% st_polygon %>% st_sfc # create random points p <- runif(50, 0, 11) %>% cbind(runif(50, 0, 11)) %>% st_multipoint %>% st_sfc %>% st_cast("POINT") # intersect points and square with st_intersection st_intersection(p, s) # intersect points and square with st_intersects (courtesy of https://stackoverflow.com/a/49304723/7114709) p[st_intersects(p, s) %>% lengths > 0,] </code></pre>

The answer is that in general the two methods do different things, though in your particular case (finding the intersection of a collection of points and a polygon), <code>st_intersects</code> can be used to efficiently do the same job. We can show the difference with a simple example modified from your own. We start with a square: <pre class="prettyprint lang-r prettyprint-override"><code>library(sf) library(dplyr) # create square s <- rbind(c(1, 1), c(10, 1), c(10, 10), c(1, 10), c(1, 1)) %>% list %>% st_polygon %>% st_sfc plot(s) </code></pre> <img src="https://i.stack.imgur.com/3KS4p.png" alt="enter image description here"> Now we will create a rectangle and draw it on the same plot with a dotted outline: <pre class="prettyprint"><code># create rectangle r <- rbind(c(-1, 2), c(11, 2), c(11, 4), c(-1, 4), c(-1, 2)) %>% list %>% st_polygon %>% st_sfc plot(r, add= TRUE, lty = 2) </code></pre> <img src="https://i.stack.imgur.com/fr9vA.png" alt="enter image description here"> Now we find the intersection of the two polygons and plot it in red: <pre class="prettyprint"><code># intersect points and square with st_intersection i <- st_intersection(s, r) plot(i, add = TRUE, lty = 2, col = "red") </code></pre> <img src="https://i.stack.imgur.com/WrfkJ.png" alt="enter image description here"> When we examine the object <code>i</code>, we will see it is a new polygon: <pre class="prettyprint"><code>i #> Geometry set for 1 feature #> geometry type: POLYGON #> dimension: XY #> bbox: xmin: 1 ymin: 2 xmax: 10 ymax: 4 #> epsg (SRID): NA #> proj4string: NA #> POLYGON ((10 4, 10 2, 1 2, 1 4, 10 4)) </code></pre> Whereas, if we use <code>st_intersects</code>, we only get a logical result telling us whether there is indeed an intersection between <code>r</code> and <code>s</code>. If we try to use this to subset <code>r</code> to find the intersection, we don't get the intersected shape, we just get our original rectangle back: <pre class="prettyprint"><code>r[which(unlist(st_intersects(s, r)) == 1)] #> Geometry set for 1 feature #> geometry type: POLYGON #> dimension: XY #> bbox: xmin: -1 ymin: 2 xmax: 11 ymax: 4 #> epsg (SRID): NA #> proj4string: NA #> POLYGON ((-1 2, 11 2, 11 4, -1 4, -1 2)) </code></pre> The situation that you have is different, because you are trying to find a subset of points that intersect a polygon. Is this case, the intersection of a group of points with a polygon is the same as the subset that meet the criterion <code>st_intersects</code>. So it is great that you have found a valid way of getting a quicker intersection. Just be aware this will only work with collections of points intersecting a polygon.

Why use st_intersection rather than st_intersects?

Tags:

r

geospatial

topology

sf

st_intersection is very slow compared to st_intersects. So why not use the latter instead of the former? Here's an example with a small toy dataset, but the difference in execution time is huge for my actual set of just 62,020 points intersected with an actual geographic region boundary. I have 24Gb of RAM and the st_intersects code takes a few seconds whereas the st_intersection code takes more than 15 minutes (possibly much more, I haven't had the patience to wait...). Does st_intersection do anything that I am not getting with st_intersects?

The below code handles sfc objects but I believe would work equally for sf objects.

library(sf)
library(dplyr)

# create square
s <- rbind(c(1, 1), c(10, 1), c(10, 10), c(1, 10), c(1, 1)) %>% list %>% st_polygon %>% st_sfc
# create random points
p <- runif(50, 0, 11) %>% cbind(runif(50, 0, 11)) %>% st_multipoint %>% st_sfc %>% st_cast("POINT")

# intersect points and square with st_intersection
st_intersection(p, s)

# intersect points and square with st_intersects (courtesy of https://stackoverflow.com/a/49304723/7114709)
p[st_intersects(p, s) %>% lengths > 0,]

881

asked Jun 18 '20 04:06

syre

Video Answer

1 Answers

The answer is that in general the two methods do different things, though in your particular case (finding the intersection of a collection of points and a polygon), st_intersects can be used to efficiently do the same job.

We can show the difference with a simple example modified from your own. We start with a square:

library(sf)
library(dplyr)

# create square
s <- rbind(c(1, 1), c(10, 1), c(10, 10), c(1, 10), c(1, 1)) %>% 
  list %>% 
  st_polygon %>% 
  st_sfc

plot(s)

enter image description here

Now we will create a rectangle and draw it on the same plot with a dotted outline:

# create rectangle
r <- rbind(c(-1, 2), c(11, 2), c(11, 4), c(-1, 4), c(-1, 2)) %>% 
  list %>% 
  st_polygon %>% 
  st_sfc

plot(r, add= TRUE, lty = 2)

enter image description here

Now we find the intersection of the two polygons and plot it in red:

# intersect points and square with st_intersection
i <- st_intersection(s, r)

plot(i, add = TRUE, lty = 2, col = "red")

enter image description here

When we examine the object i, we will see it is a new polygon:

i
#> Geometry set for 1 feature 
#> geometry type:  POLYGON
#> dimension:      XY
#> bbox:           xmin: 1 ymin: 2 xmax: 10 ymax: 4
#> epsg (SRID):    NA
#> proj4string:    NA
#> POLYGON ((10 4, 10 2, 1 2, 1 4, 10 4))

Whereas, if we use st_intersects, we only get a logical result telling us whether there is indeed an intersection between r and s. If we try to use this to subset r to find the intersection, we don't get the intersected shape, we just get our original rectangle back:

r[which(unlist(st_intersects(s, r)) == 1)]
#> Geometry set for 1 feature 
#> geometry type:  POLYGON
#> dimension:      XY
#> bbox:           xmin: -1 ymin: 2 xmax: 11 ymax: 4
#> epsg (SRID):    NA
#> proj4string:    NA
#> POLYGON ((-1 2, 11 2, 11 4, -1 4, -1 2))

The situation that you have is different, because you are trying to find a subset of points that intersect a polygon. Is this case, the intersection of a group of points with a polygon is the same as the subset that meet the criterion st_intersects.

So it is great that you have found a valid way of getting a quicker intersection. Just be aware this will only work with collections of points intersecting a polygon.

answered Sep 30 '22 20:09

Allan Cameron

Related questions
                            
                                Error in as.double(y) : cannot coerce type 'S4' to vector of type 'double'
                            
                                Classic case of `sum` returning NA because it doesn't sum NAs [closed]
                            
                                how to convert longitude from 0 - 360 to -180 - 180
                            
                                Writing multiple data frames into .csv files using R
                            
                                How to access values in a frequency table
                            
                                parsing html containing &nbsp; (non-breaking space)
                            
                                Symbol size in ggplot: scale_size_manual doesn't work
                            
                                Logarithmic grid for plot with 'ggplot2'
                            
                                Grouping Over All Possible Combinations of Several Variables With dplyr
                            
                                gsub error turning upper to lower case in R
                            
                                What exactly does sapply with '[' do?
                            
                                Convert function into string
                            
                                Converting factors to binary in R
                            
                                R Change NA values
                            
                                How to find maximum string length by column in data frame
                            
                                Determine season from Date using lubridate in R
                            
                                Changing the font size of valueBoxes
                            
                                SparklyR removing a Table from Spark Context
                            
                                Add "filename" column to table as multiple files are read and bound
                            
                                R Markdown: How do I make text float around figures?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why use st_intersection rather than st_intersects?

Tags:

r

geospatial

topology

sf

syre

People also ask

Video Answer

1 Answers

Allan Cameron

Recent Activity

Donate For Us