I have this simple <code>data.frame</code> <pre class="prettyprint"><code> lat<-c(1,2,3,10,11,12,20,21,22,23) lon<-c(5,6,7,30,31,32,50,51,52,53) data=data.frame(lat,lon) </code></pre> The idea is to find the spatial clusters based on the distance First, I plot the map (lon,lat) : <pre class="prettyprint"><code>plot(data$lon,data$lat) </code></pre> <img src="https://i.stack.imgur.com/nmzQ8.jpg" alt="enter image description here"> so clearly I have three clusters based in the distance between the position of points. For this aim, I've tried this code in R : <pre class="prettyprint"><code>d= as.matrix(dist(cbind(data$lon,data$lat))) #Creat distance matrix d=ifelse(d<5,d,0) #keep only distance < 5 d=as.dist(d) hc<-hclust(d) # hierarchical clustering plot(hc) data$clust <- cutree(hc,k=3) # cut the dendrogram to generate 3 clusters </code></pre> This gives : <img src="https://i.stack.imgur.com/P9glW.jpg" alt="enter image description here"> Now I try to plot the same points but with colors from clusters <pre class="prettyprint"><code>plot(data$x,data$y, col=c("red","blue","green")[data$clust],pch=19) </code></pre> Here the results <img src="https://i.stack.imgur.com/HCZSk.jpg" alt="enter image description here"> Which is not what I'm looking for. Actually, I want to find something like this plot <img src="https://i.stack.imgur.com/SaXzI.jpg" alt="enter image description here"> Thank you for help.

What about something like this: <pre class="prettyprint"><code>lat<-c(1,2,3,10,11,12,20,21,22,23) lon<-c(5,6,7,30,31,32,50,51,52,53) km <- kmeans(cbind(lat, lon), centers = 3) plot(lon, lat, col = km$cluster, pch = 20) </code></pre> <img src="https://i.stack.imgur.com/PS2aE.png" alt="enter image description here">

Here's a different approach. First it assumes that the coordinates are WGS-84 and not UTM (flat). Then it clusters all neighbors within a given radius to the same cluster using hierarchical clustering (with method = <code>single</code>, which adopts a 'friends of friends' clustering strategy). In order to compute the distance matrix, I'm using the <code>rdist.earth</code> method from the package <code>fields</code>. The default earth radius for this package is 6378.388 (the equatorial radius) which might not be what one is looking for, so I've changed it to 6371. See this article for more info. <pre class="prettyprint"><code>library(fields) lon = c(31.621785, 31.641773, 31.617269, 31.583895, 31.603284) lat = c(30.901118, 31.245008, 31.163886, 30.25058, 30.262378) threshold.in.km <- 40 coors <- data.frame(lon,lat) #distance matrix dist.in.km.matrix <- rdist.earth(coors,miles = F,R=6371) #clustering fit <- hclust(as.dist(dist.in.km.matrix), method = "single") clusters <- cutree(fit,h = threshold.in.km) plot(lon, lat, col = clusters, pch = 20) </code></pre> This could be a good solution if you don't know the number of clusters (like the k-means option), and is somewhat related to the dbscan option with minPts = 1. ---EDIT--- With the original data: <pre class="prettyprint"><code>lat<-c(1,2,3,10,11,12,20,21,22,23) lon<-c(5,6,7,30,31,32,50,51,52,53) data=data.frame(lat,lon) dist <- rdist.earth(data,miles = F,R=6371) #dist <- dist(data) if data is UTM fit <- hclust(as.dist(dist), method = "single") clusters <- cutree(fit,h = 1000) #h = 2 if data is UTM plot(lon, lat, col = clusters, pch = 20) </code></pre>

spatial clustering in R (simple example)

Tags:

r

geospatial

spatial

hierarchical-clustering

I have this simple data.frame

Click to copy

 lat<-c(1,2,3,10,11,12,20,21,22,23)
 lon<-c(5,6,7,30,31,32,50,51,52,53)
 data=data.frame(lat,lon)

The idea is to find the spatial clusters based on the distance

First, I plot the map (lon,lat) :

Click to copy

plot(data$lon,data$lat)

enter image description here

so clearly I have three clusters based in the distance between the position of points.

For this aim, I've tried this code in R :

Click to copy

d= as.matrix(dist(cbind(data$lon,data$lat))) #Creat distance matrix
d=ifelse(d<5,d,0) #keep only distance < 5
d=as.dist(d)
hc<-hclust(d) # hierarchical clustering
plot(hc)
data$clust <- cutree(hc,k=3) # cut the dendrogram to generate 3 clusters

This gives :

enter image description here

Now I try to plot the same points but with colors from clusters

Click to copy

plot(data$x,data$y, col=c("red","blue","green")[data$clust],pch=19)

Here the results

enter image description here

Which is not what I'm looking for.

Actually, I want to find something like this plot

enter image description here

Thank you for help.

545

asked Feb 23 '15 11:02

Math

2 Answers

What about something like this:

Click to copy

lat<-c(1,2,3,10,11,12,20,21,22,23)
lon<-c(5,6,7,30,31,32,50,51,52,53)

km <- kmeans(cbind(lat, lon), centers = 3)
plot(lon, lat, col = km$cluster, pch = 20)

enter image description here

166

answered Sep 27 '22 19:09

johannes

Here's a different approach. First it assumes that the coordinates are WGS-84 and not UTM (flat). Then it clusters all neighbors within a given radius to the same cluster using hierarchical clustering (with method = single, which adopts a 'friends of friends' clustering strategy).

In order to compute the distance matrix, I'm using the rdist.earth method from the package fields. The default earth radius for this package is 6378.388 (the equatorial radius) which might not be what one is looking for, so I've changed it to 6371. See this article for more info.

Click to copy

library(fields)
lon = c(31.621785, 31.641773, 31.617269, 31.583895, 31.603284)
lat = c(30.901118, 31.245008, 31.163886, 30.25058, 30.262378)
threshold.in.km <- 40
coors <- data.frame(lon,lat)

#distance matrix
dist.in.km.matrix <- rdist.earth(coors,miles = F,R=6371)

#clustering
fit <- hclust(as.dist(dist.in.km.matrix), method = "single")
clusters <- cutree(fit,h = threshold.in.km)

plot(lon, lat, col = clusters, pch = 20)

This could be a good solution if you don't know the number of clusters (like the k-means option), and is somewhat related to the dbscan option with minPts = 1.

---EDIT---

With the original data:

Click to copy

lat<-c(1,2,3,10,11,12,20,21,22,23)
lon<-c(5,6,7,30,31,32,50,51,52,53)
data=data.frame(lat,lon)

dist <- rdist.earth(data,miles = F,R=6371) #dist <- dist(data) if data is UTM
fit <- hclust(as.dist(dist), method = "single")
clusters <- cutree(fit,h = 1000) #h = 2 if data is UTM
plot(lon, lat, col = clusters, pch = 20)

answered Sep 27 '22 19:09

Omri374

Related questions
                            
                                Turn R list of dates into vector
                            
                                Mapping Multiple Values
                            
                                How to print a long list without indexes?
                            
                                How to add quotes around each word in a string in R?
                            
                                Sum of subvectors of a vector in R
                            
                                Error in .jcall(cell, "V", "setCellValue", value) : method setCellValue with signature ([D)V not found when attempting write.xlsx
                            
                                can we iterate over two lists with purrr (not simultaneously)?
                            
                                How to add Rtools\bin to the system path in R
                            
                                R: using ranger with caret, tuneGrid argument
                            
                                Sum of values greater than or equal too for each element in grouped dataframe (dplyr) R
                            
                                Create sections through a loop with knitr
                            
                                Your experiences with Matlab/F#/R for data analysis and modeling algorithms
                            
                                How to unmask a function in R, due to name collisions on searchpath
                            
                                Why does mapply not return date-objects?
                            
                                predicting class for new data using neuralnet
                            
                                Formatting histogram x-axis when working with dates using R
                            
                                Convert dd/mm/yy and dd/mm/yyyy to Dates
                            
                                R Programming Error in cov.wt(z) : 'x' must contain finite values only
                            
                                Multiple Groups in geom_density() plot
                            
                                Linear Interpolation using dplyr

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

spatial clustering in R (simple example)

Tags:

r

geospatial

spatial

hierarchical-clustering

Math

People also ask

2 Answers

johannes

Omri374

Recent Activity

Donate For Us