Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

generating random x and y coordinates with a minimum distance

Is there a way in R to generate random coordinates with a minimum distance between them?

E.g. what I'd like to avoid

x <- c(0,3.9,4.1,8)
y <- c(1,4.1,3.9,7)
plot(x~y)
like image 464
unknown Avatar asked Feb 07 '18 12:02

unknown


4 Answers


This is a classical problem from stochastic geometry. Completely random points in space where the number of points falling in disjoint regions are independent of each other corresponds to a homogeneous Poisson point process (in this case in R^2, but could be in almost any space).

An important feature is that the total number of points has to be random before you can have independence of the counts of points in disjoint regions.

For the Poisson process points can be arbitrarily close together. If you define a process by sampling the Poisson process until you don't have any points that are too close together you have the so-called Gibbs Hardcore process. This has been studied a lot in the literature and there are different ways to simulate it. The R package spatstat has functions to do this. rHardcore is a perfect sampler, but if you want a high intensity of points and a big hard core distance it may not terminate in finite time... The distribution can be obtained as the limit of a Markov chain and rmh.default lets you run a Markov chain with a given Gibbs model as its invariant distribution. This finishes in finite time but only gives a realisation of an approximate distribution.

In rmh.default you can also simulate conditional on a fixed number of points. Note that when you sample in a finite box there is of course an upper limit to how many points you can fit with a given hard core radius, and the closer you are to this limit the more problematic it becomes to sample correctly from the distribution.

Example:

library(spatstat)
beta <- 100; R = 0.1
win <- square(1) # Unit square for simulation
X1 <- rHardcore(beta, R, W = win) # Exact sampling -- beware it may run forever for some par.!
plot(X1, main = paste("Exact sim. of hardcore model; beta =", beta, "and R =", R))

minnndist(X1) # Observed min. nearest neighbour dist.
#> [1] 0.102402

Approximate simulation

model <- rmhmodel(cif="hardcore", par = list(beta=beta, hc=R), w = win)
X2 <- rmh(model)
#> Checking arguments..determining simulation windows...Starting simulation.
#> Initial state...Ready to simulate. Generating proposal points...Running Metropolis-Hastings.
plot(X2, main = paste("Approx. sim. of hardcore model; beta =", beta, "and R =", R))

minnndist(X2) # Observed min. nearest neighbour dist.
#> [1] 0.1005433

Approximate simulation conditional on number of points

X3 <- rmh(model, control = rmhcontrol(p=1), start = list(n.start = 42))
#> Checking arguments..determining simulation windows...Starting simulation.
#> Initial state...Ready to simulate. Generating proposal points...Running Metropolis-Hastings.
plot(X3, main = paste("Approx. sim. given n =", 42))

minnndist(X3) # Observed min. nearest neighbour dist.
#> [1] 0.1018068
like image 141
Ege Rubak Avatar answered Nov 08 '22 01:11

Ege Rubak


OK, how about this? You just generate random number pairs without restriction and then remove the onces which are too close. This could be a great start for that:

minimumDistancePairs <- function(x, y, minDistance){
    i <- 1
    repeat{
        distance <- sqrt((x-x[i])^2 + (y-y[i])^2) < minDistance # pythagorean theorem
        distance[i] <- FALSE # distance to oneself is always zero
        if(any(distance)) { # if too close to any other point
            x <- x[-i] # remove element from x
            y <- y[-i] # and remove element from y
        } else { # otherwise...
            i = i + 1 # repeat the procedure with the next element
        }
        if (i > length(x)) break
    }
    data.frame(x,y)
}

minimumDistancePairs(
    c(0,3.9,4.1,8)
    , c(1,4.1,3.9,7)
    , 1
)

will lead to

x   y
1 0.0 1.0
2 4.1 3.9
3 8.0 7.0

Be aware, though, of the fact that these are not random numbers anymore (however you solve problem).

like image 35
Georgery Avatar answered Nov 08 '22 01:11

Georgery


You can use rejection sapling https://en.wikipedia.org/wiki/Rejection_sampling

The principle is simple: you resample until you data verify the condition.

> set.seed(1)
> 
> x <- rnorm(2)
> y <- rnorm(2)
> (x[1]-x[2])^2+(y[1]-y[2])^2
[1] 6.565578
> while((x[1]-x[2])^2+(y[1]-y[2])^2 > 1) {
+   x <- rnorm(2)
+   y <- rnorm(2)
+ }
> (x[1]-x[2])^2+(y[1]-y[2])^2
[1] 0.9733252
> 
like image 1
abichat Avatar answered Nov 08 '22 02:11

abichat


The following is a naive hit-and-miss approach which for some choices of parameters (which were left unspecified in the question) works well. If performance becomes an issue, you could experiment with the package gpuR which has a GPU-accelerated distance matrix calculation.

rand.separated <- function(n,x0,x1,y0,y1,d,trials = 1000){
  for(i in 1:trials){
    nums <- cbind(runif(n,x0,x1),runif(n,y0,y1))
    if(min(dist(nums)) >= d) return(nums)
  }
  return(NA) #no luck
}

This repeatedly draws samples of size n in [x0,x1]x[y0,y1] and then throws the sample away if it doesn't satisfy. As a safety, trials guards against an infinite loop. If solutions are hard to find or n is large you might need to increase or decrease trials.

For example:

> set.seed(2018)
> nums <- rand.separated(25,0,10,0,10,0.2)
> plot(nums)

runs almost instantly and produces: enter image description here

like image 1
John Coleman Avatar answered Nov 08 '22 01:11

John Coleman