Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find min value between 2 sets of vectors (fastest way)

Tags:

r

I have a dataframe that is the start and end times for a series of vectors. So we have a bunch of x vectors and y vectors and I want to compare the minimum distance between the 2 vectors. If two vectors have any overlapping parts then the min distance is 0 (in this application you can't have a negative distance).

Here's what the dataframe looks like (below is an easy way to grab it):

  x.start x.end y.start y.end
1       3     6       7     8
2      10    14      19    22
3      19    25      45    45
4      33    33      66    68
5     100   101      90   101
6     130   150     134   153

So I want to go row by row on the x vectors and for each x vector compare it to all y vectors and find the minimum distance between the two.

Below I accomplish this with a nested for loop but I need to repeat this many times with more vectors so speed counts. This is slow. What is the most efficient way to accomplish this task?

Desired output:

## > out
## [1]  1  2  0 11  0  0

I'd prefer to keep this in base R but if you have a faster way that is OS independent I'm open.

The data:

dat <- data.frame(
    x.start = c(3, 10, 19, 33, 100, 130),
    x.end = c(6, 14, 25, 33, 101, 150), 
    y.start = c(7, 19, 45, 66, 90, 134),
    y.end = c(8, 22, 45, 68, 101, 153)
)

Note that looking at my answer below may provide better understanding of the task. I'll benchmark the results after a few competitors have risen.

Here is the desired output as a dataframe for easier comparison and understanding:

  min_dist x.start x.end y.start y.end
1        1       3     6       7     8
2        2      10    14      19    22
3        0      19    25      45    45
4       11      33    33      66    68
5        0     100   101      90   101
6        0     130   150     134   153

A visualization of the 2 sets of vectors:

enter image description here

So each of the red segments I'd like to know the minimum distance to the closest y vector (blue segments); though I see the x vector 33:33 and the y vector 45:45 don't show up but I think this gives a visual depiction of the problem.

Benchmarking Results: Running Tally

Unit: microseconds
                 expr      min       lq   median       uq        max neval
         GEEKTRADER() 5386.186 5553.659 5603.341 5678.214  68297.171  5000
            TRINKER() 1421.887 1480.198 1496.992 1517.985  63619.596  5000
       RICARDO_OPT1() 4748.483 4892.631 4974.968 5110.952 156400.446  5000
       RICARDO_OPT2() 7387.463 7583.859 7694.418 7845.564  70200.949  5000
          FOTNELTON()  437.576  462.767  473.963  486.091   6109.724  5000
     FOTNELTON_EDIT()  356.871  379.730  390.460  402.122   3576.174  5000
 RICARDO_SIMPLE_ANS()  801.444  842.496  855.091  870.952   3923.715  5000
             ALEXIS()  343.343  385.328  397.923  408.652   4169.093  5000
like image 442
Tyler Rinker Avatar asked Jan 12 '23 01:01

Tyler Rinker


2 Answers

I think the easiest and possibly fastest way to do it is as follows:

apply(dat, 1, function(d) {
  overlap <- (dat$y.end >= d[1] & dat$y.end <= d[2]) |
             (dat$y.start >= d[1] & dat$y.start <= d[2])
  if (any(overlap)) 0
  else min(abs(c(d[1] - dat$y.end[!overlap], dat$y.start[!overlap] - d[2])))
})

EDIT: overlap can be much simpler:

apply(dat, 1, function(d) {
  overlap <- dat$y.end >= d[1] & dat$y.start <= d[2]
  if (any(overlap)) 0
  else min(abs(c(d[1] - dat$y.end[!overlap], dat$y.start[!overlap] - d[2])))
})
like image 62
fotNelton Avatar answered Jan 17 '23 17:01

fotNelton


Not sure if this is fastest. But here is one way to do it.

apply(dat[,1:2], MARGIN=1, FUN=function(x) {
  min(apply(dat[,3:4], MARGIN = 1, FUN = function(y){
    X <- c(t(x))
    Y <- c(t(y))
    #Check if the two line segments overlap else find minimum distance between the 2 edges of each line segments
    if (diff(range(c(X,Y))) <=  diff(X) + diff(Y)){
      return(0)
    } else {
      return(min(abs(outer(Y, X, "-"))))
    }
  }))
})
## [1]  1  2  0 11  0  0
like image 41
CHP Avatar answered Jan 17 '23 16:01

CHP