Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare the values in two monotonic increasing vectors

Tags:

I have two monotonic increasing vectors, v1 and v2 of unequal lengths. For each value in v1 (e.g., v1[1], v1[2], ...), I want to find the value in v2 that is just less than v1[i] and compute the difference.

My current code (see below) works correctly, but does not seem to scale up well. So I am looking for recommendations to improve my approach with the requirement of staying in R, or using a package I can call from R.

Example code:

v1 <- c(3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10.0)
v2 <- c(0, 2, 3.2, 4.6, 5.5, 7.1, 9.9, 12, 13)

myFunc <- function(x,v2) x - max(v2[x>=v2])
v3 <- sapply(as.list(v1), FUN = myFunc, v2)

cbind(v1,v3)

        v1  v3
 [1,]  3.0 1.0
 [2,]  3.5 0.3 # 0.3 = 3.5 - 3.2 where 3.5 is from v1[2] and 3.2 is v2[3]
 [3,]  4.0 0.8
 [4,]  4.5 1.3
 [5,]  5.0 0.4
 [6,]  5.5 0.0
 [7,]  6.0 0.5
 [8,]  6.5 1.0
 [9,]  7.0 1.5
[10,]  7.5 0.4
[11,]  8.0 0.9
[12,]  8.5 1.4
[13,]  9.0 1.9
[14,]  9.5 2.4
[15,] 10.0 0.1

Benchmark 1: For small vectors, say roughly 10,000 elements, the code will run in <1 second:

> v1 <- seq(3,5000,.5)
> v2 <- seq(2.2,5200,.52)
> 
> {
+   start <- Sys.time()
+   v3 <- sapply(as.list(v1), FUN = myFunc, v2)
+   Sys.time() - start
+ }
Time difference of 0.8118291 secs

Benchmark 2: For vectors with roughly 100,000 elements the code takes ~60-80 seconds.

> v1 <- seq(3,50000,.5)
> v2 <- seq(2.2,52000,.52)
> 
> {
+   start <- Sys.time()
+   v3 <- sapply(as.list(v1), FUN = myFunc, v2)
+   Sys.time() - start
+ }
Time difference of 1.098762 mins

So to reiterate, I am looking for recommendations to improve my approach with the requirement of staying in R, or using a package I can call from R.

like image 982
greengrass62 Avatar asked Jun 09 '21 12:06

greengrass62


1 Answers

Use findInterval:

v1 - v2[findInterval(v1,v2)]
#[1] 1.0 0.3 0.8 1.3 0.4 0.0 0.5 1.0 1.5 0.4 0.9 1.4 1.9 2.4 0.1
like image 127
nicola Avatar answered Sep 30 '22 18:09

nicola