Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

match() values with tolerance

I'm subsetting a dataset before plotting, but the key being numeric I cannot use the strict equality testing of match() or %in% (it misses a few values). I wrote the following alternative, but I imagine this problem is sufficiently common that there's a better built-in alternative somewhere? all.equal doesn't seem to be designed for multiple test values.

select_in <- function(x, ref, tol=1e-10){
  testone <- function(value) abs(x - value) < tol
  as.logical(rowSums(sapply(ref, testone)) )
}

x = c(1.0, 1+1e-13, 1.01, 2, 2+1e-9, 2-1e-11)
x %in% c(1,2,3)
#[1]  TRUE FALSE FALSE  TRUE FALSE FALSE
select_in(x, c(1, 2, 3))
#[1]  TRUE  TRUE FALSE  TRUE FALSE  TRUE
like image 300
baptiste Avatar asked Oct 20 '15 00:10

baptiste


2 Answers

This seems to achieve the goal (albeit not quite with a tolerance):

fselect_in <- function(x, ref, d = 10){
  round(x, digits=d) %in% round(ref, digits=d)
}

fselect_in(x, c(1,2,3))
# TRUE  TRUE FALSE  TRUE FALSE  TRUE
like image 80
Frank Avatar answered Sep 19 '22 01:09

Frank


Not sure how much better it is but all.equal has a tolerance argument that will work:

`%~%` <- function(x,y) sapply(x, function(.x) {
 any(sapply(y, function(.y) isTRUE(all.equal(.x, .y, tolerance=tol))))
})

x %~% c(1,2,3)
[1]  TRUE  TRUE FALSE  TRUE FALSE  TRUE

I don't like having two apply functions there. I'll try to shorten it.

update

Another way that might be faster without using all.equal. It turns out to be much faster than the first solution:

`%~%` <- function(x,y) {
out <- logical(length(x))
for(i in 1:length(x)) out[i] <- any(abs(x[i] - y) <= tol)
out
}

x %~% c(1,2,3)
[1]  TRUE  TRUE FALSE  TRUE FALSE  TRUE

Benchmark

big.x <- rep(x, 1e3)
big.y <- rep(y, 100)

all.equal(select_in(big.x, big.y), big.x %~% big.y)
[1] TRUE

library(microbenchmark)
microbenchmark(
  baptiste = select_in(big.x, big.y),
  plafort2 = big.x %~% big.y,
  times=50L)
Unit: milliseconds
     expr       min        lq      mean    median       uq      max
 baptiste 185.86828 199.57517 231.28246 244.81980 261.7451 271.3426
 plafort2  49.03265  54.30729  84.88076  66.10971 118.3270 123.1074
 neval cld
    50   b
    50  a 
like image 21
Pierre L Avatar answered Sep 20 '22 01:09

Pierre L