match() values with tolerance

Question

I'm subsetting a dataset before plotting, but the key being numeric I cannot use the strict equality testing of match() or %in% (it misses a few values). I wrote the following alternative, but I imagine this problem is sufficiently common that there's a better built-in alternative somewhere? all.equal doesn't seem to be designed for multiple test values.

select_in <- function(x, ref, tol=1e-10){
  testone <- function(value) abs(x - value) < tol
  as.logical(rowSums(sapply(ref, testone)) )
}

x = c(1.0, 1+1e-13, 1.01, 2, 2+1e-9, 2-1e-11)
x %in% c(1,2,3)
#[1]  TRUE FALSE FALSE  TRUE FALSE FALSE
select_in(x, c(1, 2, 3))
#[1]  TRUE  TRUE FALSE  TRUE FALSE  TRUE

Frank · Accepted Answer

This seems to achieve the goal (albeit not quite with a tolerance):

fselect_in <- function(x, ref, d = 10){
  round(x, digits=d) %in% round(ref, digits=d)
}

fselect_in(x, c(1,2,3))
# TRUE  TRUE FALSE  TRUE FALSE  TRUE

Pierre L · Answer

Not sure how much better it is but all.equal has a tolerance argument that will work:

`%~%` <- function(x,y) sapply(x, function(.x) {
 any(sapply(y, function(.y) isTRUE(all.equal(.x, .y, tolerance=tol))))
})

x %~% c(1,2,3)
[1]  TRUE  TRUE FALSE  TRUE FALSE  TRUE

I don't like having two apply functions there. I'll try to shorten it.

update

Another way that might be faster without using all.equal. It turns out to be much faster than the first solution:

`%~%` <- function(x,y) {
out <- logical(length(x))
for(i in 1:length(x)) out[i] <- any(abs(x[i] - y) <= tol)
out
}

x %~% c(1,2,3)
[1]  TRUE  TRUE FALSE  TRUE FALSE  TRUE

Benchmark

big.x <- rep(x, 1e3)
big.y <- rep(y, 100)

all.equal(select_in(big.x, big.y), big.x %~% big.y)
[1] TRUE

library(microbenchmark)
microbenchmark(
  baptiste = select_in(big.x, big.y),
  plafort2 = big.x %~% big.y,
  times=50L)
Unit: milliseconds
     expr       min        lq      mean    median       uq      max
 baptiste 185.86828 199.57517 231.28246 244.81980 261.7451 271.3426
 plafort2  49.03265  54.30729  84.88076  66.10971 118.3270 123.1074
 neval cld
    50   b
    50  a

match() values with tolerance

Tags:

floating-accuracy

r

subset

baptiste

2 Answers

Frank

Pierre L

Recent Activity

Donate For Us

match() values with tolerance

Tags:

floating-accuracy

r

subset

baptiste

2 Answers

Frank

Pierre L

Related questions

Recent Activity

Donate For Us