I'm subsetting a dataset before plotting, but the key being numeric I cannot use the strict equality testing of match() or %in% (it misses a few values).
I wrote the following alternative, but I imagine this problem is sufficiently common that there's a better built-in alternative somewhere? all.equal doesn't seem to be designed for multiple test values.
select_in <- function(x, ref, tol=1e-10){
testone <- function(value) abs(x - value) < tol
as.logical(rowSums(sapply(ref, testone)) )
}
x = c(1.0, 1+1e-13, 1.01, 2, 2+1e-9, 2-1e-11)
x %in% c(1,2,3)
#[1] TRUE FALSE FALSE TRUE FALSE FALSE
select_in(x, c(1, 2, 3))
#[1] TRUE TRUE FALSE TRUE FALSE TRUE
This seems to achieve the goal (albeit not quite with a tolerance):
fselect_in <- function(x, ref, d = 10){
round(x, digits=d) %in% round(ref, digits=d)
}
fselect_in(x, c(1,2,3))
# TRUE TRUE FALSE TRUE FALSE TRUE
Not sure how much better it is but all.equal has a tolerance argument that will work:
`%~%` <- function(x,y) sapply(x, function(.x) {
any(sapply(y, function(.y) isTRUE(all.equal(.x, .y, tolerance=tol))))
})
x %~% c(1,2,3)
[1] TRUE TRUE FALSE TRUE FALSE TRUE
I don't like having two apply functions there. I'll try to shorten it.
update
Another way that might be faster without using all.equal. It turns out to be much faster than the first solution:
`%~%` <- function(x,y) {
out <- logical(length(x))
for(i in 1:length(x)) out[i] <- any(abs(x[i] - y) <= tol)
out
}
x %~% c(1,2,3)
[1] TRUE TRUE FALSE TRUE FALSE TRUE
Benchmark
big.x <- rep(x, 1e3)
big.y <- rep(y, 100)
all.equal(select_in(big.x, big.y), big.x %~% big.y)
[1] TRUE
library(microbenchmark)
microbenchmark(
baptiste = select_in(big.x, big.y),
plafort2 = big.x %~% big.y,
times=50L)
Unit: milliseconds
expr min lq mean median uq max
baptiste 185.86828 199.57517 231.28246 244.81980 261.7451 271.3426
plafort2 49.03265 54.30729 84.88076 66.10971 118.3270 123.1074
neval cld
50 b
50 a
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With