I have a vector
set.seed(2)
x <- sample.int(20, 5)
[1] 4 14 11 3 16
Now, for every element I want to find
the element with the minimum distance (min(abs(x[i]-x[-i]))
for element i
), which here would be
[1] 3 16 14 4 14
the (first) index of the element with the minimum distance, which here would be
[1] 4 5 2 1 2
The point is that the element itself is not considered, but only all the other elements, which is why this R - Fastest way to find nearest value in vector is not the answer.
If the actual answer is out there, sorry - I didn't find it.
1) Rfast Using dista
in Rfast we get the indexes of the closest two. Take the second closest as the closest will be the same value.
library(Rfast)
x <- c(4, 14, 11, 3, 16) # input
x[ dista(x, x, k = 2, index = TRUE)[, 2] ]
## [1] 3 16 14 4 14
2) sqldf Using SQL we can left join DF to itself excluding the same value value and take the row with the minimum distance.
DF <- data.frame(x) # x is from (1)
sqldf("select a.x, b.x nearest, min(abs(a.x - b.x))
from DF a
left join DF b on a.x != b.x
group by a.rowid")[1:2]
giving:
x nearest
1 4 3
2 14 16
3 11 14
4 3 4
5 16 14
3) zoo Sort the input, take the value corresponding to the least difference on either of side of each element and order it back.
library(zoo)
ix <- order(x)
least <- function(x) if (x[2] - x[1] < x[3] - x[2]) x[1] else x[3]
rollapply(c(-Inf, x[ix], Inf), 3, least)[order(ix)]
## [1] 3 16 14 4 14
4) Base R Using ix
and least
from (3) we can mimic (3) using only base functions as follows.
apply(embed(c(-Inf, x[ix], Inf), 3)[, 3:1], 1, least)[order(ix)]
## [1] 3 16 14 4 14
4a) This slightly shorter variation would also work:
-apply(embed(-c(-Inf, x[ix], Inf), 3), 1, least)[order(ix)]
## [1] 3 16 14 4 14
4b) Simplifying further we have the following base solution where, again, ix
is from (3):
xx <- x[ix]
x1 <- c(-Inf, xx[-length(xx)])
x2 <- c(xx[-1], Inf)
ifelse(xx - x1 < x2 - xx, x1, x2)[order(ix)]
## [1] 3 16 14 4 14
The example in the question had no duplicates but if there were duplicates there is some question regarding the problem definition. For example if we had c(1, 3, 4, 1)
then if we look at the first value, 1, there is another value exactly equal to it so the closest value is 1. Another interpretation is that the closest value not equal to 1 should be returned which in this case is 3. In the codes above the sqldf
solution gives the closest value not equal to the current value whereas the others give the closest value among the remaining values.
If we wanted the interpretation of the closest not equal for those other than sqldf
then we could use rle
after ordering to compress it down to unique values and then use inverse.rle
afterwards as shown on the modified 4b:
x <- c(1, 3, 4, 1)
ix <- order(x)
r <- rle(x[ix])
xx <- r$values
x1 <- c(-Inf, xx[-length(xx)])
x2 <- c(xx[-1], Inf)
r$values <- ifelse(xx - x1 < x2 - xx, x1, x2)
inverse.rle(r)[order(ix)]
## [1] 3 4 3 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With