Closest other Value in the same Vector

Question

I have a vector

set.seed(2)
x <- sample.int(20, 5)

[1]  4 14 11  3 16

Now, for every element I want to find

the element with the minimum distance (min(abs(x[i]-x[-i])) for element i), which here would be

[1]  3 16 14  4 14

the (first) index of the element with the minimum distance, which here would be

[1] 4 5 2 1 2

The point is that the element itself is not considered, but only all the other elements, which is why this R - Fastest way to find nearest value in vector is not the answer.

If the actual answer is out there, sorry - I didn't find it.

G. Grothendieck · Accepted Answer

1) Rfast Using dista in Rfast we get the indexes of the closest two. Take the second closest as the closest will be the same value.

library(Rfast)
x <- c(4, 14, 11, 3, 16) # input

x[ dista(x, x, k = 2, index = TRUE)[, 2] ]
## [1]  3 16 14  4 14

2) sqldf Using SQL we can left join DF to itself excluding the same value value and take the row with the minimum distance.

DF <- data.frame(x)   # x is from (1)
sqldf("select a.x, b.x nearest, min(abs(a.x - b.x)) 
  from DF a 
  left join DF b on a.x != b.x 
  group by a.rowid")[1:2]

giving:

   x nearest
1  4       3
2 14      16
3 11      14
4  3       4
5 16      14

3) zoo Sort the input, take the value corresponding to the least difference on either of side of each element and order it back.

library(zoo)
ix <- order(x)
least <- function(x) if (x[2] - x[1] < x[3] - x[2]) x[1] else x[3]
rollapply(c(-Inf, x[ix], Inf), 3, least)[order(ix)]
## [1]  3 16 14  4 14

4) Base R Using ix and least from (3) we can mimic (3) using only base functions as follows.

apply(embed(c(-Inf, x[ix], Inf),  3)[, 3:1], 1, least)[order(ix)]
## [1]  3 16 14  4 14

4a) This slightly shorter variation would also work:

-apply(embed(-c(-Inf, x[ix], Inf),  3), 1, least)[order(ix)]
## [1]  3 16 14  4 14

4b) Simplifying further we have the following base solution where, again, ix is from (3):

xx <- x[ix]
x1 <- c(-Inf, xx[-length(xx)])
x2 <- c(xx[-1], Inf)
ifelse(xx - x1 < x2 - xx, x1, x2)[order(ix)]
## [1]  3 16 14  4 14

Duplicates

The example in the question had no duplicates but if there were duplicates there is some question regarding the problem definition. For example if we had c(1, 3, 4, 1) then if we look at the first value, 1, there is another value exactly equal to it so the closest value is 1. Another interpretation is that the closest value not equal to 1 should be returned which in this case is 3. In the codes above the sqldf solution gives the closest value not equal to the current value whereas the others give the closest value among the remaining values.

If we wanted the interpretation of the closest not equal for those other than sqldf then we could use rle after ordering to compress it down to unique values and then use inverse.rle afterwards as shown on the modified 4b:

x <- c(1, 3, 4, 1)
ix <- order(x)
r <- rle(x[ix])
xx <- r$values
x1 <- c(-Inf, xx[-length(xx)])
x2 <- c(xx[-1], Inf)
r$values <- ifelse(xx - x1 < x2 - xx, x1, x2)
inverse.rle(r)[order(ix)]
## [1] 3 4 3 3

Closest other Value in the same Vector

Tags:

r

Georgery

1 Answers

Duplicates

G. Grothendieck

Recent Activity

Donate For Us

Closest other Value in the same Vector

Tags:

r

Georgery

1 Answers

Duplicates

G. Grothendieck

Related questions

Recent Activity

Donate For Us