How to select among 3 values, the 2 closest to each other in R?

Question

I would like to select for each ID the two closest values of Cq. I thought I'd figured it out, but it depends on row position...

Here is an example of the form of my dataset :

df <- data.frame(ID = c("A","A","A","B","B","B","C","C","C"), 
                 Cq = c(34.32,34.40,34.31,31.49,31.40,31.49,31.22,31.31,31.08))
  ID    Cq
1  A 34.32
2  A 34.40
3  A 34.31
4  B 31.49
5  B 31.40
6  B 31.49
7  C 31.22
8  C 31.31
9  C 31.08

And what I tried

df4 <-df %>% 
  group_by(ID) %>% 
  arrange(Cq) %>% 
  mutate(diffvals= Cq - lag(Cq)) %>%
  filter(row_number() == 1 | row_number() == 2)

#Output
ID       Cq   diffvals
1 A      34.31   NA     
2 A      34.32   0.0100
3 B      31.40   NA     
4 B      31.49   0.0900
5 C      31.08   NA     
6 C      31.22   0.14

And the expected Output

 ID    Cq
1  A 34.32
2  A 34.31
3  B 31.49
4  B 31.49
5  C 31.22
6  C 31.31

I've tried sorting my dataset before, but it doesn't change anything. I also tried using filter(diffvals=wich.min==diffvals)but I don't know how to extract the two smallest.

If you have any ideas, it would help me a lot!

Thanks in advance

ThomasIsCoding · Accepted Answer

Here is a base R code, where dist is used to enumerate distances of all pairs within groups, i.e.,

dfout <- do.call(rbind,
                 lapply(split(df,df$ID), 
                        function(v) {
                          d <- `diag<-`(as.matrix(dist(v$Cq)),NA)
                          d[lower.tri(d)] <- NA
                          v[which(d==min(d,na.rm = T),arr.ind = T),]
                        }
                 ))

such that

> dfout
    ID    Cq
A.1  A 34.32
A.3  A 34.31
B.4  B 31.49
B.6  B 31.49
C.7  C 31.22
C.8  C 31.31

Ronak Shah · Answer

Using dplyr one option is to do a full_join with itself based on ID. Remove the rows which are generated in combination with itself and for each ID select the row with minimum difference and get the data in long format.

library(dplyr)

df %>%
  mutate(Row = row_number()) %>%
  full_join(df, by = 'ID') %>%
  group_by(ID, Row) %>%
  filter(Cq.x != Cq.y) %>%
  group_by(ID) %>%
  slice(which.min(abs(Cq.x - Cq.y))) %>%
  tidyr::pivot_longer(cols  = starts_with('Cq')) %>%
  select(-Row, -name)

#  ID    value
#  <fct> <dbl>
#1 A      34.3
#2 A      34.3
#3 B      31.5
#4 B      31.4
#5 C      31.2
#6 C      31.3

How to select among 3 values, the 2 closest to each other in R?

Tags:

r

dplyr

user12933512

2 Answers

ThomasIsCoding

Ronak Shah

Recent Activity

Donate For Us

How to select among 3 values, the 2 closest to each other in R?

Tags:

r

dplyr

user12933512

2 Answers

ThomasIsCoding

Ronak Shah

Related questions

Recent Activity

Donate For Us