Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compact letter display from logical matrix

Tags:

r

I was wondering if there is a way to turn a logical matrix of comparisons into a letter notation as used in multiple comparisons test. As in multcomp::cld.

The data I have looks like this:

test_data <- data.frame(mean=c(1.48, 1.59, 1.81,1.94),CI_lower=c(1.29,1.38,1.54, 1.62),CI_upper=c(1.56,1.84, 2.3, 2.59))

  mean CI_lower CI_upper
1 1.48     1.29     1.56
2 1.59     1.38     1.84
3 1.81     1.54     2.30
4 1.94     1.62     2.59

What I am interested in is a notation that says which entries have overlapping CIs to get a final result that looks like this:

final <- data.frame(mean=c(1.48, 1.59, 1.81,1.94),CI_lower=c(1.29, 1.38,1.54, 1.62),CI_upper=c(1.56,1.84, 2.3, 2.59),letters = c("a","ab","ab","b"))

  mean CI_lower CI_upper letters
1 1.48     1.29     1.56       a
2 1.59     1.38     1.84      ab
3 1.81     1.54     2.30      ab
4 1.94     1.62     2.59       b

I made a pitiful attempt that went like this:

same <- outer(test_data$CI_lower, test_data$CI_upper,"-")
same <- same<0
same <- lower.tri(same, diag = FALSE) & same

same_ind <- which(same,arr.ind = T)

groups <- as.list(as.numeric(rep(NA,nrow(test_data))))

for(i in 1:nrow(same_ind)){
  group_pos <- as.numeric(same_ind[i,])
  for(i2 in group_pos){
    groups[[i2]] <- c(groups[[i2]],i)
  }
}

letters_notation <- sapply(groups,function(x){
  x <- x[!is.na(x)]
  x <- letters[x]
  x <- paste0(x,collapse="")
  return(x)
}
)

which would gives this:

  mean CI_lower CI_upper letters
1 1.48     1.29     1.56      ab
2 1.59     1.38     1.84     acd
3 1.81     1.54     2.30     bce
4 1.94     1.62     2.59      de

Any ideas for how to do this?

like image 322
Jan Stanstrup Avatar asked Jan 10 '23 08:01

Jan Stanstrup


1 Answers

Here's a possible solutions using data.tables very efficient foverlaps function. This is not exactly your desired output (because I'm not fully understand it) but you can identify the overlapping points from it

library(data.table)
setkey(setDT(test_data), CI_lower, CI_upper)
Overlaps <- foverlaps(test_data, test_data, type = "any", which = TRUE) ## returns overlap indices
test_data[ , overlaps := Overlaps[, paste(letters[yid], collapse = ""), xid]$V1][]
#    mean CI_lower CI_upper overlaps
# 1: 1.48     1.29     1.56      abc <~~ not overlapping with d
# 2: 1.59     1.38     1.84     abcd
# 3: 1.81     1.54     2.30     abcd
# 4: 1.94     1.62     2.59      bcd <~~ not overlapping with a
like image 80
David Arenburg Avatar answered Jan 12 '23 08:01

David Arenburg