Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count number of pairs across elements in a list in R?

Tags:

r

Similar questions have been asked about counting pairs, however none seem to be specifically useful for what I'm trying to do.

What I want is to count the number of pairs across multiple list elements and turn it into a matrix. For example, if I have a list like so:

myList <- list(
  a = c(2,4,6),
  b = c(1,2,3,4),
  c = c(1,2,5,7),
  d = c(1,2,4,5,8)
)

We can see that the pair 1:2 appears 3 times (once each in a, b, and c). The pair 1:3 appears only once in b. The pair 1:4 appears 2 times (once each in b and d)... etc.

I would like to count the number of times a pair appears and then turn it into a symmetrical matrix. For example, my desired output would look something like the matrix I created manually (where each element of the matrix is the total count for that pair of values):

> myMatrix
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]    0    3    1    2    2    0    1    1
[2,]    3    0    1    3    2    1    1    1
[3,]    1    1    0    1    0    0    0    0
[4,]    2    3    1    0    0    0    0    1
[5,]    2    2    0    0    0    0    1    1
[6,]    0    1    0    0    0    0    0    0
[7,]    1    1    0    0    1    0    0    0
[8,]    1    1    0    1    1    0    0    0

Any suggestions are greatly appreciated

like image 540
Electrino Avatar asked Sep 07 '21 21:09

Electrino


4 Answers

Inspired by @akrun's answer, I think you can use a crossproduct to get this very quickly and simply:

out <- tcrossprod(table(stack(myList)))
diag(out) <- 0

#      values
#values 1 2 3 4 5 6 7 8
#     1 0 3 1 2 2 0 1 1
#     2 3 0 1 3 2 1 1 1
#     3 1 1 0 1 0 0 0 0
#     4 2 3 1 0 1 1 0 1
#     5 2 2 0 1 0 0 1 1
#     6 0 1 0 1 0 0 0 0
#     7 1 1 0 0 1 0 0 0
#     8 1 1 0 1 1 0 0 0

Original answer:

Use combn to get the combinations, as well as reversing each combination.
Then convert to a data.frame and table the results.

tab <- lapply(myList, \(x) combn(x, m=2, FUN=\(cm) rbind(cm, rev(cm)), simplify=FALSE))
tab <- data.frame(do.call(rbind, unlist(tab, rec=FALSE)))
table(tab)

#   X2
#X1  1 2 3 4 5 6 7 8
#  1 0 3 1 2 2 0 1 1
#  2 3 0 1 3 2 1 1 1
#  3 1 1 0 1 0 0 0 0
#  4 2 3 1 0 1 1 0 1
#  5 2 2 0 1 0 0 1 1
#  6 0 1 0 1 0 0 0 0
#  7 1 1 0 0 1 0 0 0
#  8 1 1 0 1 1 0 0 0
like image 59
thelatemail Avatar answered Nov 19 '22 16:11

thelatemail


We could loop over the list, get the pairwise combinations with combn, stack it to a two column dataset, convert the 'values' column to factor with levels specified as 1 to 8, get the frequency count (table), do a cross product (crossprod), convert the output back to logical, and then Reduce the list elements by adding elementwise and finally assign the diagonal elements to 0. (If needed set the names attributes of dimnames to NULL

out <- Reduce(`+`, lapply(myList, function(x) 
        crossprod(table(transform(stack(setNames(
          combn(x,
         2, simplify = FALSE), combn(x, 2, paste, collapse="_"))), 
          values = factor(values, levels = 1:8))[2:1]))> 0))
diag(out) <- 0
names(dimnames(out)) <- NULL

-output

> out
  1 2 3 4 5 6 7 8
1 0 3 1 2 2 0 1 1
2 3 0 1 3 2 1 1 1
3 1 1 0 1 0 0 0 0
4 2 3 1 0 1 1 0 1
5 2 2 0 1 0 0 1 1
6 0 1 0 1 0 0 0 0
7 1 1 0 0 1 0 0 0
8 1 1 0 1 1 0 0 0
like image 8
akrun Avatar answered Nov 19 '22 14:11

akrun


I thought of a solution based on @TarJae answer, is not a elegant one, but it was a fun challenge!

Libraries

library(tidyverse)

Code

map_df(myList,function(x) as_tibble(t(combn(x,2)))) %>% 
  count(V1,V2) %>% 
  {. -> temp_df} %>% 
  bind_rows(
    temp_df %>% 
      rename(V2 = V1, V1 = V2) 
  ) %>% 
  full_join(
    expand_grid(V1 = 1:8,V2 = 1:8)
  ) %>% 
  replace_na(replace = list(n = 0)) %>% 
  arrange(V2,V1) %>% 
  pivot_wider(names_from = V1,values_from = n) %>% 
  as.matrix()

Output

     V2 1 2 3 4 5 6 7 8
[1,]  1 0 3 1 2 2 0 1 1
[2,]  2 3 0 1 3 2 1 1 1
[3,]  3 1 1 0 1 0 0 0 0
[4,]  4 2 3 1 0 1 1 0 1
[5,]  5 2 2 0 1 0 0 1 1
[6,]  6 0 1 0 1 0 0 0 0
[7,]  7 1 1 0 0 1 0 0 0
[8,]  8 1 1 0 1 1 0 0 0
like image 2
Vinícius Félix Avatar answered Nov 19 '22 16:11

Vinícius Félix


First identify the possible combination of each vector from the list to a tibble then I bind them to one tibble and count the combinations.

library(tidyverse)

a <- as_tibble(t(combn(myList[[1]],2)))
b <- as_tibble(t(combn(myList[[2]],2)))
c <- as_tibble(t(combn(myList[[3]],2)))
d <- as_tibble(t(combn(myList[[4]],2)))

bind_rows(a,b,c,d) %>% 
    count(V1, V2)
      V1    V2     n
   <dbl> <dbl> <int>
 1     1     2     3
 2     1     3     1
 3     1     4     2
 4     1     5     2
 5     1     7     1
 6     1     8     1
 7     2     3     1
 8     2     4     3
 9     2     5     2
10     2     6     1
11     2     7     1
12     2     8     1
13     3     4     1
14     4     5     1
15     4     6     1
16     4     8     1
17     5     7     1
18     5     8     1
like image 1
TarJae Avatar answered Nov 19 '22 14:11

TarJae