Merging list with common elements

Question

I have a list

[[1]]
[1] 7

[[2]]
[1] 10 11 12 211 446 469

[[3]]
[1] 10 11 12 13

[[4]]
[1] 11 12 13 215

[[5]]
[1] 15 16

[[6]]
[1] 15 17 216 225

I want to merge list slices that have common elements, and index which list slices have been merged. My desired output is below.

$`1`
[1] 7

$`2`, `3`, `4`
[1] 10 11 12 13 211 215 446 469

$`5`,`6`
[1] 15 16 17 216 225

(I've put the original list slice indices as new list names, but any form of output is fine.)

Reproducible data:

mylist <- list(7, c(10, 11, 12, 211, 446, 469), c(10, 11, 12, 13), c(11, 
12, 13, 215), c(15, 16), c(15, 17, 216, 225))

alexis_laz · Accepted Answer

Here is another approach using "Matrix" and "igraph" packages.

First, we need to extract the information of which elements are connected. Using sparse matrices can, potetially, save a lot memory usage:

library(Matrix)
i = rep(1:length(mylist), lengths(mylist)) 
j = factor(unlist(mylist))
tab = sparseMatrix(i = i, j = as.integer(j), x = TRUE, dimnames = list(NULL, levels(j)))
#as.matrix(tab)  ## just to print colnames
#         7    10    11    12    13    15    16    17   211   215   216   225   446   469
#[1,]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#[2,] FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE  TRUE
#[3,] FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#[4,] FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
#[5,] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#[6,] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE

Find if each element is connected to each other:

connects = tcrossprod(tab, boolArith = TRUE)
#connects
#6 x 6 sparse Matrix of class "lsCMatrix"
#                
#[1,] | . . . . .
#[2,] . | | | . .
#[3,] . | | | . .
#[4,] . | | | . .
#[5,] . . . . | |
#[6,] . . . . | |

Then, using graphs, we can group the indices of "mylist":

library(igraph)
# 'graph_from_adjacency_matrix' seems to not work with the "connects" object directly. 
# An alternative to coercing "connects" here would be to build it as 'tcrossprod(tab) > 0'

group = clusters(graph_from_adjacency_matrix(as(connects, "lsCMatrix")))$membership
#group
#[1] 1 2 2 2 3 3

And, finally, concatenate:

tapply(mylist, group, function(x) sort(unique(unlist(x))))
#$`1`
#[1] 7
#
#$`2`
#[1]  10  11  12  13 211 215 446 469
#
#$`3`
#[1]  15  16  17 216 225

tapply(1:length(mylist), group, toString)
#        1         2         3 
#      "1" "2, 3, 4"    "5, 6"

Ronak Shah · Answer

Not happy with the solution but this I think gives the answer. There is still scope of improvement :

unique(sapply(lst, function(x) 
       unique(unlist(lst[sapply(lst, function(y) 
                         any(x %in% y))]))))


#[[1]]
#[1] 7

#[[2]]
#[1]  10  11  12 211 446 469  13 215

#[[3]]
#[1]  15  16  17 216 225

This is basically double loop to check if any of the list element is present in any another list. If you find any such element then merge them together taking only unique values out of them.

data

lst <- list(7, c(10 ,11 ,12, 211, 446, 469), c(10, 11, 12, 13),c(11 ,12, 13 ,215), 
               c(15, 16), c(15, 17 ,216 ,225))

Merging list with common elements

Tags:

merge

list

r

vtuna

2 Answers

alexis_laz

Ronak Shah

Recent Activity

Donate For Us

Merging list with common elements

Tags:

merge

list

r

vtuna

2 Answers

alexis_laz

Ronak Shah

Related questions

Recent Activity

Donate For Us