Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R how to find the intersection of a subest of vectors in a list

I have a list of vectors (characters). For example:

my_list <- list(c("a", "b", "c"), 
                c("a", "b", "c", "d"), 
                c("e", "d"))

For the intersection of all these three vectors, I could use: Reduce(intersect, my_list). But as you can see, there is no common element in all three vectors.

Then, what if I want to find the common element that appears "at least" a certain amount of times in the list? Such as: somefunction(my_list, time=2) would give me c("a", "b", "c", "d") because those elements appear two times.

Thanks.

like image 873
Yan Avatar asked Sep 15 '16 15:09

Yan


1 Answers

We can convert this to a data.table and do the group by action to get the elements

library(data.table)
setDT(stack(setNames(my_list, seq_along(my_list))))[,
           if(uniqueN(ind)==2) values , values]$values
#[1] "a" "b" "c" "d"

A base R option would be to unlist the 'my_list', find the frequency count with the replicated sequence of 'my_list' using table, get the column sums, check whether it is equal to 2 and use that index to subset the names.

tblCount <- colSums(table(rep(seq_along(my_list), lengths(my_list)), unlist(my_list)))
names(tblCount)[tblCount==2]
#[1] "a" "b" "c" "d"
like image 55
akrun Avatar answered Nov 15 '22 05:11

akrun