Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get all possible subsets of a character vector in R?

Tags:

r

subset

Having the following vector:

c("test1","test2","test3")

I am trying to get a list or data frame containing the following entries:

"test1" "test2" "test3"
"test1" "test2" NA
"test1" NA "test3"
"test1"  NA NA
NA  "test2" "test3"
NA  "test2" NA
NA  NA "test3"

The goal would be to get all possible subsets while the order doesn't matter, that is "text1" "text2" NA is equivalent to "text2" "text1" NA. I very much appreciate any help!

like image 846
Patrick Balada Avatar asked Mar 24 '16 09:03

Patrick Balada


People also ask

How do you subset a character vector in R?

The way you tell R that you want to select some particular elements (i.e., a 'subset') from a vector is by placing an 'index vector' in square brackets immediately following the name of the vector. For a simple example, try x[1:10] to view the first ten elements of x.

How do you find the set of all subsets?

If a set contains n elements, then the number of subsets of this set is equal to 2ⁿ - 1 . The only subset which is not proper is the set itself. So, to get the number of proper subsets, you just need to subtract one from the total number of subsets.

How do I subset a list in R?

To subset lists we can utilize the single bracket [ ] , double brackets [[ ]] , and dollar sign $ operators. Each approach provides a specific purpose and can be combined in different ways to achieve the following subsetting objectives: Subset list and preserve output as a list. Subset list and simplify output.

What does subsetting mean in R?

Subsetting in R is a useful indexing feature for accessing object elements. It can be used to select and filter variables and observations. You can use brackets to select rows and columns from your dataframe.


2 Answers

You can use combn:

res <- unlist(lapply(1:3, combn, 
                     x = c("test1","test2","test3"), simplify = FALSE), 
              recursive = FALSE)
res <- sapply(res, `length<-`, 3)
#        [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]   
#[1,] "test1" "test2" "test3" "test1" "test1" "test2" "test1"
#[2,] NA      NA      NA      "test2" "test3" "test3" "test2"
#[3,] NA      NA      NA      NA      NA      NA      "test3"
like image 101
Roland Avatar answered Oct 05 '22 11:10

Roland


There is a package sets with the relevant function.

library(sets)
a <- c("test1","test2","test3")
set_power(a)

{{}, {"test1"}, {"test2"}, {"test3"}, {"test1", "test2"}, {"test1", "test3"}, {"test2", "test3"}, {"test1", "test2", "test3"}}

This returns the set of all subsets.

like image 30
gfgm Avatar answered Oct 05 '22 13:10

gfgm