How to get all possible subsets of a character vector in R?

Tags:

r

subset

Having the following vector:

c("test1","test2","test3")

I am trying to get a list or data frame containing the following entries:

"test1" "test2" "test3"
"test1" "test2" NA
"test1" NA "test3"
"test1"  NA NA
NA  "test2" "test3"
NA  "test2" NA
NA  NA "test3"

The goal would be to get all possible subsets while the order doesn't matter, that is "text1" "text2" NA is equivalent to "text2" "text1" NA. I very much appreciate any help!

846

asked Mar 24 '16 09:03

Patrick Balada

2 Answers

You can use combn:

res <- unlist(lapply(1:3, combn, 
                     x = c("test1","test2","test3"), simplify = FALSE), 
              recursive = FALSE)
res <- sapply(res, `length<-`, 3)
#        [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]   
#[1,] "test1" "test2" "test3" "test1" "test1" "test2" "test1"
#[2,] NA      NA      NA      "test2" "test3" "test3" "test2"
#[3,] NA      NA      NA      NA      NA      NA      "test3"

101

answered Oct 05 '22 11:10

Roland

There is a package sets with the relevant function.

library(sets)
a <- c("test1","test2","test3")
set_power(a)

{{}, {"test1"}, {"test2"}, {"test3"}, {"test1", "test2"}, {"test1", "test3"}, {"test2", "test3"}, {"test1", "test2", "test3"}}

This returns the set of all subsets.

answered Oct 05 '22 13:10

gfgm

Related questions
                            
                                Color in ggplot- continuous value applied to discrete variable
                            
                                R plyr, data.table, apply certain columns of data.frame
                            
                                Replace negative values by NA values
                            
                                Aggregate sum and mean in R with ddply
                            
                                Identifying the outliers in a data set in R
                            
                                Changing column types with dplyr
                            
                                Replace specific column "words" into number or blank
                            
                                Cumulative histogram with percentage on the Y axis
                            
                                R: loop through data frame extracting subset of data depending on date
                            
                                not enough distinct predictions to compute area under roc
                            
                                dplyr - Multiple summary functions
                            
                                How to update values in a dplyr pipe?
                            
                                Creating a new data frame in R from an exisiting, inadequate data frame
                            
                                subset function with "different than"?
                            
                                Change Date print format from yyyy-mm-dd to dd-mm-yyyy
                            
                                Error running R in Linux
                            
                                Splitting a string into new rows in R [duplicate]
                            
                                Splitting text column into ragged multiple new columns in a data table in R
                            
                                Filter data table by dynamic column name
                            
                                Sum of intervals lengths from an integer vector

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With