Algorithm to calculate power set (all possible subsets) of a set in R

Tags:

I couldn't find an answer to this anywhere, so here's my solution.

The question is: how can you calculate a power set in R?

It is possible to do this with the library "sets", with the command 2^as.set(c(1,2,3,4)), which yields the output {{}, {1}, {2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}, {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4}, {1, 2, 3, 4}}. However, this uses a recursive algorithm, which is rather slow.

Here's the algorithm I came up with.

It's non-recursive, so it's much faster than some of the other solutions out there (and ~100x faster on my machine than the algorithm in the "sets" package). The speed is still O(2^n).

The conceptual basis for this algorithm is the following:

for each element in the set:
    for each subset constructed so far:
        new subset = (subset + element)

Here's the R code:

EDIT: here's a somewhat faster version of the same concept; my original algorithm is in the third comment to this post. This one is 30% faster on my machine for a set of length 19.

powerset = function(s){
    len = length(s)
    l = vector(mode="list",length=2^len) ; l[[1]]=numeric()
    counter = 1L
    for(x in 1L:length(s)){
        for(subset in 1L:counter){
            counter=counter+1L
            l[[counter]] = c(l[[subset]],s[x])
        }
    }
    return(l)
}

This version saves time by initiating the vector with its final length at the start and keeping track with the "counter" variable of the position at which to save new subsets. It's also possible to calculate the position analytically, but that was slightly slower.

225

asked Sep 10 '13 09:09

sssheridan

1 Answers

A subset can be seen as a boolean vector, indicating whether an element is in the subset of not. Those boolean vectors can be seen as numbers written in binary. Enumerating all the subsets of 1:n is therefore equivalent to enumerating the numbers from 0 to 2^n-1.

f <- function(set) { 
  n <- length(set)
  masks <- 2^(1:n-1)
  lapply( 1:2^n-1, function(u) set[ bitwAnd(u, masks) != 0 ] )
}
f(LETTERS[1:4])

153

answered Oct 23 '22 06:10

Vincent Zoonekynd

Related questions
                            
                                ggplot and R: Two variables over time
                            
                                Change colour scheme for ggplot geom_polygon in R
                            
                                Problem loading the plyr package
                            
                                Data manipulation in R in LINQ style
                            
                                Identifying sequences of repeated numbers in R
                            
                                apply strsplit to specific column in a data.frame
                            
                                Subset data using non-sequential row numbers
                            
                                Length of lubridate interval
                            
                                Unable to install ggplot2 on Ubuntu 11.10
                            
                                Creating a latex table from ftable object in R
                            
                                control color in horizontal lines in ggplot2
                            
                                expanding factor interactions within a formula
                            
                                data.frame without ruining column names
                            
                                How to get length of current group in data.table grouping?
                            
                                How to reverse point size in ggplot?
                            
                                Getting values from kernel density estimation in R
                            
                                Reading numbers as strings
                            
                                ggplot2: geom_line() for single observations (x-factor, y-numeric)
                            
                                Peak of the kernel density estimation
                            
                                Is S4 method dispatch slow?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Algorithm to calculate power set (all possible subsets) of a set in R

Tags:

r

set

powerset

sssheridan

People also ask

1 Answers

Vincent Zoonekynd

Recent Activity

Donate For Us