Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to calculate power set (all possible subsets) of a set in R

Tags:

r

set

powerset

I couldn't find an answer to this anywhere, so here's my solution.

The question is: how can you calculate a power set in R?

It is possible to do this with the library "sets", with the command 2^as.set(c(1,2,3,4)), which yields the output {{}, {1}, {2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}, {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4}, {1, 2, 3, 4}}. However, this uses a recursive algorithm, which is rather slow.


Here's the algorithm I came up with.

It's non-recursive, so it's much faster than some of the other solutions out there (and ~100x faster on my machine than the algorithm in the "sets" package). The speed is still O(2^n).

The conceptual basis for this algorithm is the following:

for each element in the set:
    for each subset constructed so far:
        new subset = (subset + element)

Here's the R code:

EDIT: here's a somewhat faster version of the same concept; my original algorithm is in the third comment to this post. This one is 30% faster on my machine for a set of length 19.

powerset = function(s){
    len = length(s)
    l = vector(mode="list",length=2^len) ; l[[1]]=numeric()
    counter = 1L
    for(x in 1L:length(s)){
        for(subset in 1L:counter){
            counter=counter+1L
            l[[counter]] = c(l[[subset]],s[x])
        }
    }
    return(l)
}

This version saves time by initiating the vector with its final length at the start and keeping track with the "counter" variable of the position at which to save new subsets. It's also possible to calculate the position analytically, but that was slightly slower.

like image 225
sssheridan Avatar asked Sep 10 '13 09:09

sssheridan


People also ask

What is power set algorithm?

Power Set: Power set P(S) of a set S is the set of all subsets of S. For example S = {a, b, c} then P(s) = {{}, {a}, {b}, {c}, {a,b}, {a, c}, {b, c}, {a, b, c}}. If S has n elements in it then P(s) will have 2n elements.

How do you find the number of subsets in a power set?

The total number of subsets for a set of 'n' elements is given by 2n. Since the subsets of a set are the elements of a power set, the cardinality of a power set is given by |P(A)| = 2.

How do you create a power set of a set?

To create the Power Set, write down the sequence of binary numbers (using n digits), and then let "1" mean "put the matching member into this subset". Well, they are not in a pretty order, but they are all there.

Is power set and subset are same?

The power set P(A) is the collection of all the subsets of A. Thus, the elements in P(A) are subsets of A. One of these subsets is the set A itself. Hence, A itself appears as an element in ℘(A), and we write A∈℘(A) to describe this membership.


1 Answers

A subset can be seen as a boolean vector, indicating whether an element is in the subset of not. Those boolean vectors can be seen as numbers written in binary. Enumerating all the subsets of 1:n is therefore equivalent to enumerating the numbers from 0 to 2^n-1.

f <- function(set) { 
  n <- length(set)
  masks <- 2^(1:n-1)
  lapply( 1:2^n-1, function(u) set[ bitwAnd(u, masks) != 0 ] )
}
f(LETTERS[1:4])
like image 153
Vincent Zoonekynd Avatar answered Oct 23 '22 06:10

Vincent Zoonekynd