Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient enumeration of ordered subsets in Python

I'm not sure of the appropriate mathematical terminology for the code I'm trying to write. I'd like to generate combinations of unique integers, where "ordered subsets" of each combination are used to exclude certain later combinations.

Hopefully an example will make this clear:

from itertools import chain, combinations
​
mylist = range(4)
max_depth = 3

rev = chain.from_iterable(combinations(mylist, i) for i in xrange(max_depth, 0, -1))
for el in list(rev):
    print el

That code results in output that contains all the subsets I want, but also some extra ones that I do not. I have manually inserted comments to indicate which elements I don't want.

(0, 1, 2)
(0, 1, 3)
(0, 2, 3)
(1, 2, 3)
(0, 1)  # Exclude: (0, 1, _) occurs as part of (0, 1, 2) above
(0, 2)  # Exclude: (0, 2, _) occurs above
(0, 3)  # Keep
(1, 2)  # Exclude: (1, 2, _) occurs above
(1, 3)  # Keep: (_, 1, 3) occurs above, but (1, 3, _) does not
(2, 3)  # Keep
(0,)    # Exclude: (0, _, _) occurs above
(1,)    # Exclude: (1, _, _) occurs above
(2,)    # Exclude: (2, _) occurs above
(3,)    # Keep

Thus, the desired output of my generator or iterator would be:

(0, 1, 2)
(0, 1, 3)
(0, 2, 3)
(1, 2, 3)
(0, 3)
(1, 3)
(2, 3)
(3,)  

I know I could make a list of all the (wanted and unwanted) combinations and then filter out the ones I don't want, but I was wondering if there was a more efficient, generator or iterator based way.

like image 492
Curt F. Avatar asked Jul 07 '15 22:07

Curt F.


2 Answers

You are trying to exclude any combination that is a prefix of a previously-returned combination. Doing so is straightforward.

  • If a tuple t has length max_depth, it can't be a prefix of a previously-returned tuple, since any tuple it's a prefix of would have to be longer.
  • If a tuple t ends with mylist[-1], then it can't be a prefix of a previously-returned tuple, since there are no elements that could legally be added to the end of t to extend it.
  • If a tuple t has length less than max_depth and does not end with mylist[-1], then t is a prefix of the previously-returned tuple t + (mylist[-1],), and t should not be returned.

Thus, the combinations you should generate are exactly the ones of length max_depth and the shorter ones that end with mylist[-1]. The following code does so, in exactly the same order as your original code, and correctly handling cases like maxdepth > len(mylist):

def nonprefix_combinations(iterable, maxlen):
    iterable = list(iterable)
    if not (iterable and maxlen):
        return
    for comb in combinations(iterable, maxlen):
        yield comb
    for length in xrange(maxlen-2, -1, -1):
        for comb in combinations(iterable[:-1], length):
            yield comb + (iterable[-1],)

(I've assumed here that in the case where maxdepth == 0, you still don't want to include the empty tuple in your output, even though for maxdepth == 0, it isn't a prefix of a previously-returned tuple. If you do want the empty tuple in this case, you can change if not (iterable and maxlen) to if not iterable.)

like image 110
user2357112 supports Monica Avatar answered Sep 28 '22 06:09

user2357112 supports Monica


I noticed an interesting pattern in your desired output and I have a generator that produces that. Does this work for all your cases?

from itertools import combinations

def orderedSetCombination(iterable, r):
    # Get the last element of the iterable
    last = (iterable[-1], )
    # yield all the combinations of the iterable without the
    # last element
    for iter in combinations(iterable[:-1], r):
        yield iter
    # while r > 1 reduce r by 1 and yield all the combinations
    while r>1:
        r -= 1
        for iter in combinations(iterable[:-1], r):
            yield iter+last
    # yield the last item
    yield last

iter = [0,1,2,3]

for el in (list(orderedSetCombination(iter, 3))):
    print(el)

Here is my explaination of the logic:

# All combinations that does not include the last element of the iterable
# taking r = max_depth items at a time

(0,1,2) 

# from here on, its the combinations of all the elements except 
# the last element and the last element is added to it.
# so here taking r = r -1 items at a time and adding the last element
# combinations([0,1,2], r=2)

(0,1,3)
(0,2,3)
(1,2,3)

# the only possible value right now at index r = 2 is the last element (3)
# since all possible values of (0,1,_) (0,2,_) (1,2,_) are already listed
# So reduce r by 1 again and continue: combinations([0,1,2], r=1)

(0, 3)
(1, 3)
(2, 3)

# continue until r == 0 and then yield the last element

(3,)
like image 32
ashwinjv Avatar answered Sep 28 '22 05:09

ashwinjv