Most efficient way to generate "ordered subsets" of a sequence

Question

I need to generate all "ordered subsets" (apologies if I'm not using correct mathematical terminology) of a sequence in Python, with omitted elements replaced with None.Given [1, 2], I want [(1, 2), (1, None), (None, 2), (None, None)]. Each "ordered subset" should have the property that at each position, it is either exactly the same element as in the seed sequence, or it is None.

I can fairly easily generate subsets with omitted elements missing with the following:

from itertools import combinations
for length in xrange(len(items), 0, -1):
    for combination in combinations(items, length):
        yield combination

I can't figure out what the most effective way of reconstructing the missing elements, would be though. My first thought is to do something like this:

from itertools import combinations
indexes = range(len(items))
for length in xrange(len(items), 0, -1):
    for combination in combinations(indexes, length):
        yield tuple(items[i] if i in combination else None for i in indexes)

Just wondering if anyone can spot any obvious deficiencies in this, or if there's a more efficient solution I've missed. (Note that the items will be a fairly short list, typically under 10 elements, so I am not concerned about the O(N) search of "combination" in the inner loop).

Joel Cornett · Accepted Answer

from itertools import product, repeat
given = [1, 2]
with_nones = zip(given, repeat(None))
print(list(product(*with_nones)))

rsegal · Answer

You could start with an empty list, for every element in your seed you can copy all the final lists and add the seed at the end.

e.g.

solutions = []
solutions.append([])
for elem in seed:
    newPartials = []
    for partial in solutions:
        newPartial = partial[:]
        newPartial.append(elem)
        newPartials.append(newPartial)
    solutions.extend(newPartials)

or, you could create the number of possible solutions, 2^n, where n is the length of your seed list, and using modular arithmetic, remove elements, like so:

solutions = []
for i in xrange(2**n):
    solutions.append(seed[:])
seedLen = len(seed)
for i in xrange(2**(n-1)): // % 0 case of following loop
    solutions[i].pop(0)
for elemLoc in xrange(1,seedLen):
    for solutionNum in xrange(2**n):
        if solutionNum % elemLoc = 0:
            solutions[solutionNum].pop(elemLoc)

This solution is hilariously inefficient, I mostly included it because it's an interesting way to solve the problem.

the wolf · Answer

An alternate -- in case you want to show off how silly you are to NOT use itertools:

>>> given=[1,2]
>>> gz=zip(given,[None]*len(given))
>>> [(i,j) for i in gz[0] for j in gz[1]]
[(1, 2), (1, None), (None, 2), (None, None)]

Most efficient way to generate "ordered subsets" of a sequence

Tags:

python

dcrosta

3 Answers

Joel Cornett

rsegal

the wolf

Recent Activity

Donate For Us

Most efficient way to generate "ordered subsets" of a sequence

Tags:

python

dcrosta

3 Answers

Joel Cornett

rsegal

the wolf

Related questions

Recent Activity

Donate For Us