Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient way to generate "ordered subsets" of a sequence

Tags:

python

I need to generate all "ordered subsets" (apologies if I'm not using correct mathematical terminology) of a sequence in Python, with omitted elements replaced with None.Given [1, 2], I want [(1, 2), (1, None), (None, 2), (None, None)]. Each "ordered subset" should have the property that at each position, it is either exactly the same element as in the seed sequence, or it is None.

I can fairly easily generate subsets with omitted elements missing with the following:

from itertools import combinations
for length in xrange(len(items), 0, -1):
    for combination in combinations(items, length):
        yield combination

I can't figure out what the most effective way of reconstructing the missing elements, would be though. My first thought is to do something like this:

from itertools import combinations
indexes = range(len(items))
for length in xrange(len(items), 0, -1):
    for combination in combinations(indexes, length):
        yield tuple(items[i] if i in combination else None for i in indexes)

Just wondering if anyone can spot any obvious deficiencies in this, or if there's a more efficient solution I've missed. (Note that the items will be a fairly short list, typically under 10 elements, so I am not concerned about the O(N) search of "combination" in the inner loop).

like image 804
dcrosta Avatar asked Aug 17 '12 20:08

dcrosta


3 Answers

from itertools import product, repeat
given = [1, 2]
with_nones = zip(given, repeat(None))
print(list(product(*with_nones)))
like image 61
Joel Cornett Avatar answered Nov 20 '22 16:11

Joel Cornett


You could start with an empty list, for every element in your seed you can copy all the final lists and add the seed at the end.

e.g.

solutions = []
solutions.append([])
for elem in seed:
    newPartials = []
    for partial in solutions:
        newPartial = partial[:]
        newPartial.append(elem)
        newPartials.append(newPartial)
    solutions.extend(newPartials)

or, you could create the number of possible solutions, 2^n, where n is the length of your seed list, and using modular arithmetic, remove elements, like so:

solutions = []
for i in xrange(2**n):
    solutions.append(seed[:])
seedLen = len(seed)
for i in xrange(2**(n-1)): // % 0 case of following loop
    solutions[i].pop(0)
for elemLoc in xrange(1,seedLen):
    for solutionNum in xrange(2**n):
        if solutionNum % elemLoc = 0:
            solutions[solutionNum].pop(elemLoc)

This solution is hilariously inefficient, I mostly included it because it's an interesting way to solve the problem.

like image 36
rsegal Avatar answered Nov 20 '22 17:11

rsegal


An alternate -- in case you want to show off how silly you are to NOT use itertools:

>>> given=[1,2]
>>> gz=zip(given,[None]*len(given))
>>> [(i,j) for i in gz[0] for j in gz[1]]
[(1, 2), (1, None), (None, 2), (None, None)]
like image 2
the wolf Avatar answered Nov 20 '22 16:11

the wolf