I need to generate all "ordered subsets" (apologies if I'm not using correct mathematical terminology) of a sequence in Python, with omitted elements replaced with None
.Given [1, 2]
, I want [(1, 2), (1, None), (None, 2), (None, None)]
. Each "ordered subset" should have the property that at each position, it is either exactly the same element as in the seed sequence, or it is None
.
I can fairly easily generate subsets with omitted elements missing with the following:
from itertools import combinations
for length in xrange(len(items), 0, -1):
for combination in combinations(items, length):
yield combination
I can't figure out what the most effective way of reconstructing the missing elements, would be though. My first thought is to do something like this:
from itertools import combinations
indexes = range(len(items))
for length in xrange(len(items), 0, -1):
for combination in combinations(indexes, length):
yield tuple(items[i] if i in combination else None for i in indexes)
Just wondering if anyone can spot any obvious deficiencies in this, or if there's a more efficient solution I've missed. (Note that the items
will be a fairly short list, typically under 10 elements, so I am not concerned about the O(N) search of "combination" in the inner loop).
from itertools import product, repeat
given = [1, 2]
with_nones = zip(given, repeat(None))
print(list(product(*with_nones)))
You could start with an empty list, for every element in your seed you can copy all the final lists and add the seed at the end.
e.g.
solutions = []
solutions.append([])
for elem in seed:
newPartials = []
for partial in solutions:
newPartial = partial[:]
newPartial.append(elem)
newPartials.append(newPartial)
solutions.extend(newPartials)
or, you could create the number of possible solutions, 2^n
, where n
is the length of your seed list, and using modular arithmetic, remove elements, like so:
solutions = []
for i in xrange(2**n):
solutions.append(seed[:])
seedLen = len(seed)
for i in xrange(2**(n-1)): // % 0 case of following loop
solutions[i].pop(0)
for elemLoc in xrange(1,seedLen):
for solutionNum in xrange(2**n):
if solutionNum % elemLoc = 0:
solutions[solutionNum].pop(elemLoc)
This solution is hilariously inefficient, I mostly included it because it's an interesting way to solve the problem.
An alternate -- in case you want to show off how silly you are to NOT use itertools:
>>> given=[1,2]
>>> gz=zip(given,[None]*len(given))
>>> [(i,j) for i in gz[0] for j in gz[1]]
[(1, 2), (1, None), (None, 2), (None, None)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With