I have a list of lists, and I would like to duplicate the effect of itertools.product()
without using any element more than once.
>>> list = [['A', 'B'], ['C', 'D'], ['A', 'B']]
>>> [''.join(e) for e in itertools.product(*list)]
['ACA', 'ACB', 'ADA', 'ADB', 'BCA', 'BCB', 'BDA', 'BDB']
>>> # Desired output: ['ACB', 'ADB', 'BCA', 'BDA']
The list I need to use this on is too large to compute itertools.product
and remove the unneeded elements. (25 billion permutations from itertools.product
, while my desired output only has ~500,000). Preferably, an answer will be iterable.
Edit: I know "product" is the wrong term for what I need, but I'm struggling to find the word I'm looking for.
Edit2: This is the list that I desire to perform this operation on:
[['A', 'H', 'L'], ['X', 'B', 'I'], ['Q', 'C', 'V'], ['D', 'N'], ['E', 'F'], ['E', 'F'], ['G'], ['A', 'H', 'L'], ['X', 'B', 'I'], ['W', 'U', 'J', 'K', 'M'], ['W', 'U', 'J', 'K', 'M'], ['A', 'H', 'L'], ['W', 'U', 'J', 'K', 'M'], ['D', 'N'], ['P', 'O', 'T'], ['P', 'O', 'T'], ['Q', 'C', 'V'], ['R'], ['S'], ['P', 'O', 'T'], ['W', 'U', 'J', 'K', 'M'], ['Q', 'C', 'V'], ['W', 'U', 'J', 'K', 'M'], ['X', 'B', 'I']]
A simple stack-based implementation:
def product1(l): return product1_(l,0,[])
def product1_(l,i,buf):
if i==len(l): yield buf
else:
for x in l[i]:
if x not in buf:
buf.append(x)
yield from product1_(l,i+1,buf)
buf.pop()
This is a bit slower than Patrick Haugh's answer (I get 18 s for your test case), but it gives the results in a predictable order.
Note that you have to process the values as it generates "them", since they're all the same list buf
; you could write yield tuple(buf)
or yield "".join(buf)
to generate separate "cooked" values (at a cost of less than one additional second).
If the values are letters, you could use a "bitmask" list to implement the collision test, which reduces the time to about 13 s (but using a set
is just as fast). Other possible optimizations include processing lists with fewer eligible elements first, to reduce backtracking; this can get this case down to 11 s.
test1 = [['A', 'B'], ['C', 'D'], ['A', 'B']]
test2 = [['A', 'H', 'L'], ['X', 'B', 'I'], ['Q', 'C', 'V'], ['D', 'N'], ['E', 'F'], ['E', 'F'], ['G'], ['A', 'H', 'L'],
['X', 'B', 'I'], ['W', 'U', 'J', 'K', 'M'], ['W', 'U', 'J', 'K', 'M'], ['A', 'H', 'L'],
['W', 'U', 'J', 'K', 'M'], ['D', 'N'], ['P', 'O', 'T'], ['P', 'O', 'T'], ['Q', 'C', 'V'], ['R'], ['S'],
['P', 'O', 'T'], ['W', 'U', 'J', 'K', 'M'], ['Q', 'C', 'V'], ['W', 'U', 'J', 'K', 'M'], ['X', 'B', 'I']]
def prod(curr, *others):
if not others:
for x in curr:
yield {x} # (x,) for tuples
else:
for o in prod(*others):
for c in curr:
if c not in o:
yield {c, *o} # (c, *o) for tuples
print([''.join(x) for x in prod(*test1)])
# ['ABC', 'ABD', 'ABC', 'ABD']
print(sum(1 for x in prod(*test2)))
# 622080
The longer input takes about five seconds to run on my machine. I use set
s to pass values around because they are much more efficent than tuples or lists when it comes to membership checks. If necessary, you can use tuples, it will just be slower.
Some questions to think about: does order matter? What do you want to happen when we can't use an item from the current list (because they've all already been used)?
Your specific case has an interesting property. If we arrange it in a counter, we see that every list occurs as many times as its entries:
Counter({('A', 'H', 'L'): 3,
('D', 'N'): 2,
('E', 'F'): 2,
('G',): 1,
('P', 'O', 'T'): 3,
('Q', 'C', 'V'): 3,
('R',): 1,
('S',): 1,
('W', 'U', 'J', 'K', 'M'): 5,
('X', 'B', 'I'): 3})
In other words, ignoring order, the sequences you want are the cartesian products of permutations of your lists. Suppose your list is called l
. Then we can assemble a list of all the permutations of the sublists and take their product:
c = set(tuple(i) for i in l)
permutations = [list(itertools.permutations(i)) for i in c]
permutation_products = itertools.products(*permutations)
An element of permutation_products
looks something like:
(('W', 'U', 'J', 'K', 'M'),
('E', 'F'),
('X', 'B', 'I'),
('Q', 'C', 'V'),
('P', 'O', 'T'),
('D', 'N'),
('G',),
('S',),
('R',),
('A', 'L', 'H'))
We have to get it back into the right order. Suppose our permutation is called perm
. For each sublist of your list, we have to locate the correct element of perm
and then take the next letter in the permutation. We can do this by making a dictionary:
perm_dict = {frozenset(p): list(p) for p in perm}
Then, to construct a single permutation, we have:
s = "".join([perm_dict[frozenset(i)].pop() for i in l])
We can combine all this into a generator:
def permute_list(l):
c = set(tuple(i) for i in l)
permutations = [list(itertools.permutations(i)) for i in c]
permutation_products = itertools.product(*permutations)
for perm in permutation_products:
perm_dict = {frozenset(p): list(p) for p in perm}
yield "".join([perm_dict[frozenset(i)].pop() for i in l])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With