How can I skip the tuples which has duplicate elements in the iteration when I use itertools.product
? Or let's say, is there anyway not to look at them in the iteration? Because skipping may be time consuming if the number of lists are too much.
Example,
lis1 = [1,2]
lis2 = [2,4]
lis3 = [5,6]
[i for i in product(lis1,lis2,lis3)] should be [(1,2,5), (1,2,6), (1,4,5), (1,4,6), (2,4,5), (2,4,6)]
It will not have (2,2,5)
and (2,2,6)
since 2 is duplicate in here. How can I do that?
While this is a perfectly fine approach, it is important to remember that utilizing the itertools iterators means using iterators that are Pythonic implementations of iterators elsewhere. That being said, the iterators from itertools are often significantly faster than regular iteration from a standard Python for loop.
The module has a number of functions that construct and return iterators. One such function is the zip_longest function. This function makes an iterator that aggregates elements from each of the iterables. The iteration continues until the longest iterable is not exhausted.
islice(iterable, start, stop[, step]) Make an iterator that returns selected elements from the iterable. If start is non-zero, then elements from the iterable are skipped until start is reached. Afterward, elements are returned consecutively unless step is set higher than one which results in items being skipped.
There's an easy way to generate this sequence with the itertools. cycle() function. This function takes an iterable inputs as an argument and returns an infinite iterator over the values in inputs that returns to the beginning once the end of inputs is reached.
itertools
generally works on unique positions within inputs, not on unique values. So when you want to remove duplicate values, you generally have to either post-process the itertools
result sequence, or "roll your own". Because post-processing can be very inefficient in this case, roll your own:
def uprod(*seqs):
def inner(i):
if i == n:
yield tuple(result)
return
for elt in sets[i] - seen:
seen.add(elt)
result[i] = elt
for t in inner(i+1):
yield t
seen.remove(elt)
sets = [set(seq) for seq in seqs]
n = len(sets)
seen = set()
result = [None] * n
for t in inner(0):
yield t
Then, e.g.,
>>> print list(uprod([1, 2, 1], [2, 4, 4], [5, 6, 5]))
[(1, 2, 5), (1, 2, 6), (1, 4, 5), (1, 4, 6), (2, 4, 5), (2, 4, 6)]
>>> print list(uprod([1], [1, 2], [1, 2, 4], [1, 5, 6]))
[(1, 2, 4, 5), (1, 2, 4, 6)]
>>> print list(uprod([1], [1, 2, 4], [1, 5, 6], [1]))
[]
>>> print list(uprod([1, 2], [3, 4]))
[(1, 3), (1, 4), (2, 3), (2, 4)]
This can be much more efficient, since a duplicate value is never even considered (neither within an input iterable, nor across them).
lis1 = [1,2]
lis2 = [2,4]
lis3 = [5,6]
from itertools import product
print [i for i in product(lis1,lis2,lis3) if len(set(i)) == 3]
Output
[(1, 2, 5), (1, 2, 6), (1, 4, 5), (1, 4, 6), (2, 4, 5), (2, 4, 6)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With