Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

itertools.product eliminating repeated elements

How can I skip the tuples which has duplicate elements in the iteration when I use itertools.product? Or let's say, is there anyway not to look at them in the iteration? Because skipping may be time consuming if the number of lists are too much.

Example,
lis1 = [1,2]
lis2 = [2,4]
lis3 = [5,6]

[i for i in product(lis1,lis2,lis3)] should be [(1,2,5), (1,2,6), (1,4,5), (1,4,6), (2,4,5), (2,4,6)]

It will not have (2,2,5) and (2,2,6) since 2 is duplicate in here. How can I do that?

like image 759
genclik27 Avatar asked Nov 02 '13 17:11

genclik27


People also ask

Is Itertools faster than for loops?

While this is a perfectly fine approach, it is important to remember that utilizing the itertools iterators means using iterators that are Pythonic implementations of iterators elsewhere. That being said, the iterators from itertools are often significantly faster than regular iteration from a standard Python for loop.

What does Itertools Zip_longest return?

The module has a number of functions that construct and return iterators. One such function is the zip_longest function. This function makes an iterator that aggregates elements from each of the iterables. The iteration continues until the longest iterable is not exhausted.

What is Islice?

islice(iterable, start, stop[, step]) Make an iterator that returns selected elements from the iterable. If start is non-zero, then elements from the iterable are skipped until start is reached. Afterward, elements are returned consecutively unless step is set higher than one which results in items being skipped.

What does the cycle function from Itertools module do?

There's an easy way to generate this sequence with the itertools. cycle() function. This function takes an iterable inputs as an argument and returns an infinite iterator over the values in inputs that returns to the beginning once the end of inputs is reached.


2 Answers

itertools generally works on unique positions within inputs, not on unique values. So when you want to remove duplicate values, you generally have to either post-process the itertools result sequence, or "roll your own". Because post-processing can be very inefficient in this case, roll your own:

def uprod(*seqs):
    def inner(i):
        if i == n:
            yield tuple(result)
            return
        for elt in sets[i] - seen:
            seen.add(elt)
            result[i] = elt
            for t in inner(i+1):
                yield t
            seen.remove(elt)

    sets = [set(seq) for seq in seqs]
    n = len(sets)
    seen = set()
    result = [None] * n
    for t in inner(0):
        yield t

Then, e.g.,

>>> print list(uprod([1, 2, 1], [2, 4, 4], [5, 6, 5]))
[(1, 2, 5), (1, 2, 6), (1, 4, 5), (1, 4, 6), (2, 4, 5), (2, 4, 6)]
>>> print list(uprod([1], [1, 2], [1, 2, 4], [1, 5, 6]))
[(1, 2, 4, 5), (1, 2, 4, 6)]
>>> print list(uprod([1], [1, 2, 4], [1, 5, 6], [1]))
[]
>>> print list(uprod([1, 2], [3, 4]))
[(1, 3), (1, 4), (2, 3), (2, 4)]

This can be much more efficient, since a duplicate value is never even considered (neither within an input iterable, nor across them).

like image 150
Tim Peters Avatar answered Sep 28 '22 07:09

Tim Peters


lis1 = [1,2]
lis2 = [2,4]
lis3 = [5,6]
from itertools import product
print [i for i in product(lis1,lis2,lis3) if len(set(i)) == 3]

Output

[(1, 2, 5), (1, 2, 6), (1, 4, 5), (1, 4, 6), (2, 4, 5), (2, 4, 6)]
like image 33
thefourtheye Avatar answered Sep 28 '22 08:09

thefourtheye