Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the guarantee made by itertools.combinations?

Tags:

python

The docs for itertools.combinations state:

Combinations are emitted in lexicographic sort order. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.

Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeat values in each combination.

[Emphasis mine]

What is the exact guarantee being made here? An empirical check shows that elements are always emitted as if by

for i in range(len(iterable)):
    for j in range(i + 1, len(iterable)):
        for k in range(j + 1, len(iterable)):
            ...
                yield iterable[i], iterable[j], iterable[k], ...

What is the meaning of "lexicographixal order" in this case? In particular, I believe that the emphasized sentence is crucial, but I am not 100% of what the connection is. I think it means that the lexicographixal order is applied to the indices of the elements regardless of value, but I'd love to have someone confirm that.

like image 980
Mad Physicist Avatar asked Nov 02 '18 04:11

Mad Physicist


People also ask

What does Itertools combinations return?

What does itertools. combinations() do ? It returns r length subsequences of elements from the input iterable. Combinations are emitted in lexicographic sort order.

How does Itertools combinations work?

combinations() provides us with all the possible tuples a sequence or set of numbers or letters used in the iterator and the elements are assumed to be unique on the basis of there positions which are distinct for all elements. All these combinations are emitted in lexicographical order.

What is the time complexity of combinations in Python?

Talking about the time complexity of the combination function, we can say that to generate valid combinations exactly once, it will take O(n C r), and for getting r combinations, it is O(r). Hence, the total time complexity is O( r * [nCr]).

Is Itertools faster than for loop?

That being said, the iterators from itertools are often significantly faster than regular iteration from a standard Python for loop.


1 Answers

To translate these paragraphs from computer sciencese into English:

Combinations are emitted in lexicographic sort order. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.

"Lexicographic" here is a mathematical term and does not mean alphabetically, but "in whatever order the lexicon defines".

early 17th century: modern Latin, from Greek lexikon (biblion) ‘(book) of words’, from lexis ‘word’, from legein ‘speak’.

The "lexicon" here being your input. Put plainly, your input defines the order in which the output will be produced. If you want alphabetically sorted output, sort your input.

Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeat values in each combination.

This just says that combinations will not look at or care about the actual values themselves, it just combines elements by their position. It won't deduplicate based on the values, it deduplicates combinations of positions. If you want unique combinations, deduplicate your input.

like image 58
deceze Avatar answered Nov 09 '22 21:11

deceze