Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the purpose of Python's itertools.repeat?

For every use I can think of for Python's itertools.repeat() class, I can think of another equally (possibly more) acceptable solution to achieve the same effect. For example:

>>> [i for i in itertools.repeat('example', 5)]
['example', 'example', 'example', 'example', 'example']
>>> ['example'] * 5
['example', 'example', 'example', 'example', 'example']

>>> list(map(str.upper, itertools.repeat('example', 5)))
['EXAMPLE', 'EXAMPLE', 'EXAMPLE', 'EXAMPLE', 'EXAMPLE']
>>> ['example'.upper()] * 5
['EXAMPLE', 'EXAMPLE', 'EXAMPLE', 'EXAMPLE', 'EXAMPLE']

Is there any case in which itertools.repeat() would be the most appropriate solution? If so, under what circumstances?

like image 535
Tyler Crompton Avatar asked Jan 30 '12 03:01

Tyler Crompton


People also ask

What is the repeat command in Python?

Repeat N Times in Python Using the itertools. repeat(val, num) method is an infinite iterator, which means it will iterate infinitely till the break statement if the num value (which represents the number of iterations) is not provided.

Is Itertools faster than for-loop?

That being said, the iterators from itertools are often significantly faster than regular iteration from a standard Python for loop.


3 Answers

The primary purpose of itertools.repeat is to supply a stream of constant values to be used with map or zip:

>>> list(map(pow, range(10), repeat(2)))     # list of squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

The secondary purpose is that it gives a very fast way to loop a fixed number of times like this:

for _ in itertools.repeat(None, 10000):
    do_something()

This is faster than:

for i in range(10000):
    do_something().

The former wins because all it needs to do is update the reference count for the existing None object. The latter loses because the range() or xrange() needs to manufacture 10,000 distinct integer objects.

Note, Guido himself uses that fast looping technique in the timeit() module. See the source at https://hg.python.org/cpython/file/2.7/Lib/timeit.py#l195 :

    if itertools:
        it = itertools.repeat(None, number)
    else:
        it = [None] * number
    gcold = gc.isenabled()
    gc.disable()
    try:
        timing = self.inner(it, self.timer)
like image 83
Raymond Hettinger Avatar answered Oct 19 '22 06:10

Raymond Hettinger


The itertools.repeat function is lazy; it only uses the memory required for one item. On the other hand, the (a,) * n and [a] * n idioms create n copies of the object in memory. For five items, the multiplication idiom is probably better, but you might notice a resource problem if you had to repeat something, say, a million times.

Still, it is hard to imagine many static uses for itertools.repeat. However, the fact that itertools.repeat is a function allows you to use it in many functional applications. For example, you might have some library function func which operates on an iterable of input. Sometimes, you might have pre-constructed lists of various items. Other times, you may just want to operate on a uniform list. If the list is big, itertools.repeat will save you memory.

Finally, repeat makes possible the so-called "iterator algebra" described in the itertools documentation. Even the itertools module itself uses the repeat function. For example, the following code is given as an equivalent implementation of itertools.izip_longest (even though the real code is probably written in C). Note the use of repeat seven lines from the bottom:

class ZipExhausted(Exception):
    pass

def izip_longest(*args, **kwds):
    # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    fillvalue = kwds.get('fillvalue')
    counter = [len(args) - 1]
    def sentinel():
        if not counter[0]:
            raise ZipExhausted
        counter[0] -= 1
        yield fillvalue
    fillers = repeat(fillvalue)
    iterators = [chain(it, sentinel(), fillers) for it in args]
    try:
        while iterators:
            yield tuple(map(next, iterators))
    except ZipExhausted:
        pass
like image 32
HardlyKnowEm Avatar answered Oct 19 '22 05:10

HardlyKnowEm


Your example of foo * 5 looks superficially similar to itertools.repeat(foo, 5), but it is actually quite different.

If you write foo * 100000, the interpreter must create 100,000 copies of foo before it can give you an answer. It is thus a very expensive and memory-unfriendly operation.

But if you write itertools.repeat(foo, 100000), the interpreter can return an iterator that serves the same function, and doesn't need to compute a result until you need it -- say, by using it in a function that wants to know each result in the sequence.

That's the major advantage of iterators: they can defer the computation of a part (or all) of a list until you really need the answer.

like image 16
John Feminella Avatar answered Oct 19 '22 04:10

John Feminella