Why is range()-function slower than multiplying items to get copies inside nested list?

Question

To copy a nested list in an existing list, it is unfortunately not sufficient to simply multiply it, otherwise references are created and not independent lists in the list, see this example:

x = [[1, 2, 3]] * 2
x[0] is x[1]  # will evaluate to True

To achieve your goal, you could use the range function in a list comprehension, for example, see this:

x = [[1, 2, 3] for _ in range(2)]
x[0] is x[1]  # will evaluate to False (wanted behaviour)

This is a good way to multiply items in a list without just creating references, and this is also explained multiple times on many different websites.

However, there is a more efficient way to copy the list elements. That code seems a little faster to me (measured by timeit via command line and with different paramater n ∈ {1, 50, 100, 10000} for code below and range(n) in code above):

x = [[1, 2, 3] for _ in [0] * n]

But I wonder, why does this code run faster? Are there other disadvantages (more memory consumption or similar)?

python -m timeit '[[1, 2, 3] for _ in range(1)]'
1000000 loops, best of 3: 0.243 usec per loop

python -m timeit '[[1, 2, 3] for _ in range(50)]'
100000 loops, best of 3: 3.79 usec per loop

python -m timeit '[[1, 2, 3] for _ in range(100)]'
100000 loops, best of 3: 7.39 usec per loop

python -m timeit '[[1, 2, 3] for _ in range(10000)]'
1000 loops, best of 3: 940 usec per loop

python -m timeit '[[1, 2, 3] for _ in [0] * 1]'
1000000 loops, best of 3: 0.242 usec per loop

python -m timeit '[[1, 2, 3] for _ in [0] * 50]'
100000 loops, best of 3: 3.77 usec per loop

python -m timeit '[[1, 2, 3] for _ in [0] * 100]'
100000 loops, best of 3: 7.3 usec per loop

python -m timeit '[[1, 2, 3] for _ in [0] * 10000]'
1000 loops, best of 3: 927 usec per loop


# difference will be greater for larger n

python -m timeit '[[1, 2, 3] for _ in range(1000000)]'
10 loops, best of 3: 144 msec per loop

python -m timeit '[[1, 2, 3] for _ in [0] * 1000000]'
10 loops, best of 3: 126 msec per loop

Yann Vernier · Accepted Answer

This is correct; range, even in Python 3 where it produces a compact range object, is more complicated than a list, in the classical tradeoff between computation and storage.

As the list grows too large to fit in cache (the primary question if we're concerned with performance), the range object runs into a different issue: as each number in the range is created, it destroys and creates new int objects (the first 256 or so are less costly because they're interned, but their difference can still cost a few cache misses). The list would keep referring to the same one.

There are still more efficient options, though; a bytearray, for instance, would consume far less memory than the list. Probably the best function for the task is hidden away in itertools: repeat. Like a range object, it doesn't need storage for all the copies, but like the repeated list it doesn't need to create distinct objects. Something like for _ in repeat(None, x) would therefore just poke at the same few cache lines (iteration count and reference count for the object).

In the end, the main reason people stick to using range is because it's what's prominently presented (both in the idiom for a fixed count loop and among the builtins).

In other Python implementations, it's quite possible for range to be faster than repeat; this would be because the counter itself already holds the value. I'd expect such behaviours from Cython or PyPy.

Why is range()-function slower than multiplying items to get copies inside nested list?

Tags:

colidyre

1 Answers

Yann Vernier

Recent Activity

Donate For Us

Why is range()-function slower than multiplying items to get copies inside nested list?

Tags:

colidyre

1 Answers

Yann Vernier

Related questions

Recent Activity

Donate For Us