Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the fastest way to print a list of ints so unintuitive? [closed]

I have a list of ints (ints) and I want to print each one in their own line as fast as possible.

My first shot was this:

list(map(print,ints))

and running it on 107 ints together with a function to read them from standard in this takes an average of 1708 ms from 30 repetitions.

As this fairly well-received answer points out, using list() to force-expand a map-result is inefficient because it has to create a long list of irrelevant data, specifically in our case a list of None values, returned by print. That makes intuitive sense, right?

So I tried a number of other approaches:

# n print calls
list(map(print,ints))                         # map
for n in ints: print(n)                       # loop
for i in range(0,len(ints)): print(ints[i])   # range
[print(n) for n in ints]                      # compr
deque(map(print,ints),0)                      # deque

# one print call
print('\n'.join(map(str,ints)))               # cat
print(*ints, sep='\n')                        # sep

Measured in-situ together with ints = list(map(int,sys.stdin.readlines())) the following are the runtimes.

map    1707 ms
loop   1926 ms
range  2027 ms
compr  1828 ms
deque  1699 ms
--------------
cat    1084 ms
sep    1462 ms

This is not a fluke. I don't want to overload this question with too much statistics, but these times are not outliers.

This is entirely counter intuitive. What I would have guessed to be the fastest (from the n-print set) is significantly slower than my first version ('map') despite that one creating a [None,...,None] list with ten million entries. Does python optimise that list construction out? List comprehension is in between and not hugely surprising, but the most astonishing is that in 'cat' creating a huge str and calling print on that is by far the fastest. I would have assumed that the memory allocations on that one str would have destroyed the runtime entirely, even if having only one io operation looks better on paper (note that flush=False is the default).

Main question: Why is the 'map' variant so much faster than other variants with n print calls?
And why is 'cat' so much faster yet?


Example map.py:

#!/usr/bin/python3

import sys

if __name__ == '__main__':
  ints = list(map(int,sys.stdin.readlines()))
  list(map(print,ints))

Run as

/usr/bin/time -f '%e' -o map.run-time -a ./map.py < input.log | wc -l > /dev/null
like image 901
bitmask Avatar asked Sep 08 '25 15:09

bitmask


1 Answers

list(map(print,ints)) does take extra time to build a large list, but all the hard work is done in C.

[print(n) for n in ints] does the same but in Python, which is slower than C.

for n in ints: print(n) doesn't build a list, but it's still interpreting Python, and evidently that's more detrimental than building the list. Also, n is a global variable. That's probably why it's slower than the list comprehension. You should see some speedup if you did it inside a function so it'd be a faster local variable.

for i in range(0,len(ints)): print(ints[i]) is just silly, that's trying to be slow with all the index objects and list subscriptions.

deque(map(print,ints),0) uses fastest possible consumption of the map iterator and doesn't build a large list, so it's fastest. I'm actually a bit disappointed its advantage is so small.


You can see more details by looking at the bytecode. For example import dis; dis.dis('list(map(print,ints))') shows that it just loads and calls the relevant pieces:

  1           LOAD_NAME                0 (list)
              PUSH_NULL
              LOAD_NAME                1 (map)
              PUSH_NULL
              LOAD_NAME                2 (print)
              LOAD_NAME                3 (ints)
              CALL                     2
              CALL                     1

Whereas the "loop" code has a Python-level loop that requires lots of interpreter steps:

  1           LOAD_NAME                0 (ints)
              GET_ITER
      L1:     FOR_ITER                11 (to L2)
              STORE_NAME               1 (n)
              LOAD_NAME                2 (print)
              PUSH_NULL
              LOAD_NAME                1 (n)
              CALL                     1
              POP_TOP
              JUMP_BACKWARD           13 (to L1)
      L2:     END_FOR
              POP_TOP
like image 160
Kelly Bundy Avatar answered Sep 10 '25 05:09

Kelly Bundy