Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3.5 vs. 3.6 what made "map" slower compared to comprehensions

I sometimes used map if there was a function/method that was written in C to get a bit extra performance. However I recently revisited some of my benchmarks and noticed that the relative performance (compared to a similar list comprehension) drastically changed between Python 3.5 and 3.6.

That's not the actual code but just a minimal sample that illustrates the difference:

import random

lst = [random.randint(0, 10) for _ in range(100000)]
assert list(map((5).__lt__, lst)) == [5 < i for i in lst]
%timeit list(map((5).__lt__, lst))
%timeit [5 < i for i in lst]

I realize that it's not a good idea to use (5).__lt__ but I couldn't come up with a useful example right now.

The timings on Python-3.5 were in favor of the map approach:

15.1 ms ± 5.64 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
16.7 ms ± 35.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

While the Python-3.6 timings actually show that the comprehension is faster:

17.9 ms ± 755 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
14.3 ms ± 128 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

My question is what happened in this case that made the list-comprehension faster and the map solution slower? I realize the difference isn't that much, it just made me curious because that was one of the "tricks" I sometimes (actually seldom) used in performance critical codes.

like image 556
MSeifert Avatar asked Aug 09 '17 22:08

MSeifert


1 Answers

I think a fair comparison involves using the same function and same testing conditions in Python 3.5 and 3.6 as well as when comparing map to list comprehension in a chosen Python version.

In my initial answer I have performed multiple tests that showed that map was still faster by a factor of about two in both versions of Python when compared to list comprehension. However some results were not conclusive and so I performed some more tests.

First let me cite some of your points stated in the question:

"... [I] noticed that the relative performance [of map] (compared to a similar list comprehension) drastically changed between Python 3.5 and 3.6"

You also ask:

"My question is what happened in this case that made the list-comprehension faster and the map solution slower?"

It is not very clear if you mean that map is slower than list comprehension in Python 3.6 or if you mean that map is slower in Python 3.6 than in 3.5 and list comprehension's performance has increased (albeit not necessarily to the level of beating map).

Based on more extensive tests that I have performed after my first answer to this question, I think I have an idea of what is going on.

However, first let's create conditions for "fair" comparisons. For this we need to:

  1. Compare performance of map in different Python versions using the same function;

  2. Compare performance of map to list comprehension in the same version using same function;

  3. Run the tests on same data;

  4. Minimize contribution from timing functions.

Here is version information about my system:

Python 3.5.3 |Continuum Analytics, Inc.| (default, Mar  6 2017, 12:15:08) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
IPython 5.3.0 -- An enhanced Interactive Python.

and

Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:14:59) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.

Let's first address the issue of "same data". Unfortunately because you effectively are using seed(None), each data set lst is different on each of the two versions of Python. This probably contributes to the difference in performance seen on two Python versions. One fix would be to set, e.g., random.seed(0) (or something like that). I chose to create the list once and save it using numpy.save() and then load it in each version. This is especially important because I chose to modify your tests slightly (number of "loops" and "repeats") and I have increased the length of your dataset to 100,000,000:

import numpy as np
import random
lst = [random.randint(0, 10) for _ in range(100000000)]
np.save('lst', lst, allow_pickle=False)

Second, let's use timeit module instead of IPython's magic command %timeit. The reason for doing this comes from the following test performed in Python 3.5:

In [11]: f = (5).__lt__
In [12]: %timeit -n1 -r20 [f(i) for i in lst]
1 loop, best of 20: 9.01 s per loop

Compare this to the result of timeit in same version of Python:

>>> t = timeit.repeat('[f(i) for i in lst]', setup="f = (5).__lt__;
... import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, 
... number=1); print(min(t), max(t), np.mean(t), np.std(t))
7.442819457995938 7.703615028003696 7.5105415405 0.0550515642854

For unknown to me reasons, IPython's magic %timeit is adding some time compared to timeit package. Therefore, I will use timeit exclusively in my testing.

NOTE: In the discussions that follows I will use only minimum timing (min(t)).

Tests in Python 3.5.3:

Group 1: map and list comprehension tests

>>> import numpy as np
>>> import timeit

>>> t = timeit.repeat('list(map(f, lst))', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.666553302988177 4.811194089008495 4.72791638025 0.041115884397

>>> t = timeit.repeat('[f(i) for i in lst]', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
7.442819457995938 7.703615028003696 7.5105415405 0.0550515642854

>>> t = timeit.repeat('[5 < i for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.94656751700677 5.07807950800634 5.00670203845 0.0340474956945

>>> t = timeit.repeat('list(map(abs, lst))', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.167273573024431 4.320013975986512 4.2408865186 0.0378852782878

>>> t = timeit.repeat('[abs(i) for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
5.664627838006709 5.837686392012984 5.71560354655 0.0456700607748

Notice how second test (list comprehension using f(i)) is significantly slower than third test (list comprehension using 5 < i) indicating that f = (5).__lt__ is not identical (or almost identical) to 5 < i from the code perspective.

Group 2: "individual" function tests

>>> t = timeit.repeat('f(1)', setup="f = (5).__lt__", repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.052280781004810706 0.05500587198184803 0.0531139718529 0.000877649561967

>>> t = timeit.repeat('5 < 1', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.030931947025237605 0.033691533986711875 0.0314959864045 0.000633274658428

>>> t = timeit.repeat('abs(1)', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.04685414198320359 0.05405496899038553 0.0483296330043 0.00162837880358

Notice how again first test (of f(1)) is significantly slower than second test (of 5 < 1) further supporting that f = (5).__lt__ is not identical (or almost identical) to 5 < i from the code perspective.

Tests in Python 3.6.2:

Group 1: map and list comprehension tests

>>> import numpy as np
>>> import timeit

>>> t = timeit.repeat('list(map(f, lst))', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.599696700985078 4.743880658003036 4.6631793691 0.0425774678203

>>> t = timeit.repeat('[f(i) for i in lst]', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
7.316072431014618 7.572676292009419 7.3837024617 0.0574811241553

>>> t = timeit.repeat('[5 < i for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.570452399988426 4.679144663008628 4.61264215875 0.0265541828693

>>> t = timeit.repeat('list(map(abs, lst))', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
2.742673939006636 2.8282236389932223 2.78504617405 0.0260357089928

>>> t = timeit.repeat('[abs(i) for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
6.2177103200228885 6.428813881997485 6.28722427145 0.0493010620999

Group 2: "individual" function tests

>>> t = timeit.repeat('f(1)', setup="f = (5).__lt__", repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.051936342992121354 0.05764096099301241 0.0532974587506 0.00117079475737

>>> t = timeit.repeat('5 < 1', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.02675032999832183 0.032919151999522 0.0285137565021 0.00156522182488

>>> t = timeit.repeat('abs(1)', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.047831349016632885 0.0531779529992491 0.0482893927969 0.00112825297875

Notice how again first test (of f(1)) is significantly slower than second test (of 5 < 1) further supporting that f = (5).__lt__ is not identical (or almost identical) to 5 < i from the code perspective.

Discussion

I do not know how reliable are these timing tests and it is also difficult to separate all factors that contribute to these timing results. However we can notice from the "Group 2" of tests that the only "individual" test that significantly changed its timing is the test of 5 < 1: it went down to 0.0268s in Python 3.6 from 0.0309s in Python 3.5. This makes the list comprehension test in Python 3.6 that uses 5 < i to run faster than a similar test in Python 3.5. However, this does not mean that list comprehension become faster in Python 3.6.

Let's compare relative performance of map to list comprehension for the same function in the same Python version. Then we get in Python 3.5: r(f) = 7.4428/4.6666 = 1.595, r(abs) = 5.665/4.167 = 1.359 and in Python 3.6: r(f) = 7.316/4.5997 = 1.591, r(abs) = 6.218/2.743 = 2.267. Based on these relative performances we can see that in Python 3.6 performance of the map relative to the performance of list comprehension is at least the same as in Python 3.5 for the f = (5).__lt__ function and this ratio has even improved for a function such as abs() in Python 3.6.

In any case, I believe that there is no evidence that list comprehension became faster in Python 3.6 neither in the relative nor in the absolute sense. The only performance improvement is for [5 < i for i in lst] test but that is because 5 < i itself became faster in Python 3.6 and not due to the list comprehension itself being faster.

like image 149
AGN Gazer Avatar answered Oct 12 '22 09:10

AGN Gazer