I sometimes used map
if there was a function/method that was written in C to get a bit extra performance. However I recently revisited some of my benchmarks and noticed that the relative performance (compared to a similar list comprehension) drastically changed between Python 3.5 and 3.6.
That's not the actual code but just a minimal sample that illustrates the difference:
import random
lst = [random.randint(0, 10) for _ in range(100000)]
assert list(map((5).__lt__, lst)) == [5 < i for i in lst]
%timeit list(map((5).__lt__, lst))
%timeit [5 < i for i in lst]
I realize that it's not a good idea to use (5).__lt__
but I couldn't come up with a useful example right now.
The timings on Python-3.5 were in favor of the map
approach:
15.1 ms ± 5.64 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
16.7 ms ± 35.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
While the Python-3.6 timings actually show that the comprehension is faster:
17.9 ms ± 755 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
14.3 ms ± 128 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
My question is what happened in this case that made the list-comprehension faster and the map
solution slower? I realize the difference isn't that much, it just made me curious because that was one of the "tricks" I sometimes (actually seldom) used in performance critical codes.
I think a fair comparison involves using the same function and same testing conditions in Python 3.5 and 3.6 as well as when comparing map
to list comprehension in a chosen Python version.
In my initial answer I have performed multiple tests that showed that map
was still faster by a factor of about two in both versions of Python when compared to list comprehension. However some results were not conclusive and so I performed some more tests.
First let me cite some of your points stated in the question:
"... [I] noticed that the relative performance [of
map
] (compared to a similar list comprehension) drastically changed between Python 3.5 and 3.6"
You also ask:
"My question is what happened in this case that made the list-comprehension faster and the map solution slower?"
It is not very clear if you mean that map is slower than list comprehension in Python 3.6 or if you mean that map is slower in Python 3.6 than in 3.5 and list comprehension's performance has increased (albeit not necessarily to the level of beating map
).
Based on more extensive tests that I have performed after my first answer to this question, I think I have an idea of what is going on.
However, first let's create conditions for "fair" comparisons. For this we need to:
Compare performance of map
in different Python versions using the same function;
Compare performance of map
to list comprehension in the same version using same function;
Run the tests on same data;
Minimize contribution from timing functions.
Here is version information about my system:
Python 3.5.3 |Continuum Analytics, Inc.| (default, Mar 6 2017, 12:15:08)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
IPython 5.3.0 -- An enhanced Interactive Python.
and
Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:14:59)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.
Let's first address the issue of "same data". Unfortunately because you effectively are using seed(None)
, each data set lst
is different on each of the two versions of Python. This probably contributes to the difference in performance seen on two Python versions. One fix would be to set, e.g., random.seed(0)
(or something like that). I chose to create the list once and save it using numpy.save()
and then load it in each version. This is especially important because I chose to modify your tests slightly (number of "loops" and "repeats") and I have increased the length of your dataset to 100,000,000:
import numpy as np
import random
lst = [random.randint(0, 10) for _ in range(100000000)]
np.save('lst', lst, allow_pickle=False)
Second, let's use timeit
module instead of IPython's magic command %timeit
. The reason for doing this comes from the following test performed in Python 3.5:
In [11]: f = (5).__lt__
In [12]: %timeit -n1 -r20 [f(i) for i in lst]
1 loop, best of 20: 9.01 s per loop
Compare this to the result of timeit
in same version of Python:
>>> t = timeit.repeat('[f(i) for i in lst]', setup="f = (5).__lt__;
... import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20,
... number=1); print(min(t), max(t), np.mean(t), np.std(t))
7.442819457995938 7.703615028003696 7.5105415405 0.0550515642854
For unknown to me reasons, IPython's magic %timeit
is adding some time compared to timeit
package. Therefore, I will use timeit
exclusively in my testing.
NOTE: In the discussions that follows I will use only minimum timing (min(t)
).
>>> import numpy as np
>>> import timeit
>>> t = timeit.repeat('list(map(f, lst))', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.666553302988177 4.811194089008495 4.72791638025 0.041115884397
>>> t = timeit.repeat('[f(i) for i in lst]', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
7.442819457995938 7.703615028003696 7.5105415405 0.0550515642854
>>> t = timeit.repeat('[5 < i for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.94656751700677 5.07807950800634 5.00670203845 0.0340474956945
>>> t = timeit.repeat('list(map(abs, lst))', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.167273573024431 4.320013975986512 4.2408865186 0.0378852782878
>>> t = timeit.repeat('[abs(i) for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
5.664627838006709 5.837686392012984 5.71560354655 0.0456700607748
Notice how second test (list comprehension using f(i)
) is significantly slower than third test (list comprehension using 5 < i
) indicating that f = (5).__lt__
is not identical (or almost identical) to 5 < i
from the code perspective.
>>> t = timeit.repeat('f(1)', setup="f = (5).__lt__", repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.052280781004810706 0.05500587198184803 0.0531139718529 0.000877649561967
>>> t = timeit.repeat('5 < 1', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.030931947025237605 0.033691533986711875 0.0314959864045 0.000633274658428
>>> t = timeit.repeat('abs(1)', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.04685414198320359 0.05405496899038553 0.0483296330043 0.00162837880358
Notice how again first test (of f(1)
) is significantly slower than second test (of 5 < 1
) further supporting that f = (5).__lt__
is not identical (or almost identical) to 5 < i
from the code perspective.
>>> import numpy as np
>>> import timeit
>>> t = timeit.repeat('list(map(f, lst))', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.599696700985078 4.743880658003036 4.6631793691 0.0425774678203
>>> t = timeit.repeat('[f(i) for i in lst]', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
7.316072431014618 7.572676292009419 7.3837024617 0.0574811241553
>>> t = timeit.repeat('[5 < i for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.570452399988426 4.679144663008628 4.61264215875 0.0265541828693
>>> t = timeit.repeat('list(map(abs, lst))', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
2.742673939006636 2.8282236389932223 2.78504617405 0.0260357089928
>>> t = timeit.repeat('[abs(i) for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
6.2177103200228885 6.428813881997485 6.28722427145 0.0493010620999
>>> t = timeit.repeat('f(1)', setup="f = (5).__lt__", repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.051936342992121354 0.05764096099301241 0.0532974587506 0.00117079475737
>>> t = timeit.repeat('5 < 1', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.02675032999832183 0.032919151999522 0.0285137565021 0.00156522182488
>>> t = timeit.repeat('abs(1)', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.047831349016632885 0.0531779529992491 0.0482893927969 0.00112825297875
Notice how again first test (of f(1)
) is significantly slower than second test (of 5 < 1
) further supporting that f = (5).__lt__
is not identical (or almost identical) to 5 < i
from the code perspective.
I do not know how reliable are these timing tests and it is also difficult to separate all factors that contribute to these timing results. However we can notice from the "Group 2" of tests that the only "individual" test that significantly changed its timing is the test of 5 < 1
: it went down to 0.0268s in Python 3.6 from 0.0309s in Python 3.5. This makes the list comprehension test in Python 3.6 that uses 5 < i
to run faster than a similar test in Python 3.5. However, this does not mean that list comprehension become faster in Python 3.6.
Let's compare relative performance of map
to list comprehension for the same function in the same Python version. Then we get in Python 3.5: r(f) = 7.4428/4.6666 = 1.595
, r(abs) = 5.665/4.167 = 1.359
and in Python 3.6: r(f) = 7.316/4.5997 = 1.591
, r(abs) = 6.218/2.743 = 2.267
. Based on these relative performances we can see that in Python 3.6 performance of the map
relative to the performance of list comprehension is at least the same as in Python 3.5 for the f = (5).__lt__
function and this ratio has even improved for a function such as abs()
in Python 3.6.
In any case, I believe that there is no evidence that list comprehension became faster in Python 3.6 neither in the relative nor in the absolute sense. The only performance improvement is for [5 < i for i in lst]
test but that is because 5 < i
itself became faster in Python 3.6 and not due to the list comprehension itself being faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With