When using pythons random.shuffle
function, I noticed it went significantly faster to use sorted(l, key=lambda _: random.random())
than random.shuffle(l)
. As far as I understand, both ways produce completely random lists, so why does shuffle
take so much longer?
Below are the times using timeit
module.
from timeit import timeit
setup = 'import random\nl = list(range(1000))'
# 5.542 seconds
print(timeit('random.shuffle(l)', setup=setup, number=10000))
# 1.878 seconds
print(timeit('sorted(l, key=lambda _: random.random())', setup=setup, number=10000))
On CPython (the reference interpreter) random.shuffle
is implemented in Python (and implemented in terms of _randbelow
, itself a Python wrapper around getrandbits
, the C level function that ultimately implements it, and which can end up being called nearly twice as often as strictly necessary in an effort to ensure the outputs are unbiased); sorted
(and random.random
) are implemented in C. The overhead of performing work in Python is higher than performing similar work in C.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With