Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is random.shuffle so much slower than using sorted function?

When using pythons random.shuffle function, I noticed it went significantly faster to use sorted(l, key=lambda _: random.random()) than random.shuffle(l). As far as I understand, both ways produce completely random lists, so why does shuffle take so much longer?

Below are the times using timeit module.

from timeit import timeit
setup = 'import random\nl = list(range(1000))'

# 5.542 seconds
print(timeit('random.shuffle(l)', setup=setup, number=10000))

# 1.878 seconds
print(timeit('sorted(l, key=lambda _: random.random())', setup=setup, number=10000))
like image 980
mazore Avatar asked Oct 15 '22 23:10

mazore


1 Answers

On CPython (the reference interpreter) random.shuffle is implemented in Python (and implemented in terms of _randbelow, itself a Python wrapper around getrandbits, the C level function that ultimately implements it, and which can end up being called nearly twice as often as strictly necessary in an effort to ensure the outputs are unbiased); sorted (and random.random) are implemented in C. The overhead of performing work in Python is higher than performing similar work in C.

like image 134
ShadowRanger Avatar answered Oct 19 '22 02:10

ShadowRanger