I'm in the process of working on programming project that involves some pretty extensive Monte Carlo simulation in Python, and as such the generation of a tremendous number of random numbers. Very nearly all of them, if not all of them, will be able to be generated by Python's built in random module.
I'm something of a coding newbie, and unfamiliar with efficient and inefficient ways to do things. Is it faster to generate say, all the random numbers as a list, and then iterate through that list, or generate a new random number each time a function is called, which will be in a very large loop?
Or some other, undoubtedly more clever method?
Generate a random number each time. Since the inner workings of the loop only care about a single random number, generate and use it inside the loop.
Example:
# do this:
import random
for x in xrange(SOMEVERYLARGENUMBER):
n = random.randint(1,1000) # whatever your range of random numbers is
# Do stuff with n
# don't do this:
import random
# This list comprehension generates random numbers in a list
numbers = [random.randint(1,1000) for x in xrange(SOMEVERYLARGENUMBER)]
for n in numbers:
# Do stuff with n
Obviously, in practical terms it really doesn't matter, unless you're dealing with billions and billions of iterations, but why bother generating all those numbers if you're only going to be using one at a time?
import random
for x in (random.randint(0,80) for x in xrange(1000*1000)):
print x
The code between parentheses will only generate one item at a time, so it's memory safe.
Python builtin random
module, e.g. random.random()
, random.randint()
, (some distributions also available, you probably want gaussian) does about 300K samples/s.
Since you are doing numerical computation, you probably use numpy
anyway, that offers better performance if you cook random number one array at a time instead of one number at a time and wider choice of distributions. 60K/s * 1024 (array length), that's ~60M samples/s.
You can also read /dev/urandom
on Linux and OSX. my hw/sw (osx laptop) manages ~10MB/s.
Surely there must be faster ways to generate random numbers en masse, e.g.:
from Crypto.Cipher import AES
from Crypto.Util import Counter
import secrets
aes = AES.new(secrets.token_bytes(16), AES.MODE_CTR, secrets.token_bytes(16), counter=Counter.new(128))
data = "0" * 2 ** 20
with open("filler.bin", "wb") as f:
while True:
f.write(aes.encrypt(data))
This generates 200MB/s on a single core of i5-4670K
Common ciphers like aes and blowfish manage 112MB/s and 70MB/s on my stack. Furthermore modern processors make aes even faster up to some 700MB/s see this link to test runs on few hardware combinations. (edit: link broken). You could use weaker ECB mode, provided you feed distinct inputs into it, and achieve up to 3GB/s.
Stream cipher are better suited for the task, e.g. RC4 tops out at 300MB/s on my hardware, you may get best results from most popular ciphers as more effort was spent optimising those both and software.
Code to generate 10M random numbers efficiently and faster:
import random l=10000000 listrandom=[] for i in range (l): value=random.randint(0,l) listrandom.append(value) print listrandom
Time taken included the I/O time lagged in printing on screen:
real 0m27.116s user 0m24.391s sys 0m0.819s
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With