I'm using Python 2.5 on Linux, in multiple parallel FCGI processes. I use
chars = string.ascii_letters + string.digits
cookie = ''.join([random.choice(chars) for x in range(32)])
to generate distinct cookies. Assuming that the RNG is seeded from /dev/urandom, and that the sequence of random numbers comes from the Mersenne twister, I would expect that there is practically zero chance of collision.
However, I do see regular collisions, even though only a few (<100) users are logged in at any time.
Why are the random numbers not more random?
Use the random. sample function when you want to choose multiple random items from a list without including the duplicates. Use random. choices function when you want to choose multiple items out of a list including repeated.
Python Random choice() Method The choice() method returns a randomly selected element from the specified sequence. The sequence can be a string, a range, a list, a tuple or any other kind of sequence.
In Python, you can randomly sample elements from a list with choice() , sample() , and choices() of the random module. These functions can also be applied to a string and tuple. choice() returns one random element, and sample() and choices() return a list of multiple random elements.
It shouldn't be generating duplicates.
import random
chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
def gen():
return ''.join([random.choice(chars) for x in range(32)])
test = [gen() for i in range(100000)]
print len(test), len(set(test)) # 100000 100000
The chances of duplicates is significant with chars = "ab"; 126 duplicates in 1000000 iterations. It's nonexistant with 62.
That said, this isn't a good way to generate cookies, because session cookies need to be unpredictable, to avoid attacks involving stealing other people's session cookies. The Mersenne Twister is not designed for generating secure random numbers. This is what I do:
import os, hashlib
def gen():
return hashlib.sha1(os.urandom(512)).hexdigest()
test = [gen() for i in range(100000)]
print len(test), len(set(test))
... which should be very secure (which is to say, difficult to take a string of session cookies and guess other existing session cookies from them).
This is definitely not a normal collision scenario:
Could this be a concurrency issue?
random.Random
instances for each threadthreading.local()
)os.urandom()
- not system time - so you should get different streams for each thread.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With