I was reading about python's random module in standard library. It amazes me that when I set the seed and produce a few random numbers:
random.seed(1)
for i in range(5):
print random.random()
The numbers produced are exactly the same as the sample in the article. I think it's safe to say the algorithm is deterministic when the seed is set.
And when the seed is not set, the standard library seeds with time.time()
.
Now suppose an online service use random.random()
to generate a captcha code, can a hacker use the same random generator to reproduce the captcha easily?
Am I worrying too much, or is this a real vulnerability?
The random number or data generated by Python's random module is not truly random; it is pseudo-random(it is PRNG), i.e., deterministic. The random module uses the seed value as a base to generate a random number.
For "secure" random numbers, Python doesn't actually generate them: it gets them from the operating system, which has a special driver that gathers entropy from various real-world sources, such as variations in timing between keystrokes and disk seeks.
Python, like any other programming technique, uses a pseudo-random generator. Python's random generation is based upon Mersenne Twister algorithm that produces 53-bit precision floats.
Most random data generated with Python is not fully random in the scientific sense of the word. Rather, it is pseudorandom: generated with a pseudorandom number generator (PRNG), which is essentially any algorithm for generating seemingly random but still reproducible data.
It shouldn't surprise you that the sequence is deterministic after seeding. That's the whole point of seeding. random.random
is known as a PRNG, a pseudo- random number generator. This is not unique to Python, every language's simple random source is deterministic in this way.
And yes, people who are genuinely concerned about security will worry that an attacker could reproduce the sequence. That's why other sources of randomness are available, like os.urandom
, but they are more expensive.
But the problem is not as bad as you say: for a web request, typically a process handles more than one request, so the module is initialized at some unknown point in the past, not when the web request was received.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With