Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: where is random.random() seeded?

Say I have some python code:

import random
r=random.random()

Where is the value of r seeded from in general?
And what if my OS has no random, then where is it seeded?
Why isn't this recommended for cryptography? Is there some way to know what the random number is?

like image 915
Academiphile Avatar asked Dec 04 '14 01:12

Academiphile


People also ask

How do I find a random seed in Python?

Python Random seed() Method The random number generator needs a number to start with (a seed value), to be able to generate a random number. By default the random number generator uses the current system time. Use the seed() method to customize the start number of the random number generator.

What is seed in random method?

Seed function is used to save the state of a random function, so that it can generate same random numbers on multiple executions of the code on the same machine or on different machines (for a specific seed value). The seed value is the previous value number generated by the generator.

What is the seed in random number generator?

When you use statistical software to generate random numbers, you usually have an option to specify a random number seed. A seed is a positive integer that initializes a random-number generator (technically, a pseudorandom-number generator). A seed enables you to create reproducible streams of random numbers.

What is seed in random split?

randomSplit(by:seed:)Creates two mutually exclusive, randomly divided subsets of the table.


1 Answers

Follow da code.

To see where the random module "lives" in your system, you can just do in a terminal:

>>> import random
>>> random.__file__
'/usr/lib/python2.7/random.pyc'

That gives you the path to the .pyc ("compiled") file, which is usually located side by side to the original .py where readable code can be found.

Let's see what's going on in /usr/lib/python2.7/random.py:

You'll see that it creates an instance of the Random class and then (at the bottom of the file) "promotes" that instance's methods to module functions. Neat trick. When the random module is imported anywhere, a new instance of that Random class is created, its values are then initialized and the methods are re-assigned as functions of the module, making it quite random on a per-import (erm... or per-python-interpreter-instance) basis.

_inst = Random()
seed = _inst.seed
random = _inst.random
uniform = _inst.uniform
triangular = _inst.triangular
randint = _inst.randint

The only thing that this Random class does in its __init__ method is seeding it:

class Random(_random.Random):
    ...
    def __init__(self, x=None):
        self.seed(x)    
...
_inst = Random()
seed = _inst.seed

So... what happens if x is None (no seed has been specified)? Well, let's check that self.seed method:

def seed(self, a=None):
    """Initialize internal state from hashable object.

    None or no argument seeds from current time or from an operating
    system specific randomness source if available.

    If a is not None or an int or long, hash(a) is used instead.
    """

    if a is None:
        try:
            a = long(_hexlify(_urandom(16)), 16)
        except NotImplementedError:
            import time
            a = long(time.time() * 256) # use fractional seconds

    super(Random, self).seed(a)
    self.gauss_next = None

The comments already tell what's going on... This method tries to use the default random generator provided by the OS, and if there's none, then it'll use the current time as the seed value.

But, wait... What the heck is that _urandom(16) thingy then?

Well, the answer lies at the beginning of this random.py file:

from os import urandom as _urandom
from binascii import hexlify as _hexlify

Tadaaa... The seed is a 16 bytes number that came from os.urandom

Let's say we're in a civilized OS, such as Linux (with a real random number generator). The seed used by the random module is the same as doing:

>>> long(binascii.hexlify(os.urandom(16)), 16)
46313715670266209791161509840588935391L

The reason of why specifying a seed value is considered not so great is that the random functions are not really "random"... They're just a very weird sequence of numbers. But that sequence will be the same given the same seed. You can try this yourself:

>>> import random
>>> random.seed(1)
>>> random.randint(0,100)
13
>>> random.randint(0,100)
85
>>> random.randint(0,100)
77

No matter when or how or even where you run that code (as long as the algorithm used to generate the random numbers remains the same), if your seed is 1, you will always get the integers 13, 85, 77... which kind of defeats the purpose (see this about Pseudorandom number generation) On the other hand, there are use cases where this can actually be a desirable feature, though.

That's why is considered "better" relying on the operative system random number generator. Those are usually calculated based on hardware interruptions, which are very, very random (it includes interruptions for hard drive reading, keystrokes typed by the human user, moving a mouse around...) In Linux, that O.S. generator is /dev/random. Or, being a tad picky, /dev/urandom (that's what Python's os.urandom actually uses internally) The difference is that (as mentioned before) /dev/random uses hardware interruptions to generate the random sequence. If there are no interruptions, /dev/random could be exhausted and you might have to wait a little bit until you can get the next random number. /dev/urandom uses /dev/random internally, but it guarantees that it will always have random numbers ready for you.

If you're using linux, just do cat /dev/random on a terminal (and prepare to hit Ctrl+C because it will start output really, really random stuff)

borrajax@borrajax:/tmp$ cat /dev/random
_+�_�?zta����K�����q�ߤk��/���qSlV��{�Gzk`���#p$�*C�F"�B9��o~,�QH���ɭ�f�޺�̬po�2o𷿟�(=��t�0�p|m�e
���-�5�߁ٵ�ED�l�Qt�/��,uD�w&m���ѩ/��;��5Ce�+�M����
~ �4D��XN��?ס�d��$7Ā�kte▒s��ȿ7_���-     �d|����cY-�j>�
                    �b}#�W<դ���8���{�1»
.       75���c4$3z���/̾�(�(���`���k�fC_^C

Python uses the OS random generator or a time as a seed. This means that the only place where I could imagine a potential weakness with Python's random module is when it's used:

  • In an OS without an actual random number generator, and
  • In a device where time.time is always reporting the same time (has a broken clock, basically)

If you are concerned about the actual randomness of the random module, you can either go directly to os.urandom or use the random number generator in the pycrypto cryptographic library. Those are probably more random. I say more random because...

https://stackoverflow.com/a/2146062/289011

Image inspiration came from this other SO answer

like image 124
BorrajaX Avatar answered Sep 22 '22 08:09

BorrajaX