Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practices for generating a random seeds to seed Pytorch? [closed]

Tags:

pytorch

What I really want is to seed the dataset and dataloader. I am adapting code from:

https://gist.github.com/kevinzakka/d33bf8d6c7f06a9d8c76d97a7879f5cb

Anyone know how to seed this properly? What are the best practices for seeding things in Pytorch.

Honestly, I have no idea if there is an algorithm specific way for GPU vs CPU. I care mostly about general pytorch and make sure my code is "truly random". Specially when it uses GPU I guess...


related:

  • https://discuss.pytorch.org/t/best-practices-for-seeding-random-numbers-on-gpu/18751
  • https://discuss.pytorch.org/t/the-random-seed/19516/4
  • https://discuss.pytorch.org/t/best-practices-for-generating-a-random-seed-to-seed-pytorch/52894/2

My answer was deleted and here is its content:

I don't know if this is the best for pytorch but this is what seems the best for any programming language:


Usually the best random sample you could get in any programming language is generated through the operating system. In Python, you can use the os module:

random_data = os.urandom(4)

In this way you get a cryptographic safe random byte sequence which you may convert in a numeric data type for using as a seed.

seed = int.from_bytes(random_data, byteorder="big")

EDIT: the snippets of code works only on Python 3


''' Greater than 4 I get this error:

ValueError: Seed must be between 0 and 2**32 - 1 '''

RAND_SIZE = 4

like image 908
Charlie Parker Avatar asked Sep 01 '25 03:09

Charlie Parker


1 Answers

Have a look at https://pytorch.org/docs/stable/notes/randomness.html

This is what I use

def seed_everything(seed=42):
  random.seed(seed)
  os.environ['PYTHONHASHSEED'] = str(seed)
  np.random.seed(seed)
  torch.manual_seed(seed)
  torch.backends.cudnn.deterministic = True
  torch.backends.cudnn.benchmark = False

the two last parameters (cudnn) are for GPU

and you can generate a seed as follow:

def get_truly_random_seed_through_os():
    """
    Usually the best random sample you could get in any programming language is generated through the operating system. 
    In Python, you can use the os module.

    source: https://stackoverflow.com/questions/57416925/best-practices-for-generating-a-random-seeds-to-seed-pytorch/57416967#57416967
    """
    RAND_SIZE = 4
    random_data = os.urandom(
        RAND_SIZE
    )  # Return a string of size random bytes suitable for cryptographic use.
    random_seed = int.from_bytes(random_data, byteorder="big")
    return random_seed
like image 109
cookiemonster Avatar answered Sep 03 '25 17:09

cookiemonster