Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python multiprocessing doesn't play nicely with uuid.uuid4()

I'm trying to generate a uuid for a filename, and I'm also using the multiprocessing module. Unpleasantly, all of my uuids end up exactly the same. Here is a small example:

import multiprocessing
import uuid

def get_uuid( a ):
    ## Doesn't help to cycle through a bunch.
    #for i in xrange(10): uuid.uuid4()

    ## Doesn't help to reload the module.
    #reload( uuid )

    ## Doesn't help to load it at the last minute.
    ## (I simultaneously comment out the module-level import).
    #import uuid

    ## uuid1() does work, but it differs only in the first 8 characters and includes identifying information about the computer.
    #return uuid.uuid1()

    return uuid.uuid4()

def main():
    pool = multiprocessing.Pool( 20 )
    uuids = pool.map( get_uuid, range( 20 ) )
    for id in uuids: print id

if __name__ == '__main__': main()

I peeked into uuid.py's code, and it seems to depending-on-the-platform use some OS-level routines for randomness, so I'm stumped as to a python-level solution (to do something like reload the uuid module or choose a new random seed). I could use uuid.uuid1(), but only 8 digits differ and I think there are derived exclusively from the time, which seems dangerous especially given that I'm multiprocessing (so the code could be executing at exactly the same time). Is there some Wisdom out there about this issue?

like image 424
yig Avatar asked May 03 '10 16:05

yig


People also ask

What is UUID uuid4 in Python?

UUID, Universal Unique Identifier, is a python library which helps in generating random objects of 128 bits as ids. It provides the uniqueness as it generates ids on the basis of time, Computer hardware (MAC etc.).

What is UUID uuid4?

uuid4() creates a random UUID. New in version 3.7. The UUID was generated by the platform in a multiprocessing-safe way. The UUID was not generated in a multiprocessing-safe way.

Is Python UUID thread safe?

The thread-unsafe part of Python 2.5's uuid. uuid1() is when it compares the current current timestamp to the previous timestamp. Without a lock, two processes can end up comparing against the same globally saved timestamp.

Is Python UUID cryptographically secure?

It is a cryptographically secure PRNG, but during a small time in system start up, it may not be correctly seeded. If you need long term keys, it may be better to get some 256 bits from /dev/random before using /dev/urandom.


1 Answers

This is the correct way to generate your own uuid4, if you need to do that:

import os, uuid
return uuid.UUID(bytes=os.urandom(16), version=4)

Python should be doing this automatically--this code is right out of uuid.uuid4, when the native _uuid_generate_random doesn't exist. There must be something wrong with your platform's _uuid_generate_random.

If you have to do this, don't just work around it yourself and let everyone else on your platform suffer; report the bug.

like image 77
Glenn Maynard Avatar answered Sep 30 '22 12:09

Glenn Maynard