Python docs states that uuid1 uses current time to form the uuid value. But I could not find a reference that ensures UUID1 is sequential.
>>> import uuid
>>> u1 = uuid.uuid1()
>>> u2 = uuid.uuid1()
>>> u1 < u2
True
>>>
Argumentless use of uuid.uuid1()
gives non-sequential results (see answer by @basil-bourque), but it can be easily made sequential if you set clock_seq
or node
arguments (because in this case uuid1
uses python implementation that guarantees to have unique and sequential timestamp
part of the UUID in current process):
import time
from uuid import uuid1, getnode
from random import getrandbits
_my_clock_seq = getrandbits(14)
_my_node = getnode()
def sequential_uuid(node=None):
return uuid1(node=node, clock_seq=_my_clock_seq)
def alt_sequential_uuid(clock_seq=None):
return uuid1(node=_my_node, clock_seq=clock_seq)
if __name__ == '__main__':
from itertools import count
old_n = uuid1() # "Native"
old_s = sequential_uuid() # Sequential
native_conflict_index = None
t_0 = time.time()
for x in count():
new_n = uuid1()
new_s = sequential_uuid()
if old_n > new_n and not native_conflict_index:
native_conflict_index = x
if old_s >= new_s:
print("OOops: non-sequential results for `sequential_uuid()`")
break
if (x >= 10*0x3fff and time.time() - t_0 > 30) or (native_conflict_index and x > 2*native_conflict_index):
print('No issues for `sequential_uuid()`')
break
old_n = new_n
old_s = new_s
print(f'Conflicts for `uuid.uuid1()`: {bool(native_conflict_index)}')
print(f"Tries: {x}")
BUT if you are running some parallel processes on the same machine, then:
node
which defaults to uuid.get_node()
will be the same for all the processes;clock_seq
has small chance to be the same for some processes (chance of 1/16384)That might lead to conflicts! That is general concern for using
uuid.uuid1
in parallel processes on the same machine unless you have access to SafeUUID from Python3.7.
If you make sure to also set node
to unique value for each parallel process that runs this code, then conflicts should not happen.
Even if you are using SafeUUID, and set unique node
, it's still possible to have non-sequential ids if they are generated in different processes.
If some lock-related overhead is acceptable, then you can store clock_seq
in some external atomic storage (for example in "locked" file) and increment it with each call: this allows to have same value for node
on all parallel processes and also will make id-s sequential. For cases when all parallel processes are subprocesses created using multiprocessing
: clock_seq
can be "shared" using multiprocessing.Value
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With