Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does a Django UUIDField generate a UUID in Postgresql?

After reading this blog post https://blog.starkandwayne.com/2015/05/23/uuid-primary-keys-in-postgresql/

I wanted to know more about how Django generates uuid because I am using them as my pk. Well, according to the docs, https://docs.djangoproject.com/es/1.9/ref/models/fields/#uuidfield, Django is relying on the Python UUID module https://docs.python.org/3/library/uuid.html#uuid.UUID. But there are many kinds of UUID, and it is not at all clear to me which one is being generated in Django, or how to chose, assuming a choice is available.

Finally, given the fragmentation issue pointed out in the blog post, and assuming uuid_generate_v1mc is not available directly in Python or Django, is there a way to force them to use it?

like image 838
Malik A. Rumi Avatar asked Feb 04 '16 20:02

Malik A. Rumi


People also ask

Can Postgres generate UUID?

Unfortunately, while PostgreSQL is great for storing and comparing UUID data, it lacks capabilities for creating UUID values in its core. Instead, it relies on third-party modules to create UUIDs using specified techniques.

Is Django unique UUID?

UUIDField is a special field to store universally unique identifiers.


1 Answers

  • How does Django and or Python generate a UUID in Postgresql?

  • But there are many kinds of UUID, and it is not at all clear to me which one is being generated in Django

When you use UUIDField as a primary key in Django, it doesn't generate a UUID one for you, you generate it yourself before you save the object

I don't know if things have changed since, but last time I have used a UUIDField, you had to specify the UUID value yourself (e.g. when you create the object, Django won't let you save an object with a blank UUID and have the database generate one). Looking at the Django documentation samples reinforces my thought, because they provide a default=uuid.uuid4() e.g. in the primary key.

class MyUUIDModel(models.Model):
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
                                                    ^
                                                    |__ calls uuid.uuid4() 

Which UUID version to choose

For a comparison of the properties of the different UUID versions please see this question: Which UUID version to use?

For a lot of applications, UUID4 is just fine

If you just want to generate a UUID and get on with your life, uuid.uuid4() like the snippet above is just fine. UUID4 is a random UUID and the chances of a collision are so remote that you don't really need to worry about, especially if you're not generating a ton of them per second.

Finally, given the fragmentation issue pointed out in the blog post, and assuming uuid_generate_v1mc is not available directly in Python or Django, is there a way to force them to use it?

A Python UUID1 with random MAC address, like uuid-ossp's uuid_generate_v1mc

The blog you linked mentions the use of UUID1. Python's uuid.uuid1() takes a parameter that is used instead of the default real hardware MAC address (48 bits). Because these random bits are the end of the UUID1, the first bits of the UUID1 can be sequential/timestamp-based to limit the index fragmentation.

So

uuid.uuid1(random_48_bits)

Should get you similar results as uuid_generate_v1mc, which is a UUID1 with a random MAC address.

To generate a random 48 bits, as a dummy example we can use:

import random
random_48_bits = random.randint(0, 2**48 - 1)

Try it:

>>> import uuid
>>> import random
>>> 2 ** 48 - 1
281474976710655
>>> uuid.uuid1(random.randint(0, 281474976710655))
UUID('c5ecbde1-cbf4-11e5-a759-6096cb89d9a5')

Now make a function out of it, and use it as the default for your Django UUIDField

Custom UUIDs, and an example from Instagram

Note that it's totally fine to come up with your custom UUID scheme, and use the available bits to encode information that can be useful to your application.

E.g. you may use a few bits to encode the country of a given user, a few bits with a timestamp, some bits for randomness etc.

You may want to read how Instagram (built on Django and PostgreSQL) cooked up their own UUID scheme to help with sharding.

like image 143
bakkal Avatar answered Oct 13 '22 08:10

bakkal