Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Excluding primary key in Django dumpdata with natural keys

How do you exclude the primary key from the JSON produced by Django's dumpdata when natural keys are enabled?

I've constructed a record that I'd like to "export" so others can use it as a template, by loading it into a separate databases with the same schema without conflicting with other records in the same model.

As I understand Django's support for natural keys, this seems like what NKs were designed to do. My record has a unique name field, which is also used as the natural key.

So when I run:

from django.core import serializers
from myapp.models import MyModel
obj = MyModel.objects.get(id=123)
serializers.serialize('json', [obj], indent=4, use_natural_keys=True)

I would expect an output something like:

[
    {
        "model": "myapp.mymodel", 
        "fields": {
            "name": "foo", 
            "create_date": "2011-09-22 12:00:00", 
            "create_user": [
                "someusername"
            ]
        }
    }
]

which I could then load into another database, using loaddata, expecting it to be dynamically assigned a new primary key. Note, that my "create_user" field is a FK to Django's auth.User model, which supports natural keys, and it output as its natural key instead of the integer primary key.

However, what's generated is actually:

[
    {
        "pk": 123,
        "model": "myapp.mymodel", 
        "fields": {
            "name": "foo", 
            "create_date": "2011-09-22 12:00:00", 
            "create_user": [
                "someusername"
            ]
        }
    }
]

which will clearly conflict with and overwrite any existing record with primary key 123.

What's the best way to fix this? I don't want to retroactively change all the auto-generated primary key integer fields to whatever the equivalent natural keys are, since that would cause a performance hit as well as be labor intensive.

Edit: This seems to be a bug that was reported...2 years ago...and has largely been ignored...

like image 793
Cerin Avatar asked Feb 24 '12 19:02

Cerin


2 Answers

Updating the answer for anyone coming across this in 2018 and beyond.

There is a way to omit the primary key through the use of natural keys and unique_together method. Taken from the Django documentation on serialization:

You can use this command to test :

python manage.py dumpdata app.model --pks 1,2,3 --indent 4 --natural-primary --natural-foreign > dumpdata.json ;

Serialization of natural keys

So how do you get Django to emit a natural key when serializing an object? Firstly, you need to add another method – this time to the model itself:

class Person(models.Model):
    objects = PersonManager()

    first_name = models.CharField(max_length=100)
    last_name = models.CharField(max_length=100)
    birthdate = models.DateField()

    def natural_key(self):
        return (self.first_name, self.last_name)

    class Meta:
        unique_together = (('first_name', 'last_name'),)

That method should always return a natural key tuple – in this example, (first name, last name). Then, when you call serializers.serialize(), you provide use_natural_foreign_keys=True or use_natural_primary_keys=True arguments:

serializers.serialize('json', [book1, book2], indent=2, use_natural_foreign_keys=True, use_natural_primary_keys=True)

When use_natural_foreign_keys=True is specified, Django will use the natural_key() method to serialize any foreign key reference to objects of the type that defines the method.

When use_natural_primary_keys=True is specified, Django will not provide the primary key in the serialized data of this object since it can be calculated during deserialization:

    {
        "model": "store.person",
        "fields": {
            "first_name": "Douglas",
            "last_name": "Adams",
            "birth_date": "1952-03-11",
        }
    }
like image 166
Tomiwa Avatar answered Nov 06 '22 07:11

Tomiwa


The problem with json is that you can't omit the pk field since it will be required upon loading of the fixture data again. If not existing, json will fail with

$ python manage.py loaddata some_data.json
[...]
File ".../django/core/serializers/python.py", line 85, in Deserializer
data = {Model._meta.pk.attname : Model._meta.pk.to_python(d["pk"])}
KeyError: 'pk'

As pointed out in the answer to this question, you can use yaml or xml if you really want to omit the pk attribute OR just replace the primary key value with null.

import re
from django.core import serializers

some_objects = MyClass.objects.all()
s = serializers.serialize('json', some_objects, use_natural_keys=True)
# Replace id values with null - adjust the regex to your needs
s = re.sub('"pk": [0-9]{1,5}', '"pk": null', s)
like image 9
lutuh Avatar answered Nov 06 '22 07:11

lutuh