I've added a UUID field to some of my models and then migrated with South. Any new objects I create have the UUID field populated correctly. However the UUID fields on all my older data is null.
Is there any way to populate UUID data for existing data?
To add UUID values to all existing records first you will need to make sure your model has the UUID filed with blank=True, null=True
Then Run the schemamigration command with south and then open up the resulting migration file. And then Edit your migration file with the following as shown in this post
Quote:
You'll need to import the following import uuid
At the end of the forwards() function add the following def forwards(self, orm):
... for a in MyModel.objects.all(): a.uuid = u'' + str(uuid.uuid1().hex) a.save()
As stated that will loop through existing instances and add a uuid to it as part of the migration.
For the following sample class:
from django_extensions.db.fields import UUIDField
def MyClass:
uuid = UUIDField(editable=False, blank=True)
name = models.CharField()
If you're using South, create a data migration:
python ./manage.py datamigration <appname> --auto
And then use the following code to update the migration with the specific logic to add a UUID:
from django_extensions.utils import uuid
def forwards(self, orm):
for item in orm['mypp.myclass'].objects.all():
if not item.uuid:
item.uuid = uuid.uuid4() #creates a random GUID
item.save()
def backwards(self, orm):
for item in orm['mypp.myclass'].objects.all():
if item.uuid:
item.uuid = None
item.save()
You can create different types of UUIDs, each generated differently. the uuid.py module in Django-extensions has the complete list of the types of UUIDs you can create.
It's important to note that if you run this migration in an environment with a lot of objects, it has the potential to time out (for instance, if using fabric to deploy). An alternative method of filling in already existing fields will be required for production environments.
It's possible to run out of memory while trying to do this to a large number of objects (we found ourselves running out of memory and having the deployment fail with 17,000+ objects).
To get around this, you need to create a custom iterator in your migration (or stick it where it's really useful, and refer to it in your migration). It would look something like this:
def queryset_iterator(queryset, chunksize=1000):
import gc
pk = 0
last_pk = queryset.order_by('-pk')[0].pk
queryset=queryset.order_by('pk')
if queryset.count() < 1
return []
while pk < last_pk:
for row in queryset.filter(pk__gt=pk)[:chunksize]:
pk = row.pk
yield row
gc.collect()
And then your migrations would change to look like this:
class Migration(DataMigration):
def forwards(self, orm):
for item in queryset_iterator(orm['myapp.myclass'].objects.all()):
if not item.uuid:
item.uuid = uuid.uuid1()
item.save()
def backwards(self, orm):
for item in queryset_iterator(orm['myapp.myclass'].objects.all()):
if item.uuid:
item.uuid = None
item.save()
There is now an excellent, updated answer for Django 1.9 to this exact question in the Django docs.
Saved me a lot of time!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With