I need to add a new column to a large (5m row) django table. I have a south schemamigration
that creates the new column. Now I'm writing a datamigration
script to populate the new column. It looks like this. (If you're not familiar with south migrations, just ignore the orm.
prefixing the model name.)
print "Migrating %s articles." % orm.Article.objects.count()
cnt = 0
for article in orm.Article.objects.iterator():
if cnt % 500 == 0:
print " %s done so far" % cnt
# article.newfield = calculate_newfield(article)
article.save()
cnt += 1
I switched from objects.all
to objects.iterator
to reduce memory requirements. But something is still chewing up vast memory when I run this script. Even with the actually useful line commented out as above, the script still grows to using 10+ GB of ram before getting very far through the table and I give up on it.
Seems like something is holding on to these objects in memory. How can I run this so it's not a memory hog?
FWIW, I'm using python 2.6, django 1.2.1, south 0.7.2, mysql 5.1.
Ensure settings.DEBUG
is set to False
. DEBUG=True
fills memory especially with database intensive operations, since it stores all queries sent to the RDBMS within a view.
With Django 1.8 out, it should not be necessary since a hardcoded max of 9000 queries are now stored, instead of an infinite number before.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With