Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django - bulk_create() lead to memory error

I have around 400 000 object instances to insert in postgres. I am using bulk_create() to do so, but I face a Memory error.

My first idea was to chunk the list of instances:

def chunks(l, n):
    n = max(1, n)
    return [l[i:i + n] for i in range(0, len(l), n)]

for c in chunks(instances, 1000):
    Feature.objects.bulk_create(c)

But sometimes that strategy also leads to Memory Error because instance's size can vary a lot, so a chunk could exceed the memory limit while others dont.

Is it possible to chunk the list of instances in order to have chunks of delimited size? What would be the best approach in this case?

like image 236
Below the Radar Avatar asked Mar 16 '15 17:03

Below the Radar


Video Answer


1 Answers

If your are using Django in debug mode it will keep track on all your sql statements for debugging purposes. For many objects this may cause memory problems. You can reset that with:

from django import db
db.reset_queries()

see why-is-django-leaking-memory

like image 129
linqu Avatar answered Sep 22 '22 15:09

linqu