I have around 400 000 object instances to insert in postgres. I am using bulk_create() to do so, but I face a Memory error.
My first idea was to chunk the list of instances:
def chunks(l, n):
n = max(1, n)
return [l[i:i + n] for i in range(0, len(l), n)]
for c in chunks(instances, 1000):
Feature.objects.bulk_create(c)
But sometimes that strategy also leads to Memory Error because instance's size can vary a lot, so a chunk could exceed the memory limit while others dont.
Is it possible to chunk the list of instances in order to have chunks of delimited size? What would be the best approach in this case?
If your are using Django in debug mode it will keep track on all your sql statements for debugging purposes. For many objects this may cause memory problems. You can reset that with:
from django import db
db.reset_queries()
see why-is-django-leaking-memory
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With