Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent django from loading objects in memory when using `delete()`?

I'm having memory issues because it looks like Django is loading the objects into memory when using delete(). Is there any way to prevent Django from doing that?

From the Django docs:

Django needs to fetch objects into memory to send signals and handle cascades. However, if there are no cascades and no signals, then Django may take a fast-path and delete objects without fetching into memory. For large deletes this can result in significantly reduced memory usage. The amount of executed queries can be reduced, too.

https://docs.djangoproject.com/en/1.8/ref/models/querysets/#delete

I don't use signals. I do have foreign keys on the model I'm trying to delete, but I don't see why Django would need to load the objects into memory. It looks like it does, because my memory is rising as the query runs.

like image 630
gitaarik Avatar asked Jul 17 '15 14:07

gitaarik


2 Answers

You can import django database connection and use it with sql to delete. I had exact same problem as you do and this helps me a lot. Here's some snippet(I'm using mysql by the way, but you can run any sql statement):

from django.db import connection
sql_query = "DELETE FROM usage WHERE date < '%s' ORDER BY date" % date
cursor = connection.cursor()
try:
    cursor.execute(sql_query)
finally:
    c.close()

This should execute only the delete operation on that table without affecting any of your model relationships.

like image 156
Shang Wang Avatar answered Oct 08 '22 06:10

Shang Wang


You can use a function like this to iterate over an huge number of objects without using too much memory:

import gc

def queryset_iterator(qs, batchsize = 500, gc_collect = True):
    iterator = qs.values_list('pk', flat=True).order_by('pk').distinct().iterator()
    eof = False
    while not eof:
        primary_key_buffer = []
        try:
            while len(primary_key_buffer) < batchsize:
                primary_key_buffer.append(iterator.next())
        except StopIteration:
            eof = True
        for obj in qs.filter(pk__in=primary_key_buffer).order_by('pk').iterator():
            yield obj
        if gc_collect:
            gc.collect()

Then you can use the function to iterate over the objects to delete:

for obj in queryset_iterator(HugeQueryset.objects.all()):
    obj.delete()

For more information you can check this blog post.

like image 31
Augusto Destrero Avatar answered Oct 08 '22 04:10

Augusto Destrero