Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregating save()s in Django?

I'm using Django with an sqlite backend, and write performance is a problem. I may graduate to a "proper" db at some stage, but for the moment I'm stuck with sqlite. I think that my write performance problems are probably related to the fact that I'm creating a large number of rows, and presumably each time I save() one it's locking, unlocking and syncing the DB on disk.

How can I aggregate a large number of save() calls into a single database operation?

like image 594
kdt Avatar asked Aug 03 '10 09:08

kdt


People also ask

What does aggregate do in Django?

When specifying the field to be aggregated in an aggregate function, Django will allow you to use the same double underscore notation that is used when referring to related fields in filters. Django will then handle any table joins that are required to retrieve and aggregate the related value.

How do I save multiple items in Django?

You are going in the right direction with the bulk_load. Generate a list of the PostsModel objects and then use bulk_create to upload them into the database. An important note here is that it won't work if the posts already exist in the database. For updating posts, try the bulk_update.

How use Django save method?

Whenever one tries to create an instance of a model either from admin interface or django shell, save() function is run. We can override save function before storing the data in the database to apply some constraint or fill some ready only fields like SlugField.

What is super save Django?

When you overwrite a function (of a class) you can call the function of the parent class using super . The save function in the models records the instance in the database. The first super(Review, self). save() is to obtain an id since it is generated automatically when an instance is saved in the database.


2 Answers

EDITED: commit_on_success is deprecated and was removed in Django 1.8. Use transaction.atomic instead. See Fraser Harris's answer.

Actually this is easier to do then you think. You can use transactions in Django. These batch database operations (specifically save, insert and delete) into one operation. I've found the easiest one to use is commit_on_success. Essentially you wrap your database save operations into a function and then use the commit_on_success decorator.

from django.db.transaction import commit_on_success  @commit_on_success def lot_of_saves(queryset):     for item in queryset:         modify_item(item)         item.save() 

This will have a huge speed increase. You'll also get the benefit of having roll-backs if any of the items fail. If you have millions of save operations then you may have to commit them in blocks using the commit_manually and transaction.commit() but I've rarely needed that.

Hope that helps,

Will

like image 192
JudoWill Avatar answered Oct 09 '22 14:10

JudoWill


New as of Django 1.6 is atomic, a simple API to control DB transactions. Copied verbatim from the docs:

atomic is usable both as a decorator:

from django.db import transaction  @transaction.atomic def viewfunc(request):     # This code executes inside a transaction.     do_stuff() 

and as a context manager:

from django.db import transaction  def viewfunc(request):     # This code executes in autocommit mode (Django's default).     do_stuff()      with transaction.atomic():         # This code executes inside a transaction.         do_more_stuff() 

Legacy django.db.transaction functions autocommit(), commit_on_success(), and commit_manually() have been deprecated and will be remove in Django 1.8.

like image 27
Fraser Harris Avatar answered Oct 09 '22 14:10

Fraser Harris