I'm using Django with an sqlite backend, and write performance is a problem. I may graduate to a "proper" db at some stage, but for the moment I'm stuck with sqlite. I think that my write performance problems are probably related to the fact that I'm creating a large number of rows, and presumably each time I save()
one it's locking, unlocking and syncing the DB on disk.
How can I aggregate a large number of save()
calls into a single database operation?
When specifying the field to be aggregated in an aggregate function, Django will allow you to use the same double underscore notation that is used when referring to related fields in filters. Django will then handle any table joins that are required to retrieve and aggregate the related value.
You are going in the right direction with the bulk_load. Generate a list of the PostsModel objects and then use bulk_create to upload them into the database. An important note here is that it won't work if the posts already exist in the database. For updating posts, try the bulk_update.
Whenever one tries to create an instance of a model either from admin interface or django shell, save() function is run. We can override save function before storing the data in the database to apply some constraint or fill some ready only fields like SlugField.
When you overwrite a function (of a class) you can call the function of the parent class using super . The save function in the models records the instance in the database. The first super(Review, self). save() is to obtain an id since it is generated automatically when an instance is saved in the database.
EDITED: commit_on_success
is deprecated and was removed in Django 1.8. Use transaction.atomic
instead. See Fraser Harris's answer.
Actually this is easier to do then you think. You can use transactions in Django. These batch database operations (specifically save, insert and delete) into one operation. I've found the easiest one to use is commit_on_success
. Essentially you wrap your database save operations into a function and then use the commit_on_success
decorator.
from django.db.transaction import commit_on_success @commit_on_success def lot_of_saves(queryset): for item in queryset: modify_item(item) item.save()
This will have a huge speed increase. You'll also get the benefit of having roll-backs if any of the items fail. If you have millions of save operations then you may have to commit them in blocks using the commit_manually
and transaction.commit()
but I've rarely needed that.
Hope that helps,
Will
New as of Django 1.6 is atomic, a simple API to control DB transactions. Copied verbatim from the docs:
atomic is usable both as a decorator:
from django.db import transaction @transaction.atomic def viewfunc(request): # This code executes inside a transaction. do_stuff()
and as a context manager:
from django.db import transaction def viewfunc(request): # This code executes in autocommit mode (Django's default). do_stuff() with transaction.atomic(): # This code executes inside a transaction. do_more_stuff()
Legacy django.db.transaction
functions autocommit()
, commit_on_success()
, and commit_manually()
have been deprecated and will be remove in Django 1.8.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With