Python 2.7.9 Django 1.7 MySQL 5.6
I would like to populate a whole bunch of object instances belonging to multiple classes, stack them up into a single create()
-like query, open a database connection, execute the query, then close. My main motivation is performance, but code compactness is also a plus.
The functionality of bulk_create()
appears to be exactly what I want, but I am in violation of at least one of the caveats listed here, i.e.
It does not work with many-to-many relationships.
and
It does not work with child models in a multi-table inheritance scenario.
These limitations are also described in the source code thus:
# So this case is fun. When you bulk insert you don't get the primary
# keys back (if it's an autoincrement), so you can't insert into the
# child tables which references this. There are two workarounds, 1)
# this could be implemented if you didn't have an autoincrement pk,
# and 2) you could do it by doing O(n) normal inserts into the parent
# tables to get the primary keys back, and then doing a single bulk
# insert into the childmost table. Some databases might allow doing
# this by using RETURNING clause for the insert query. We're punting
# on these for now because they are relatively rare cases.
But the error returned when I attempt it is the generic
ValueError: Can't bulk create an inherited model
My models do not apparently contain any many-to-many fields or foreign keys. It is not entirely clear to me what multi-table inheritance scenarios they are referring to, so I'm not sure if that is my problem. I was hoping I could slip by with my structure that looks like this but then I got the general error, so no dice:
child class with OneToOneField---\
\
child class with OneToOneField----->---concrete parent class
/
child class with OneToOneField---/
As far as the workarounds suggested in the source, #1 is not an option for me, and #2 does not look appealing because I assume it would entail sacrificing the gains in performance that I'm going for.
Are there other workarounds that could simulate bulk_create()
while handling inheritance like this and not forgo the gains in performance? Do I need to go back down to raw SQL? I would not mind making a separate collection and executing a separate INSERT
/create()
for each child object type.
The workaround I settled on was wrapping all of my collected create()
s in a with transaction.atomic():
. This greatly reduced running time by not opening any database connections or executing any queries until all of the Python had returned.
A downside could be that if any errors at all are encountered all changes are rolled back and the database is untouched. This could be remedied by chunking the create()
s into batches and opening and closing a transaction around each one. (In my case this was not the desired behavior because I wanted all of the data or none of it.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With