I have a Django 1.10 model, and a simple model:
I have a simple REST endpoint designed for testing purposes:
@api_view(['POST'])
@transaction.atomic
def r_test(request):
for record in request.data:
serializer = TestSerializer(data = record)
if serializer.is_valid():
serializer.save()
...that takes 9 seconds (too slow) to execute for one 100 records.
If I rewrite it the following way, it executes instantly.
@api_view(['POST'])
def r_test(request):
obj_list = []
for record in request.data:
obj = Test(field1 = record['field1'])
obj_list.append(obj)
Test.objects.bulk_create(obj_list)
What bothers me is that I have read in many sources that wrapping function into a transaction (which I do by adding a decorator @transaction.atomic) would significantly improve insert operations in case of multiple operations. But I can't see this now.
So the question is, does only bulk_create() deliver super-fast speed for inserting big data, or there's something I am doing wrong with transaction.atomic?
Update: Moreover, I have ATOMIC_REQUESTS set to True in my settings. BTW, could it be that something is wrong in the settings? Like, say, Debug = True hinders Django from executing queries in a single transaction ?
Update 2 I have tried both ways with decorator and by wrapping the for loop in with transaction.atomic():. And still I observe instant execution only with bulk_create()
Update 3. My DB is MySQL
Transactions do generally speed up the inserting process. Since you are already in a transaction due to ATOMIC_REQUESTS = True, you won't notice a difference when using @transaction.atomic(). The main reason transaction are faster is that committing takes time. Without a transaction, Django uses auto-commit mode, so every query results in a commit.
Transactions are not a magic bullet when it comes to performance. You are still performing 100 queries, and 100 roundtrips to the database. Even if your database runs on the same system, that is going to take some time. That's where bulk_create comes into play. It performs a single query to insert all the data at once. You've just saved yourself 99 database roundtrips. That is way more significant than any speedup caused by transactions.
djangorestframework, right? I think you should improve your api function first:
def r_test(request):
serializer = TestSerializer(data=request.data, many=True)
if serializer.is_valid():
serializer.save()
Try it again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With