Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django CRUD operations for multiple records - transaction.atomic vs bulk_create

Tags:

django

I have a Django 1.10 model, and a simple model:

I have a simple REST endpoint designed for testing purposes:

@api_view(['POST'])
@transaction.atomic
def r_test(request):
    for record in request.data:
        serializer = TestSerializer(data = record)
        if serializer.is_valid():
           serializer.save()

...that takes 9 seconds (too slow) to execute for one 100 records.

If I rewrite it the following way, it executes instantly.

@api_view(['POST'])
def r_test(request):
    obj_list = []
    for record in request.data:
       obj = Test(field1 = record['field1'])
       obj_list.append(obj)
    Test.objects.bulk_create(obj_list)

What bothers me is that I have read in many sources that wrapping function into a transaction (which I do by adding a decorator @transaction.atomic) would significantly improve insert operations in case of multiple operations. But I can't see this now.

So the question is, does only bulk_create() deliver super-fast speed for inserting big data, or there's something I am doing wrong with transaction.atomic?

Update: Moreover, I have ATOMIC_REQUESTS set to True in my settings. BTW, could it be that something is wrong in the settings? Like, say, Debug = True hinders Django from executing queries in a single transaction ?

Update 2 I have tried both ways with decorator and by wrapping the for loop in with transaction.atomic():. And still I observe instant execution only with bulk_create()

Update 3. My DB is MySQL

like image 812
Edgar Navasardyan Avatar asked Oct 19 '25 12:10

Edgar Navasardyan


2 Answers

Transactions do generally speed up the inserting process. Since you are already in a transaction due to ATOMIC_REQUESTS = True, you won't notice a difference when using @transaction.atomic(). The main reason transaction are faster is that committing takes time. Without a transaction, Django uses auto-commit mode, so every query results in a commit.

Transactions are not a magic bullet when it comes to performance. You are still performing 100 queries, and 100 roundtrips to the database. Even if your database runs on the same system, that is going to take some time. That's where bulk_create comes into play. It performs a single query to insert all the data at once. You've just saved yourself 99 database roundtrips. That is way more significant than any speedup caused by transactions.

like image 147
knbk Avatar answered Oct 22 '25 05:10

knbk


djangorestframework, right? I think you should improve your api function first:

def r_test(request):
    serializer = TestSerializer(data=request.data, many=True)
    if serializer.is_valid():
       serializer.save()

Try it again.

like image 27
gzerone Avatar answered Oct 22 '25 04:10

gzerone



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!