Using python async / await with django restframework

Tags:

I am just upgrading an older project to Python 3.6, and found out that there are these cool new async / await keywords.

My project contains a web crawler, that is not very performant at the moment, and takes about 7 mins to complete. Now, since I have django restframework in place already to access data of my django application, I thought it would be nice to have a REST endpoint where I could start the crawler from remote with a simple POST request.

However, I don't want the client to synchronously wait for the crawler to complete. I just want to straight away send him the message that the crawler has been started and start the crawler in the background.

from rest_framework import status
from rest_framework.decorators import api_view
from rest_framework.response import Response
from django.conf import settings
from mycrawler import tasks

async def update_all_async(deep_crawl=True, season=settings.CURRENT_SEASON, log_to_db=True):
    await tasks.update_all(deep_crawl, season, log_to_db)


@api_view(['POST', 'GET'])
def start(request):
    """
    Start crawling.
    """
    if request.method == 'POST':
        print("Crawler: start {}".format(request))

        deep = request.data.get('deep', False)
        season = request.data.get('season', settings.CURRENT_SEASON)

        # this should be called async
        update_all_async(season=season, deep_crawl=deep)

        return Response({"Success": {"crawl finished"}}, status=status.HTTP_200_OK)
    else:
        return Response ({"description": "Start the crawler by calling this enpoint via post.", "allowed_parameters": {
            "deep": "boolean",
            "season": "number"
        }}, status.HTTP_200_OK)

I have read some tutorials, also how to use the loops and stuff, but I don't really get it... Where should I start the loop in this case?

[EDIT] 20/10/2017:

I solved it using threading for now, since it really is a "fire and forget" task. However, I still would like to know how to achieve the same thing using async / await.

Here's my current solution:

import threading


@api_view(['POST', 'GET'])
def start(request):
    ...
    t = threading.Thread(target=tasks.update_all, args=(deep, season))
    t.start()
    ...

835

asked Oct 18 '17 22:10

platzhersh

1 Answers

This is possible in Django 3.1+, after introducing asynchronous support.

Regarding the asynchronous running loop, you can make use of it by running Django with uvicorn or any other ASGI server instead of gunicorn or other WSGI servers. The difference is that when using an ASGI server, there's already a running loop, while you would need to create one when using WSGI. With ASGI, you can simply define async functions directly under views.py or its View Classes's inherited functions.

Assuming you go with ASGI, you have multiple ways of achieving this, I'll describe a couple (other options could make use of asyncio.Queue for example):

Make start() async

By making start() async, you can make direct use of the existing running loop, and by using asyncio.Task, you can fire and forget into the existing running loop. And if you want to fire but remember, you can create another Task to follow up on this one, i.e.:

from rest_framework import status
from rest_framework.decorators import api_view
from rest_framework.response import Response
from django.conf import settings
from mycrawler import tasks

import asyncio

async def update_all_async(deep_crawl=True, season=settings.CURRENT_SEASON, log_to_db=True):
    await tasks.update_all(deep_crawl, season, log_to_db)

async def follow_up_task(task: asyncio.Task):
    await asyncio.sleep(5) # Or any other reasonable number, or a finite loop...
    if task.done():
        print('update_all task completed: {}'.format(task.result()))
    else:
        print('task not completed after 5 seconds, aborting')
        task.cancel()


@api_view(['POST', 'GET'])
async def start(request):
    """
    Start crawling.
    """
    if request.method == 'POST':
        print("Crawler: start {}".format(request))

        deep = request.data.get('deep', False)
        season = request.data.get('season', settings.CURRENT_SEASON)

        # Once the task is created, it will begin running in parallel
        loop = asyncio.get_running_loop()
        task = loop.create_task(update_all_async(season=season, deep_crawl=deep))

        # Fire up a task to track previous down
        loop.create_task(follow_up_task(task))

        return Response({"Success": {"crawl finished"}}, status=status.HTTP_200_OK)
    else:
        return Response ({"description": "Start the crawler by calling this enpoint via post.", "allowed_parameters": {
            "deep": "boolean",
            "season": "number"
        }}, status.HTTP_200_OK)

async_to_sync

Sometimes you can't just have an async function to route the request to in the first place, as it happens with DRF (as of today). For this, Django provides some useful async adapter functions, but be aware that switching from sync to async context or vice versa, comes with a small performance penalty of approximately 1ms. Note that this time, the running loop as gathered in the update_all_sync function instead:

from rest_framework import status
from rest_framework.decorators import api_view
from rest_framework.response import Response
from django.conf import settings
from mycrawler import tasks

import asyncio
from asgiref.sync import async_to_sync

@async_to_sync
async def update_all_async(deep_crawl=True, season=settings.CURRENT_SEASON, log_to_db=True):
    #We can use the running loop here in this use case
    loop = asyncio.get_running_loop()
    task = loop.create_task(tasks.update_all(deep_crawl, season, log_to_db))
    loop.create_task(follow_up_task(task))

async def follow_up_task(task: asyncio.Task):
    await asyncio.sleep(5) # Or any other reasonable number, or a finite loop...
    if task.done():
        print('update_all task completed: {}'.format(task.result()))
    else:
        print('task not completed after 5 seconds, aborting')
        task.cancel()


@api_view(['POST', 'GET'])
def start(request):
    """
    Start crawling.
    """
    if request.method == 'POST':
        print("Crawler: start {}".format(request))

        deep = request.data.get('deep', False)
        season = request.data.get('season', settings.CURRENT_SEASON)

        # Make update all "sync"
        sync_update_all_sync = async_to_sync(update_all_async)
        sync_update_all_sync(season=season, deep_crawl=deep)

        return Response({"Success": {"crawl finished"}}, status=status.HTTP_200_OK)
    else:
        return Response ({"description": "Start the crawler by calling this enpoint via post.", "allowed_parameters": {
            "deep": "boolean",
            "season": "number"
        }}, status.HTTP_200_OK)

In both cases, the function will quickly return the 200, but technically the 2nd option is slower.

IMPORTANT: When using Django, it is common to have DB operations involved in these async operations. DB operations in Django can only be synchronous, at least for now, so you will have to consider this in asynchronous contexts. sync_to_async() becomes very handy for these cases.

answered Sep 20 '22 05:09

castel

Related questions
                            
                                Soft delete django database objects
                            
                                Django just sort ListView by date
                            
                                ValueError: path is on mount 'C:', start on mount 'F:' while django migrations in windows
                            
                                Django image src not found
                            
                                Django-tinymce full featured in admin
                            
                                Django model count() with caching
                            
                                Django year validation returns "Ensure this value is less than or equal to 2016" in year 2017
                            
                                Django: Serializing a list of multiple, chained models
                            
                                Django Channels. How to respond to a WebSocket open request with a subprotocol?
                            
                                Remove blank "---------" from RadioSelect
                            
                                Sending notification to GCM from Django via FCM token
                            
                                How to get Database details from settings.py
                            
                                Using filters with list_routes in drf
                            
                                Use Custom Filter with Django ModelChoice Filter
                            
                                Python ldap macOS - ValueError: option error
                            
                                Django: Temporarily redirect all URLs to one view
                            
                                CommandError: No fixture named 'myapp' found
                            
                                Django AttributeError: 'NoneType' object has no attribute 'has_header'
                            
                                Django verbose_name attribute on serializers
                            
                                Is it possible to change the 'migrations' folder's location outside of the Django project?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using python async / await with django restframework

Tags:

asynchronous

python-3.x

django

platzhersh

People also ask

1 Answers

castel

Recent Activity

Donate For Us