I'm writing a command to randomly create 5M orders in a database.
def constrained_sum_sample(
number_of_integers: int, total: Optional[int] = 5000000
) -> int:
"""Return a randomly chosen list of n positive integers summing to total.
Args:
number_of_integers (int): The number of integers;
total (Optional[int]): The total sum. Defaults to 5000000.
Yields:
(int): The integers whose the sum is equals to total.
"""
dividers = sorted(sample(range(1, total), number_of_integers - 1))
for i, j in zip(dividers + [total], [0] + dividers):
yield i - j
def create_orders():
customers = Customer.objects.all()
number_of_customers = Customer.objects.count()
for customer, number_of_orders in zip(
customers,
constrained_sum_sample(number_of_integers=number_of_customers),
):
for _ in range(number_of_orders):
create_order(customer=customer)
number_of_customers
will be at least greater than 1k and the create_order
function does at least 5 db operations (one to create the order, one to randomly get the order's store, one to create the order item (and this can go up to 30, also randomly), one to get the item's product (or higher but equals to the item) and one to create the sales note.
As you may suspect this take a LONG time to complete. I've tried, unsuccessfully, to perform these operations asynchronously. All of my attempts (dozen at least; most of them using sync_to_async
) have raised the following error:
SynchronousOnlyOperation you cannot call this from an async context - use a thread or sync_to_async
Before I continue to break my head, I ask: is it possible to achieve what I desire? If so, how should I proceed?
Thank you very much!
Django has support for writing asynchronous (“async”) views, along with an entirely async-enabled request stack if you are running under ASGI. Async views will still work under WSGI, but with performance penalties, and without the ability to have efficient long-running requests.
An async view function in Django is detected by the annotation async def , which then runs the async view in a thread within its own event loop. This gives the benefit of being able to do and run tasks concurrently inside the async views.
async_to_sync turns an awaitable into a synchronous callable, and asyncio. run executes a coroutine and return the result. According to documentation, a callable from async_to_sync works in a subthread.
Django Q is a native Django task queue, scheduler and worker application using Python multiprocessing.
Django 3.1 has officially asynchronous support for views and middleware however if you try to call ORM within async function you will get SynchronousOnlyOperation.
if you need to call DB from async function they have provided helpers utils like: async_to_sync and sync_to_async to change between threaded or coroutine mode as follows:
from asgiref.sync import sync_to_async
results = await sync_to_async(Blog.objects.get, thread_sensitive=True)(pk=123)
#settings.py
DJANGO_ALLOW_ASYNC_UNSAFE=True
The reason this is needed in Django is that many libraries, specifically database adapters, require that they are accessed in the same thread that they were created in. Also a lot of existing Django code assumes it all runs in the same thread, e.g. middleware adding things to a request for later use in views.
More fun news in the release notes: https://docs.djangoproject.com/en/3.1/topics/async/
It's possible to achieve what you desire, however you need a different perspective to solve this problem.
Try using asynchronous workers, and a simple one would be rq workers or celery.
Use one of these libraries to process async long-running tasks defined in django in different threads or processes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With