Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Celery versus Threading Library for running async requests [closed]

Tags:

I am running a python method that parses a lot of data. Since it is time intensive, I would like to run it asynchronously on a separate thread so the user can still access the website/UI.

Do threads using the "from threading import thread" module terminate if a user exits the site or do they continue to run on the server?

What would be the advantages of using Celery versus simply using the threading module for something like this?

like image 457
user3426600 Avatar asked May 22 '14 02:05

user3426600


People also ask

Does threading improve performance in Python?

Using the threading module in Python or any other interpreted language with a GIL can actually result in reduced performance. If your code is performing a CPU bound task, such as decompressing gzip files, using the threading module will result in a slower execution time.

Is Asyncio better than threading?

One of the cool advantages of asyncio is that it scales far better than threading . Each task takes far fewer resources and less time to create than a thread, so creating and running more of them works well. This example just creates a separate task for each site to download, which works out quite well.

How many CPUs or cores will the Python threading library take advantage of simultaneously?

How many CPUs (or cores) will the Python threading library take advantage of simultaneously? Python threading is restricted to a single CPU at one time. The multiprocessing library will allow you to run code on different processors.

Does Asyncio use threads or processes?

asyncio is essentially threading where not the CPU but you, as a programmer (or actually your application), decide where and when does the context switch happen. In Python you use an await keyword to suspend the execution of your coroutine (defined using async keyword).


1 Answers

The Python interpreter is resolutely single threaded due to the (in)famous global interpreter lock aka (GIL). Thus threading in Python only provides parallelism when computation and IO can happen simultaneously. Compute bound tasks will see little benefit from threading in the Python threading model, at least under CPython 2 or 3.

On the other hand, those limitations don't apply to multiprocessing, which is what you'll be doing with a queuing system like celery. You can run several worker instances of Python which can execute simultaneously on multicore machines, or multiple machines for that matter.

If I understand your scenario - an interaction on a web site kicks off a long-running job - queuing is almost certainly the way to go. You get true parallelism and the easy option to move processing to other machines.

like image 101
cbare Avatar answered Oct 21 '22 07:10

cbare