Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A semaphore-like mechanism for Celery

We're developing a distributed application in Python + Celery for our task queue.

Our application requires us to download emails from a remote ISP via IMAP (e.g.: gmail) and we're looking to have be able this task be done in parallel. For a given email account you're granted a limited to a number of simulations connections, so we need a way to atomically keep track of our active connections for all accounts being downloaded.

I've found multiple examples of atomic locks for Celery using Redis, but none that can keep track of a pool of limited resources like this, and all attempts to implement our own have resulted in difficult to debug race-conditions, causing our locks to intermittently never get released.

like image 511
NFicano Avatar asked May 03 '11 22:05

NFicano


1 Answers

As celery uses the multiprocessing library for processes, you should be able to use the process safe multiprocessing.Semaphore([value]).

You will want to create the semaphore upfront and pass it in, and you can set a default value equal to the maximum number of concurrent accesses you want to allow. Then acquire before your IMAP connection and release after you disconnect.

like image 60
underrun Avatar answered Sep 29 '22 03:09

underrun