We're developing a distributed application in Python + Celery for our task queue.
Our application requires us to download emails from a remote ISP via IMAP (e.g.: gmail) and we're looking to have be able this task be done in parallel. For a given email account you're granted a limited to a number of simulations connections, so we need a way to atomically keep track of our active connections for all accounts being downloaded.
I've found multiple examples of atomic locks for Celery using Redis, but none that can keep track of a pool of limited resources like this, and all attempts to implement our own have resulted in difficult to debug race-conditions, causing our locks to intermittently never get released.
As celery uses the multiprocessing library for processes, you should be able to use the process safe multiprocessing.Semaphore([value])
.
You will want to create the semaphore upfront and pass it in, and you can set a default value equal to the maximum number of concurrent accesses you want to allow. Then acquire before your IMAP connection and release after you disconnect.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With