I'm using Celery to process multiple data-mining tasks. One of these tasks connects to a remote service which allows a maximum of 10 simultaneous connections per user (or in other words, it CAN exceed 10 connections globally but it CANNOT exceed 10 connections per individual job).
I THINK Token Bucket (rate limiting) is what I'm looking for, but I can't seem to find any implementation of it.
Celery features rate limiting, and contains a generic token bucket implementation.
Set rate limits for tasks: http://docs.celeryproject.org/en/latest/userguide/tasks.html#Task.rate_limit
Or at runtime:
http://docs.celeryproject.org/en/latest/userguide/workers.html#rate-limits
The token bucket implementation is in Kombu
After much research I found out that Celery does not explicitly provide a way to limit the number of concurrent instances like this and furthermore, doing so would generally be considered bad practice.
The better solution would be to download concurrently within a single task, and use Redis or Memcached to store and distribute for other tasks to process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With