I have a web service with a load balancer that maps requests to multiple machines. Each of these requests end up sending a http call to an external API, and for that reason I would like to rate limit the number of requests I send to the external API.
My current design:
This doesn't work when I'm using multiple machines, because each machine has its own queue and rate limiter. For example: when I set my rate limiter to 10,000 requests/day, and I use 10 machines, I will end up processing 100,000 requests/day at full load because each machine processes 10,000 requests/day. I would like to rate limit so that only 10,000 requests get processed/day, while still load balancing those 10,000 requests.
I'm using Java and MYSQL.
To enforce rate limiting, first understand why it is being applied in this case, and then determine which attributes of the request are best suited to be used as the limiting key (for example, source IP address, user, API key). After you choose a limiting key, a limiting implementation can use it to track usage.
A better way: randomization While the reset option is one way to deal with rate limiting, you may want more granular control over your rate limit handling. For example, you might have a specific retry timeframe that you want to follow and not wait for the minute window to pass for your entire rate limit to be reset.
Leaky Bucket This algorithm is closely related to Token Bucket, where it provides a simple approach to rate limiting via a queue which you can think of as a bucket holding the requests.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With