I've migrated queue-triggered Azure Webjobs to Azure Functions. Based on my measurements the wait time to pluck messages off the queue is 5X to 60X+ (yes really) longer with the Functions.
In Webjob land, I observed that with BatchSize, NewBatchThreshold, and MaxPollingInterval at their defaults, queue wait times were generally sub-second.
With my Functions, I am seeing queue wait times often in excess of 45-60 seconds. There is a correlation between number of items in queue and wait times. If the number of items in the queue is low single digits, wait times are excessive, ie. 60 seconds plus. This is despite my trying many different combinations of BatchSize and NewBatchThreshold.
Some specific details:
To get some scientific measurements I instrumented my Functions to log time the message was queued and the time the message was retrieved from the queue in order to get the elapsed time. To further eliminate variables I created several completely empty functions - that is, the body of the queue triggered method contains nothing but the code to log the time. I saw massive wait times here as well.
If I take the queue triggered methods and copy and paste them into an Azure webjob, the queue wait times become 1 second or less.
Any guidance?
Not sure about Webjobs, but In Azure Functions the time between adding a message to the queue and the moment it's picked up varies - take a look at the details of the polling algorithm from the documentation:
The queue trigger implements a random exponential back-off algorithm to reduce the effect of idle-queue polling on storage transaction costs. The algorithm uses the following logic:
- When a message is found, the runtime waits two seconds and then checks for another message
- When no message is found, it waits about four seconds before trying again.
- After subsequent failed attempts to get a queue message, the wait time continues to increase until it reaches the maximum wait time, which defaults to one minute.
- The maximum wait time is configurable via the maxPollingInterval property in the host.json file. For local development the maximum polling interval defaults to two seconds.
Based on that, it seems you need to decrease the value of maxPollingInterval - it's 60 seconds by default, so in worst case, you can expect the maximum delay to be around that value. If you decrease it to X, the worst time between adding the message and dequeuing will be around X (probably a bit more due to different overheads)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With