Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to maximize WebJob CPU Usage

I have an azure storage queue that has over 100,000 queue items on it. The average processing time is about 1 minute to complete each item (as reported in the WebJob dashboard).

I have set the max batch size for my webJob to be 32 like this:

JobHostConfiguration config = new JobHostConfiguration();
config.Queues.BatchSize = 32;

var host = new JobHost(config);
// The following code ensures that the WebJob will be running continuously
host.RunAndBlock();

If I set it any higher than 32 the webjob won't start and keeps flipping between (pending restart and starting) so I assume 32 is the max batch size.

However, my app service plan is running with a cool 4% CPU utilization. I have enabled auto-scale based on CPU usage.

What I want to do is figure out how to make the web job do more tasks in parallel so it can start using more of that CPU usage if it needs it and hopefully cause it to auto scale and then process more. What levers can I pull to make my WebJob take better advantage of my App Service Plan instances?

like image 909
emseetea Avatar asked Feb 28 '16 03:02

emseetea


2 Answers

Note that the BatchSize maximum of 32 is a limit imposed by Azure Queues that the WebJobs SDK doesn't control. A single queue listener can only pull a maximum of 32 messages at a time because that’s all queues allow. That's why your job is not starting properly when you set it greater than 32 - if you check your error logs you should see an error to that effect.

However, there is a second config knob that relates to parallel throughput that you can also configure. See config.Queues.NewBatchThreshold. This value defaults to half the BatchSize when not explicitly set. Basically, this setting is the threshold that governs when a new batch will be fetched. So if you increase this value (say setting it to 100), more queue messages will be processed in parallel. If set to 100, when the number of messages being processed dips below 100, a new batch will be fetched.

You can also further increase throughput by scaling out your job to multiple instances. I recommend trying the NewBatchThreshold setting first and see where that gets you.

like image 58
mathewc Avatar answered Nov 17 '22 18:11

mathewc


This comment in the code explains the situation:

    // Azure Queues currently limits the number of messages retrieved to 32. We enforce this constraint here because
    // the runtime error message the user would receive from the SDK otherwise is not as helpful.
    private const int MaxBatchSize = 32;

More information about this can be found on https://azure.microsoft.com/en-us/documentation/articles/storage-dotnet-how-to-use-queues/:

There are two ways you can customize message retrieval from a queue. First, you can get a batch of messages (up to 32). [etc...]

So that's where this limit is coming from. However, I'm thinking that the WebJobs SDK could theoretically process multiple queue batches at the same time, so it doesn't have to be bound to this Storage Queue limitation. That's something that you should bring up on https://github.com/Azure/azure-webjobs-sdk/issues for further discussion to see what can be done. But as it stands, that is indeed the limitation.

like image 21
David Ebbo Avatar answered Nov 17 '22 17:11

David Ebbo