I am trying to use Scale Rule for the first time and am trying to fiddle with "Storage Queue" resource. While trying to set options I see following options in "Time Aggregation": Minimum, Maximum, Average, Total and Last. What I learnt is that the value selected from TimeAggregation will be used for the specified "duration" and will be compared against the specified "Threshold". However I fail to understand what is the sampling interval for the data that it will try to do. Also consider the following example:
Say I have the following rule: If AppxMsgCount >=15, increase the instance count by 3 where, Threshold = 15 and time aggregation is set to "Average"
System state: AppxMsgCount = 20 Current Instance Count = 2
So the first time the auto scale gets kicked in with the above system state, the instance count gets increased to 5.
Now, with the increased instances AppxMsgCount supposed to come down? My hunch says it must be but then what should be the maths for it? Is it 20*2/5 ?
Second thing what does the other options mean here, and when should they be used - i.e. when should I be using total vs when should I be using average
It will be helpful if I can get some link for reference.
The properties work together as follows: the “statistic” of “metricName” will be calculated every “timeGrain”. Every “timeGrain”, autoscale will take the “timeAggregation” of the previous “timeWindow” amount of data and compare that with the “threshold” based on the “operator”. Using the specific example below, this means:
The average of percentage CPU will be calculated every minute. Every minute, autoscale will take the average of the previous 5 minutes of data and check if it’s greater than 60%. If it is, it will trigger the scale rule.
"rules": [{
"metricTrigger": {
"metricResourceUri": "[resourceId('Microsoft.Compute/virtualMachineScaleSets', 'myScaleSet')]",
"metricName": "Percentage CPU",
"timeGrain": "PT1M",
"statistic": "Average",
"timeWindow": "PT5M",
"timeAggregation": "Average",
"operator": "GreaterThan",
"threshold": 60
},
Note: for a single VM, percentage CPU is simply one number. However, in the case of a scale set, each VM reports a number for percentage CPU. To consolidate these, the scale set calculates the “statistic” across all of the VMs. For example, let’s imagine we had “statistic” as “max”, “timeGrain” as 1 minute, “timeAggregation” as “average”, and “timeWindow” as 5 minutes. This would mean that every minute, the scale set emits the max percentage CPU across all VMs in the scale set. For instance, if there were two VMs in the scale set, one running at 0% CPU, and another running at 90% CPU, for that minute, the scale set would emit a max of 90%. Autoscale would then average the last 5 minutes of these maxes and compare this to the threshold.
Hopefully this helps! It's a bit confusing, and the info is spread across different documentation pages, so I put together a quickstart blog on the basics of autoscaling scale sets here: https://negatblog.wordpress.com/2018/07/06/autoscaling-scale-sets-based-on-metrics/. Hopefully it's useful :)
Cheers, Neil
The link below is a link to the Microsoft Documentation for best practices when it comes to autoscale
: https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/insights-autoscale-best-practices
However, I will attempt to answer your above questions also.
1) I believe that you can set a scale in option for when to decrease back down to a smaller number of instances. In previous versions, where the scale in option wasn't present, I believe it scaled back to the initial number of instances once the limit is no longer being triggered. i.e. below the 15 ApxMsgCount
2) Not quite sure on this one, in my experience i have always used a Total metric rather than Average, however i think Average is best used over a specific time frame.
Hopefully this helps somewhat
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With