We are using Application Insight and have added a couple of web tests to monitor our site. The web tests are all run from three locations every 5 minutes and all three locations need to fail within 5 minutes for the alert to go off.
Is there some report in Application Insights that we could use to report availability for the previous month to our customer? We would need the availability percentage with at least one decimal.
UPDATE: Based on the answer by @ZakiMa I ended up with the following query:
let lastmonthstart = startofmonth(now(), -1);
let lastmonthend = endofmonth(lastmonthstart);
availabilityResults
| where timestamp between(lastmonthstart .. lastmonthend)
| summarize failurecount=countif(success == 0), successcount=countif(success == 1) by name, bin(timestamp, 5m)
| project failure = iff(failurecount > 0 and successcount == 0, 1, 0), name, bin(timestamp, 5m)
| summarize totalFailures = sum(failure), totalTests = count(failure) by name
| project ["Name"] = name, ["SLA"] = todouble(totalTests - totalFailures) / todouble(totalTests) * 100
| order by ["SLA"]
You can calculate SLA using Application Insights Analytics. Click on "Analytics" menu item on your overview page and use the following query:
availabilityResults
| where timestamp > ago(30d)
| summarize _successCount=todouble(countif(success == 1)),
_errorCount=todouble(countif(success == 0)),
_totalCount=todouble(count()) by name
| project
["Name"] = name,
["SLA"] = _successCount / _totalCount * 100
| order by ["SLA"]
You should get something like this:
And then you can pin it to your dashboard (there is a pin icon in the top right corner):
This is just an example - here you can find full analytics query language reference - it's quite powerful: https://learn.microsoft.com/en-us/azure/application-insights/app-insights-analytics-reference. You can adjust to your definition of SLA.
EDIT: Here is an example of a query which is closer to your question
availabilityResults
| where timestamp > ago(30d)
// check whether location failed within 5m bin
| summarize _failure=iff(countif(success == 0)>0, 1, 0) by name, location, bin(timestamp, 5m)
// check whether all locations failed within 5m bin
| summarize _failureAll=iff(sum(_failure)>=3, 1, 0) by name, bin(timestamp, 5m)
// count all failed 5 minute bins and total number of bins
| summarize _failuresCount=sum(_failureAll), _totalCount=count() by name
| project ["Name"] = name,
["SLA"] = todouble(_totalCount - _failuresCount) / todouble(_totalCount) * 100
| order by ["SLA"]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With