I'm monitoring several containers using Prometheus, cAdvisor and Prometheus Alertmanager. What I want is to get an alert if a container goes down for some reason. Problem is if a container dies there is no metrics collected by the cAdvisor. Any query returns 'no data' since there are no matches for the query.
Take a look at Prometheus function absent()
absent(v instant-vector) returns an empty vector if the vector passed to it has any elements and a 1-element vector with the value 1 if the vector passed to it has no elements.
This is useful for alerting on when no time series exist for a given metric name and label combination.
examples:
absent(nonexistent{job="myjob"}) => {job="myjob"}
absent(nonexistent{job="myjob",instance=~".*"}) => {job="myjob"}
absent(sum(nonexistent{job="myjob"})) => {}
here is an example for an alert:
ALERT kibana_absent
IF absent(container_cpu_usage_seconds_total{com_docker_compose_service="kibana"})
FOR 5s
LABELS {
severity="page"
}
ANNOTATIONS {
SUMMARY= "Instance {{$labels.instance}} down",
DESCRIPTION= "Instance= {{$labels.instance}}, Service/Job ={{$labels.job}} is down for more than 5 sec."
}
I use a small tool called Docker Event Monitor that runs as a container on the Docker host and sends out alerts to Slack, Discord or SparkPost if certain events are triggered. You can configure which events trigger alerts.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With