I want to check if a certain metric is not available in prometheus for 5 minute.
I am using absent(K_KA_GCPP) and giving a 5 minute threshold. But it seems I cannot group the absent function on certain labels like Site Id.
Absent works if the metric is not available for all 4 site Ids. I want to find out if the metric is not available or absent for 1 site id out of all 4 and I don't want to hardcode the site Id labels in the query, it should be generic. Is there any way I can do that?
I was able to achieve this by doing something like this:
count(up{job="prometheus"} offset 1h) by (project) unless count(up{job="prometheus"} ) by (project)
If the metric is missing in the last 1 hour, it will trigger an alert.
You can add any labels you need after the by section (that's helpful in altering for example).
Source: Prometheus Alert for missing metrics and labels
The offset I feel like is a great starting point, but it has a big weakness. If there's no sample in the time - offset then your query doesn't return what you'd like to.
I reworked the answer from Ahmed to this:
group(present_over_time(myMetric{label1="asd"}[3h])) by (labels) unless group(myMetric{label1="asd"}) by (labels)
present_over_time should fix that aforementioned problemgroup() aggregation, since you don't need the valueup{} is a state of the scraped target, not the "metric is present" information which I feel might not be equivalentIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With