I have faced some issues with Prometheus memory alert. If I take the backup of Gitlab then memory usage going up to 95%. I want to snooze memory alert for a specific time.
e.g. If I am taking a backup at 2 AM then I need to snooze Prometheus memory alert. Is it possible?
As Marcelo said, there is no way to schedule a silence but if the backup is made at regular interval (say every night from 2am to 3am), you can include that in the alert expression.
- alert: OutOfMemory
expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 < 10 AND ON() absent(hour() >= 2 <= 3)
This can rapidly become tedious if you want to silence many rules (or if you want more complex schedules of inhibition). In that case, you can use inhibition rules of alert manager in the following way.
First step is to define an alert, in Prometheus, fired at the time you want the inhibition to take place:
- alert: BackupHours
expr: hour() >= 2 <= 3
for: 1m
labels:
notification: none
annotations:
description: 'This alert fires during backup hours to inhibit others'
Remember to add a route in alert manager to avoid notifying this alert:
routes:
- match:
notification: none
receiver: do_nothing
receivers:
- name: do_nothing
And then use inhibition rules to silence target rules during that time:
inhibit_rules:
- source_match:
alertname: BackupHours
target_match:
# here can be any other selection of alert
alertname: OutOfMemory
Note that it only works out of the box for UTC computation. If you need DST, it requires more boilerplate (with recording rules by example).
As a side note, if you are monitoring your backup process, you may already have a metric that indicate the backup is under way. If so, you could use this metrics to inhibit the other alerts and you wouldn't need to maintain a schedule.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With