My scrape interval and evaluation interval are way off from each other as whown below (15s vs 4m). When I feed metrics to the endpoint, I find that the rules are evaluated every 4m which is expected. However, what I dont understand is that it does not evaluate rules on all the metrics fed for the last 4 minutes. I am having a hard time understanding on how the two clocks (scrape and evaluation) function. Also, the documentation around this is very sparse. Any pointers will be of great help. I have no hesitation in changing the scrape time and evaluation time to say 15 seconds each. But i need to understand the ramifications of setting the clocks apart.
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 4m # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- testmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
- "/etc/prometheus/xyz_rule.yml"
- "/etc/prometheus/pqr_rule.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
metrics_path: /v1/metrics/xyz
# scheme defaults to 'http'.
static_configs:
- targets: ['test:7070']
As mentioned, in general it's not advisable to use scrape intervals of more than 2 minutes in Prometheus (e.g. see here). This is due to the default staleness period of 5 minutes, which means that a scrape interval of 2 minutes allows for one failed scrape without the metrics being treated as stale.
In this case the global setting is to scrape every 15 seconds. The evaluation_interval option controls how often Prometheus will evaluate rules. Prometheus uses rules to create new time series and to generate alerts. The rule_files block specifies the location of any rules we want the Prometheus server to load.
Open up your Prometheus config and check the scrape_interval setting. We recommend sticking with the Prometheus default of 60s (DPM of 1) and adjusting per-job scrape intervals as needed.
Prometheus is configured to scrape metrics every 20 seconds, and the evaluation interval is 1 minute.
The two processes are independent, PromQL and recording rules both have no knowledge of what your scrape interval is. So whatever rule you specify will evaluate in the same way with the same result when evaluated at a given time, no matter what the evaluation interval is.
For simplicity and sanity it's best to have the two intervals the same, so I'd suggest having both as 15s here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With