Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prometheus cAdvisor docker monitoring

I've setup a docker monitoring stack using Prometheus, Grafana and cAdvisor. While using this query to get running containers:

count_scalar(container_last_seen{name=~container1|container2})

It picks up the containers allright, as soon as i launch a new container it is picked up right away. The problem is when a container is stopped or removed it does not pick it up, it still shows it as a running container.

From the cAdvisor/metrics endpoint it is removed as soon as the container stops.

Is there something wrong with the query?

(this is what i used for the stack: https://github.com/vegasbrianc/prometheus)

like image 932
A.Jac Avatar asked Jun 06 '17 09:06

A.Jac


People also ask

Does Prometheus use cAdvisor?

To monitor cAdvisor with Prometheus, simply configure one or more jobs in Prometheus which scrape the relevant cAdvisor processes at that metrics endpoint. For details, see Prometheus's Configuration documentation, as well as the Getting started guide.

What is cAdvisor in Prometheus?

cAdvisor with Prometheus and Grafana Prometheus is a metrics server that scrapes and stores metrics from different sources, including Kubernetes nodes and individual containers. Since you want to collect the metrics related to containers, you'll be using cAdvisor in this example.


1 Answers

It seems to be related to the amount of time cAdvisor stores the data in memory.

While cAdvisor keeps the data in memory, you still have a valid date in container_last_seen metric. So the count_scalar instruction still 'sees' the container as it has a valid value.

In my test setup, cAdvisor keeps the data during 5 minutes. After this duration, I get the right information out of your formula because the container_last_seen metric has disappeared.

You can change this cAdvisor configuration with the --storage_duration flag.

--storage_duration=2m0s: How long to store data.

As an alternative if you wan't quick alerting, you could also consider running a query that would compare last seen date with current date:

count_scalar(time()-container_last_seen{name=~"container1|container2"}<=60)
like image 161
François Maturel Avatar answered Sep 30 '22 12:09

François Maturel