Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I get a pod's (milli)core CPU usage with Prometheus in Kubernetes?

I run a v1.9.2 custom setup of Kubernetes and scrape various metrics with Prometheus v2.1.0. Among others, I scrape the kubelet and cAdvisor metrics.

I want to answer the question: "How much of the CPU resources defined by requests and limits in my deployment are actually used by a pod (and its containers) in terms of (milli)cores?"

There are a lot of scraped metrics available, but nothing like that. Maybe it could be calculated by the CPU usage time in seconds, but I don't know how.

I was considering it's not possible - until a friend told me she runs Heapster in her cluster which has a graph in the built-in Grafana that tells exactly that: It shows the indivual CPU usage of a pod and its containers in (milli)cores.

Since Heapster also uses kubelet and cAdvisor metrics, I wonder: how can I calculate the same? The metric in InfluxDB is named cpu/usage_rate but even with Heapster's code, I couldn't figure out how they calculate it.

Any help is appreciated, thanks!

like image 608
Alex Avatar asked Feb 19 '18 18:02

Alex


People also ask

How do you check CPU and memory utilization in Kubernetes pods?

Check Memory Usage of Kubernetes P You can launch it by using the application search bar or by using the shortcut key of “Ctrl+Alt+T”. By using any of these approaches, you can open the command line terminal. Now, the main important step is to start the minikube cluster in your Ubuntu 20.04 system.

How much CPU and memory does Prometheus use?

I found today that the prometheus consumes lots of memory(avg 1.75GB) and CPU (avg 24.28%). There are two prometheus instances, one is the local prometheus, the other is the remote prometheus instance.


2 Answers

We're using the container_cpu_usage_seconds_total metric to calculate Pod CPU usage. This metrics contains the total amount of CPU seconds consumed by container by core (this is important, as a Pod may consist of multiple containers, each of which can be scheduled across multiple cores; however, the metric has a pod_name annotation that we can use for aggregation). Of special interest is the change rate of that metric (which can be calculated with PromQL's rate() function). If it increases by 1 within one second, the Pod consumes 1 CPU core (or 1000 milli-cores) in that second.

The following PromQL query does just that: Compute the CPU usage of all Pods (using the sum(...) by (pod_name) operation) over a five minute average:

sum(rate(container_cpu_usage_seconds_total[5m])) by (pod_name)
like image 189
helmbert Avatar answered Nov 19 '22 21:11

helmbert


The following PromQL query returns per-pod number of used CPU cores starting from Kubernetes v1.16 and newer versions:

sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (pod)

The {container!=""} filter is needed for filtering out cgroups hierarchical stats, which is already included into per-container stats. See this answer for more details on this.

The following PromQL query must be used for Kubernetes below v1.16 because it uses different label names (e.g. container_name instead of container and pod_name instead of pod - see this issue for details):

sum(rate(container_cpu_usage_seconds_total{container_name!=""}[5m])) by (pod_name)
like image 43
valyala Avatar answered Nov 19 '22 21:11

valyala