Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what's meaning the container_cpu_cfs_throttled_seconds_total metrics

cadvisor has two metrics container_cpu_cfs_throttled_seconds_total and container_cpu_cfs_throttled_periods_total

I have confuse what does that means ..

I have found about two explain:

  1. container run with cpu limit, when container cpu over limit , the container will be "throttled" and add time to container_cpu_cfs_throttled_seconds_total

    that means :
     (1). only container cpu over limit, rate(container_cpu_cfs_throttled_seconds_total) > 0. 
     (2). we can use this metrics to alert container cpu over limit ... 
    
  2. when host in heavy cpu pressure, it will "throttled" container with POD QoS(Guaranteed > Burstable > Best-Effort) ...

    that means :
     (1). container_cpu_cfs_throttled_seconds_total will add has no relate with how many cpu container used and cpu limit ..
     (2). this metrics can not to alert container cpu over limit .. 
    
like image 680
gpl Avatar asked Nov 13 '18 08:11

gpl


People also ask

What is Container_cpu_cfs_throttled_seconds_total?

container_cpu_cfs_throttled_seconds_total is the sum of all throttle durations, i.e. durations that the container was throttled, i.e. stopped using the uses CFS Cgroup bandwidth control.

What are cAdvisor metrics?

cAdvisor analyzes metrics for memory, CPU, file, and network usage for all containers running on a given node. However, it doesn't store this data long-term, so you need a dedicated monitoring tool. Since cAdvisor is already integrated with the kubelet binary, there are no special steps required to install it.

What is metrics in Kubernetes?

Kubernetes metrics help you ensure all pods in a deployment are running and healthy. They provide information such as how many instances a pod currently has and how many were expected. If the number is too low, your cluster may run out of resources.

What is Container_memory_usage_bytes?

container_memory_usage_bytes (Total): Total memory usage of a container, regardless of when it was accessed.


2 Answers

container_cpu_cfs_throttled_seconds_total is the sum of all throttle durations, i.e. durations that the container was throttled, i.e. stopped using the uses CFS Cgroup bandwidth control.

Since each stopped thread adds its throttled durations to container_cpu_cfs_throttled_seconds_total, this number can become huge and does not help you (unless you have a known, fixed number of threads).

That is why alerting on CPU throttling is usually based on the metrics throttled percentage := container_cpu_cfs_throttled_periods_total / container_cpu_cfs_periods_total, i.e. the percentage of CPU periods where the container ran but was throttled (stopped from running the whole CPU period).

For more detail, you can watch this talk on CFS and CPU scheduling, or read the corresponding article.

like image 185
DaveFar Avatar answered Sep 19 '22 16:09

DaveFar


Lets say httpbin container running on machine1. Lets say httbin has a limit set in it's deployment to use maximum of 1 CPU. And machine1 has 2 CPUs. It makes httpbin to use half the available.

If httpbin container is trying to use more than 1 CPU, kubernetes will not kill the container. It will throttle it. If it is happening frequently, you may want to get alerted on that and fix the deployment. Another scenario is, if there are multiple containers in machine1 and if there is a lack of CPU resource, then it will throttle all containers it has.

container_cpu_cfs_throttled_seconds_total is the Total time duration the container has been throttled in seconds. container_cpu_cfs_throttled_periods_total is the Number of throttled period intervals

like image 30
ffran09 Avatar answered Sep 22 '22 16:09

ffran09