I need to monitor my container memory usage running on kubernetes cluster. After read some articles there're two recommendations: "container_memory_rss", "container_memory_working_set_bytes"
The definitions of both metrics are said (from the cAdvisor code)
I think both metrics are represent the bytes size on the physical memory that process uses. But there are some differences between the two values from my grafana dashboard.
My question is:
container_memory_rss (RSS): At a high level, memory usage not related to the file cache. RSS is used by the kernel for out of memory (OOM) scores and killing of processes when memory hits the limit.
cAdvisor (Container Advisor) provides container users an understanding of the resource usage and performance characteristics of their running containers. It is a running daemon that collects aggregates processes and exports information about running containers.
container_cpu_cfs_throttled_seconds_total is the sum of all throttle durations, i.e. durations that the container was throttled, i.e. stopped using the uses CFS Cgroup bandwidth control.
container_cpu_system_seconds_total — The total amount of “system” time (i.e. time spent in the kernel) container_cpu_usage_seconds_total — The sum of the above. Prior to Kubernetes 1.9 this is reported for every CPU in all node.
You are right. I will try to address your questions in more detail.
What is the difference between two metrics?
container_memory_rss
equals to the value of total_rss
from /sys/fs/cgroups/memory/memory.status
file:
// The amount of anonymous and swap cache memory (includes transparent
// hugepages).
// Units: Bytes.
RSS uint64 `json:"rss"`
The total amount of anonymous and swap cache memory (it includes transparent hugepages), and it equals to the value of total_rss
from memory.status
file. This should not be confused with the true resident set size
or the amount of physical memory used by the cgroup. rss + file_mapped
will give you the resident set size of cgroup. It does not include memory that is swapped out. It does include memory from shared libraries as long as the pages from those libraries are actually in memory. It does include all stack and heap memory.
container_memory_working_set_bytes
(as already mentioned by Olesya) is the total usage
- inactive file
. It is an estimate of how much memory cannot be evicted:
// The amount of working set memory, this includes recently accessed memory,
// dirty memory, and kernel memory. Working set is <= "usage".
// Units: Bytes.
WorkingSet uint64 `json:"working_set"`
Working Set is the current size, in bytes, of the Working Set of this process. The Working Set is the set of memory pages touched recently by the threads in the process.
Which metrics are much proper to monitor memory usage? Some post said both because one of those metrics reaches to the limit, then that container is oom killed.
If you are limiting the resource usage for your pods than you should monitor both as they will cause an oom-kill if they reach a particular resource limit.
I also recommend this article which shows an example explaining the below assertion:
You might think that memory utilization is easily tracked with
container_memory_usage_bytes
, however, this metric also includes cached (think filesystem cache) items that can be evicted under memory pressure. The better metric iscontainer_memory_working_set_bytes
as this is what the OOM killer is watching for.
EDIT:
Adding some additional sources as a supplement:
A Deep Dive into Kubernetes Metrics — Part 3 Container Resource Metrics
#1744
Understanding Kubernetes Memory Metrics
Memory_working_set vs Memory_rss in Kubernetes, which one you should monitor?
Managing Resources for Containers
cAdvisor code
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With