Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Docker stats with CPU percentage more than 100

I have a question about docker stats command if anyone can help me. I am new in Docker area and I want to monitor the cpu usage of a docker container.

The physical machine has 8 cores (CPU0...CPU7). I already create a container and limit its cpu resource to 1 core (CPU0) using the following command: docker run -itd --cpuset-cpus=0 -p 8081:8080 binfalse/bives-webapp

I stress the container by sending requests from Jmeter and then monitor the cpu usage of the container via docker stats command which gives me values greater than 100%.

I don't understand why it gives more than 100% even if only one core is allocated to the container!. Do you have any idea about the reason? Does this cpu value represents the cpu usage of some system processes in addition to the container?

Thanks in advance for your help.

docker version: Client: Version: 17.06.0-ce API version: 1.30 Go version: go1.8.3 Git commit: 02c1d87 Built: Fri Jun 23 21:23:31 2017 OS/Arch: linux/amd64

Server: Version: 17.06.0-ce API version: 1.30 (minimum version 1.12) Go version: go1.8.3 Git commit: 02c1d87 Built: Fri Jun 23 21:19:04 2017 OS/Arch: linux/amd64 Experimental: true

docker info result: Containers: 2 Running: 1 Paused: 0 Stopped: 1 Images: 10 Server Version: 17.06.0-ce Storage Driver: aufs Root Dir: /var/lib/docker/aufs Backing Filesystem: extfs Dirs: 141 Dirperm1 Supported: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4 init version: 949e6fa Security Options: apparmor seccomp Profile: default Kernel Version: 4.4.0-98-generic Operating System: Ubuntu 16.04.2 LTS OSType: linux Architecture: x86_64 CPUs: 8 Total Memory: 15.56GiB Name: logti048131 ID: RHOG:IR6N:FVC4:YDI5:A6T4:QA4Y:DDYF:7HZN:AI3L:WVLE:BNHY:6YNV Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Experimental: true Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false

WARNING: No swap limit support

like image 238
SMA Avatar asked Nov 20 '17 21:11

SMA


People also ask

How can CPU usage be more than 100?

If the CPU usage is around 100%, this means that your computer is trying to do more work than it has the capacity for. This is usually OK, but it means that programs may slow down a little. Computers tend to use close to 100% of the CPU when they are doing computationally-intensive things like running games.

What is CPU in Docker stats?

CPU is expressed as a percentage (%) of the overall host capacity. One can optimize the resource usage of Docker hosts by being aware of how much CPU the hosts and containers consume. One active/busy container shouldn't slow down other containers by consuming all of the CPU resources.

How much CPU can a docker container use?

On windows, a container defaults to using two CPUs. If hyperthreading is available this is one core and two logical processors. If hyperthreading is not available this is two cores and two logical processors.

Can I limit the file size CPU utilization for a docker in my machine?

Limit Docker Container CPU Usage You can also use the --cpu-shares option to give the container a greater or lesser proportion of CPU cycles. By default, this is set to 1024. To find more options for limiting container CPU usage, please refer to Docker's official documentation.


1 Answers

On Linux, cgroups and Docker CPU stats deal in "time slices" of CPU, the number of nanoseconds the CPU has been in use for. To get the percentage, the container cgroup value of "time used" is compared to the overall system value for "time available" from /proc/stat.

Due to the stored "time slice" values being cumulative, the current values are compared to the previous collected values to get a more instantaneous percentage. I think this comparison is the basis of the issue.

Stats Collection

The docker stats command actually does a lot of the leg work for this information in the client. The client queries for all containers, watches events for container start/stops and opens an individual stats stream for each running container. These streams of container stats are used to calculate the percentages on each dump of stats data from a stream.

For the container stats stream, the Docker daemon collects the systems used cpu time first. It then uses libcontainer to read in a containers cgroup files and parse the text into values. Here are all the stats data structures. That is all then sent to the client as a JSON response for processing.

I believe at least part of the problem stems from reading and parsing the /proc/stat system information and container cgroup stats at different times. Every time the goroutine that reads the container info is delayed a bit, more nanoseconds are included in that sample compared to the system. As the collection process is scheduled to run every X seconds, the next read then includes less total nanoseconds so the values can bounce up on a busy system, then back down the same amount as there is not a full 'tick' included in the second sample.

The issue compounds the more containers you run and the busier the system gets. The stats gathering and forwarding to the client seems to be a relatively heavyweight process, just docker stats with a large number of containers is enough to cause more inaccuracy. My best guess is contention in the goroutines that are all trying to read the stats. I'm not sure that would account for quite the level of inaccuracy the Docker shows. I'm either completely wrong or there's something else adding to the problem.

Linux cgroups

Each Docker containers usage is tracked in a cgroup. The CPU accounting information can be viewed via the cgroup file system:

→ find /sys/fs/cgroup/cpuacct/docker -type d
/sys/fs/cgroup/cpuacct/docker
/sys/fs/cgroup/cpuacct/docker/f0478406663bb57d597d4a63a031fc2e841de279a6f02d206b27eb481913c0ec
/sys/fs/cgroup/cpuacct/docker/5ac4753f955acbdf38beccbcc273f954489b2a00049617fdb0f9da6865707717
/sys/fs/cgroup/cpuacct/docker/a4e00d69819a15602cbfb4f86028a4175e16415ab9e2e9a9989fafa35bdb2edf
/sys/fs/cgroup/cpuacct/docker/af00983b1432d9ffa6de248cf154a1f1b88e6b9bbebb7da2485d94a38f9e7e15

→ cd /sys/fs/cgroup/cpuacct/docker/f0478406663bb57d597d4a63a031fc2e841de279a6f02d206b27eb481913c0ec
→ ls -l
total 0
-rw-r--r--    1 root     root             0 Nov 20 22:31 cgroup.clone_children
-rw-r--r--    1 root     root             0 Nov 20 04:35 cgroup.procs
-r--r--r--    1 root     root             0 Nov 20 21:51 cpuacct.stat
-rw-r--r--    1 root     root             0 Nov 20 21:51 cpuacct.usage
-r--r--r--    1 root     root             0 Nov 20 22:31 cpuacct.usage_all
-r--r--r--    1 root     root             0 Nov 20 21:51 cpuacct.usage_percpu
-r--r--r--    1 root     root             0 Nov 20 22:31 cpuacct.usage_percpu_sys
-r--r--r--    1 root     root             0 Nov 20 22:31 cpuacct.usage_percpu_user
-r--r--r--    1 root     root             0 Nov 20 22:31 cpuacct.usage_sys
-r--r--r--    1 root     root             0 Nov 20 22:31 cpuacct.usage_user
-rw-r--r--    1 root     root             0 Nov 20 22:31 notify_on_release
-rw-r--r--    1 root     root             0 Nov 20 22:31 tasks

→ cat cpuacct.usage_percpu
3625488147 6265485043 6504277830 

Each value is the cumulative usage in nano seconds on that CPU.

→ grep -w ^cpu /proc/stat
cpu  475761 0 10945 582794 2772 0 159 0 0 0

Values here are USER_HZ == 1/100 of a second, so get some conversion in Docker.

like image 172
Matt Avatar answered Oct 21 '22 09:10

Matt