Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does "cpu_time" represent exactly in libvirt?

Tags:

I can pull the following CPU values from libvirt:

virsh domstats vm1 --cpu-total
Domain: 'vm1'
  cpu.time=6173016809079111
  cpu.user=26714880000000
  cpu.system=248540680000000

virsh cpu-stats vm1 --total
Total:
    cpu_time       6173017.263233824 seconds
    user_time        26714.890000000 seconds
    system_time     248540.700000000 seconds

What does the cpu_time figure represent here exactly?

I'm looking to calculate CPU utilization as a percentage using this data.

Thanks

like image 948
JoeM Avatar asked Nov 07 '16 15:11

JoeM


People also ask

What is libvirt domain?

So libvirt is intended to be a building block for higher level management tools and for applications focusing on virtualization of a single node (the only exception being domain migration between node capabilities which involves more than one node).

What is libvirt in Linux?

libvirt is a collection of software that provides a common API (Application Programming Interface) for managing popular virtualization solutions, for example KVM and Xen. libvirt consists of an API library, a system service libvirtd , and a command line utility virsh .


2 Answers

This was a surprisingly difficult question to answer! After pouring over the kernel code for a good while I've figured out what's going on here and its quite nice to learn what's going on.

Normally for a process on Linux, the overall CPU usage is simply the sum of the time spent in userspace and the time spent on kernel space. So naively one would have expected user_time + system_time to equal cpu_time. What I've discovered is that Linux tracks time spent by vCPU threads executing guest code separately from either userspace or kernelspace time.

Thus cpu_time == user_time + system_time + guest_time

So you can think of system_time + user_time as giving the overhead of QEMU / KVM on the host side. And cpu_time - (user_time + system_time) as giving the actual amount of time the guest OS was running its CPUs.

To calculate CPU usage, you probably just want to record cpu_time every N seconds and calculate the delta between two samples. eg usage % = 100 * (cpu_time 2 - cpu_time 1) / N

like image 177
DanielB Avatar answered Sep 28 '22 22:09

DanielB


As per master pulled 2018-07-10 from https://github.com/libvirt/libvirt/ and as far as QEMU/KVM is concerned, it comes down to:

  • cpu.time = cpuacct.usage cgroup metric
  • cpu.{user,system} = cpuacct.stat cgroup metrics

Problem one may encounter is guest load = time load - system load - user load sometime leads to negative values (?!?), example given for a running QEMU/KVM guest (values are seconds), with Debian 9 stock kernel (4.9):

time                   system    user     total
2018-07-10T13:19:20Z 62308.67 9278.59 107968.33
2018-07-10T13:20:20Z 62316.08 9279.73 107970.73
               delta     7.41    1.14      2.40 (2.40 < 7.41+1.14 ?!?)

Kernel bug ? (at least one person experiments something similar: https://lkml.org/lkml/2017/11/1/101)
One thing is certain: cpuacct.usage and cpuacct.stat do use a different logic to gather their metrics; this might explain the discrepancy (?).

like image 30
Cédric Dufour Avatar answered Sep 28 '22 21:09

Cédric Dufour