I am trying to measure the resource usage time (user and system) for various function calls using rusage. I find that the results I am getting are in the order of 10s of milliseconds like 0s 70000us, 10000us etc. Please let me know if there is a way to set precision/ granularity for getrusage.
My program is simple:
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <unistd.h>
int main(){
struct rusage usage;
struct timeval start, end;
int i, j, k = 0;
getrusage(RUSAGE_SELF, &usage);
start = usage.ru_utime;
printf("buffer check\n");
char *str = "---";
int arr[100],ctr;
for(ctr = 0;ctr<100;ctr++){
arr[ctr] = ctr + 1000;
}
for (i = 0; i < 10000; i++) {
for (j = 0; j < 10000; j++) {
k += 20;
}
}
getrusage(RUSAGE_SELF, &usage);
end = usage.ru_utime;
printf("Started at: %ld.%lds\n", start.tv_sec, start.tv_usec);
printf("Ended at: %ld.%lds\n", end.tv_sec, end.tv_usec);
return 1;
}
Result Started at: 0.0s Ended at: 0.2000000s
I added another for loop and got result like: Started at: 0.0s Ended at: 0.7000000s I browsed a lot to find a possible way to get accurate timings. Came across 3 parameter getrusage in linux sources etc. but I am not sure how to use it since it required the task pointer as the first param. One of the link suggested it has got to do with linux version. Regardless, please let me know if there is any way to set the precision/granularity. If not, let me know if there is any alternative to getrusage. gettimeofDay does not seem to give the resource usage details so looking for actual implementation of getrusage if somehow I am not able to set precision.
Many operating systems don't do precise accounting of time used by processes. In many cases it's too expensive to read clocks on every context switch and system call, in other cases the hardware might not even have a clock that's allows you to time things with any precision.
A very commonly used method for accounting that you get from getrusage
is to have a 100Hz (it's most often 100Hz, although 64Hz and 1024Hz are common too) timer interrupt that samples what's happening on the system at the time of the interrupt. So 100 times per second the kernel checks what is currently running and where (user space for ru_utime or kernel space for ru_stime) and increments a counter. That counter is then interpreted as your program running for 10ms.
You can experiment with clock_gettime
on your system, see if it has per-process counters, sometimes those can be more precise than the getrusage
counters. But I wouldn't get my hopes up, if 10ms resolution is the best getrusage
can do, it's likely that clock_gettime
won't have a better resolution either or any per-process clocks at all.
If the clocks in the operating system aren't good enough for your measurements your only option is to repeat your test run for several minutes and divide whatever result you get by the number of runs.
The fact that gettimeofday
is more precise doesn't mean much. gettimeofday
might be relatively expensive. Think about the work the kernel would have to do to accurately keep track of user and system time for a process. Every time you make a system call it would have to take a time stamp twice (one for the start of the system call and once at the end) just to keep track of how much system time you use. For keeping track of user time you'd need time stamps on every time the system switches to another process. Many systems do keep track of the second one, but not the first one since system calls are much more common than process context switches (that's why I suggest checking clock_gettime
since it can have a timer that accumulates total system and user time for a process).
Clocks in modern systems are quite annoying because even though taking time stamps is one of the most common system calls we still often need to trawl through a slow bus and do heavy locking to get them. Other solutions like cycle counters on the cpu have been used but those are notoriously inaccurate because they might be not synchronized between CPUs, might have a variable frequency, can stop outside of the control of the operating system, etc. and you need to know the exact model of your CPU to be able to reliably use them. The operating system has a lot of heuristics to figure out which clocks to use, but it might mean that there's a huge difference between two machines that are almost the same. One might get a cycle counter with sub-nanosecond precision that costs one instruction to read while the other needs to go through the ISA bus to a 30 years old chip design with microsecond precision or worse which takes thousands of cycles to read.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With