Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting TSC rate from x86 kernel

I have an embedded Linux system running on an Atom, which is a new enough CPU to have an invariant TSC (time stamp counter), whose frequency the kernel measures on startup. I use the TSC in my own code to keep time (avoiding kernel calls), and my startup code measures the TSC rate, but I'd rather just use the kernel's measurement. Is there any way to retrieve this from the kernel? It's not in /proc/cpuinfo anywhere.

like image 435
Paul DeRocco Avatar asked Feb 01 '16 05:02

Paul DeRocco


2 Answers

BPFtrace

As root, you can retrieve the kernel's TSC rate with bpftrace:

# bpftrace -e 'BEGIN { printf("%u\n", *kaddr("tsc_khz")); exit(); }' | tail -n

(tested it on CentOS 7 and Fedora 29)

That is the value that is defined, exported and maintained/calibrated in arch/x86/kernel/tsc.c.

GDB

Alternatively, also as root, you can also read it from /proc/kcore, e.g.:

# gdb /dev/null /proc/kcore -ex 'x/uw 0x'$(grep '\<tsc_khz\>' /proc/kallsyms \
    | cut -d' ' -f1) -batch 2>/dev/null | tail -n 1 | cut -f2

(tested it on CentOS 7 and Fedora 29)

SystemTap

If the system doesn't have bpftrace nor gdb available but SystemTap you can get it like this (as root):

# cat tsc_khz.stp 
#!/usr/bin/stap -g

function get_tsc_khz() %{ /* pure */
    THIS->__retvalue = tsc_khz;
%}
probe oneshot {
    printf("%u\n", get_tsc_khz());
}
# ./tsc_khz.stp

Of course, you can also write a small kernel module that provides access to tsc_khz via the /sys pseudo file system. Even better, somebody already did that and a tsc_freq_khz module is available on GitHub. With that the following should work:

# modprobe tsc_freq_khz
$ cat /sys/devices/system/cpu/cpu0/tsc_freq_khz

(tested on Fedora 29, reading the sysfs file doesn't require root)

Kernel Messages

In case nothing of the above is an option you can parse the TSC rate from the kernel logs. But this gets ugly fast because you see different kinds of messages on different hardware and kernels, e.g. on a Fedora 29 i7 system:

$ journalctl --boot | grep 'kernel: tsc:' -i | cut -d' ' -f5-
kernel: tsc: Detected 2800.000 MHz processor
kernel: tsc: Detected 2808.000 MHz TSC

But on a Fedora 29 Intel Atom just:

kernel: tsc: Detected 2200.000 MHz processor

While on a CentOS 7 i5 system:

kernel: tsc: Fast TSC calibration using PIT
kernel: tsc: Detected 1895.542 MHz processor
kernel: tsc: Refined TSC clocksource calibration: 1895.614 MHz

Perf Values

The Linux Kernel doesn't provide an API to read the TSC rate, yet. But it does provide one for getting the mult and shift values that can be used to convert TSC counts to nanoseconds. Those values are derived from tsc_khz - also in arch/x86/kernel/tsc.c - where tsc_khz is initialized and calibrated. And they are shared with userspace.

Example program that uses the perf API and accesses the shared page:

#include <asm/unistd.h>
#include <inttypes.h>
#include <linux/perf_event.h>
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>

static long perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
           int cpu, int group_fd, unsigned long flags)
{
    return syscall(__NR_perf_event_open, hw_event, pid, cpu, group_fd, flags);
}

The actual code:

int main(int argc, char **argv)
{
    struct perf_event_attr pe = {
        .type = PERF_TYPE_HARDWARE,
        .size = sizeof(struct perf_event_attr),
        .config = PERF_COUNT_HW_INSTRUCTIONS,
        .disabled = 1,
        .exclude_kernel = 1,
        .exclude_hv = 1
    };
    int fd = perf_event_open(&pe, 0, -1, -1, 0);
    if (fd == -1) {
        perror("perf_event_open failed");
        return 1;
    }
    void *addr = mmap(NULL, 4*1024, PROT_READ, MAP_SHARED, fd, 0);
    if (!addr) {
        perror("mmap failed");
        return 1;
    }
    struct perf_event_mmap_page *pc = addr;
    if (pc->cap_user_time != 1) {
        fprintf(stderr, "Perf system doesn't support user time\n");
        return 1;
    }
    printf("%16s   %5s\n", "mult", "shift");
    printf("%16" PRIu32 "   %5" PRIu16 "\n", pc->time_mult, pc->time_shift);
    close(fd);
}

Tested in on Fedora 29 and it works also for non-root users.

Those values can be used to convert a TSC count to nanoseconds with a function like this one:

static uint64_t mul_u64_u32_shr(uint64_t cyc, uint32_t mult, uint32_t shift)
{
    __uint128_t x = cyc;
    x *= mult;
    x >>= shift;
    return x;
}

CPUID/MSR

Another way to obtain the TSC rate is to follow DPDK's lead.

DPDK on x86_64 basically uses the following strategy:

  1. Read the 'Time Stamp Counter and Nominal Core Crystal Clock Information Leaf' via cpuid intrinsics (doesn't require special privileges), if available
  2. Read it from the MSR (requires the rawio capability and read permissions on /dev/cpu/*/msr), if possible
  3. Calibrate it in userspace by other means, otherwise

FWIW, a quick test shows that the cpuid leaf doesn't seem to be that widely available, e.g. an i7 Skylake and a goldmont atom don't have it. Otherwise, as can be seen from the DPDK code, using the MSR requires a bunch of intricate case distinctions.

However, in case the program already uses DPDK, getting the TSC rate, getting TSC values or converting TSC values is just a matter of using the right DPDK API.

like image 123
maxschlepzig Avatar answered Nov 27 '22 02:11

maxschlepzig


I had a brief look and there doesn't seem to be a built-in way to directly get this information from the kernel.

However, the symbol tsc_khz (which I'm guessing is what you want) is exported by the kernel. You could write a small kernel module that exposes a sysfs interface and use that to read out the value of tsc_khz from userspace.

If writing a kernel module is not an option, it may be possible to use some Dark Magic™ to read out the value directly from the kernel memory space. Parse the kernel binary or System.map file to find the location of the tsc_khz symbol and read it from /dev/{k}mem. This is, of course, only possible provided that the kernel is configured with the appropriate options.

Lastly, from reading the kernel source comments, it looks like there's a possibility that the TSC may be unstable on some platforms. I don't know much about the inner workings of the x86 arch but this may be something you want to take into consideration.

like image 31
tangrs Avatar answered Nov 27 '22 02:11

tangrs