Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are percpu pointers implemented in the Linux kernel?

On multiprocessor, each core can have its own variables. I thought they are different variables in different addresses, although they are in same process and have the same name.

But I am wondering, how does the kernel implement this? Does it dispense a piece of memory to deposit all the percpu pointers, and every time it redirects the pointer to certain address with shift or something?

like image 661
dspjm Avatar asked Jun 07 '13 07:06

dspjm


People also ask

What is Percpu?

percpu sections, where N is the number of CPUs, and the section used by the bootstrap processor will contain an uninitialized variable created with the DEFINE_PER_CPU macro. The kernel provides an API for per-cpu variables manipulating: get_cpu_var(var)

What is Per_cpu in Linux?

A per-CPU variable in the Linux kernel is actually an array with one instance of the variable for each processor. Each processor works with its own copy of the variable; this can be done with no locking, and with no worries about cache line bouncing.


1 Answers

Normal global variables are not per CPU. Automatic variables are on the stack, and different CPUs use different stack, so naturally they get separate variables.

I guess you're referring to Linux's per-CPU variable infrastructure.
Most of the magic is here (asm-generic/percpu.h):

extern unsigned long __per_cpu_offset[NR_CPUS];

#define per_cpu_offset(x) (__per_cpu_offset[x])

/* Separate out the type, so (int[3], foo) works. */
#define DEFINE_PER_CPU(type, name) \
    __attribute__((__section__(".data.percpu"))) __typeof__(type) per_cpu__##name

/* var is in discarded region: offset to particular copy we want */
#define per_cpu(var, cpu) (*RELOC_HIDE(&per_cpu__##var, __per_cpu_offset[cpu]))
#define __get_cpu_var(var) per_cpu(var, smp_processor_id())

The macro RELOC_HIDE(ptr, offset) simply advances ptr by the given offset in bytes (regardless of the pointer type).

What does it do?

  1. When defining DEFINE_PER_CPU(int, x), an integer __per_cpu_x is created in the special .data.percpu section.
  2. When the kernel is loaded, this section is loaded multiple times - once per CPU (this part of the magic isn't in the code above).
  3. The __per_cpu_offset array is filled with the distances between the copies. Supposing 1000 bytes of per cpu data are used, __per_cpu_offset[n] would contain 1000*n.
  4. The symbol per_cpu__x will be relocated, during load, to CPU 0's per_cpu__x.
  5. __get_cpu_var(x), when running on CPU 3, will translate to *RELOC_HIDE(&per_cpu__x, __per_cpu_offset[3]). This starts with CPU 0's x, adds the offset between CPU 0's data and CPU 3's, and eventually dereferences the resulting pointer.
like image 200
ugoren Avatar answered Oct 04 '22 09:10

ugoren