When I <code>cat /proc/cpuinfo</code>, I see 8 cores, with ID's from <code>0</code> to <code>7</code>. Is there an <code>x86</code> instruction that will report the core id of the core that the instruction itself is running on? I looked at <code>cpuid</code> but that does not seem to return <code>coreid</code> under any parameter setting.

The Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3A: System Programming Guide, Part 1, section 8.4.5 Identifying Logical Processors in an MP System lists, among others: <blockquote> This APIC ID is reported by CPUID.0BH:EDX[31:0] </blockquote> Note that this doesn't directly equate to the linux kernel's numbering. In the kernel there is an <code>x86_cpu_to_apicid</code> table that you can read. Of course the kernel also knows what cpu the code is executing on, without having to consult the APIC: <pre class="prettyprint"><code> * smp_processor_id(): get the current CPU ID. * * if DEBUG_PREEMPT is enabled then we check whether it is * used in a preemption-safe way. (smp_processor_id() is safe * if it's used in a preemption-off critical section, or in * a thread that is bound to the current CPU.) </code></pre>

Is there an `x86` instruction to tell which core the instruction is being run on?

2 Answers

Some newer x86/x86_64 CPUs have the "RDTSCP" variant of RDTSC instruction:

http://ref.x86asm.net/coder32-abc.html#R

RDTSC   EAX EDX IA32_TIM…               0F  31
        P1+         f2              Read Time-Stamp Counter
RDTSCP  EAX EDX ECX ...         0F  01  F9  7   
        C7+         f2              Read Time-Stamp Counter and Processor ID

C7+ means that "0x0F01F9" instruction was introduced in some "Core i7"...

Opcodes

Hex Mnemonic Encoding Long Mode Legacy Mode Description

0F 01 F9 RDTSCP A Valid Valid

Read 64-bit time-stamp counter and 32-bit IA32_TSC_AUX value into EDX:EAX and ECX.

OS should write core id into IA32_TSC_AUX (Linux does), and this value is accessible with RDTSCP.

Linux encodes numa id (<<12) and core id (8bit) into TSC_AUX:

341         if (cpu_has(&cpu_data(cpu), X86_FEATURE_RDTSCP))
342                 write_rdtscp_aux((node << 12) | cpu);
343 
344         /*
345          * Store cpu number in limit so that it can be loaded quickly
346          * in user space in vgetcpu. (12 bits for the CPU and 8 bits for the node)
347          */

In Linux there is also vsyscall getcpu ("__vdso_getcpu") to access cpu id via rdtscp (if cpu has the instruction) or via GDT - GDT_ENTRY_PER_CPU: __getcpu in include/asm/vsyscall.h from 3.13. From the man page:

getcpu() was added in kernel 2.6.19 for x86_64 and i386.

Linux makes a best effort to make this call as fast possible. The intention of getcpu() is to allow programs to make optimizations with per-CPU data or for NUMA optimization.

From some intel manuals: http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-32-ia-64-benchmark-code-execution-paper.pdf#page=15

3.2 Improvements Using RDTSCP Instruction

The RDTSCP instruction is described in the Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2B ([3]) as an assembly instruction that, at the same time, reads the timestamp register and the CPU identifier. The value of the timestamp register is stored into the EDX and EAX registers; the value of the CPU id is stored into the ECX register (“On processors that support the Intel 64 architecture, the high order 32 bits of each of RAX, RDX, and RCX are cleared”). What is interesting in this case is the “pseudo” serializing property of RDTSCP. The manual states:

“The RDTSCP instruction waits until all previous instructions have been executed before reading the counter. However, subsequent instructions may begin execution before the read operation is performed.”

This means that this instruction guarantees that everything that is above its call in the source code is executed before the instruction itself is called. It cannot, however, guarantee that - for optimization purposes - the CPU will not execute, before the RDTSCP call, instructions that, in the source code, are placed after the RDTSCP function call itself. If this happens, a contamination caused by instructions in the source code that come after the RDTSCP will occur in the code under measurement. .

Also, description is available here http://www.felixcloutier.com/x86/RDTSCP.html which is clone of https://github.com/zneak/x86doc

UPDATE: There will be separate instruction RDPID just to read IA32_TSC_AUX register without timestamp counter (as RDTSCP does

https://hjlebbink.github.io/x86doc/html/RDPID.html

Reads the value of the IA32_TSC_AUX MSR (address C0000103H) into the destination register. The value of CS.D and operand-size prefixes (66H and REX.W) do not affect the behavior of the RDPID instruction.
F3 0F C7 /7 RDPID r32 M   N.E./V  RDPID   Read IA32_TSC_AUX into r32.
F3 0F C7 /7 RDPID r64 M   V/N.E.  RDPID   Read IA32_TSC_AUX into r64.

It will be enabled since "Ice Lake" microarchitecture (2018), as declared in https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf 319433-030 OCTOBER 2017

129

answered Oct 04 '22 04:10

osgx

The Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3A: System Programming Guide, Part 1, section 8.4.5 Identifying Logical Processors in an MP System lists, among others:

This APIC ID is reported by CPUID.0BH:EDX[31:0]

Note that this doesn't directly equate to the linux kernel's numbering. In the kernel there is an x86_cpu_to_apicid table that you can read. Of course the kernel also knows what cpu the code is executing on, without having to consult the APIC:

 * smp_processor_id(): get the current CPU ID.
 *
 * if DEBUG_PREEMPT is enabled then we check whether it is
 * used in a preemption-safe way. (smp_processor_id() is safe
 * if it's used in a preemption-off critical section, or in
 * a thread that is bound to the current CPU.)

answered Oct 04 '22 03:10

Jester

Related questions
                            
                                dword ptr usage confusion
                            
                                C vs assembler vs NEON performance
                            
                                How does MOVSX assembly instruction work?
                            
                                Why does the BIOS entry point start with a WBINVD instruction?
                            
                                Shellcode for a simple stack overflow: Exploited program with shell terminates directly after execve("/bin/sh")
                            
                                Local and static variables in C
                            
                                Why do some SSE "mov" instructions specify that they move floating-point values?
                            
                                Why do we need to define .data and .text section in assembly?
                            
                                Is it possible to call a non-exported function that resides in an exe?
                            
                                Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures?
                            
                                Difference between lea and offset
                            
                                When are GAS ELF the directives .type, .thumb, .size and .section needed?
                            
                                How does one do integer (signed or unsigned) division on ARM?
                            
                                "cpuid" before "rdtsc"
                            
                                load warning: cannot find entry symbol _start
                            
                                How to get address of base stack pointer
                            
                                What is %gs in Assembly
                            
                                LLVM and compiler nomenclature
                            
                                Assembly language (MIPS) difference betweent addi and add
                            
                                internal relocation not fixed up

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there an `x86` instruction to tell which core the instruction is being run on?

Tags:

x86

assembly

x86-64

merlin2011

People also ask

2 Answers

osgx

Jester

Recent Activity

Donate For Us