I would like to programmatically disable hardware prefetching.
From Optimizing Application Performance on Intel® Core™ Microarchitecture Using Hardware-Implemented Prefetchers and How to Choose between Hardware and Software Prefetch on 32-Bit Intel® Architecture, I need to update the MSR to disable hardware prefetching.
Here is a relevant snippet:
"DPL Prefetch and L2 Streaming Prefetch settings can also be changed programmatically by writing a device driver utility for changing the bits in the
IA32_MISC_ENABLE
register –MSR 0x1A0
. Such a utility offers the ability to enable or disable prefetch mechanisms without requiring any server downtime.
The table below shows the bits in the IA32_MISC_ENABLE MSR
that have to be changed in order to control the DPL
and L2 Streaming Prefetch:
Prefetcher Type MSR (0x1A0) Bit Value DPL (Hardware Prefetch) Bit 9 0 = Enable 1 = Disable L2 Streamer (Adjacent Cache Line Prefetch) Bit 19 0 = Enable 1 = Disable"
I tried using http://etallen.com/msr.html but this did not work. I also tried using wrmsr
in asm/msr.h
directly but that segfaults. I tried doing this in a kernel module ... and killed the machine.
BTW - I am using kernel 2.6.18-92.el5 and it has MSR
linked in the kernel:
$ grep -i msr /boot/config-$(uname -r) CONFIG_X86_MSR=y ...
The hardware prefetchers can throttle themselves in response to software prefetching, so even if hardware prefetching is not effective for a certain application, it does not need to be disabled because it will remain mostly inactive.
Only in over-provisioned systems, can prefetching with low predictive accuracy improve performance. However, the data cache is obviously under-provisioned as it can keep only a subset of the data-set. The prefetched data typically shares the cache space with demand-paged data.
Hardware based prefetching is typically accomplished by having a dedicated hardware mechanism in the processor that watches the stream of instructions or data being requested by the executing program, recognizes the next few elements that the program might need based on this stream and prefetches into the processor's ...
You can enable or disable the hardware prefetchers using msr-tools http://www.kernel.org/pub/linux/utils/cpu/msr-tools/.
The following enables the hardware prefetcher (by unsetting bit 9):
[root@... msr-tools-1.2]# ./wrmsr -p 0 0x1a0 0x60628e2089 [root@... msr-tools-1.2]# ./rdmsr 0x1a0 60628e2089
The following disables the hardware prefetcher (by enabling bit 9):
[root@... msr-tools-1.2]# ./wrmsr -p 0 0x1a0 0x60628e2289 [root@... msr-tools-1.2]# ./rdmsr 0x1a0 60628e2289
Programatically, you can do this as root by opening /dev/cpu/<cpunumber>/msr
and using pwrite to write to the msr "file" at the 0x1a0
offset.
From the Intel reference:
This instruction must be executed at privilege level 0 or in real-address mode; otherwise, a general protection exception #GP(0) will be generated. Specifying a reserved or unimplemented MSR address in ECX will also cause a general protection exception.
...
The CPUID instruction should be used to determine whether MSRs are supported (EDX[5]=1) before using this instruction.
So, your fault might be related to a cpu that doesn't support MSRs or using the wrong MSR address.
There are lots of examples of using the MSRs in the kernel source:
In the kernel source, for a single cpu, it demonstrates disabling prefetch for the Xeon in arch/i386/kernel/cpu/intel.c, in the function:
static void __cpuinit Intel_errata_workarounds(struct cpuinfo_x86 *c)
The rdmsr function arguments are the msr number, a pointer to the low 32 bit word, and a pointer to the high 32 bit word.
The wrmsr function arguments are the msr number, the low 32 bit word value, and the high 32 bit word value.
multi-core or smp systems have to pass the cpu struct in as the first argument:
void rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h);
void wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With