Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I load values from memory without polluting the cache?

I want to read a memory location without polluting the cache. I am working on X86 Linux machine. I tried using MOVNTDQA assembler instruction:

  asm("movntdqa %[source], %[dest] \n\t"
      : [dest] "=x" (my_var) : [source] "m" (my_mem[0]) : "memory");

my_mem is an int* allocated with new, my_var is an int.

I have two problems with this approach:

  1. The code compiles but I am getting "Illegal Instruction" error when running it. Any ideas why?
  2. I am not sure what type of memory is allocated with new. I would assume that WB. According to documentation, the MOVNTDQA instruction will work only will USWC memory type. How can I know what memory type I am working on?

To summarize, my question is:

How can I read a memory location without polluting the cache on an X86 machine? Is my approach in the right direction, and can it be fixed to work?

Thanks.

like image 593
Anna Avatar asked Aug 12 '09 10:08

Anna


1 Answers

The problem with the movntdqa instruction with %%xmm as target (loading from memory) is that this insn is only available with SSE4.1 and on. This means newer Core 2 (45 nm) or i7 only so far. The other way around (storing data to memory) is available in earlier SSE versions.

For this instruction, the processor moves the data into one very small of very few read buffers (Intel doesn't specify the exact size, but assume it is in the range of 16 bytes), where it is readily available, but gets kicked out after a few other loads.

And it does not pollute the other caches, so if you have streaming data, your approach is viable.

Remember, you need to use a sfence insn afterwards.

Prefetching exists in two variants: prefetcht0 (Prefetches data in all caches) and prefetchnt (Prefetch non temporal data). Usually prefetch in all caches is the right thing to do, for a streaming data loop the latter would be better, if you make consequent use of the streaming instructions.

You use it with the address of an object you want to use in the near future, usually some iterations ahead if you have a loop. The prefetch insn doesn't wait or block, it just makes the processor start getting the data at the specified memory location.

like image 159
Gunther Piez Avatar answered Oct 16 '22 09:10

Gunther Piez