Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Software prefetching across page boundary on x86

My understanding is that hardware prefetching will never cross page boundaries. I'm wondering if a software prefetch has the same restriction i.e. can I use a software prefetch to avoid a future TLB miss. From searching around, it appears to be possible, but I couldn't find anything definitive in the documentation, so a reference would be good.

I'm specifically interested in Nehalem, Sandy Bridge and Westmere.

like image 972
jmetcalfe Avatar asked Feb 08 '13 22:02

jmetcalfe


People also ask

Does prefetch affect performance?

Cache prefetching is a technique used by computer processors to boost execution performance by fetching instructions or data from their original storage in slower memory to a faster local memory before it is actually needed (hence the term 'prefetch').

What is software prefetching?

With software prefetching the programmer or compiler inserts prefetch instructions into the program. These are instructions that initiate a load of a cache line into the cache, but do not stall waiting for the data to arrive.

What is prefetch Linux?

Prefetching in the Linux kernel is beneficial for sequential accesses to a file, that is, accesses to consecutive blocks of that file. When a file is not accessed sequentially, prefetch- ing can potentially result in extra I/Os by reading data that is not used.


2 Answers

According to Intel's Optimization Reference Manual, it depends on the processor. From section 7.4.3:

There are cases where a PREFETCH will not perform the data prefetch. These include:

  • PREFETCH causes a DTLB (Data Translation Lookaside Buffer) miss. This applies to Pentium 4 processors with CPUID signature corresponding to family 15, model 0, 1, or 2. PREFETCH resolves DTLB misses and fetches data on Pentium 4 processors with CPUID signature corresponding to family 15, model 3.
  • An access to the specified address that causes a fault/exception.

Software prefetching may or may not avoid TLB misses, depending on the processor. It will not fetch the data if it would cause a page fault.

If you want ensure you avoid TLB misses, you could do a dummy read to load the data instead of a prefetch instruction. This could cause a page fault to swap in a page, which could be either good or bad depending on your use case.

like image 198
ughoavgfhw Avatar answered Sep 20 '22 01:09

ughoavgfhw


In modern processors (Nehalem, Sandy Bridge and Westmere) software prefetching does indeed trigger a TLB lookup.

From the Intel optimization guide: (section 7.3.3)

In older microarchitectures, PREFETCH causing a Data Translation Lookaside Buffer (DTLB) miss would be dropped. In processors based on Nehalem, Westmere, Sandy Bridge, and newer microar-chitectures, Intel Core 2 processors, and Intel Atom processors, PREFETCH causing a DTLB miss can be fetched across a page boundary.

like image 21
jleahy Avatar answered Sep 22 '22 01:09

jleahy