How can I prefetch a memory region most easily?

Tags:

Background: I've implemented a stochastic algorithm that requires random ordering for best convergence. Doing so obviously destroys memory locality, however. I've found that by prefetching the next iteration's data, the performance drop is minimized.

I can prefetch n cache lines using _mm_prefetch in a simple, mostly OS+compiler-portable fashion - but what's the length of a cache line? Right now, I'm using a hardcoded value of 64, which seems to be the norm nowadays on x64 processors - but I don't know how to detect this at runtime, and a question about this last year found no simple solution.

I've seen GetLogicalProcessorInformation on windows but I'm leery of using such a complex API for something so simple, and that won't work on macs or linux anyhow.

Perhaps there's some entirely other API/intrinsic that could prefetch a memory region identified in terms of bytes (or words, or whatever) and allows me to prefetch without knowing the cache line length?

Basically, is there a reasonable alternative to _mm_prefetch with #define CACHE_LINE_LEN 64?

332

asked Oct 20 '10 15:10

Eamon Nerbonne

1 Answers

There's a question asking just about the same thing here. You can read it from the CPUID if you feel like delving into some assembly. You'll have to write platform specific code for this of course.

You're probably already familiar with Agner Fog's manuals for optimization which gives the cache information for many popular processors. If you are able to determine the expected CPU's you'll encounter you can just hard-code the cache line sizes and look up the CPU vendor information to set the line size.

103

answered Sep 21 '22 11:09

Ron Warholic

Related questions
                            
                                Code that compiles for the iPhone Device but not for the Simulator
                            
                                Boost Serializing of Object containing Map (with object values) and Multimap (with std::string values): what is needed?
                            
                                Searching std::string between a limit
                            
                                Removing the obstacle that yields the best path from a map after A* traversal
                            
                                Getting PageRank
                            
                                Split a Large File In C++
                            
                                Using Doxygen Correctly
                            
                                Eclipse CDT Console does not link to code for compilation errors
                            
                                Are there any Visual Studio add-ins for true 'smart tabs'?
                            
                                Genetic programming in c++, library suggestions? [closed]
                            
                                Image/"most resembling pixel" search optimization?
                            
                                Mutability design patterns in Objective C and C++
                            
                                Detect pointer arithmetics because of LARGEADDRESSAWARE
                            
                                How to build Boost::program_options
                            
                                Binary parser or serialization?
                            
                                Is it necessary to overload placement new operator, when we overload new operator?
                            
                                Projecting a texture in OpenGL
                            
                                Selecting the pixels with highest intensity in OpenCV
                            
                                STLish lower_bound function for Radix/Patricia Trie
                            
                                Pure virtual and inline definition

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can I prefetch a memory region most easily?

Tags:

c++

caching

64-bit

Eamon Nerbonne

People also ask

1 Answers

Ron Warholic

Recent Activity

Donate For Us