How to decrease the number of possible cache misses when designing a C++ program? Does inlining functions help every time? or is it good only when the program is CPU-bounded (i.e. the program is computation oriented not I/O oriented)?

Here are some things that I like consider when working on this kind of code. <ul> <li>Consider whether you want "structures of arrays" or "arrays of structures". Which you want to use will depend on each part of the data.</li> <li>Try to keep structures to multiples of 32 bytes so they pack cache lines evenly.</li> <li>Partition your data in hot and cold elements. If you have an array of objects of class o, and you use o.x, o.y, o.z together frequently but only occasionally need to access o.i, o.j, o.k then consider puting o.x, o.y, and o.z together and moving the i, j, and k parts to a parallel axillary data structure.</li> <li>If you have multi dimensional arrays of data then with the usual row-order layouts, access will be very fast when scanning along the preferred dimension and very slow along the others. Mapping it along a space-filling curve instead will help to balance access speeds when traversing in any dimension. (Blocking techniques are similar -- they're just Z-order with a larger radix.)</li> <li>If you must incur a cache miss, then try to do as much as possible with that data in order to amortize the cost.</li> <li>Are you doing anything multi-threaded? Watch out for slowdowns from cache consistency protocols. Pad flags and small counters so that they'll be on separate cache lines.</li> <li>SSE on Intel provides some prefetch intrinsics if you know what you'll be accessing far enough ahead of time.</li> </ul>

For data bound operations <ol> <li>use arrays & vectors over lists,maps & sets</li> <li>process by rows over columns</li> </ol>

decreasing cache misses through good design

2 Answers

Here are some things that I like consider when working on this kind of code.

Consider whether you want "structures of arrays" or "arrays of structures". Which you want to use will depend on each part of the data.
Try to keep structures to multiples of 32 bytes so they pack cache lines evenly.
Partition your data in hot and cold elements. If you have an array of objects of class o, and you use o.x, o.y, o.z together frequently but only occasionally need to access o.i, o.j, o.k then consider puting o.x, o.y, and o.z together and moving the i, j, and k parts to a parallel axillary data structure.
If you have multi dimensional arrays of data then with the usual row-order layouts, access will be very fast when scanning along the preferred dimension and very slow along the others. Mapping it along a space-filling curve instead will help to balance access speeds when traversing in any dimension. (Blocking techniques are similar -- they're just Z-order with a larger radix.)
If you must incur a cache miss, then try to do as much as possible with that data in order to amortize the cost.
Are you doing anything multi-threaded? Watch out for slowdowns from cache consistency protocols. Pad flags and small counters so that they'll be on separate cache lines.
SSE on Intel provides some prefetch intrinsics if you know what you'll be accessing far enough ahead of time.

answered Sep 27 '22 21:09

Boojum

For data bound operations

use arrays & vectors over lists,maps & sets
process by rows over columns

answered Sep 27 '22 23:09

Chris

Related questions
                            
                                How to remove diagramming support objects from SQL Server?
                            
                                ASP.Net URLEncode Ampersand for use in Query String
                            
                                How to set file permissions (cross platform) in C++?
                            
                                Overriding the right-click context menu in web browsers - pros and cons
                            
                                Constructor not found during deserialization?
                            
                                In terms of programming, what do semantics mean?
                            
                                Many to Many Relation Design - Intersection Table Design
                            
                                How do I get a list of JNI libraries which are loaded?
                            
                                How to lock a critical section in Django?
                            
                                Make windows batch file not close upon program exit
                            
                                db:schema:load vs db:migrate with capistrano
                            
                                Using - in XML element name

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

decreasing cache misses through good design

Tags:

Josef

People also ask

2 Answers

Boojum

Chris

Recent Activity

Donate For Us