Does alignment really matter for performance in C++11?

Tags:

There is an advice in Stroustrup's book to order the members in a struct beginning from the biggest to the smallest. But I wonder if someone has made measurements to actually see if this makes any difference, and if it is worth it to think about when writing code.

524

asked Dec 28 '13 00:12

user3111311

2 Answers

Alignment matters not only for performance, but also for correctness. Some architectures will fail with an processor trap if the data is not aligned correctly, or access the wrong memory location. On others, access to unaligned variables is broken into multiple accesses and bitshifts (often inside the hardware, sometimes by OS trap handler), losing atomicity.

The advice to sort members in descending order of size is for optimal packing / minimum space wasted by padding, not for alignment or speed. Members will be correctly aligned no matter what order you list them in, unless you request non-conformant layout using specialized pragmas (i.e. the non-portable #pragma pack) or keywords. Although total structure size is affected by padding and also affects speed, often there is another ordering that is optimal.

For best performance, you should try to get members which are used together into the same cache line, and members that are accessed by different threads into different cache lines. Sometimes that means a lot of padding to get a cross-thread shared variable alone in its own cache line. But that's better than taking a performance hit from false sharing.

103

answered Oct 05 '22 20:10

Ben Voigt

Just to add to Ben's great answer:

Defining struct members in the same order they are later accessed in your application will reduce cache misses and possibly increase performance. This will work provided the entire structure does not fit into L1 cache.

On the other hand, ordering the members from biggest to smallest may reduce overall memory usage, which may be important when storing an array of small structures.

Let's assume that for an architecture (I don't know them that well, I think that would be the case for default settings 32bit gcc, someone will correct me in comments) this structure:

struct MemoryUnused {   uint8_t val0;   uint16_t val1;   uint8_t val2;   uint16_t val3;   uint8_t val4;   uint32_t val5;   uint8_t val6; }

takes 20 bytes in memory, while this:

struct MemoryNotLost {   uint32_t val5;   uint16_t val1;   uint16_t val3;   uint8_t val0;   uint8_t val2;   uint8_t val4;   uint8_t val6; }

Will take 12. That's 8 bytes lost due to padding, and it's a 67% increase in size of the smallers struct. With a large array of such structs, the gain would be significant and, simply because of the amount of used memory, will decrease the amount of cache misses.

answered Oct 05 '22 18:10

Dariusz

Related questions
                            
                                C++ Default argument for vector<int>&?
                            
                                Is there a way to disable all warnings with a pragma?
                            
                                Is it legal for the compiler to degrade the time complexity of a program? Is this considered observable behavior?
                            
                                std::vector resize downward
                            
                                Optimizing for space instead of speed in C++
                            
                                How to efficiently compare two maps of strings in C++ only for a subset of the keys
                            
                                How to iterate over a C++ STL map data structure using the 'auto' keyword?
                            
                                passing object by reference in C++
                            
                                return an empty vector c++ [duplicate]
                            
                                how to do an if else depending type of type in c++ template? [duplicate]
                            
                                Common array length macro for C? [duplicate]
                            
                                C++ classes (public, private, and protected)
                            
                                Why is the data type needed in pointer declarations?
                            
                                std::cout to print character N times
                            
                                How to pass and execute anonymous function as parameter in C++11?
                            
                                What exactly are C++ modules?
                            
                                What are the disadvantages of the Spirit parser-generator framework from boost.org?
                            
                                Calling class method through NULL class pointer [duplicate]
                            
                                Scripting language for C++ [closed]
                            
                                Is there a way to avoid implicit conversion to void*?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Does alignment really matter for performance in C++11?

Tags:

c++

c++11

memory-alignment

user3111311

People also ask

2 Answers

Ben Voigt

Dariusz

Recent Activity

Donate For Us