Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ alignment (when to use alignas)

I've recently learned about sizes and alignments of structs. I am quite familiar with how to use and how alignas() specifier works. I have seen examples of proper usage (concerning semantics, not real-life use cases) and the way it changes size of a type/variable.

However, I don't know when it is useful in my code. Could you list some use cases, when a developer should manually specify alignment of data?

like image 357
smutnyjoe Avatar asked Aug 03 '17 08:08

smutnyjoe


1 Answers

There are plenty of use cases where alignas is handy in multi threaded applications which are latency sensitive. Eg. High frequency trading applications.

Alignas provides tighter control over how your objects layout on CPU Caches to make access to the objects faster. Goals are as follows for optimal use which are the use cases for use of alignas

  1. You want to avoid unnecessary invalidation of your data from cache lines
  2. You want to optimize the CPU reads such that wastage of CPU cycles can be saved.

How does alignment to cache lines using alignas helps
Use 1 - Avoiding unnecessary invalidation of data from cache line You can use alignas to keep the addresses or objects used by separate threads running on separate cache lines, so that one thread does not inadvertently invalidate cache line of another core.

How this happens: Consider the case when a thread in your process is running on core 0 and is writing to address say xxxx. This address is now loaded into L1 cache of core 0. Thread no. 2 is accessing address xxxx + n bytes. Now if both these addresses happen to be on same cache line, then any writes by thread 2 will unnecessary invalidate the cache line of core 0. Thus thread 0 is delayed until the cache line is invalidated and loaded again. This hampers the performance in multi threaded environment.

Use 2 Align your objects to separate cache lines, such that the objects are not spread across multiple cache lines. This saves CPU cycles. Eg. If your object size is say for eg. 118 bytes, it's better to align it to 64 bytes since on most processor the cache line size is 64 bytes now.

If you don't do it your object may be laid out as follows on 64 bytes cache lines. (Eg. taken such that the object has actual size of say 118 bytes and with natural alignment, size becomes multiple of 4, thus 120 bytes)

Cache line 1<-----Object 1 60Bytes --> <---your Object 4> Bytes ---------->
Cache line 2<--------- Your object 64 Bytes --------------------------------->
Cache line 3 <----- Your object 52 bytes -----> <--- Some other object 12 Bytes -->

Since CPU reads in multiple of cache lines, your object would be read in 3 cpu cycles. Consider alignas(64) if you want to optimize it. With this, your object would always be spread on 2 cache lines.

Caveats Please note that you need to carefully examine your objects before considering alignas. Reason being a wrong methodology would lead to more padding and thus more wastage of L2 cache. There are simple techniques of arranging the data members in sequence such that it avoids wastage.

Hope this helps and Good Luck!

like image 140
Happy ITWala Avatar answered Nov 18 '22 20:11

Happy ITWala