I couldn't find a decent document that explains how the alignment system works and why some types are more strictly aligned than the others.
Data alignment: Data alignment means putting the data in memory at an address equal to some multiple of the word size. This increases the performance of the system due to the way the CPU handles memory.
Aligned Pointer means that pointer with adjacent memory location that can be accessed by a adding a constant and its multiples. for char a[5] = "12345"; here a is constant pointer if you and the size of char to it every time you can access the next chracter that is, a +sizeofchar will access 2.
Alignment helps the CPU fetch data from memory in an efficient manner: less cache miss/flush, less bus transactions etc. Some memory types (e.g. RDRAM, DRAM etc.) need to be accessed in a structured manner (aligned "words" and in "burst transactions" i.e. many words at one time) in order to yield efficient results.
Every complete object type has a property called alignment requirement, which is an integer value of type size_t representing the number of bytes between successive addresses at which objects of this type can be allocated.
I'll try to explain in short.
The architecture in you computer is composed of processor and memory. Memory is organized in cells, so:
0x00 | data | 0x01 | ... | 0x02 | ... |
Each memory cell has a specified size, amount of bits it can store. This is architecture dependent.
When you define a variable in your C/C++ program, one or more different cells are occupied by your program.
For example
int variable = 12;
Suppose each cell contains 32 bits and the int
type size is 32 bits, then in somewhere in your memory:
variable: | 0 0 0 c | // c is hexadecimal of 12.
When your CPU has to operate on that variable it needs to bring it inside its register. A CPU can take in "1 clock" a small amount of bit from the memory, that size is usually called WORD. This dimension is architecture dependent as well.
Now suppose you have a variable which is stored, because of some offset, in two cells.
For example I have two different pieces data to store (I'm going to use a "string representation to make more clear"):
data1: "ab" data2: "cdef"
So the memory will be composed in that way (2 different cells):
|a b c d| |e f 0 0|
That is, data1
occupies just half of the cell, so data2
occupies the remaining part and a part of a second cell.
Now suppose you CPU wants to read data2
. The CPU needs 2 clocks in order to access the data, because within one clock it reads the first cell and within the other clock it reads the remaining part in the second cell.
If we align data2
in accordance with this memory-example, we can introduce a sort of padding and shift data2
all in the second cell.
|a b 0 0| |c d e f| --- padding
In that way the CPU will lose only "1 clock" in order to access to data2
.
An align system just introduces that padding in order to align the data with the memory of the system, in accordance with the architecture.
I will not go deep in this answer. However, broadly speaking, memory alignment comes from the requirements of the context.
In the example above, having padding (so the data is memory-aligned) can save CPU cycles in order to retrieve the data. This might have an impact on the execution performance of the program because of minor number of memory access.
However, beyond the above example (made only for sake of the explanation), there are many other scenarios where memory alignment is useful or even needed.
For example, some architectures might have strict requirements how the memory can be accessed. In such cases, the padding helps to allocate memory fulfilling the platform constraints.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With