I'm a bit confused about the concept of memory alignment. So here's my doubt: What text says is that say if you wanna read 4 bytes of data, starting from an address that is not divisible by 4, you have the case of an unaligned memory access. Example, if I wanna read 10 bytes starting at address 05, this will be termed as an unaligned access (http://www.mjmwired.net/kernel/Documentation/unaligned-memory-access.txt).
Will this case be specific to a 4 byte word addressable architecture or does this hold valid for a byte addressable architecture as well? If the above case is unaligned for a byte addressable architecture, why is it so?
Thanks!
The CPU can operate on an aligned word of memory atomically, meaning that no other instruction can interrupt that operation. This is critical to the correct operation of many lock-free data structures and other concurrency paradigms.
For instance, if the address of a data is 12FEECh (1244908 in decimal), then it is 4-byte alignment because the address can be evenly divisible by 4. (You can divide it by 2 or 1, but 4 is the highest number that is divisible evenly.)
What does it mean to be 4-byte aligned? A 1-byte variable (typically a char in C/C++) is always aligned. A 2-byte variable (typically a short in C/C++) in order to be aligned must lie at an address divisible by 2. A 4-byte variable (typically an int in C/C++) must lie at an address divisible by 4 and so on.09-Apr-2020.
An object that is "8 bytes aligned" is stored at a memory address that is a multiple of 8. Many CPUs will only load some data types from aligned locations; on other CPUs such access is just faster.
As a general rule, bit 0 in memory is gated onto a bus and bit 0 of that bus is connected to bit 0 of every register. It goes on like this until bit 31. There may be special hardware that directs each byte (bits 15:8, 23:16, and 31:24) onto the low order byte, bits 7:0. (When you get to bit "32", it's actually bit 0 of the 4-byte word at address 4.)
However, in the nominal case there is not any special hardware that moves bytes to any position other than the one they are nominally connected to in the natural order and, maybe, byte lane 0.
Imagine a simple memory chip with 32 data pins and a simple CPU with 32 data pins. A given data pin on each chip is wired to the corresponding one on the other, and only to that one. There simply is no way for a simple CPU to do the misaligned read at all.
So, consider a read from 0. The next 4 bytes all fall into a register as wired, and this also happens for a read from address 4. But what if you read (32 bits) from address 1? Or 2? Or 3? Although the read cannot be done directly in hardware, a fancy controller can cause a whole lot of things to happen:
All of these things take extra time.
Note. In reality the data bus is typically a multiple of 32-bits and so is the memory. Special hardware may exist for realigning objects. But even then, because it's an abnormal case, it may not get the pipeline optimizations that properly aligned reads get, and even with special hardware there is probably a time penalty for running operands through it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With