Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does CPU access memory on a word boundary?

Tags:

I heard a lot that data should be properly aligned in memory for better access efficiency. CPU access memory on a word boundary.

So in the following scenario, the CPU has to make 2 memory accesses to get a single word.

Supposing: 1 word = 4 bytes  ("|" stands for word boundary. "o" stands for byte boundary)   |----o----o----o----|----o----o----o----|   (The word boundary in CPU's eye)            ----o----o----o----              (What I want to read from memory) 

Why should this happen? What's the root cause of the CPU can only read at the word boundary?

If the CPU can only access at the 4-byte word boundary, the address line should only need 30bit, not 32bit width. Cause the last 2bit are always 0 in CPU's eye.

ADD 1

And even more, if we admit that CPU must read at the word boundary, why can't the boundary start at where I want to read? It seems that the boundary is fixed in CPU's eye.

ADD 2

According to AnT, it seems that the boundary setting is hardwired and it is hardwired by the memory access hardware. CPU is just innocent as far as this is concerned.

like image 893
smwikipedia Avatar asked Sep 07 '10 05:09

smwikipedia


People also ask

How does a processor access memory?

Random access allows the processor to access any part of the memory directly rather than having to proceed sequentially from a starting place. RAM is located close to a computer's processor and enables faster access to data than storage media such as hard disk drives and solid-state drives.

Which memory is directly access the data by CPU?

The correct answer is Register Memory. Registers are small memory locations that are located directly on the CPU chip itself. The data stored within them is directly available to the CPU and can be accessed extremely quickly.

What is a word boundary?

A word boundary is a zero-width test between two characters. To pass the test, there must be a word character on one side, and a non-word character on the other side. It does not matter which side each character appears on, but there must be one of each.

Why is memory alignment needed?

The CPU can operate on an aligned word of memory atomically, meaning that no other instruction can interrupt that operation. This is critical to the correct operation of many lock-free data structures and other concurrency paradigms.


1 Answers

The meaning of "can" (in "...CPU can access...") in this case depends on the hardware platform.

On x86 platform CPU instructions can access data aligned on absolutely any boundary, not only on "word boundary". The misaligned access might be less efficient than aligned access, but the reasons for that have absolutely nothing to do with CPU. It has everything to do with how the underlying low-level memory access hardware works. It is quite possible that in this case the memory-related hardware will have to make two accesses to the actual memory, but that's something CPU instructions don't know about and don't need to know about. As far as CPU is concerned, it can access any data on any boundary. The rest is implemented transparently to CPU instructions.

On hardware platforms like Sun SPARC, CPU cannot access misaligned data (in simple words, your program will crash if you attempt to), which means that if for some reason you need to perform this kind of misaligned access, you'll have to implement it manually and explicitly: split it into two (or more) CPU instructions and thus explicitly perform two (or more) memory accesses.

As for why it is so... well, that's just how modern computer memory hardware works. The data has to be aligned. If it is not aligned, the access either is less efficient or does not work at all.

A very simplified model of modern memory would be a grid of cells (rows and columns), each cell storing a word of data. A programmable robotic arm can put a word into a specific cell and retrieve a word from a specific cell. One at a time. If your data is spread across several cells, you have no other choice but to make several consecutive trips with that robotic arm. On some hardware platforms the task of organizing these consecutive trips is hidden from CPU (meaning that the arm itself knows what to do to assemble the necessary data from several pieces), on other platforms it is visible to the CPU (meaning that it is the CPU who's responsible for organizing these consecutive trips of the arm).

like image 123
AnT Avatar answered Sep 21 '22 08:09

AnT