Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prohibit unaligned memory accesses on x86/x86_64

I want to emulate the system with prohibited unaligned memory accesses on the x86/x86_64. Is there some debugging tool or special mode to do this?

I want to run many (CPU-intensive) tests on the several x86/x86_64 PCs when working with software (C/C++) designed for SPARC or some other similar CPU. But my access to Sparc is limited.

As I know, Sparc always checks alignment in memory reads and writes to be natural (reading a byte from any address, but reading a 4-byte word only allowed when address is divisible by 4).

May be Valgrind or PIN has such mode? Or special mode of compiler? I'm searching for Linux non-commercial tool, but windows tools allowed too.

or may be there is secret CPU flag in EFLAGS?

like image 688
osgx Avatar asked Aug 06 '12 23:08

osgx


People also ask

What is unaligned memory access?

Unaligned memory accesses occur when you try to read N bytes of data starting from an address that is not evenly divisible by N (i.e. addr % N != 0). For example, reading 4 bytes of data from address 0x10004 is fine, but reading 4 bytes of data from address 0x10005 would be an unaligned memory access.

Why is unaligned memory access slower?

Because the address doesn't fall evenly on the processor's memory access boundary, the processor has extra work to do. Such an address is known as an unaligned address. Because address 1 is unaligned, a processor with two-byte granularity must perform an extra memory access, slowing down the operation.

What is unaligned address?

An unaligned address is then an address that isn't a multiple of the transfer size. The meaning in AXI4 would be the same.


2 Answers

I've just read question Does unaligned memory access always cause bus errors? which linked to Wikipedia article Segmentation Fault.

In the article, there's a wonderful reminder of rather uncommon Intel processor flags AC aka Alignment Check.

And here's how to enable it (from Wikipedia's Bus Error example, with a red-zone clobber bug fixed for x86-64 System V so this is safe on Linux and MacOS, and converted from Basic asm which is never a good idea inside functions: you want changes to AC to be ordered wrt. memory accesses.

#if defined(__GNUC__)
# if defined(__i386__)
    /* Enable Alignment Checking on x86 */
    __asm__("pushf\n orl $0x40000,(%%esp)\n popf" ::: "memory");
# elif defined(__x86_64__) 
     /* Enable Alignment Checking on x86_64 */
    __asm__("add $-128, %%rsp \n"    // skip past the red-zone, in case there is one and the compiler has local vars there.
            "pushf\n"
            "orl $0x40000,(%%rsp)\n"
            "popf \n"
            "sub $-128, %%rsp"       // and restore the stack pointer.
           ::: "memory");       // ordered wrt. other mem access
# endif
#endif

Once enable it's working a lot like ARM alignment settings in /proc/cpu/alignment, see answer How to trap unaligned memory access? for examples.

Additionally, if you're using GCC, I suggest you enable -Wcast-align warnings. When building for a target with strict alignment requirements (ARM for example), GCC will report locations that might lead to unaligned memory access.

But note that libc's handwritten asm for memcpy and other functions will still make unaligned accesses, so setting AC is often not practical on x86 (including x86-64). GCC will sometimes emit asm that makes unaligned accesses even if your source doesn't, e.g. as an optimization to copy or zero two adjacent array elements or struct members at once.

like image 92
Yann Droneaud Avatar answered Oct 26 '22 23:10

Yann Droneaud


It's tricky and I haven't done it personally, but I think you can do it in the following way:

x86_64 CPUs (specifically I've checked Intel Corei7 but I guess others as well) have a performance counter MISALIGN_MEM_REF which counter misaligned memory references.

So first of all, you can run your program and use "perf" tool under Linux to get a count of the number of misaligned access your code has done.

A more tricky and interesting hack would be to write a kernel module that programs the performance counter to generate an interrupt on overflow and get it to overflow the first unaligned load/store. Respond to this interrupt in your kernel module but sending a signal to your process.

This will, in effect, turn the x86_64 into a core that doesn't support unaligned access.

This wont be simple though - beside your code, the system libraries also use unaligned accesses, so it will be tricky to separate them from your own code.

like image 37
gby Avatar answered Oct 27 '22 00:10

gby