ARM Cortex M7 unaligned access and memcpy

Tags:

I am compiling this code for a Cortex M7 using GCC:

// copy manually
void write_test_plain(uint8_t * ptr, uint32_t value)
{
    *ptr++ = (u8)(value);
    *ptr++ = (u8)(value >> 8);
    *ptr++ = (u8)(value >> 16);
    *ptr++ = (u8)(value >> 24); 
}

// copy using memcpy
void write_test_memcpy(uint8_t * ptr, uint32_t value)
{
    void *px = (void*)&value;
    memcpy(ptr, px, 4);
}

int main(void) 
{
    extern uint8_t data[];
    extern uint32_t value;

    // i added some offsets to data to
    // make sure the compiler cannot
    // assume it's aligned in memory

    write_test_plain(data + 2, value);
    __asm volatile("": : :"memory"); // just to split inlined calls
    write_test_memcpy(data + 5, value);

    ... do something with data ...
}

And I get the following Thumb2 assembly with -O2:

// write_test_plain(data + 2, value);
800031c:    2478        movs    r4, #120 ; 0x78
800031e:    2056        movs    r0, #86  ; 0x56
8000320:    2134        movs    r1, #52  ; 0x34
8000322:    2212        movs    r2, #18  ; 0x12
8000324:    759c        strb    r4, [r3, #22]
8000326:    75d8        strb    r0, [r3, #23]
8000328:    7619        strb    r1, [r3, #24]
800032a:    765a        strb    r2, [r3, #25]

// write_test_memcpy(data + 5, value);
800032c:    4ac4        ldr r2, [pc, #784]  ; (8000640 <main+0x3a0>)
800032e:    923b        str r2, [sp, #236]  ; 0xec
8000330:    983b        ldr r0, [sp, #236]  ; 0xec
8000332:    f8c3 0019   str.w   r0, [r3, #25]

Can someone explain how the memcpy version works? This looks like inlined 32-bit store to the destination address, but isn't this a problem since data + 5 is most certainly not aligned to a 4-byte boundary?

Is this perhaps some optimization which happens due to some undefined behavior in my source?

721

asked Jun 14 '18 22:06

Lou

2 Answers

For Cortex-M processors unaligned loads and stores of bytes, half-words, and words are usually allowed and most compilers use this when generating code unless they are instructed not to. If you want to prevent gcc from assuming the unaligned accesses are OK, you can use the -mno-unaligned-access compiler flag.

If you specify this flag gcc will no longer inline the call to memcpy and write_test_memcpy looks like

write_test_memcpy(unsigned char*, unsigned long):
  push {lr}
  sub sp, sp, #12
  movs r2, #4
  add r3, sp, #8
  str r1, [r3, #-4]!
  mov r1, r3
  bl memcpy
  add sp, sp, #12
  ldr pc, [sp], #4

200

answered Oct 03 '22 01:10

Johan

Cortex-M 7 , M4, M3 M33, M23 does support unaligned access M0, M+ doesn't support unaligned access

however you can disable the support of unaligned access in cortexm7 by setting bit UNALIGN_TRP in configuration and control register and any unaligned access will generate usage fault.

From compiler perspective, default setting is that generated assembly code does unaligned access unless you disable this by using the compile flag -mno-unaligned-access

answered Oct 03 '22 01:10

dhokar.w

Related questions
                            
                                Does strncat() always null terminate?
                            
                                C - using fork() and exec() twice
                            
                                Does this C inheritance implementation contain undefined behavior?
                            
                                Are the prologue and epilogue mandatory when writing assembly functions?
                            
                                Place segments of external static library to specific locations
                            
                                Converting Python dictionary to ctypes structure
                            
                                Writing a safe tagged union in C
                            
                                when pthread_attr_t is not NULL?
                            
                                Rationale of static declaration followed by non-static declaration allowed but not vice versa
                            
                                Get size of x86-64 instruction
                            
                                Why does pow() subtract 1 from my result? [duplicate]
                            
                                How avoid cache line invalidation from multiple threads writing to a shared array?
                            
                                How to gracefully handle accept() giving EMFILE and close the connection?
                            
                                Is it guaranteed that there's no padding between values of the same type?
                            
                                Bitfield endianness in gcc
                            
                                Is there any purpose to `bind()` unix domain socket client processes?
                            
                                Alternative solutions for using native APIs: JVM_LoadClass0, JVM_AllocateNewArray and JVM_AllocateNewObject
                            
                                C- File input/output buffers and setvbuf()
                            
                                Array of char pointers
                            
                                How to pass user data to a callback function

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

ARM Cortex M7 unaligned access and memcpy

Tags:

c

embedded

memory-alignment

memcpy

cortex-m

Lou

People also ask

2 Answers

Johan

dhokar.w

Recent Activity

Donate For Us