Embedded: memcpy/memset not used by most CRT startup code ― why?

Question

Context:
I'm working on an ARM target, more specifically a Cortex-M4F microcontroller from ST. When working on such platforms (microcontrollers in general), there's obviously no OS; in order to get a working C/C++ "environment" (moreover, to be standard compliant in regard to initialization of variables) there must be some kind of startup code run at reset that does the minimum setup required before explicitly calling main. Such startup code, as I hinted, must initialize initialized global and static variables (such as int foo = 42;at global scope) and zero-out the other globals (such as int bar; at global scope). Then, if necessary, global "ctors" are called.

On a microcontroller, that simply means that the startup code has to copy data from flash to ram for every initialized global (all in section '.data') and clear the others (all in '.bss'). Because I use GCC, I must supply such a startup code and I happily analyzed several startup codes (and its associated linker script!) bundled with numerous examples I've found on the Internet, all using the same demo board I'm developing on.

Question:
As stated, I've seen numerous startup codes, and they initialize globals in different ways, some more efficient in term of space and time than others. But they all have something odd in common: they didn't use memset nor memcpy, resorting instead to hand-written loops to do the job. As it appears natural to me to use standard functions when possible (simple "DRY principle"), I tried the following in lieu of the initial hand-written loops:

/* Initialize .data section */
ldr r0, DATA_LOAD
ldr r1, DATA_START
ldr r2, DATA_SIZE
bl  memcpy       /* memcpy(DATA_LOAD, DATA_START, DATA_SIZE); */

/* Initialize .bss section */
ldr r0, BSS_START
mov r1, #0
ldr r2, BSS_SIZE
bl  memset       /* memset(BSS_START, 0, BSS_SIZE); */

... and it worked perfectly. The space saving are negligible, but it is clearly dead simple now.

So, I thought about it, and I see no reason to do hand-written loops in this case:

memcpy and memset are very likely to be linked in the executable anyway, because the programmer would use it directly, or indirectly through another library;
It is smaller;
Speed is not a very important factor for startup code, but nevertheless it is likely faster;
It's nearly impossible to get it wrong.

Any idea why one wouldn't rely on memcpy and memset for startup code?

R.. GitHub STOP HELPING ICE · Accepted Answer

I suspect the startup code does not want to make assumptions about the implementation of memcpy and such in libc. For example, the implementation of memcpy might use a global variable set by libc initialization code to report which cpu extensions are available, in order to provide optimized SIMD copying on machines that support such operations. At the point where the early "crt" startup code is running, the storage for such a global might be completely uninitialized (containing random junk), in which case it would be dangerous to call memcpy. Even if making the call works for you, it's a consequence of the implementation (or maybe even the unpredictable results of UB...) making it work; this is probably not something the crt code wants to depend on.

Embedded: memcpy/memset not used by most CRT startup code ― why?

Tags:

c

assembly

embedded

Jarhmander

1 Answers

R.. GitHub STOP HELPING ICE

Recent Activity

Donate For Us

Embedded: memcpy/memset not used by most CRT startup code ― why?

Tags:

c

assembly

embedded

Jarhmander

1 Answers

R.. GitHub STOP HELPING ICE

Related questions

Recent Activity

Donate For Us