I am trying to write some bare metal code with a <code>memset</code>-style loop in it: <pre class="prettyprint"><code>for (int i = 0; i < N; ++i) { arr[i] = 0; } </code></pre> It is compiled with GCC and GCC is smart enough to turn that into a call to <code>memset()</code>. Unfortunately because it's bare metal I have no <code>memset()</code> (normally in libc) so I get a link error. <pre class="prettyprint"><code> undefined reference to `memset' </code></pre> It seems like the optimisation that does this transformation is <code>-ftree-loop-distribute-patterns</code>: <blockquote> Perform loop distribution of patterns that can be code generated with calls to a library. This flag is enabled by default at -O2 and higher, and by <code>-fprofile-use</code> and <code>-fauto-profile</code>. </blockquote> So one person's solution was to just lower the optimisation level. Not very satisfying. I also found this really helpful page that explains that <code>-ffreestanding</code> is not enough to get GCC not to do this, and there's basically no option but to provide your own implementations of <code>memcpy</code>, <code>memmove</code>, <code>memset</code> and <code>memcmp</code>. I'm happy to do that, but how? If I just write <code>memset</code> the compiler will detect the loop inside it and transform it into a call to memset! In fact in the code provided by the CPU vendor I'm using I actually found this comment: <pre class="prettyprint"><code>/* // This is commented out because the assembly code that the compiler generates appears to be // wrong. The code would recursively call the memset function and eventually overruns the // stack space. void * memset(void *dest, int ch, size_t count) ... </code></pre> So I assume that is the issue they ran into. How do I supply a C implementation of <code>memset</code> without the compiler optimising it to a call to itself and without disabling that optimisation?

You mention in your question: <blockquote> It seems like the optimisation that does this transformation is <code>-ftree-loop-distribute-patterns</code> </blockquote> all you need to do to turn off this optimization is pass <code>-fno-tree-loop-distribute-patterns</code> to the compiler. This turns off the optimization globally.

How to provide an implementation of memcpy

Tags:

c

gcc

memset

I am trying to write some bare metal code with a memset-style loop in it:

for (int i = 0; i < N; ++i) {
  arr[i] = 0;
}

It is compiled with GCC and GCC is smart enough to turn that into a call to memset(). Unfortunately because it's bare metal I have no memset() (normally in libc) so I get a link error.

 undefined reference to `memset'

It seems like the optimisation that does this transformation is -ftree-loop-distribute-patterns:

Perform loop distribution of patterns that can be code generated with calls to a library. This flag is enabled by default at -O2 and higher, and by -fprofile-use and -fauto-profile.

So one person's solution was to just lower the optimisation level. Not very satisfying.

I also found this really helpful page that explains that -ffreestanding is not enough to get GCC not to do this, and there's basically no option but to provide your own implementations of memcpy, memmove, memset and memcmp. I'm happy to do that, but how?

If I just write memset the compiler will detect the loop inside it and transform it into a call to memset! In fact in the code provided by the CPU vendor I'm using I actually found this comment:

/*
// This is commented out because the assembly code that the compiler generates appears to be
// wrong.  The code would recursively call the memset function and eventually overruns the
// stack space.
void * memset(void *dest, int ch, size_t count)
...

So I assume that is the issue they ran into.

How do I supply a C implementation of memset without the compiler optimising it to a call to itself and without disabling that optimisation?

807

asked Apr 22 '21 09:04

Timmmm

Video Answer

2 Answers

Aha I checked in the glibc code and there's a inhibit_loop_to_libcall modifier which sounds like it should do this. It is defined like this:

/* Add the compiler optimization to inhibit loop transformation to library
   calls.  This is used to avoid recursive calls in memset and memmove
   default implementations.  */
#ifdef HAVE_CC_INHIBIT_LOOP_TO_LIBCALL
# define inhibit_loop_to_libcall \
    __attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
#else
# define inhibit_loop_to_libcall
#endif

answered Oct 19 '22 04:10

Timmmm

You mention in your question:

It seems like the optimisation that does this transformation is -ftree-loop-distribute-patterns

all you need to do to turn off this optimization is pass -fno-tree-loop-distribute-patterns to the compiler. This turns off the optimization globally.

answered Oct 19 '22 03:10

S.S. Anne

Related questions
                            
                                Why does m[1] - m[0] return 3 where m is a 3x3 matrix?
                            
                                Variable number of parameters in function in C++
                            
                                Allocating memory for a Structure in C
                            
                                String going crazy if I don't give it a little extra room. Can anyone explain what is happening here?
                            
                                Setup OpenCV-2.3 for Visual Studio 2010
                            
                                Does the order of cases in a switch statement affect performance?
                            
                                Can I skip cmake compiler tests or avoid "error: unrecognized option '-rdynamic'"
                            
                                Two or more data types in declaration specifiers error [closed]
                            
                                How to detect UTF-8 in plain C?
                            
                                How to update old C code? [closed]
                            
                                How do I align a number like this in C?
                            
                                Except OOP, why is C++ better than C? [closed]
                            
                                What is "int i = 1;Why (i >= 60 * 60 * 1000 / 1 * 1000)" true?
                            
                                C syntax for functions returning function pointers
                            
                                Fast AVX512 modulo when same divisor
                            
                                Multi-threaded debugging tutorial for GDB and C [closed]
                            
                                Is epoll thread-safe?
                            
                                How to embed WebKit into my C/C++/Win32 application?
                            
                                How do I use a C library in a Rust library compiled to WebAssembly?
                            
                                How to play MP3 files in C?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to provide an implementation of memcpy

Tags:

c

gcc

memset

Timmmm

People also ask

Video Answer

2 Answers

Timmmm

S.S. Anne

Recent Activity

Donate For Us