Merging global arrays at link time / filling a global array from multiple compilation units

Question

I want to define an array of things, like event handlers. The contents of this array is completely known at compile time, but is defined among multiple compilation units, distributed amongst multiple libraries that are fairly decoupled, at least until the final (static) link. I'd like to keep it that way too - so adding or deleting a compilation unit will also automatically manage the event handler without having to modify a central list of event handlers.

Here's an example of what I'd like to do (but does not work).

central.h:

typedef void (*callback_t)(void);

callback_t callbacks[];

central.c:

#include "central.h"

void do_callbacks(void) {
    int i;
    for (i = 0; i < sizeof(callbacks) / sizeof(*callbacks); ++i)
        callbacks[i]();
}

foo.c:

#include "central.h"

void callback_foo(void) { }

callback_t callbacks[] = {
    &callback_foo
};

bar.c:

#include "central.h"

void callback_bar(void) { }

callback_t callbacks[] = {
    &callback_bar
};

What I'd like to happen is to get a single callbacks array, which contains two elements: &callback_foo and &callback_bar. With the code above, there's obviously two problems:

The callbacks array is defined multiple times.
sizeof(callbacks) isn't known when compiling central.c.

It seems to me that the first point could be solved by having the linker merge the two callbacks symbols instead of throwing an error (possibly through some attribute on the variable), but I'm not sure if there is something like that. Even if there is, the sizeof problem should somehow also be solved.

I realize that a common solution to this problem is to just have a startup function or constructor that "registers" the callback. However, I can see only two ways to implement this:

Use dynamic memory (realloc) for the callbacks array.
Use static memory with a fixed (bigger than usually needed) size.

Since I'm running on a microcontroller platform (Arduino) with limited memory, neither of these approaches appeal to me. And given that the entire contents of the array is known at compile time, I'm hoping for a way to let the compiler also see this.

I've found this and this solution, but those require a custom linker script, which is not feasible in the compilation environment I'm running (especially not since this would require explicitely naming each of these special arrays in the linker script, so just having a single linker script addition doesn't work here).

This solution is the best I found so far. It uses a linked list that is filled at runtime, but uses memory allocated statically in each compile unit seperately (e.g. a next pointer is allocated with each function pointer). Still, the overhead of these next pointers should not be required - is there any better approach?

Perhaps having a dynamic solution combined with link-time optimization can somehow result in a static allocation?

Suggestions on alternative approaches are also welcome, though the required elements are having a static list of things, and memory efficiency.

Furthermore:

Using C++ is fine, I just used some C code above for illustrating the problem, most Arduino code is C++ anyway.
I'm using gcc / avr-gcc and though I'd prefer a portable solution, something that is gcc only is also ok.
I have template support available, but not STL.
In the Arduino environment that I use, I have not Makefile or other way to easily run some custom code at compiletime, so I'm looking for something that can be entirely implemented in the code.

Pedro · Accepted Answer

As commented in some previous answer, the best option is to use a custom linker script (with a KEEP(*(SORT(.whatever.*))) input section).

Anyway, it can be done without modifying the linker scripts (working sample code below), at least at some platforms with gcc (tested on xtensa embedded device and cygwin)

Assumptions:

We want to avoid using RAM as much as possible (embedded)
We do not want the calling module to know anything about the modules with callbacks (it is a lib)
No fixed size for the list (unknown size at library compile time)
I am using GCC. The principle may work on other compilers, but I have not tested it
Callback funtions in this sample receive no arguments, but it is quite simple to modify if needed

How to do it:

We need the linker to somehow allocate at link time an array of pointers to functions
As we do not know the size of the array, we also need the linker to somehow mark the end of the array

This is quite specific, as the right way is using a custom linker script, but it happens to be feasible without doing so if we find a section in the standard linker script that is always "kept" and "sorted".

Normally, this is true for the .ctors.* input sections (the standard requires C++ constructors to be executed in order by function name, and it is implemented like this in standard ld scripts), so we can hack a little and give it a try.

Just take into account that it may not work for all platforms (I have tested it in xtensa embedded architecture and CygWIN, but this is a hacking trick, so...).

Also, as we are putting the pointers in the constructors section, we need to use one byte of RAM (for the whole program) to skip the callback code during C runtime init.

test.c:

A library that registers a module called test, and calls its callbacks at some point

#include "callback.h"

CALLBACK_LIST(test);

void do_something_and_call_the_callbacks(void) {

        // ... doing something here ...

        CALLBACKS(test);

        // ... doing something else ...
}

callme1.c:

Client code registering two callbacks for module test. The generated functions have no name (indeed they do have a name, but it is magically generated to be unique inside the compilation unit)

#include <stdio.h>
#include "callback.h"

CALLBACK(test) {
        printf("%s: %s
", __FILE__, __FUNCTION__);
}

CALLBACK(test) {
        printf("%s: %s
", __FILE__, __FUNCTION__);
}

void callme1(void) {} // stub to be called in the test sample to include the compilation unit. Not needed in real code...

callme2.c:

Client code registering another callback for module test...

#include <stdio.h>
#include "callback.h"

CALLBACK(test) {
        printf("%s: %s
", __FILE__, __FUNCTION__);
}

void callme2(void) {} // stub to be called in the test sample to include the compilation unit. Not needed in real code...

callback.h:

And the magic...

#ifndef __CALLBACK_H__
#define __CALLBACK_H__

#ifdef __cplusplus
extern "C" {
#endif

typedef void (* callback)(void);
int __attribute__((weak)) _callback_ctor_stub = 0;

#ifdef __cplusplus
}
#endif

#define _PASTE(a, b)    a ## b
#define PASTE(a, b)     _PASTE(a, b)

#define CALLBACK(module) \
        static inline void PASTE(_ ## module ## _callback_, __LINE__)(void); \
        static void PASTE(_ ## module ## _callback_ctor_, __LINE__)(void); \
        static __attribute__((section(".ctors.callback." #module "$2"))) __attribute__((used)) const callback PASTE(__ ## module ## _callback_, __LINE__) = PASTE(_ ## module ## _callback_ctor_, __LINE__); \
        static void PASTE(_ ## module ## _callback_ctor_, __LINE__)(void) { \
                 if(_callback_ctor_stub) PASTE(_ ## module ## _callback_, __LINE__)(); \
        } \
        inline void PASTE(_ ## module ## _callback_, __LINE__)(void)

#define CALLBACK_LIST(module) \
        static __attribute__((section(".ctors.callback." #module "$1"))) const callback _ ## module ## _callbacks_start[0] = {}; \
        static __attribute__((section(".ctors.callback." #module "$3"))) const callback _ ## module ## _callbacks_end[0] = {}

#define CALLBACKS(module) do { \
        const callback *cb; \
        _callback_ctor_stub = 1; \
        for(cb =  _ ## module ## _callbacks_start ; cb <  _ ## module ## _callbacks_end ; cb++) (*cb)(); \
} while(0)

#endif

main.c:

If you want to give it a try... this the entry point for a standalone program (tested and working on gcc-cygwin)

void do_something_and_call_the_callbacks(void);

int main() {
    do_something_and_call_the_callbacks();
}

output:

This is the (relevant) output in my embedded device. The function names are generated at callback.h and can have duplicates, as the functions are static

app/callme1.c: _test_callback_8
app/callme1.c: _test_callback_4
app/callme2.c: _test_callback_4

And in CygWIN...

$ gcc -c -o callme1.o callme1.c
$ gcc -c -o callme2.o callme2.c
$ gcc -c -o test.o test.c
$ gcc -c -o main.o main.c
$ gcc -o testme test.o callme1.o callme2.o main.o
$ ./testme
callme1.c: _test_callback_4
callme1.c: _test_callback_8
callme2.c: _test_callback_4

linker map:

This is the relevant part of the map file generated by the linker

 *(SORT(.ctors.*))
 .ctors.callback.test$1    0x4024f040    0x0    .build/testme.a(test.o)
 .ctors.callback.test$2    0x4024f040    0x8    .build/testme.a(callme1.o)
 .ctors.callback.test$2    0x4024f048    0x4    .build/testme.a(callme2.o)
 .ctors.callback.test$3    0x4024f04c    0x0    .build/testme.a(test.o)

Lundin · Answer

Try to solve the actual problem. What you need are multiple callback functions, that are defined in various modules, that aren't in the slightest related to each other.

What you have done though, is to place a global variable in a header file, which is accessible by every module including that header. This introduces a tight coupling between all such files, even though they are not related to each other. Furthermore, it seems only the callback handler .c function needs to actually call the functions, yet they are exposed to the whole program.

So the actual problem here is the program design and nothing else.

And there is actually no apparent reason why you need to allocate this array at compile time. The only sane reason would be to save RAM, but that's of course is a valid reason for an embedded system. In which case the array should be declared as const and initialized at compile time.

You can keep something similar to your design if storing the array as read-write objects. Or if the array must be a read-only one for the purpose of saving RAM, you must do a drastic re-design.

I'll give both versions, consider which one is most suitable for your case:

RAM-based read/write array

(Advantage: flexible, can be changed in runtime. Disadvantages: RAM consumption. Slight over-head code for initialization. RAM is more exposed to bugs than flash.)

Let the callback.h and callback.c from a module which is only concerned with the handling of the callback functions. This module is responsible for how the callbacks are allocated and when they are executed.
In callback.h define a type for the callback functions. This should be a function pointer type just as you have done. But remove the variable declaration from the .h file.

In callback.c, declare the callback array of functions as

 static callback_t callbacks [LARGE_ENOUGH_FOR_WORST_CASE];

There is no way you can avoid "LARGE_ENOUGH_FOR_WORST_CASE". You are on an embedded system with limited RAM, so you have to actually consider what the worst-case scenario is and reserve enough memory for that, no more, no less. On a microcontroller embedded system, there are no such things as "usually needed" nor "lets save some RAM for other processes". Your MCU either has enough memory to cover the worst case scenario, or it does not, in which case no amount of clever allocations will save you.
In callback.c, declare a size variable that keeps track of how much of the callback array that has been initialized. static size_t callback_size;.
Write an init function void callback_init(void) which initializes the callback module. The prototype should be in the .h file and the caller is responsible for executing it once, at program startup.
Inside the init function, set callback_size to 0. The reason I propose to do this in runtime is because you have an embedded system where a .bss segment may not be present or even undesired. You might not even have a copy-down code that initializes all static variables to zero. Such behavior is non-conformant with the C standard but very common in embedded systems. Therefore, never write code which relies on static variables getting automatically initialized to zero.
Write a function void callback_add (callback_t* callback);. Every module that includes your callback module will call this function to add their specific callback functions to the list.
Keep your do_callbacks function as it is (though as a minor remark, consider renaming to callback_traverse, callback_run or similar).

Flash-based read-only array

(Advantages: saves RAM, true read-only memory safe from memory corruption bugs. Disadvantages: less flexible, depends on every module used in the project, possibly slightly slower access because it's in flash.)

In this case, you'll have to turn the whole program upside-down. By the nature of compile-time solutions, it will be a whole lot more "hard-coded".

Instead of having multiple unrelated modules including a callback handler module, you'll have to make the callback handler module include everything else. The individual modules still don't know when a callback will get executed or where it is allocated. They just declare one or several functions as callbacks. The callback module is then responsible for adding every such callback function to its array at compile-time.

// callback.c

#include "timer_module.h"
#include "spi_module.h"
...

static const callback_t CALLBACKS [] = 
{
  &timer_callback1,
  &timer_callback2,
  &spi_callback,
  ...
};

The advantage of this is that you'll automatically get the worst case scenario handed to you by your own program. The size of the array is now known at compile time, it is simply sizeof(CALLBACKS)/sizeof(callback_t).

Of course this isn't nearly as elegant as the generic callback module. You get a tight coupling from the callback module to every other module in the project, but not the other way around. Essentially, the callback.c is a "main()".

You can still use a function pointer typedef in callback.h though, but it is no longer actually needed: the individual modules must ensure that they have their callback functions written in the desired format anyhow, with or without such a type present.

Merging global arrays at link time / filling a global array from multiple compilation units

Tags:

c++

c

embedded

compile-time

arduino

Matthijs Kooijman

2 Answers

Pedro

Lundin

Recent Activity

Donate For Us

Merging global arrays at link time / filling a global array from multiple compilation units

Tags:

c++

c

embedded

compile-time

arduino

Matthijs Kooijman

2 Answers

Pedro

Lundin

Related questions

Recent Activity

Donate For Us