I want to implement higher-order functions (HOFs) in C as a syntactic sugar with minimal effort. For example, for the following code
function add(int x) {
return int(int y) {
return x + y;
};
}
int main() {
function add1 = add(1);
return add1(2);
}
it is transcompiled into pure C as
#include <stdlib.h>
typedef struct {
void *ctx;
void* (*fun)(void *arg, void *ctx);
} function;
function new_function(void *ctx, void* (*fun)(void *, void *)) {
function f = {.ctx=ctx, .fun=fun};
return f;
}
void* apply(function f, void *arg) {
return (*(f.fun))(arg, f.ctx);
}
typedef struct {
int x;
} context$1;
void* new_context$1(int x) {
context$1 *ctx = malloc(sizeof(context$1));
ctx->x = x;
return ctx;
}
void* function$1(void *arg, void *ctx) {
int y = (int)arg;
int x = ((context$1*)ctx)->x;
return (void*)(x + y);
}
function add(int x) {
return new_function(new_context$1(x), function$1);
}
int main() {
function add1 = add(1);
return (int)apply(add1, (void*)2);
}
I have run this (manually) transcompiled version and it works fine. For implementation, I believe some AST manipulation and lambda lifting would be enough.
Are there any potential flaws in my approach? Are there any easier methods for HOFs, or can I improve my approach to make it easier to implement?
Although C is not the language that supports many FP concepts out of the box, higher-order functions can be implemented using function pointers and also with compiler support.
Higher-order functions are useful when you want to make your code terse. But they are especially useful to make your code more flexible and maintainable while still remaining relevant and useful.
The map function is one of the many higher-order functions built into the language. sort, reduce, filter, forEach are other examples of higher-order functions built into the language.
Higher-order functions are functions that take functions as parameters or return functions when called. Curried functions are higher-order functions that take one argument at a time returning a series of functions until all parameters are passed.
There are two obvious problems by now:
Be aware that your generated code invokes undefined behavior. In several places, you are converting between integral types and pointer types via direct cast. This isn't legal in C. You have no guarantee that a pointer and an int
are the same size or that you can even cast between them without altering or corrupting the value. The code may work on your particular system by coincidence, but it will break on many other systems.
The only way that you can generically handle both pointers and integral types (as well as structures, for that matter) is to pass parameters using variable-length argument lists. That way, you can extract each argument regardless of the size of the underlying data type.
Another option is to drop the generic function
structure and create a customized structure for each HOF in the code. The function pointer could then list all of the necessary parameters directly instead of trying to abstract them behind a generic interface, which would eliminate the problematic casts and allow the code to work as expected regardless of the data types used.
As far as usability goes, consider changing your interface so that the return type of the HOF is specified (or at least listed) on the line where it is declared. C objects always list their type in the declaration. In your example, the declaration is function add1 = add(1);
. To find out what data type this function returns, you have to dig through the definition for the HOF. This isn't a difficult task for your sample code, but this could be non-trivial for more complex code. You may not have code for the HOF at all if it is coming from a library. Perhaps something like function(int) add1 = add(1);
might be more explicit.
On a side note, I recommend that you define auto-generated functions as static
to help prevent name collisions between modules.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With