Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to do currying in C?

People also ask

Where can you use function currying?

Currying is helpful when you have to frequently call a function with a fixed argument. Considering, for example, the following function: If we want to define the function error , warn , and info , for every type, we have two options. Currying provides a shorter, concise, and more readable solution.

What is currying in programming?

Currying is a process in functional programming in which we can transform a function with multiple arguments into a sequence of nesting functions. It returns a new function that expects the next argument inline.

Is currying supported in scheme?

It is possible to generate curried functions in Scheme. The function curry2 generates a curried version of a function, which accepts two parameters. The curried version takes one parameter at a time. Similarly, curry3 generates a curried version of a function that takes three parameters.

What is currying in C++?

Currying simply means a transformation of a function of several arguments to a function of a single argument. This is most easily illustrated using an example: Take a function f that accepts three arguments: int f(int a,std::string b,float c) { // do something with a, b, and c return 0; }


I found a paper by Laurent Dami that discusses currying in C/C++/Objective-C:

More Functional Reusability in C/C++/Objective-c with Curried Functions

Of interest to how it is implemented in C:

Our current implementation uses existing C constructs to add the currying mechanism. This was much easier to do than modifying the compiler, and is sufficient to prove the interest of currying. This approach has two drawbacks, however. First, curried functions cannot be type-checked, and therefore require careful use in order to avoid errors. Second, the curry function cannot know the size of its arguments, and counts them as if they were all of the size of an integer.

The paper does not contain an implementation of curry(), but you can imagine how it is implemented using function pointers and variadic functions.


GCC provides an extension for the definition of nested functions. Although this is not ISO standard C, this may be of some interest, since it enables to answer the question quite conveniently. In short, nested function can access the parent function local variables and pointers to them can be returned by the parent function as well.

Here is a short, self-explanatory example:

#include <stdio.h>

typedef int (*two_var_func) (int, int);
typedef int (*one_var_func) (int);

int add_int (int a, int b) {
    return a+b;
}

one_var_func partial (two_var_func f, int a) {
    int g (int b) {
        return f (a, b);
    }
    return g;
}

int main (void) {
    int a = 1;
    int b = 2;
    printf ("%d\n", add_int (a, b));
    printf ("%d\n", partial (add_int, a) (b));
}

There is however a limitation to this construction. If you keep a pointer to the resulting function, as in

one_var_func u = partial (add_int, a);

the function call u(0) may result in an unexpected behaviour, as the variable a which u reads was destroyed just after partial terminated.

See this section of GCC's documentation.


Here's my first guess off the top of my head (may not be the best solution).

The curry function could allocate some memory off the heap, and put the parameter values into that heap-allocated memory. The trick is then for the returned function to know that it's supposed to read its parameters from that heap-allocated memory. If there's only one instance of the returned function, then a pointer to those parameters can be stored in a singleton/global. Otherwise if there's more than one instance of the returned function, then I think that curry needs to create each instance of the returned function in the heap-allocated memory (by writing opcodes like "get that pointer to the parameters", "push the parameters", and "invoke that other function" into the heap-allocated memory). In that case you need to beware whether allocated memory is executable, and maybe (I don't know) even be afraid of anti-virus programs.


The good news: There is a way to write a program that will do this in standard ANSI C, without using any compiler-specific features. (In particular, it does not require gcc's nested function support.)

The bad news: It requires creating little bits of executable code to serve as trampoline functions at runtime. This means that the implementation will be dependent on:

  • The processor instruction set
  • The ABI (in particular, the function calling convention)
  • The operating system's ability to mark data as executable

The best news: If you just need to do this in real, production code… you should use the closure API of libffi. It's permissively-licensed and contains careful, flexible implementations for many platforms and ABIs.


If you're still here, you want to nerd out and understand how to implement this “from scratch.”

The program below demonstrates how to curry a 2-parameter function into a 1-parameter function in C, given…

  • x86-64 processor architecture
  • System V ABI
  • Linux operating system

It is based on the “Trampoline Illustration” from Infectious Executable Stacks, but with the trampoline structure stored on the heap (via malloc) rather on than the stack. This is much safer because it means we don't have to disable the compiler's stack execution protection (no gcc -Wl,-z,execstack).

It uses the Linux mprotect system call to make the heap object executable.

The essence of the program is that it takes a pointer to a two-parameter function (uint32_t (*fp2)(uint32_t a, uint32_t b)) and converts it to a pointer to a one-parameter function (uint32_t (*fp1)(uint32_t a)) which invokes fp1 with a pre-set value for the parameter b. It does this by creating little 3-instruction trampoline functions:

movl $imm32, %esi  /* $imm32 filled in with the value of 'b' */
movq %imm64, %rax  /* $imm64 filled in with the value of 'fp2' */
jmpq *%rax

With the values of b and fp2 stitched into it appropriately, a pointer to the block of memory containing these 3 instructions can be used as the one-parameter function pointer fp1 precisely as described above. That's because it obeys the x86-64 System V calling convention, in which one-parameter functions receive their first parameter in the %edi/%rdi register, and two-parameter functions receive the second parameter in the %esi/%rsi register. In this case, the one-parameter trampoline function receives its uint32_t parameter in %edi, then fills in the value of the second uint32_t parameter in %esi, then jumps directly to the “real” two-parameter function which expects its two parameter in exactly those registers.

Here's the complete working code, which I've also put up on GitHub at dlenski/c-curry:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#include <unistd.h>
#include <sys/mman.h>
#include <stdint.h>

#define PAGE_START(P) ((uintptr_t)(P) & ~(pagesize-1))
#define PAGE_END(P)   (((uintptr_t)(P) + pagesize - 1) & ~(pagesize-1))

/* x86-64 ABI passes parameters in rdi, rsi, rdx, rcx, r8, r9
 * (https://wiki.osdev.org/System_V_ABI), and return value
 * goes in %rax.
 *
 * Binary format of useful opcodes:
 *
 *       0xbf, [le32] = movl $imm32, %edi (1st param)
 *       0xbe, [le32] = movl $imm32, %esi (2nd param)
 *       0xba, [le32] = movl $imm32, %edx (3rd param)
 *       0xb9, [le32] = movl $imm32, %ecx (4rd param)
 *       0xb8, [le32] = movl $imm32, %eax
 * 0x48, 0x__, [le64] = movq $imm64, %r__
 *       0xff, 0xe0   = jmpq *%rax
 */

typedef uint32_t (*one_param_func_ptr)(uint32_t);
one_param_func_ptr curry_two_param_func(
    void *two_param_func, 
    uint32_t second_param)
{
    /* This is a template for calling a "curried" version of 
     * uint32_t (*two_param_func)(uint32_t a, uint32_t b),
     * using the Linux x86-64 ABI. The curried version can be
     * treated as uint32_t (*one_param_func)(uint32_t a).
     */
    uintptr_t fp = (uintptr_t)two_param_func;
    uint8_t template[] = {
        0xbe, 0, 0, 0, 0,                                   /* movl $imm32, %esi */
        0x48, 0xb8, fp >>  0, fp >>  8, fp >> 16, fp >> 24, /* movq fp, %rax */
                    fp >> 32, fp >> 40, fp >> 48, fp >> 56,
        0xff, 0xe0                                          /* jmpq *%rax */
    };
    
    /* Now we create a copy of this template on the HEAP, and
     * fill in the second param. */
    uint8_t *buf = malloc(sizeof(template));
    if (!buf)
        return NULL;
    
    memcpy(buf, template, sizeof(template));
    buf[1] = second_param >> 0;
    buf[2] = second_param >> 8;
    buf[3] = second_param >> 16;
    buf[4] = second_param >> 24;
    
    /* We do NOT want to make the stack executable,
     * but we NEED the heap-allocated buf to be executable.
     * Compiling with 'gcc -Wl,-z,execstack' would do BOTH.
     *
     * This appears to be the right way to only make a heap object executable:
     *   https://stackoverflow.com/a/23277109/20789
     */
    uintptr_t pagesize = sysconf(_SC_PAGE_SIZE);
    mprotect((void *)PAGE_START(buf),
             PAGE_END(buf + sizeof(template)) - PAGE_START(buf),
             PROT_READ|PROT_WRITE|PROT_EXEC);
    
    return (one_param_func_ptr)buf;
}

/********************************************/

int print_both_params(int a, int b)
{
    printf("Called with a=%d, b=%d\n", a, b);
    return a+b;
}

int main(int argc, char **argv)
{
    one_param_func_ptr print_both_params_b4 =
        curry_two_param_func(print_both_params, 4);
    one_param_func_ptr print_both_params_b256 = 
        curry_two_param_func(print_both_params, 256);
    
    print_both_params_b4(3);    // "Called with a=3, b=4"
    print_both_params_b256(6);  // "Called with a=6, b=256"

    return 0;
}