Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementation of nested functions

I recently found out that gcc allows the definition of nested function. In my opinion, this is a cool feature, but I wonder how to implement it.

While it is certainly not difficult to implement direct calls of nested functions by passing a context pointer as a hidden argument, gcc also allows to take a pointer to a nested function and pass this pointer to an arbitrary other function that in turn can call the nested function of the context. Because the function that calls the nested function has only the type of the nested function to call, it obviously can't pass a context pointer.

I know, that other languages like Haskell that have a more convoluted calling convention allow partial application to support such stuff, but I see no way to do that in C. How is it possible to implement this?

Here is a small example of a case that illustrates the problem:

int foo(int x,int(*f)(int,int(*)(void))) {   int counter = 0;   int g(void) { return counter++; }    return f(x,g); } 

This function calls a function that calls a function that returns a counter from the context and increments it at the same time.

like image 649
fuz Avatar asked Nov 18 '11 08:11

fuz


People also ask

What is function nested function?

Using a function as one of the arguments in a formula that uses a function is called nesting, and we'll refer to that function as a nested function.

What is nested function in programming?

A nested function is a function that is completely contained within a parent function. Any function in a program file can include a nested function.

What is nested function in C language?

A nested function is a function defined inside the definition of another function. It can be defined wherever a variable declaration is permitted, which allows nested functions within nested functions. Within the containing function, the nested function can be declared prior to being defined by using the auto keyword.

Can we nest functions within functions?

We can declare a function inside a function, but it's not a nested function. Because nested functions definitions can not access local variables of the surrounding blocks, they can access only global variables of the containing module.


1 Answers

GCC uses something called a trampoline.

Information: http://gcc.gnu.org/onlinedocs/gccint/Trampolines.html

A trampoline is a piece of code that GCC creates in the stack to use when you need a pointer to a nested function. In your code, the trampoline is necessary because you pass g as a parameter to a function call. A trampoline initializes some registers so that the nested function can refer to variables in the outer function, then it jumps to the nested function itself. Trampolines are very small -- you "bounce" off a trampoline and into the body of the nested function.

Using nested functions this way requires an executable stack, which is discouraged these days. There is not really any way around it.

Dissection of a trampoline:

Here is an example of a nested function in GCC's extended C:

void func(int (*param)(int));  void outer(int x) {     int nested(int y)     {         // If x is not used somewhere in here,         // then the function will be "lifted" into         // a normal, non-nested function.         return x + y;     }     func(nested); } 

It's very simple so we can see how it works. Here is the resulting assembly of outer, minus some stuff:

subq    $40, %rsp movl    $nested.1594, %edx movl    %edi, (%rsp) leaq    4(%rsp), %rdi movw    $-17599, 4(%rsp) movq    %rsp, 8(%rdi) movl    %edx, 2(%rdi) movw    $-17847, 6(%rdi) movw    $-183, 16(%rdi) movb    $-29, 18(%rdi) call    func addq    $40, %rsp ret 

You'll notice that most of what it does is write registers and constants to the stack. We can follow along, and find that at SP+4 it places a 19 byte object with the following data (in GAS syntax):

 .word -17599 .int $nested.1594 .word -17847 .quad %rsp .word -183 .byte -29 

This is easy enough to run through a disassembler. Suppose that $nested.1594 is 0x01234567 and %rsp is 0x0123456789abcdef. The resulting disassembly, provided by objdump, is:

    0:   41 bb 67 45 23 01       mov    $0x1234567,%r11d    6:   49 ba ef cd ab 89 67    mov    $0x123456789abcdef,%r10    d:   45 23 01    10:   49 ff e3                rex.WB jmpq   *%r11 

So, the trampoline loads the outer function's stack pointer into %r10 and jumps to the nested function's body. The nested function body looks like this:

movl    (%r10), %eax addl    %edi, %eax ret 

As you can see, the nested function uses %r10 to access the outer function's variables.

Of course, it's fairly silly that the trampoline is larger than the nested function itself. You could easily do better. But not very many people use this feature, and this way, the trampoline can stay the same size (19 bytes) no matter how large the nested function is.

Final note: At the bottom of the assembly, there is a final directive:

 .section        .note.GNU-stack,"x",@progbits 

This instructs the linker to mark the stack as executable.

like image 140
Dietrich Epp Avatar answered Sep 28 '22 05:09

Dietrich Epp