Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compiling Tail-Call Optimization In Mutual Recursion Across C and Haskell

I'm experimenting with the foreign-function interface in Haskell. I wanted to implement a simple test to see if I could do mutual recursion. So, I created the following Haskell code:

module MutualRecursion where
import Data.Int

foreign import ccall countdownC::Int32->IO ()
foreign export ccall countdownHaskell::Int32->IO()

countdownHaskell::Int32->IO()
countdownHaskell n = print n >> if n > 0 then countdownC (pred n) else return ()

Note that the recursive case is a call to countdownC, so this should be tail-recursive.

In my C code, I have

#include <stdio.h>

#include "MutualRecursionHaskell_stub.h"

void countdownC(int count)
{
    printf("%d\n", count);
    if(count > 0)
        return countdownHaskell(count-1);
}

int main(int argc, char* argv[])
{
    hs_init(&argc, &argv);

    countdownHaskell(10000);

    hs_exit();
    return 0;
}

Which is likewise tail recursive. So then I make a

MutualRecursion: MutualRecursionHaskell_stub
    ghc -O2 -no-hs-main MutualRecursionC.c MutualRecursionHaskell.o -o MutualRecursion
MutualRecursionHaskell_stub:
    ghc -O2 -c MutualRecursionHaskell.hs

and compile with make MutualRecursion.

And... upon running, it segfaults after printing 8991. Just as a test to make sure gcc itself can handle tco in mutual recursion, I did

void countdownC2(int);

void countdownC(int count)
{
    printf("%d\n", count);
    if(count > 0)
        return countdownC2(count-1);
}

void countdownC2(int count)
{
    printf("%d\n", count);
    if(count > 0)
        return countdownC(count-1);
}

and that worked quite fine. It also works in the single-recursion case of just in C and just in Haskell.

So my question is, is there a way to indicate to GHC that the call to the external C function is tail recursive? I'm assuming that the stack frame does come from the call from Haskell to C and not the other way around, since the C code is very clearly a return of a function call.

like image 258
Crazycolorz5 Avatar asked Nov 03 '15 18:11

Crazycolorz5


People also ask

Does C support tail recursion optimization?

Since many Scheme compilers use C as an intermediate target code, the tail recursion must be encoded in C without growing the stack, even if the C compiler does not optimize tail calls. Many implementations achieve this by using a device known as a trampoline, a piece of code that repeatedly calls functions.

Does Haskell use tail recursion?

Note that foldl is tail recursive. The important concept to know in Haskell is guarded recursion (see tail recursion modulo cons), where any recursive calls occur within a data constructor (such as foldr , where the recursive call to foldr occurs as an argument to (:) ).

How does Haskell optimize recursion?

Haskell uses lazy-evaluation to implement recursion, so treats anything as a promise to provide a value when needed (this is called a thunk). Thunks get reduced only as much as necessary to proceed, no more. This resembles the way you simplify an expression mathematically, so it's helpful to think of it that way.


1 Answers

I believe cross-language C-Haskell tail calls are very, very hard to achieve.

I do not know the exact details, but the C runtime and the Haskell runtime are vastly different. The main factors for this difference, as far as I can see, are:

  • different paradigm: purely functional vs imperative
  • garbage collection vs manual memory management
  • lazy semantics vs strict one

The kinds of optimizations which are likely to survive across language boundaries given such differences are next to zero. Perhaps, in theory, one could invent an ad hoc C runtime together with a Haskell runtime so that some optimizations are feasible, but GHC and GCC were not designed in this way.

Just to show an example of the potential differences, assume we have the following Haskell code

p :: Int -> Bool
p x = x==42

main = if p 42
       then putStrLn "A"     -- A
       else putStrLn "B"     -- B

A possible implementation of the main could be the following:

  • push the address of A on the stack
  • push the address of B on the stack
  • push 42 on the stack
  • jump to p
  • A: print "A", jump to end
  • B: print "B", jump to end

while p is implemented as follows:

  • p: pop x from the stack
  • pop b from stack
  • pop a from stack
  • test x against 42
  • if equal, jump to a
  • jump to b

Note how p is invoked with two return addresses, one for each possible result. This is different from C, whose standard implementations use only one return address. When crossing boundaries the compiler must account for this difference and compensate.

Above I also did not account for the case when the argument of p is a thunk, to keep it simple. The GHC allocator can also trigger garbage collection.

Note that the above fictional implementation was actually used in the past by GHC (the so called "push/enter" STG machine). Even if now it is no longer in use, the "eval/apply" STG machine is only marginally closer to the C runtime. I'm not even sure about GHC using the regular C stack: I think it does not, using its own one.

You can check the GHC developer wiki to see the gory details.

like image 128
chi Avatar answered Sep 24 '22 14:09

chi