Roughly following this tutorial, I managed to get this toy project working. It calls a Haskell function from a C++ program.
Foo.hs
{-# LANGUAGE ForeignFunctionInterface #-}
module Foo where
foreign export ccall foo :: Int -> Int -> IO Int
foo :: Int -> Int -> IO Int
foo n m = return . sum $ f n ++ f m
f :: Int -> [Int]
f 0 = []
f n = n : f (n-1)
bar.c++
#include "HsFFI.h"
#include FOO // Haskell module (path defined in build script)
#include <iostream>
int main(int argc, char *argv[]) {
hs_init(&argc, &argv);
std::cout << foo(37, 19) << "\n";
hs_exit();
return 0;
}
call-haskell-from-cxx.cabal
name: call-haskell-from-cxx
version: 0.1.0.0
build-type: Simple
cabal-version: >=1.10
executable foo.so
main-is: Foo.hs
build-depends: base >=4.10 && <4.11
ghc-options: -shared -fPIC -dynamic
extra-libraries: HSrts-ghc8.2.1
default-language: Haskell2010
build script
#!/bin/bash
hs_lib="foo.so"
hs_obj="dist/build/$hs_lib/$hs_lib"
ghc_version="8.2.1" # May need to be tweaked,
ghc_libdir="/usr/local/lib/ghc-$ghc_version" # depending on system setup.
set -x
cabal build
g++ -I "$ghc_libdir/include" -D"FOO=\"${hs_obj}-tmp/Foo_stub.h\"" -c bar.c++ -o test.o
g++ test.o "$hs_obj" \
-L "$ghc_libdir/rts" "-lHSrts-ghc$ghc_version" \
-o test
env LD_LIBRARY_PATH="dist/build/$hs_lib:$ghc_libdir/rts:$LD_LIBRARY_PATH" \
./test
This works (Ubuntu 16.04, GCC 5.4.0), printing 893
– but it isn't really robust, namely, if I remove the actual invocation of the Haskell function, i.e. the std::cout << foo(37, 19) << "\n";
line, then it fails at the linking step, with the error message
/usr/local/lib/ghc-8.2.1/rts/libHSrts-ghc8.2.1.so: undefined reference to `base_GHCziTopHandler_flushStdHandles_closure'
/usr/local/lib/ghc-8.2.1/rts/libHSrts-ghc8.2.1.so: undefined reference to `base_GHCziStable_StablePtr_con_info'
/usr/local/lib/ghc-8.2.1/rts/libHSrts-ghc8.2.1.so: undefined reference to `base_GHCziPtr_FunPtr_con_info'
/usr/local/lib/ghc-8.2.1/rts/libHSrts-ghc8.2.1.so: undefined reference to `base_GHCziWord_W8zh_con_info'
/usr/local/lib/ghc-8.2.1/rts/libHSrts-ghc8.2.1.so: undefined reference to `base_GHCziIOziException_cannotCompactPinned_closure'
...
Apparently, the inclusion of the Haskell project pulls additional library files in that are needed. How do I explicitly depend on everything necessary, to avoid such brittleness?
Output of the build script when the foo
call is included, with ldd
on the final executable:
++ cabal build
Preprocessing executable 'foo.so' for call-haskell-from-C-0.1.0.0..
Building executable 'foo.so' for call-haskell-from-C-0.1.0.0..
Linking a.out ...
Linking dist/build/foo.so/foo.so ...
++ g++ -I /usr/local/lib/ghc-8.2.1/include '-DFOO="dist/build/foo.so/foo.so-tmp/Foo_stub.h"' -c bar.c++ -o test.o
++ g++ test.o dist/build/foo.so/foo.so -L /usr/local/lib/ghc-8.2.1/rts -lHSrts-ghc8.2.1 -o test
++ env LD_LIBRARY_PATH=dist/build/foo.so:/usr/local/lib/ghc-8.2.1/rts: sh -c 'ldd ./test; ./test'
linux-vdso.so.1 => (0x00007fff23105000)
foo.so => dist/build/foo.so/foo.so (0x00007fdfc5360000)
libHSrts-ghc8.2.1.so => /usr/local/lib/ghc-8.2.1/rts/libHSrts-ghc8.2.1.so (0x00007fdfc52f8000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fdfc4dbe000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdfc49f4000)
libHSbase-4.10.0.0-ghc8.2.1.so => /usr/local/lib/ghc-8.2.1/base-4.10.0.0/libHSbase-4.10.0.0-ghc8.2.1.so (0x00007fdfc4020000)
libHSinteger-gmp-1.0.1.0-ghc8.2.1.so => /usr/local/lib/ghc-8.2.1/integer-gmp-1.0.1.0/libHSinteger-gmp-1.0.1.0-ghc8.2.1.so (0x00007fdfc528b000)
libHSghc-prim-0.5.1.0-ghc8.2.1.so => /usr/local/lib/ghc-8.2.1/ghc-prim-0.5.1.0/libHSghc-prim-0.5.1.0-ghc8.2.1.so (0x00007fdfc3b80000)
libgmp.so.10 => /usr/lib/x86_64-linux-gnu/libgmp.so.10 (0x00007fdfc3900000)
libffi.so.6 => /usr/local/lib/ghc-8.2.1/rts/libffi.so.6 (0x00007fdfc36f3000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fdfc33ea000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fdfc31e2000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fdfc2fde000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fdfc2dc1000)
/lib64/ld-linux-x86-64.so.2 (0x00007fdfc5140000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fdfc2bab000)
A shared object file holds code and data suitable to be linked in two contexts. First, the link-editor can process it with other relocatable and shared object files to create other object files. Second, the runtime linker combines it with a dynamic executable file and other shared objects to create a process image.
In computing, a linker or link editor is a computer system program that takes one or more object files (generated by a compiler or an assembler) and combines them into a single executable file, library file, or another "object" file.
This answer explains what happens during the linkage, why the solution with -Wl,--no-as-needed
works and what should be done instead to have a somewhat more robust approach.
In a nutshell: -lHSrts-ghcXXX.so
depends on libHSbaseXXX.so
, libHSinteger-gmpXXX.so
and libHSghc-primXXX.so
which must be provided to the linker during the linkage.
The here proposed solution depends on a lot of manual work and is not very scalable. However I don't know enough about cabal
to tell you how to automatize this, but I hope you can make the last step.
Or maybe you will be just fine with using the -Wl,--no-as-needed
-solution, because you know what happens behind the scenes.
Let's start by stepping through the linking process for the version without calling foo
, in a somewhat simplified manner (here is a great article from Eli Bendersky, even if it is about static linkage):
The linker maintains a table of symbols and has to find definitions/machine-code for all of them. Let's simplify and assume, that at the beginning it has only symbol main
in the table and the definition of this symbol is unknown.
The definition of symbol main
is found it the object-file test.o
. However, the function main
uses functions hs_init
and hs_exit
. Thus we found the definition of main
, but it doesn't work unless we know the definitions of hs_init
and hs_exit
. So now we have to look for their definitions.
In the next step the linker looks at foo.so
, but foo.so
doesn't define any symbol we are interested in (foo
is not used!) and the linker just skips foo.so
and will never look back.
The linker looks at -lHSrts-ghcXXX.so
. There it finds the definitions of hs_init
and hs_exit
. Thus, the whole content of the shared library is used, but it needs definitions of such symbols as for example base_GHCziTopHandler_flushStdHandles_closure
. That means the linker starts to look for definitions of these symbols.
There are however no more libraries at the command line, thus the linker has nothing to look at and the linkage fails/is not successful, because definitions of some symbols are missing.
What is different for the case where foo
is used? After the 2. step not only hs_init
and hs_exit
are wanted but also foo
, which is found in foo.so
. So foo.so
must be included.
Due to the way the library foo.so
was build, there is the following information contained:
>>> readelf -d dist/build/foo.so/foo.so | grep NEEDED
0x0000000000000001 (NEEDED) Shared library: [libHSrts-ghc7.10.3.so]
0x0000000000000001 (NEEDED) Shared library: [libHSbase-4.8.2.0-HQfYBxpPvuw8OunzQu6JGM-ghc7.10.3.so]
0x0000000000000001 (NEEDED) Shared library: [libHSinteger-gmp-1.0.0.0-2aU3IZNMF9a7mQ0OzsZ0dS-ghc7.10.3.so]
0x0000000000000001 (NEEDED) Shared library: [libHSghc-prim-0.4.0.0-8TmvWUcS1U1IKHT0levwg3-ghc7.10.3.so]
0x0000000000000001 (NEEDED) Shared library: [libgmp.so.10]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
>>> readelf -d dist/build/foo.so/foo.so | grep RPATH
0x000000000000000f (RPATH) Library rpath: [
/usr/lib/ghc/base_HQfYBxpPvuw8OunzQu6JGM:
/usr/lib/ghc/rts:
/usr/lib/ghc/ghcpr_8TmvWUcS1U1IKHT0levwg3:
/usr/lib/ghc/integ_2aU3IZNMF9a7mQ0OzsZ0dS]
From this information, the linker knows which shared libraries are needed (NEEDED
-flag) and where they can be found on your system (RPATH
). These libraries are found/opened/processed (i.e. marked as needed) and thus all necessary definitions are present.
You can follow the whole process by adding
g++ ...
-Wl,--trace-symbol=base_GHCziTopHandler_flushStdHandles_closure \
-Wl,--verbose \
-o test
to the linkage-step.
The same thing happens if we enforce that the foo.so
is included into the resulting executable via -Wl,--no-as-needed
as suggested by @Yuras.
What is the consequence of this analysis?
We should provide the needed libraries on the command line (after -lHSrts-ghcXXX.so
) and not depend on them being added per chance through other shared-libraries. Obviously, the somewhat cryptic names are only valid for my installation:
g++ ...
-L/usr/lib/ghc/base_HQfYBxpPvuw8OunzQu6JGM -lHSbase-4.8.2.0-HQfYBxpPvuw8OunzQu6JGM-ghc7.10.3 \
-L/usr/lib/ghc/integ_2aU3IZNMF9a7mQ0OzsZ0dS -lHSinteger-gmp-1.0.0.0-2aU3IZNMF9a7mQ0OzsZ0dS-ghc7.10.3 \
-L/usr/lib/ghc/ghcpr_8TmvWUcS1U1IKHT0levwg3 -lHSghc-prim-0.4.0.0-8TmvWUcS1U1IKHT0levwg3-ghc7.10.3 \
...
-o test
Now it builds, but doesn't load at the run time (after all the right rpath
is only set in foo.so
but foo.so
isn't used). To fix it we could either extend the LD_LIBRARY_PATH
or add -rpath
the link-command-line:
g++ ...
-L... -lHSbase-... -Wl,-rpath,/usr/lib/ghc/base_HQfYBxpPvuw8OunzQu6JGM \
-L... -lHSinteger-gmp-... -Wl,-rpath,/usr/lib/ghc/integ_2aU3IZNMF9a7mQ0OzsZ0dS \
-L... -lHSghc-prim-... -Wl,-rpath,/usr/lib/ghc/ghcpr_8TmvWUcS1U1IKHT0levwg3 \
...
-o test
There must be an utility to get the paths and library-names automatically (cabal seems to do it when building foo.so
), but I don't know how to do because I have no experience with haskell/cabal.
Usually ghc
links executables with -Wl,--no-as-needed
option, and you should use it too. (You can check how ghc
links executable e.g. using cabal build --ghc-options=-v3
.)
You can find more details here. My understanding it the next: foo.so
requires libHSbase-4.10.0.0-ghc8.2.1.so
to be loaded at runtime as needed, i.e. when we need symbol from it (check readelf -a dist/build/foo.so/foo.so | grep NEEDED
). So if you don't call foo
, then base.so
is not loaded. But ghc needs all libraries to be loaded (I don't know why). The --no-as-needed
option forces all libraries to be loaded.
Note that --no-as-needed
options is position-dependent, so put it before the shared library.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With