Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How much does it cost for Haskell FFI to go into C and back?

If I want to call more than one C function, each one depending on the result of the previous one, is it better to create a wrapper C function that handles the three calls? Will it cost the same as using Haskell FFI without converting types?

Suppose I have the following Haskell code:

foo :: CInt -> IO CInt
foo x = do
  a <- cfA x
  b <- cfB a
  c <- cfC c
  return c

Each function cf* is a C call.

Is it better, in terms of performance, to create a single C function like cfABC and make only one foreign call in Haskell?

int cfABC(int x) {
   int a, b, c;
   a = cfA(x);
   b = cfB(a);
   c = cfC(b);
   return c;
}

Haskell code:

foo :: CInt -> IO CInt
foo x = do
  c <- cfABC x
  return c

How to measure the performace cost of a C call from Haskell? Not the cost of the C function itself, but the cost of the "context-switching" from Haskell to C and back.

like image 295
Thiago Negri Avatar asked Jan 25 '13 10:01

Thiago Negri


1 Answers

The answer depends mostly on whether the foreign call is a safe or an unsafe call.

An unsafe C call is basically just a function call, so if there's no (nontrivial) type conversion, there are three function calls if you make three foreign calls, and between one and four when you write a wrapper in C, depending on how many of the component functions can be inlined when compiling the C, since a foreign call into C cannot be inlined by GHC. Such a function call is generally very cheap (it's just a copy of the arguments and a jump to the code), so the difference is small either way, the wrapper should be slightly slower when no C function can be inlined into the wrapper, and slightly faster when all can be inlined [and that was indeed the case in my benchmarking, +1.5ns resp. -3.5ns where the three foreign calls took about 12.7ns for everything just returning the argument]. If the functions do something nontrivial, the difference is negligible (and if they're not doing anything nontrivial, you'd probably better write them in Haskell to let GHC inline the code).

A safe C call involves saving some nontrivial amount of state, locking, possibly spawning a new OS thread, so that takes much longer. Then the small overhead of perhaps calling one function more in C is negligible compared to the cost of the foreign calls [unless passing the arguments requires an unusual amount of copying, many huge structs or so]. In my do-nothing benchmark

{-# LANGUAGE ForeignFunctionInterface #-}
module Main (main) where

import Criterion.Main
import Foreign.C.Types
import Control.Monad

foreign import ccall safe "funcs.h cfA" c_cfA :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfB" c_cfB :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfC" c_cfC :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfABC" c_cfABC :: CInt -> IO CInt

wrap :: (CInt -> IO CInt) -> Int -> IO Int
wrap foo arg = fmap fromIntegral $ foo (fromIntegral arg)

cfabc = wrap c_cfABC

foo :: Int -> IO Int
foo = wrap (c_cfA >=> c_cfB >=> c_cfC)

main :: IO ()
main = defaultMain
            [ bench "three calls" $ foo 16
            , bench "single call" $ cfabc 16
            ]

where all the C functions just return the argument, the mean for the single wrapped call is a bit above 100ns [105-112], and for the three separate calls around 300ns [290-315].

So a safe c call takes roughly 100ns and usually, it is then faster to wrap them up into a single call. But still, if the called functions do something sufficiently nontrivial, the difference won't matter.

like image 89
Daniel Fischer Avatar answered Oct 17 '22 07:10

Daniel Fischer