Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Idiomatic Usage of Haskell Foreign Types?

Tags:

c

haskell

ffi

Problem: Going from Haskell types to Foreign types and back requires a lot of boilerplate code.

For example, suppose we're working with the following Haskell data structure:

data HS_DataStructure = HS_DataStructure {
       a1 :: String    
    ,  b1 :: String
    ,  c1 :: Int
}

In order to get this data structure into the land of C, we will need to consider its struct analogue:

typedef struct {
        char *a;
        char *b;
        int   c;
} c_struct;

But in order to pass such a struct into C from Haskell, we have to transform HS_DataStructure into the following:

data HS_Struct = HS_Struct { 
      a :: CString
    , b :: CString
    , c :: CInt
} deriving Show

We then have to make HS_Struct an instance of Storable:

instance Storable HS_Struct where
    sizeOf    _ = #{size c_struct}
    alignment _ = alignment (undefined :: CString)

poke p c_struct = do
    #{poke c_struct, a} p $ a c_struct
    #{poke c_struct, b} p $ b c_struct
    #{poke c_struct, c} p $ c c_struct

peek p = return HS_Struct
          `ap` (#{peek c_struct, a} p)
          `ap` (#{peek c_struct, b} p)
          `ap` (#{peek c_struct, c} p)

(In the above I'm using hs2c syntax).

Now finally, in order to convert between HS_Struct and HS_DataStructure, we are forced to use the following helper functions(!):

makeStruct :: HS_DataStructure -> IO (HS_Struct)
makeStruct hsds = do str1 <- newCString (a1 hsds)
                             str2 <- newCString (b1 hsds)
                             jreturn (HS_Struct str1 str2 (c1 hsds))

makeDataStructure :: Ptr (HS_Struct) -> IO (HS_DataStructure)
makeDataStructure p = do hss <- peek p
                          hs1 <- peekCString (a hss)j
                          hs2 <- peekCString (b hss)
                         return (HS_DataStructure hs1 hs2 (c hss))

This seems to be an insane amount of boilerplate to go back and forth between Haskell and C.

Questions

  1. Is there any way to minimize the boilerplate above?
  2. With Haskell projects that involve a heavy amount of FFI, is it idiomatic to just give in and primarily use Haskell's C types (i.e., CInt, CString, and so forth)? This would at least save you the hastle having to convert back and forth between the types.
like image 706
George Avatar asked Dec 18 '16 00:12

George


1 Answers

"Idiomatically", I think there is no way getting around the boilerplate for marshaling values between C and Haskell. However, there are answers to your question that you might find useful. After having written and contributed to a number of libraries that wrap C libraries for Haskell users, I would highly recommend using the bindings-dsl library to address your problems, which uses hsc2hs under the hood. Specifically, to answer your questions:

1. Is there any way to minimize the boilerplate above?

You can eliminate some of it, but marshaling between C types and Haskell types in most cases requires care to make sure that Haskell values remain well founded. This is a feature, not a bug, since C types are inherently different representations. For String and CString, for example, you need to specify what happens if you call

makeStruct $ HS_DataStructure (repeat 'a') (repeat 'b') 0

... or how to handle memory allocation (your example will leak without a corresponding deleteStruct function). Similarly there are concerns about integer semantics with Int and CInt. What happens if you get a CInt that's out of range of Int? Clamp? Wrap? These answers are usually application-specific, and the interoperability with other libraries requires that the invariants hold for all Haskell programs.

Using bindings-dsl, we can at least get rid of the need for writing your own Storable instances by defining a .hsc file with the following:

module MyModule where

#include <c_struct.h>

#starttype struct HS_Struct
#field a, CString
#field b, CString
#field c, CInt
#stoptype

If you add this module to your cabal file, cabal should recognize that it needs to use hsc2hs and will compile it properly adding all of the additional instances. Look at any of the links above for examples. Your code for makeDataStructure can be made (a bit) simpler, too:

makeDataStructure :: Ptr HS_Struct -> IO HS_DataStructure
makeDataStructure p = do
    HS_Struct ca cb cc <- peek p
    HS_DataStructure <$> peekCString ca <*> peekCString cb <*> fromIntegral cc

The real "boilerplate reduction" win from using bindings-DSL comes from the Haskell-side FFI definitions of functions that exist in the headers.

2. With Haskell projects that involve a heavy amount of FFI, is it idiomatic to just give in and primarily use Haskell's C types (i.e., CInt, CString, and so forth)? This would at least save you the hassle having to convert back and forth between the types.

Having something be "idiomatic" is a bit subjective (and hence susceptible to both bikeshedding and gatekeeping), so any answer you get to this question you should take with a grain of salt, and might not be the best fit for this site. In my opinion, I don't think primarily using C types is idiomatic. Those types exist for the sole purpose of interfacing with C, and the Haskell runtime is optimized around Haskell types. If you find yourself writing so much FFI that you feel the need to primarily use FFI types, that might be a good indication that you should be writing a C library instead.

The exception is, of course, if you're writing a Haskell library that's wrapping a C library. In this case, I'd argue that the whole point of the Haskell library is to implement the boilerplate in a way that is consistent between the Haskell runtime and the C library interface.


like image 156
Mokosha Avatar answered Oct 11 '22 17:10

Mokosha