Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interchange structured data between Haskell and C

First, I'm a Haskell beginner.

I'm planning integrating Haskell into C for realtime game. Haskell does logic, C does rendering. To do this, I have to pass huge complexly structured data (game state) from/to each other for each tick (at least 30 times per second). So the passing data should be lightweight. This state data may laid on sequential space on memory. Both of Haskell and C parts should access every area of the states freely.

In best case, the cost of passing data can be copying a pointer to a memory. In worst case, copying whole data with conversion.

I'm reading Haskell's FFI(http://www.haskell.org/haskellwiki/FFICookBook#Working_with_structs) The Haskell code look specifying memory layout explicitly.

I have a few questions.

  1. Can Haskell specify memory layout explicitly? (to be matched exactly with C struct)
  2. Is this real memory layout? Or any kind of conversion required? (performance penalty)
  3. If Q#2 is true, Any performance penalty when the memory layout specified explicitly?
  4. What's the syntax #{alignment foo}? Where can I find the document about this?
  5. If I want to pass huge data with best performance, how should I do that?

*PS Explicit memory layout feature which I said is just C#'s [StructLayout] attribute. Which is specifying in-memory position and size explicitly. http://www.developerfusion.com/article/84519/mastering-structs-in-c/

I'm not sure Haskell has matching linguistic construct matching with fields of C struct.

like image 693
eonil Avatar asked Dec 21 '10 17:12

eonil


3 Answers

I would strongly recommend using a preprocessor. I like c2hs, but hsc2hs is very common because it's included with ghc. Greencard appears to be abandoned.

To answer your questions:

1) Yes, through the definition of the Storable instance. Using Storable is the only safe mechanism to pass data through the FFI. The Storable instance defines how to marshal data between a Haskell type and raw memory (either a Haskell Ptr, ForeignPtr, or StablePtr, or a C pointer). Here's an example:

data PlateC = PlateC {
  numX :: Int,
  numY :: Int,
  v1   :: Double,
  v2   :: Double } deriving (Eq, Show)

instance Storable PlateC where
  alignment _ = alignment (undefined :: CDouble)
  sizeOf _ = {#sizeof PlateC#}
  peek p =
    PlateC <$> fmap fI ({#get PlateC.numX #} p)
           <*> fmap fI ({#get PlateC.numY #} p)
           <*> fmap realToFrac ({#get PlateC.v1 #} p)
           <*> fmap realToFrac ({#get PlateC.v2 #} p)
  poke p (PlateC xv yv v1v v2v) = do
    {#set PlateC.numX #} p (fI xv)
    {#set PlateC.numY #} p (fI yv)
    {#set PlateC.v1 #}   p (realToFrac v1v)
    {#set PlateC.v2 #}   p (realToFrac v2v)

The {# ... #} fragments are c2hs code. fI is fromIntegral. The values in the get and set fragments refer to the following struct from an included header, not the Haskell type of the same name:

struct PlateCTag ;

typedef struct PlateCTag {
  int numX;
  int numY;
  double v1;
  double v2;
} PlateC ;

c2hs converts this to the following plain Haskell:

instance Storable PlateC where
  alignment _ = alignment (undefined :: CDouble)
  sizeOf _ = 24
  peek p =
    PlateC <$> fmap fI ((\ptr -> do {peekByteOff ptr 0 ::IO CInt}) p)
           <*> fmap fI ((\ptr -> do {peekByteOff ptr 4 ::IO CInt}) p)
           <*> fmap realToFrac ((\ptr -> do {peekByteOff ptr 8 ::IO CDouble}) p)
           <*> fmap realToFrac ((\ptr -> do {peekByteOff ptr 16 ::IO CDouble}) p)
  poke p (PlateC xv yv v1v v2v) = do
    (\ptr val -> do {pokeByteOff ptr 0 (val::CInt)}) p (fI xv)
    (\ptr val -> do {pokeByteOff ptr 4 (val::CInt)}) p (fI yv)
    (\ptr val -> do {pokeByteOff ptr 8 (val::CDouble)})   p (realToFrac v1v)
    (\ptr val -> do {pokeByteOff ptr 16 (val::CDouble)})   p (realToFrac v2v)

The offsets are of course architecture-dependent, so using a pre-processer allows you to write portable code.

You use this by allocating space for your data type (new,malloc, etc.) and pokeing the data into the Ptr (or ForeignPtr).

2) This is the real memory layout.

3) There is a penalty for reading/writing with peek/poke. If you have a lot of data, it's better to convert only what you need, e.g. reading just one element from a C array instead of marshalling the entire array to a Haskell list.

4) Syntax depends upon the preprocessor you choose. c2hs docs. hsc2hs docs. Confusingly, hsc2hs uses the syntax #stuff or #{stuff}, while c2hs uses {#stuff #}.

5) @sclv's suggestion is what I would do as well. Write a Storable instance and keep a pointer to the data. You can either write C functions to do all the work and call them through the FFI, or (less good) write low-level Haskell using peek and poke to operate on just the parts of the data you need. Marshalling the whole thing back and forth (i.e. calling peek or poke on the entire data structure) will be expensive, but if you only pass pointers around the cost will be minimal.

Calling imported functions through the FFI has a significant penalty unless they're marked "unsafe". Declaring an import "unsafe" means that the function should not call back into Haskell or undefined behavior results. If you're using concurrency or parallelism, it also means that all Haskell threads on the same capability (i.e. CPU) will block until the call returns, so it should return fairly quickly. If those conditions are acceptable an "unsafe" call is relatively fast.

There are a lot of packages on Hackage that deal with this sort of thing. I can recommend hsndfile and hCsound as exhibiting good practice with c2hs. It's probably easier if you look at a binding to a small C library you're familiar with though.

like image 188
John L Avatar answered Nov 15 '22 09:11

John L


Even though you can get deterministic memory layout for strict unboxed Haskell structures, there are no guarantees and it is a really really bad idea.

If you're willing to live with conversion, there's Storeable: http://www.haskell.org/ghc/docs/6.12.3/html/libraries/base-4.2.0.2/Foreign-Storable.html

What I'd do is construct the C structures, and then construct Haskell functions that operate directly on them using the FFI, rather than trying to produce Haskell "equivalents" to them.

Alternately, you can decide that you only need to pass a select bit of information to the C -- not the whole game state, but just a few pieces of information about what objects are where in the world, with your actual information on how to draw them living solely in the C side of the equation. Then you do all the logic in Haskell, operating on native Haskell structures, and only project out to the C world that tiny subset of data which the C actually needs to render with.

Edit: I should add that matrices and other common c structures already have excellent libraries/bindings that keep the heavy lifting on the c side.

like image 45
sclv Avatar answered Nov 15 '22 09:11

sclv


hsc2hs, c→hs, and Green Card all provide automated Haskell⇆C structure peek/poke or marshalling. I would recommend their use over manually determining sizes and offsets and using pointer manipulation in Haskell, although that's possible too.

  1. Not as far as I know, if I'm understanding you correctly. Haskell doesn't have any built-in handling of foreign aggregate data structures.
  2.  
  3.  
  4. As that page describes, it's hsc2hs with some C magic.
like image 2
ephemient Avatar answered Nov 15 '22 08:11

ephemient