Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Garbage collector issues in Haskell runtime when (de)allocations are managed in C

I would like to share data (in the simplest case an array of integers) between C and Haskell using Haskell's FFI functionality. The C side creates the data (allocating memory accordingly), but never modifies it until it is freed, so I thought the following method would be "safe":

  • After the data is created, the C function passes the length of the array and a pointer to its start.
  • On the Haskell side, we create a ForeignPtr, setting up a finalizer which calls a C function that frees the pointer.
  • We build a Vector using that foreign pointer which can be (immutably) used in Haskell code.

However, using this approach causes rather non-deterministic crashes. Small examples tend to work, but "once the GC kicks in", I start to get various errors from segmentation faults to "barf"s at this or this line in the "evacuation" part of GHC's GC.

What am I doing wrong here? What would be the "right way" of doing something like this?

An Example

I have a C header with the following declarations:

typedef struct CVector {
    const int32_t *pointer;
    size_t length;
} Vector;

void create_c_vector(struct CVector *vector);
void free_buffer(void *buff);

The Haskell code is generated from the following .chs file using c2hs:

import Foreign.C.Types
import Foreign.Concurrent
import Foreign.Marshal.Alloc
import Foreign.Ptr
import Foreign.Storable

import qualified Data.Vector.Storable as V

#include <cvector.h>


data ForeignVector = ForeignVector
  { pointerFV  :: Ptr CInt
  , lengthFV   :: CULong
  }

instance Storable ForeignVector where
  sizeOf _ = {#sizeof CVector #}
  alignment _ = {#alignof CVector #}
  peek p =
    ForeignVector
      <$> {#get CVector->pointer #} p
      <*> {#get CVector->length #} p
  poke p (ForeignVector vecP l) =
    do {#set CVector.pointer #} p (castPtr vecP)
       {#set CVector.length #} p l

peekUnit :: Storable a => Ptr () -> IO a
peekUnit = peek . castPtr

{#fun create_c_vector as ^ { alloca- `ForeignVector' peekUnit*} -> `()' #}
{#fun free_buffer as ^ { `Ptr ()' } -> `()' #}

fromForeign :: ForeignVector -> IO (V.Vector CInt)
fromForeign (ForeignVector p l) =
  V.unsafeFromForeignPtr0
    <$> newForeignPtr p (freeBuffer . castPtr $ p)
    <*> pure (fromIntegral l)

createVector :: IO (V.Vector CInt)
createVector = fromForeign =<< createCVector

One particular test I did yielded internal error: evacuate: strange closure type 177 after a few thousand calls to createVector.

PS: Here is why I would like to use Foreign.Concurrent.newForeignPtr instead of the more "standard" Foreign.ForeignPtr.newForeignPtr: In some more complicated cases I am anticipating, while freeing the pointer one should also clean up other things which can potentially depend on parameters that are passed from Haskell. Therefore I would like to have a "finalizer with multiple arguments" and pass a partial application as the actual finalizer. This means that I can't use a pointer to a C function as the finalizer. While I've read that one can cook up the FinalizerPtr required for the finalizer from Haskell functions using a "wrapping" mechanism, according to the documentation, function pointers obtained this way need to be explicitly deallocated with freeHaskellFunPtr and I don't want to do bookkeeping for that.

PPS: Here is a base64-encoded tarball with the complete source code of the example above (including code for an executable that reproduces the aforementioned error):

H4sIAAAAAAAAA+1Ze1PbOhbv3/oUZ0JnSQAb50VmeM1QKNvMwIUpLZ2dbjdRbDnx4liuZAO5vXz3
PUd+hAS4LC2XbmejYYitc3Se0u9Isu8HVqB1KqzxlbAcu27X13kcr796xuZg67Tb5hfb/K95rrec
dqPptFtO85VTb9abzVfQfk4jHmqpTrhCU35Uzrxzv0jz78m/GwaD55wAT8j/xkazTflvdFqL/L9E
uy//Wrk/Z/03Njr4Y9Z/e7H+X6Q9lP/jT2+fbQ781/lvNTbwj/K/4Szy/yLtz/J/KJUIhtG5cBOp
bHekv1MHxWOj1Xoo/41mcz7/HdwIvIIXCeL/ef7H0ktDAZhueybdcDUSSjAWjGOpEshp9r79YRIL
fadbRm6qlIgSZlkwR8x/TxM1P+yYKz3iob0XhtKdJ97Df4aG8UE4NetrysPAD4QHBzzhdj5TCzbg
Gs4ZWwoiN0w9AdvuZcYw2mWMeTgCZn3emX1nAN8glkGUCHV4DrC5CWgU7HfRTYA1CEU0TEZEIdL+
xyMZDZFwg+ZFOKsiV0Bpyn3BBdDB7+LEhx5q/rZEL9KH/Zxn6QYZ0L1hNMa45jzmfZ4pFuICYtjB
R7jjAbXt17s4diiSYpy1m7uFAiAuuFbucGUeFkyxvBCopzrrC8b0FMJart6T5MlUhn1bEVRdrhOK
IQ2q5XrnBtzSCSFj5NzHKEgoxNPEws6uyUW1BtYudE+ATxl3soDYkCtj7NuSn0bgKsET0XN72Syg
2fEvTDCnycct6M+4tQyFvJUbUtGv1pYp2pkoXwnRG6S+L0ox/cycZZhhZ34mtFgTrovqoKITD/fY
9gj+RpIqGAj3EB/Ix8M0MpLoH8+dq9ZqKEnJcW6i4ZtJQs53ni8BM0drM0PmshYXKTu300hzXxxO
eVG1w4p5E4mraTelkCx+k7lehhheQ4yZsECqzbkRmWPZHKMZFqdKkBA5RhPFUPEQLWEsS05uHLp3
jzczLDtw27md7e08vfksYj8bV3+Vdl/9p/P/MQ8i+7sr/mx7pP7X662N4vxXbzjIV281286i/r9E
K+o/pnuu5N/ZEpQUrPaJkqF9IER8Jr7Odx/LiHtFp6nLR4FOHtlJnE10IsZ29+RptZ3pIAwnZ+nY
YAaWyQwITYkuSZEBOl+GXrgM1X/CNUyIr3oNqzCpQR9j0Idmu1OvgUOYZ7BKiTgMXESUYxyPUFQM
X82Z4DYcIYCNKYI5cNWyN9KK9fAStq0Z7qzujU7T5CxRRxFgNRCKMLSyb7g8yCrUJlRgdRX0SF7B
5QODKr8h0QgPoiFg9ZOXVJMiL38Y0jq27Uo2XIuvqcB9Sa8ovZ/JQE0GTqNV0LIR0PcwzTiwj8oV
1vJCc8n3BwYIBXyuO84aNBwHbBvw2flSMHxZoPH/bLsP/2f6bJcPePhDOh7D/412vTz/tZuE/3j8
by7w/yWaya6FUKEDGW0Cpr/B6Az3YRRo8AME2hEi7UCICIYiEsrAE229IObuBR8Ke8LHIQwmMKIO
yCWBYzdbdstGUSRNC7EJoySJ9eb6+jBIRunAduV4Xctw3YxjLOJj5Jm2mUnISgPzZiYq05NIxjrQ
RfcejIMoGOOW8kqqCwJEcc3HMTrh034/gsPDLhihDLFdRHqq8TQdYNeBJOBmgzQIPSvB+pTRzwIS
wsR1orilZapcYVFs9KaBuPdv9w6O39pjz7yZ2/PypHm3y2WofKC4miBNXMdSC8/KynAuD+6pvQAy
wfI8z3jKk5HuYax6xq0exQqrhC6s9AJV8mrl4tNw5FoyTjCYGDbrpAHWJzqSWMOGMYc8zMwLplpy
0/Etj4yU4ZTYGOmS4olYRF5JG3AtzOMalCI84fM0TKyQR8MUJ9AmvOP6QoRhw6k7DIMs3DQxJX4W
mwYBVRzKj0UZv7VJfVpwHg4AWCrRSNJ34vL9bufFM3+bndRPCsxftP7vw/9yPj+Tjkf3/5329PtP
q2Puf7EkLPD/BdqSwcQuzQA4zsHzUw6ebzPwZMwUg1jJf+NUxWk6xqWTUCXQwHHvyfUIcLf793f7
yxqGXA1w7oIrwzA7qSMJuYRKaEuMyphG5EV4kTbrJtPqgtKotFxhcYCzJKstcPThDOodrEk2Y0tL
8IYWG1rG2EkkIBLC05DIbA0CIgDsQw6tKJNuXjIS+ULUfDlh5VLJGp52AncELke4F7gLjsSaqRJ5
xVijkqbSKCJ1/X6fDV0XLD3iCo20JOkpAF1LsPzT7j5MEZ4GoLektE/g3wcEYkOc2GS8K+bMxfji
Ht6brM0amoccjUSHjDMJpJr8qdAFo8eVxwwqVVDShcgCB3RjQmU9C9r73AnWjZCCW3cTMjxAEcTi
IzpplCK+kiX9O6jbXwN5O9xxmtAANu8ZFB6/PjroHXXfvN97/4/e6d6Hd30Q0WWg8HhobjIvMfmk
3F4cC35+e/D7f7mD+XEdj+C/06yX3/+x1NUR/9vIv8D/l2jTjyN4rMfTPX0bmekz99S7jNFuGHco
ePpXqVve1sO3bINLFQHXf9Js9BJYye/8twyNPmtgZ3atv8VuIBu5xdilDLz5W/nqnPyVrLu2lXHf
univmo4VekHqz47jr9oeXf/uj+t4bP1v4Jm/uP/tdOj83647ncX6f4lWrvVKCfiVp61MAwEzixx2
oNWgxV8CAi1S7K2WHTU8yNF3t2qrASuQffksqLWaGetD1aztHGJyDQGKcTJgod1adQt7tgtwgWB1
teCno5nvfw6+4IAgG3Bj/l/OfQHdMYxbM7TSjwK1cDCIUItc+F0Zv308OnpAhjH3ht2wP4UwI5lo
1RzRbhaYtmiLtmh/afsPAHfp2gAuAAA=
like image 850
aclow Avatar asked May 22 '21 23:05

aclow


1 Answers

Copied and extended from my earlier comment.

You may have a faulty cast or poke. One thing I make a point of doing, both as a defensive guideline and when debugging, is this:

Explicitly annotate the type of everything that can undermine types. That way, you always know what you’re getting. Even if a poke, castPtr, or unsafeCoerce has my intended type now, that may not be stable under code motion. And even if this doesn’t identify the issue, it can at least help think through it.

For example, I was once writing a null terminator into a byte buffer…which corrupted adjacent memory by writing beyond the end, because I was using '\NUL', which is not a char, but a Char—32 bits! The reason was that pokeByteOff is polymorphic: it has type (Storable a) => Ptr b -> Int -> a -> IO (), not … => Ptr a -> ….

This turned out to be the case in your code! Quoth @aclow:

The createVector generated by c2hs was equivalent to something like alloca $ \ ptr -> createCVector'_ ptr >> peek ptr, where createCVector'_ :: Ptr () -> IO (), which meant that alloca allocated only enough space to hold a unit. Changing the in-marshaller to alloca' f = alloca $ f . (castPtr :: Ptr ForeignVector -> Ptr ()) seems to solve the issue.

Things that turned out not to be the case, but could’ve been:

I’ve encountered a similar crash when a closure was getting corrupted by somebody (read: me) writing beyond an array. If you’re doing any writes without bounds checking, it may be helpful to replace them with checked versions to see if you can get an exception rather than heap corruption. In a way this is what was happening here, except that the write was to the alloca-allocated region, not the array.

Alternatively, consider lifetime issues: whether the ForeignPtr could be getting dropped & freeing the buffer earlier than you expect, giving you a use-after-free. In a particularly frustrating case, I’ve had to use touchForeignPtr to keep a ForeignPtr alive for that reason.

like image 87
Jon Purdy Avatar answered Oct 18 '22 22:10

Jon Purdy