Is there a way to make sure that objects of some specific type (basically ForeignPtr
) are garbage-collected very aggressively? I have some kind of simple type like this:
data SomePtr = SomePtr { ptr :: ForeignPtr CUChar, size :: CSize }
alloc :: CSize -> IO SomePtr
free :: SomePtr -> IO ()
free = finalizeForeignPtr . ptr
I think that the standard theory is that reference counting (which is how I would do this myself in, say, C++) is slower than the GC that ghc uses, which is why it doesn't use it. But the problem for me is that when working with externally allocated objects, like GPU memory, the promise that it will eventually be freed isn't enough. The memory is quite scarce and as far as I can tell, the ForeignPtr
finalizer is not actually called when I would like it to be. I would like the memory to be freed as soon as possible, so I end up calling finalizeForeignPtr
myself.
Is there some way to tell ghc to be really aggressive in destroying some specific types of objects?
Or I am going about this the wrong way?
Here is example code to illustrate what I mean:
{-# LANGUAGE RecordWildCards #-}
import Foreign.ForeignPtr.Safe
import Foreign.Ptr
import Foreign.Marshal.Alloc
import Foreign.Storable
import Control.Monad
import Foreign.C.Types
import Text.Printf
data FPtr = FPtr { fptr :: ForeignPtr CUChar, size :: CSize }
foreign import ccall "falloc" falloc :: CSize -> Ptr (Ptr CUChar) -> IO CInt
foreign import ccall "&ffree" ffree :: FunPtr (Ptr CUChar -> IO ())
newFPtr :: CSize -> IO FPtr
newFPtr size =
do alloca $ \ptr -> do
result <- falloc size ptr
printf "Result: %d\n" (fromIntegral result :: Int)
fptr <- newForeignPtr ffree =<< peek ptr
return FPtr{..}
freeFPtr :: FPtr -> IO ()
freeFPtr = finalizeForeignPtr . fptr
main :: IO ()
main = forM_ [1 .. 5] $ const work
where
work = do x <- newFPtr 1024
-- freeFPtr x
return ()
#include <cstdio>
using namespace std;
extern "C" {
int falloc(size_t size, unsigned char** ptr);
void ffree(unsigned char* ptr);
}
int some_counter = 0;
int falloc(size_t size, unsigned char** ptr) {
some_counter++;
printf("falloc(%lu, %#lx, %#lx); %d\n",
size, (unsigned long)ptr, (unsigned long)*ptr, some_counter);
*ptr = new unsigned char[size];
return 0;
}
void ffree(unsigned char* ptr) {
printf("ffree(%#lx)\n", (unsigned long)ptr);
delete[] ptr;
}
falloc(1024, 0x100606010, 0); 1
Result: 0
falloc(1024, 0x100606028, 0); 2
Result: 0
falloc(1024, 0x100606040, 0); 3
Result: 0
falloc(1024, 0x100606058, 0); 4
Result: 0
falloc(1024, 0x100606070, 0); 5
Result: 0
ffree(0x101026400)
ffree(0x101027800)
ffree(0x101027c00)
ffree(0x101028000)
ffree(0x101028400)
falloc(1024, 0x100606010, 0); 1
Result: 0
ffree(0x101026400)
falloc(1024, 0x100606028, 0); 2
Result: 0
ffree(0x100802200)
falloc(1024, 0x100606040, 0); 3
Result: 0
ffree(0x100802200)
falloc(1024, 0x100606058, 0); 4
Result: 0
ffree(0x100802200)
falloc(1024, 0x100606070, 0); 5
Result: 0
ffree(0x100802200)
If we want to be independent of the GHC's garbage collection, we need to introduce some kind of determinism and therefore explicit deallocation. Allocation is usually something of type IO a
, and the corresponding deallocation of type a -> IO ()
(just as your example).
Now, what if we had the following functions?
allocate :: IO a -> (a -> IO ()) -> Alloc a
runAlloc :: Alloc a -> IO a
autoAllocate
should take both an allocation and deallocation and give you the result of the allocation in the new (superficial) Alloc
monad, and runAlloc
runs all actions and deallocations. Your example wouldn't change that much, except for the end:
allocateFPtr size = autoAllocate (newFPtr size) freeFPtr
main :: IO ()
main = forM_ [1 .. 5] $ runAlloc . const work
where
work = do x <- allocateFPtr 1024
return ()
Now, allocate
, runAlloc
and Alloc
already exists in resourcet
as allocate
, runResourceT
and ResourceT
, and the actual code would look like this:
allocateFPtr size = fmap snd $ allocate (newFPtr size) freeFPtr
main :: IO ()
main = forM_ [1 .. 5] $ runResourceT . const work
where
work = do x <- allocateFPtr 1024
return ()
Result:
falloc(1024, 0x1e04014, 0); 1 Result: 0 ffree(0x6abc60) falloc(1024, 0x1e04020, 0); 2 Result: 0 ffree(0x6abc60) falloc(1024, 0x1e0402c, 0); 3 Result: 0 ffree(0x6abc60) falloc(1024, 0x1e04038, 0); 4 Result: 0 ffree(0x6abc60) falloc(1024, 0x1e04044, 0); 5 Result: 0 ffree(0x6abc60)
But you said that some of your pointers should actually live longer. That's also not a problem, since allocate
actually returns m (ReleaseKey, a)
, and ReleaseKey
can be used to either release the memory earlier than runResourceT
(using release
) or remove the automatic release mechanism (using unprotect
, which returns the deallocation action).
So, all in all, I guess your scenario could be handled well with ResourceT
. After all, it's synopsis is "Deterministic allocation and freeing of scarce resources".
You are thinking about this in the wrong way for Haskell.
In C++, RAII is used to ensure that resources are released -- promptly. Since C++ lacks a finally
construct, there is no other way to ensure that resources are released in the presence of exceptions. Also, since C++ lacks a garbage collector, reference counting and RAII are the order of the day.
In Haskell (and other garbage collected languages), however, the situation is different. One does not rely on finalizers running promptly. In fact, one should not rely on finalizers running at all, since they could be delayed for an arbitrary amount of time if the amount of available memory is high enough -- and might never be executed at all if the program terminates before the finalizer (or even the garbage collector) has a chance to run since the object became unreachable.
Instead, one uses explicit resource deallocation. This seems bad, but isn't. For reasons of memory safety, one should put the object in a "zombie" state, so that any further attempts to use the object throw exceptions (since they are bugs).
Alternatively, if the resources are such that they are automatically deallocated on process exit, one can rely on finalizers -- but note that they may not be called promptly (as you mentioned), and so an explicit performGC
may be needed if the resource is exhausted. I suspect that not knowing when the life of truly scarce resources is over (at least conservatively) is probably a code smell even in C++ -- it means that there is no upper bound on the amount of the resource consumed.
In the very limited case where you are concerned about just freeing some memory that you can have live on the haskell heap, there is a special edge case available to you.
mallocForeignPtr
allocates the memory as a pinned mutable byte array on the haskell heap, so when the ForeignPtr
(and the mutable byte array) get GC'd, the memory gets automatically reclaimed with no finalizer invocation.
This is considerably cheaper than adding a manual hook to call some free
corresponding to a system malloc
, but only in the limited circumstances where you can live with the limitations.
However, if you are relying on freeing another resource (e.g. through a file handle object, or memory or resource IDs out on the GPU or something else) you're still hosed.
In general, don't rely on the GC to free up valuable external resources for you, except as a sort of "apology" pass for leaked things during, say, exceptions or the like. Your usual control flow should still free up the external resources you use.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With