I want to implement a function in C++ via Haskell FFI, which should have the (final) type of String -> String
. Say, is it possible to re-implement the following function in C++ with the exact same signature?
import Data.Char
toUppers:: String -> String
toUppers s = map toUpper s
In particular, I wanted to avoid having an IO in the return type because introducing the impurity (by that I mean the IO monad) for this simple task is logically unnecessary. All examples involing a C string I have seen so far involve returning an IO something or Ptr which cannot be converted back to a pure String
.
The reason I want to do this is that I have the impression that marshaling is messy with FFI. Maybe if I can fix the simplest case above (other than primitive types such as int), then I can do whatever data parsing I want on the C++ side, which should be easy.
The cost of parsing is negligible compared to the computation that I want to do between the marshalling to/from strings.
Thanks in advance.
You need to involve IO
at least at some point, to allocate buffers for the C-strings. The straightforward solution here would probably be:
import Foreign
import Foreign.C
import System.IO.Unsafe as Unsafe
foreign import ccall "touppers" c_touppers :: CString -> IO ()
toUppers :: String -> String
toUppers s =
Unsafe.unsafePerformIO $
withCString s $ \cs ->
c_touppers cs >> peekCString cs
Where we use withCString
to marshall the Haskell string into a buffer, change it to upper-case and finally un-marshall the (changed!) buffer contents into the new Haskell string.
Another solution could be to delegate messing with IO
to the bytestring
library. That could be a good idea anyways if you are interested in performance. The solution would look roughly like follows:
import Data.ByteString.Internal
foreign import ccall "touppers2"
c_touppers2 :: Int -> Ptr Word8 -> Ptr Word8 -> IO ()
toUppers2 :: ByteString -> ByteString
toUppers2 s =
unsafeCreate l $ \p2 ->
withForeignPtr fp $ \p1 ->
c_touppers2 l (p1 `plusPtr` o) p2
where (fp, o, l) = toForeignPtr s
This is a bit more elegant, as we now don't actually have to do any marshalling, just convert pointers. On the other hand, the C++ side changes in two respects - we have to handle possibly non-null-terminated strings (need to pass the length) and now have to write to a different buffer, as the input is not a copy anymore.
For reference, here are two quick-and-dirty C++ functions that fit the above imports:
#include <ctype.h>
extern "C" void touppers(char *s) {
for (; *s; s++) *s = toupper(*s);
}
extern "C" void touppers2(int l, char *s, char *t) {
for (int i = 0; i < l; i++) t[i] = toupper(s[i]);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With