I am considering the general architecture of my next project. For the back-end, haskell looks like a very good fit, but not for the front-end, where python would be better and likely easier to code. The heavy computations would be done in haskell, and the result displayed in a gui built with python.
So, I need to pick the right conduit and the right format to communicate between these two processes.
The message sent from python to the haskell process would be quite simple, like a document with a few but diverse values. (json could be used for that I suppose.)
But the message from the haskell to the python process would be much more heavy with a big (float) arrays. That's where I need to be more careful : whatever libs I use will need to have a fast implementation in python and to be reasonably stable in haskell.
So, what are the options ?
I'm using google's protocol buffers over zeromq at our company. It's quite happily shuffling data between python, C++ and C# code and I've had a play with haskell successfully too.
Essentially you could split this up into two concerns, serialization and transport.
As ehird mentions their answer there are a multitude of options for serialization. I'd recommend protocol buffers having used it a lot, but I've heard good things about thrift and there's also msgpack.org too, which seems pretty solid.
Transport wise I hands down recommend zeromq, it's awesome! It supports a wide variety of messaging patterns and it's screaming fast. Here's a little example of zmq Resources and Sinks for the conduit library (I haven't released it yet): https://github.com/boothead/zeromq-conduit
I would consider using the cereal or blaze-builder packages from Haskell to define your own binary serialisation format, and then writing code to manually unpack it in Python (e.g. using struct). This is likely to be a pain if you have a lot of structures to transfer, but if there's only one or two, then this is likely to be more compact and simpler than finding a binary serialisation format that's well-supported in both languages.
cereal handles both serialisation and deserialisation, but blaze-builder only does serialisation; on the other hand, I think blaze-builder is faster. cereal's primary purpose is serialising something in a format you're not particularly picky about so you can read it back later with Haskell, meaning it uses a type-class extensively, so you have to be careful about using standard serialisations which do undesirable things like serialise arbitrary bignum Integer
s rather than fixed-size integers, while blaze-builder is more about custom formats. Still, it's pretty easy to use cereal with a custom format, and if you want to deserialise the structures from Haskell too, it's the obvious choice.
A quick glance at Hackage shows a well-maintained BSON package; that might be a good option if your structures are complex, but otherwise might be overkill.
I think using JSON for the Python→Haskell transport is probably the best idea; while you lose the nicety of having the same serialisation format being used both ways, JSON is very standard, and well-supported in Haskell by aeson. If you choose BSON for the Haskell→Python route, that could work too.
Other options I can think of:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With