I've read a lot of articles about distributed Haskell. Much work has been done but seems to be in the area of distributing computations. I saw the remote package which seems to implement Erlang-style messaging passing but it is 0.1 and early stage.
I'd like to implement a system where there are many separate processes that provide distinct services, and are tied together by several main processes. This seems to be a natural fit for Erlang, but not so for Haskell. But I like Haskell's type safety.
Has there been any recent adoption of Erlang-style process management in Haskell?
If you want to learn more about the remote
package, a.k.a CloudHaskell, see the paper as well as Jeff Epstein's thesis. It aims to provide precisely the actor abstraction you want, but as you say it is in the early stages. There is active discussion regarding improvements on the parallel-haskell mailing list, so if you have specific needs that remote
doesn't provide, we'd be happy for you to jump in and help us decide its future directions.
More mature but lower-level than remote
is the haskell-mpi
package. If you stick to the Simple
interface, messages can be sent containing arbitrary Serialize
instances, but the abstraction is still way lower than remote
.
There are some experimental systems, such as described in Implementing a High-level Distributed-Memory Parallel Haskell in Haskell (Patrick Maier and Phil Trinder, IFL 2011, can't find a pdf online). It blends a monad-par
approach of deterministic dataflow parallelism with a limited ability to make the I-structures serializable over the network. These sorts of abstraction have promise for doing distributed computation, but since the focus is on computing purely-functional values rather than providing Erlang-style processes, they probably wouldn't be a good fit for your application.
Also, for completeness, I should point out the Haskell wiki page on cloud and HPC Haskell, which covers what I describe here, as well as the subsection on distributed Haskell, which seems in need of a refresh.
I frequently get the feeling that IPC and actors are an oversold feature. There are plenty of attractive messaging systems out there that have Haskell bindings e.g. MessagePack, 0MQ or Thrift. IMHO the only thing you have to add is proper addressing of processes and decide who/what is managing this addressing capability.
By the way: a number of coders adopt e.g. 0MQ into their Erlang environments, simply because it offers the possibility to structure messaging via message brokers rather then relying on pure process to process messaging in super scale.
In a "massively multicore world" I personally assume that shared memory approaches will eventually be outperforming messaging. Someone can then always come and argue with asynchrony of course. But already when you write that you want to "tie together" your processes by "several main processes" you in fact speak about synchronization. Also, you can of course challenge whether a single function, process or thread is the right level of parallelization.
In short: I would probably see whether MessagePack or 0MQ could fit my needs in Haskell and care for the rest in my code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With