I have heard that in Haskell, creating a multi-threaded application is as easy as taking a standard Haskell application and compiling it with the -threaded
flag. Other cases, however, have described the use of a par
command within the actual source code.
What is the state of Haskell multi-threading? How easy is it to introduce into programs? Is there a good multi-threading tutorial that goes over these different commands and their uses?
For explicit concurrency and/or parallelism, Haskell implementations have a light-weight thread system that schedules logical threads on the available operating system threads. These light and cheap threads can be created with forkIO.
Multithreading isn't hard. Properly using synchronization primitives, though, is really, really, hard. You probably aren't qualified to use even a single lock properly. Locks and other synchronization primitives are systems level constructs.
Multi-threaded programming is probably the most difficult solution to concurrency. It basically is quite a low level abstraction of what the machine actually does. There's a number of approaches, such as the actor model or (software) transactional memory, that are much easier.
Without -threaded , the Haskell process uses a single OS thread only, and multithreaded foreign calls are not supported.
What is the state of Haskell multi-threading?
Mature. The implementation is around 15 years old, with transactional memory for 5 years. GHC is a widely used compiler, with large open source support, and commercial backing.
How easy is it to introduce into programs?
This depends on the algorithm. Sometimes it can be a one line use of par
to gain parallelism. Sometimes new algorithms must be developed. In general it will be easier to introduce safe parallelism and concurrency in Haskell, than in typical languages, and performance is good.
Is there a good multi-threading tutorial that goes over these different commands and their uses?
There are 3 major parallel and concurrent programming models in Haskell.
par
These are the main things. In all cases you compile with -threaded to use the multicore runtime, but how easy it is to parallelise a particular problem depends on the algorithm you use, and the parallel programming model you adopt from that list.
Here is an introduction to the main parallel programming models in Haskell, and how to achieve speedups.
I think Chapter 24 of Real World Haskell is a good tutorial.
There is also concurrency term.
Without any changes in code your haskell rts will try to use them for some internal process, but to use in your application you should give a hint that's done by par b (f a b)
which forces Haskell to be not so lazy on caculation of b
even if f
will not require it for result.
One of the reason to not do that for every function which require its all its arguments (like a+b
) is that synchronization (scheduling calculations and waiting for results) gives some overhead and you probably don't want to spend extra-ticks for (2*3)+(3*4)
just because you can calculate multiplications in parallel. And you probably will loose some cache-hits or something like this or optimizations which is done when you do that on single processor (i.e. you'll need to pass result from one processor to another anyway).
Of course code which is uses par
is ugly and when you fold list or some other data structures with light sub-elements you'll probably want to calculate some chunks of that light elements to make sure that overhead/calc will be really small. To resolve that you can look at parallel.
There is also Data Parallel Haskell (DPH).
If your program is more about IO monad than you definitely need many changes. See forkIO
, Software Transactional Memory (STM) and many others from Concurrency category
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With