I've been trying to get a understanding of concurrency, and I've been trying to work out what's better, one big IORef
lock or many TVar
s. I've came to the following guidelines, comments will be appreciated, regarding whether these are roughly right or whether I've missed the point.
Lets assume our concurrent data structure is a map m
, accessed like m[i]
. Lets also say we have two functions, f_easy
and f_hard
. The f_easy
is quick, f_hard
takes a long time. We'll assume the arguments to f_easy/f_hard
are elements of m
.
(1) If your transactions look roughly like this m[f_easy(...)] = f_hard(...)
, use an IORef
with atomicModifyIORef
. Laziness will ensure that m
is only locked for a short time as it's updated with a thunk. Calculating the index effectively locks the structure (as something is going to get updated, but we don't know what yet), but once it's known what that element is, the thunk over the entire structure moves to a thunk only over that particular element, and then only that particular element is "locked".
(2) If your transactions look roughly like this m[f_hard(...)] = f_easy(...)
, and the don't conflict too much, use lots of TVar
s. Using an IORef
in this case will effectively make the app single threaded, as you can't calculate two indexes at the same time (as there will be an unresolved thunk over the entire structure). TVar
s let you work out two indexes at the same time, however, the negative is that if two concurrent transactions both access the same element, and one of them is a write, one transaction must be scrapped, which wastes time (which could have been used elsewhere). If this happens a lot, you may be better with locks that come (via blackholing) from IORef
, but if it doesn't happen very much, you'll get better parallelism with TVar
s.
Basically in case (2), with IORef
you may get 100% efficiency (no wasted work) but only use 1.1 threads, but with TVar
if you have a low number of conflicts you might get 80% efficiency but use 10 threads, so you still end up 7 times faster even with the wasted work.
Concurrent Haskell is the name given to GHC's concurrency extension. It is enabled by default, so no special flags are required. The Concurrent Haskell paper is still an excellent resource, as is Tackling the awkward squad.
Concurrent Haskell extends Haskell 98 with explicit concurrency. Its two main underlying concepts are: A primitive type MVar α implementing a bounded/single-place asynchronous channel, which is either empty or holds a value of type α . The ability to spawn a concurrent thread via the forkIO primitive.
Parallelism and Concurrency in HaskellHaskell supports both pure parallelism and explicit concurrency.
Your guidelines are somewhat similar to the findings of [1] (Section 6) where the performance of the Haskell STM is analyzed:
"In particular, for programs that do not perform much work inside transactions, the commit overhead appears to be very high. To further observe this overhead, an analysis needs to be conducted on the performance of commit-time course-grain and fine-grain STM locking mechanisms."
I use atomicModifyIORef
or an MVar
when all the synchronization I need is something that simple locking will ensure. When looking at concurrent accesses to a data structure, it also depends on how this data structure is implemented. For example, if you store your data inside a IORef Data.Map
and frequently perform read/write access then I think atmoicModifyIORef
will degrade to a single thread performance, as you have conjectured, but the same will be true for a TVar Data.Map
. My point is that it's important to use a data structure that is suitable for concurrent programming (balanced trees aren't).
That said, in my opinion the winning argument for using STM is composability: you can combine multiple operations into a single transactions without headaches. In general, this isn't possible using IORef
or MVar
without introducing new locks.
[1] The limits of software transactional memory (STM): dissecting Haskell STM applications on a many-core environment. http://dx.doi.org/10.1145/1366230.1366241
Answer to @Clinton's comment:
If a single IORef
contains all your data, you can simply use atomicModifyIORef
for composition. But if you need to process lots of parallel read/write requests to that data, the performance loss might become significant, since every pair of parallel read/write requests to that data might cause a conflict.
The approach that I would try is to use a data structure where the entries themselves are stored inside a TVar
(vs putting the whole data structure into a single TVar
). That should reduce the possibility of livelocks, as transactions won't conflict that often.
Of course, you still want to keep your transactions as small as possible and use composability only if it's absolutely necessary to guarantee consistency. So far I haven't encountered a scenario where combining more than a few insert/lookup operations into a single transaction was necessary.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With