Software transactional memory with a big, shared piece of data

The original question

I'm new to STM. One thing I'd like to do in Haskell involves a big piece of data, and lots of lightweight threads reading and writing to small parts of said big piece of data. The locations read from and written to can be considered essentially random and small. STM seems great for this, but I have some questions regarding how to attack the problem. I see several possible approaches, each with some drawbacks, and some that seem plain stupid. Some comments on these, or other ideas for alternatives, would be much appreciated.

Let's for simplicity assume that the big piece of data is a Data.Vector a, where the elements are themselves small.

The entire piece of data as one TVar (Vector a). I guess this will cause lots of copying of the huge piece of data, as STM will think each individual write has possibly affected the entire shared data. Surely there's no magic going on where STM identifies that the reads and writes are very localized, and that consistency across the big piece of data is not required?
A huge amount of TVar as, essentially one for each element, giving fully localized STM, but essentially duplicating the entire Vector a. This seems stupid.
A compromise between 1 and 2 by segmenting the data so that I have a reasonable number of TVar (Vector a)s corresponding to subvectors of the data. I feel this solution depends too much on heuristics, like how big the segments should be.
Messaging. Instead of each worker reading and writing to the data using STM, each writes messages with requests for data to be read or chunks of data to be written through some STM mechanism, such as a TChan. A special thread receives these messages, passing out data requested through another TChan, or taking received data and writing it to the shared data structure. This solution seems free of the problems plaguing solutions 1-3, but it also seems to me that it essentially gives up on using the niceties of STM to keep the data consistent. Rather, it's just message passing. Granted, the message passing part is implemented with STM, but my real problem is solved with message passing. STM seems so great, message passing is so... meh.

Am I thinking about this correctly? Does anybody have any hints or other suggestions?

Keep in mind that I have no experience with STM, nor have I tried the above solutions yet. I'll get out of my armchair, but sometimes it can be good to think about these things before trying anything.

Addendum: A fifth approach comes from Nathan Howell and uses TArray. It sounds like what I want, but the documentation says:

It is currently implemented as Array ix (TVar e), but it may be replaced by a more efficient implementation in the future (the interface will remain the same, however).

I take this to mean that TArray is just my approach number 2 in nicer clothes. The docs hinting about "a more efficient" implementation is interesting, as it hints of there actually being a nicer approach.

Follow-up on Vagif Verdi's answer

Vagif Verdi's answer is very interesting, so I made a little experiment to try it out. I don't have time to slim down the code right now, so those interested in this will have to bear with me on the code not just containing the bare essentials. I decided to use a mutable vector with 10^8 Ints as the "big shared piece of data", and let the multiple readers/writers correspond to threads taking into on a network socket.

Note that the code does not even read or write to the shared piece of data just. It's simply there, and each thread holds a TVar to it.

So what happens? I run the program, and immediately it takes up about 780 MB of RAM, which is to be expected (it's roughly what the 10^8 Ints need). Now, if I use netcat to connect a few clients and write a bit of text, which the program is just supposed to print out and not even write to the shared data, the process' CPU usage spikes to 100% for upwards of a second! There's a noticeable lag before the text is displayed. On the bright side, memory usage stays constant (as per Vagif Verdi's answer). If I remove the vector and the TVar, i.e. take out all STM and shared data, everything is very quick and responsive, and there's no noticeable CPU usage whenever a client writes something.

So, while it's very nice to see that the shared data is in fact not duplicated (although I suppose I should actually write to the shared data in order to verify this fully), there's a very heavy performance penalty involved in maintaining coherent state. For me, my question then remains: How should one properly attack this problem while keeping the niceties of STM?

Thanks to Vagif Verdi for bringing up some very interesting points.

434

asked Mar 03 '12 15:03

gspr

1 Answers

STM isn't magic. If you have one giant TVar it is going to have to block all threads but one at any given time. This is equivalent to "coarse grained locking" and has the benefit that it is easy to use, but the whole point of STM is to not be forced into using coarse grained locking. Effectively only one thread can write at a time with this approach. The same goes with your "message passing" system--one thread is a bottleneck limiting scalability.

I don't think there is a big problem with using an array of TVars and I don't know why you describe your approach 2 as "stupid". This is exactly what STM was invented to do.

Edit: I encourage interested parties to watch this video, or at least the beginning of it, for a discussion of some of the motivation for STM. It is a few years old, and the stuff on Transactional Boosting isn't really relevant, but Herlihy is brilliant and one of the Computer Scientists who manages to make an area interesting even if it isn't your thing.

141

answered Oct 02 '22 16:10

Philip JF

Related questions
                            
                                QuickCheck 2 batch processing
                            
                                Why don't you need to use 'lift' when interacting with a nested StateT monadT in this case?
                            
                                Haskell Pipes and Branching
                            
                                simplifying maybe Monad
                            
                                Dead code and/or how to generate a cross reference from Haskell source
                            
                                How do I reexport qualified imported modules?
                            
                                How much does Haskell/GHC memoize?
                            
                                Can I get KnownNat n to imply KnownNat (n * 3), etc?
                            
                                What is a "real server" for websockets in Haskell?
                            
                                Why does this code that type-checked in GHC 7.10 no longer type check in GHC 8.0.1?
                            
                                When finding last but second element of a list, why is using `last` the fastest among these?
                            
                                Cannot 'coerce' data type with 'Reader' as a field
                            
                                What's the most efficient way to take up to the last n elements of a list
                            
                                Autoindenting Haskell in Vim
                            
                                How to "unpack" a structure in haskell
                            
                                How do you make this Haskell power function tail recursive?
                            
                                Is there any hope to cast ForeignPtr to ByteArray# (for a function :: ByteString -> Vector)
                            
                                How to define an arbitrary arity function in Haskell, which includes an arity of 0?
                            
                                Displaying dynamically generated images in Yesod
                            
                                Haskell implemented without a stack?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Software transactional memory with a big, shared piece of data

Tags:

haskell

stm