Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-Toy Software Transactional Memory for C or Java

I'm thinking about the possibility of teaching the use of Software Transactional Memory through 1 or 2 guided laboratories for a university course. I only know about Haskell's STM, but the students of the course probably never heard a word about it.

I already found some lists of such libraries online or in other questions (e.g., http://en.wikipedia.org/wiki/Software_transactional_memory#C.2FC.2B.2B). I'm checking them out as you read this, but many of them do not seem to have a very nice documentation (most are research prototypes only vaguely described in papers, and I would rather teach about something more used and well documented). Furthermore, many of the links provided by wikipedia are dangling.

To sum it up, are there STM implementations aimed to industrial projects (or at least non-toy ones, to ensure a certain level of quality) and well documented (to give some good pointers to the students)?

EDIT: I'm not the teacher of the course, I just help him with the laboratories. Of course the students will be taught basics of concurrency and distributed algorithms before. This was just an idea to propose something different towards the end of the course.

like image 249
Riccardo T. Avatar asked Feb 25 '13 09:02

Riccardo T.


3 Answers

Production-quality STM-Libraries are not intended as a teaching tool, not even as "best practice". What is worth learning for any college/university-course is maybe 1% of the code; the remaining 99% is nitty-gritty platform-dependent intrinsic corner-cases. The 1% that is interesting is not highlighted in any way so you have no way of finding it.

What I recommend for a college/university-course (no matter if introductory or advanced) is to implement STM-buildingblocks yourself (and only for 1 platform).

Start by introducing the problems: concurrency, cache...

Then introduce the atomic helpers we have: cas/cmpxchg, fence.

Then build examples together with your students, first easy, then harder and more complex.

like image 141
Bernd Elkemann Avatar answered Oct 31 '22 02:10

Bernd Elkemann


Start by introducing the problems: concurrency, cache...

Leading on from eznme, some good problems that I covered while at University for concurrency.

  • Dining philosophers problem

    In computer science, the dining philosophers problem is an example problem often used in concurrent algorithm design to illustrate synchronization issues and techniques for resolving them.

    dining phil
    (source: wikimedia.org)

Using the same implementation from here, by Je Magee and Je Kramer, and solving the problem using monitors.

Most shared memory applications are more efficient with Integers than Strings (due to AtomicInteger class for Java). So the best way to demonstrate shared memory in my opinion is to get the students to write an application that uses a threadpool to calculate prime numbers, or to calculate some integral.

Or a good example of threads and shared memory is the Producer-consumer problem.

The producer-consumer problem (also known as the bounded-buffer problem) is a classic example of a multi-process synchronization problem.

producer
(source: csusb.edu)

Implementation found here, there is also an implementation from Massey from the professor in Software Eng Jenz Dietrich.

For distributed algorithms MapReduce and Hadoop are highly documented distributed data structures. And for Distributed Programming Libraries look into MPI (Message Passing Interface) and OpenMP (or Pragma for C++). There is also implementations of Dijkstra shortest path algorithm in parallel too.

like image 40
classicjonesynz Avatar answered Oct 31 '22 02:10

classicjonesynz


There are three good ways to do STM today.

The first way is to use gcc and do TM in C or C++. As of gcc 4.7, transactional memory is supported via the -fgnu-tm flag. The gcc maintainers have done a lot of work, and as of the 4.9 (trunk) branch, you can even use hardware TM (e.g., Intel Haswell TSX). There is a draft specification for the interface to the TM at http://justingottschlich.com/tm-specification-for-c-v-1-1/, which is not too painful. You can also find use cases of gcc's TM from the TM community (see, for example, the application track papers from transact 2014: http://transact2014.cse.lehigh.edu).

The implementation itself is a bit complex, but that's what it takes to be correct. There's a lot of literature on the things that can go wrong, especially in a type-unsafe language like C or C++. GCC gets all of these things right. Really.

The second way is to use Java. You can either use DeuceSTM, which is very easy to extend (type safety makes TM implementation much easier!), or use Scala's Akka library for STM. I prefer Deuce, because it's easier to extend and easier to use (you just annotate a method as @Atomic, and Deuce's java agents do the rest).

The third way is to use Scala. I've not done much in this space, but researchers seem to love Akka. If you're affiliated with a parallel/distributed class, you might even be using Scala already.

like image 34
Mike Spear Avatar answered Oct 31 '22 02:10

Mike Spear