I'm progressing in my Clojure quest (about 80 problems solved on 4clojure.com) and I keep on reading and coding and trying to "get it".
Now I'm a bit confused by Clojure being designed for "lockless concurrency". I know all too well about deadlocks (as in: "I've written poor Java code that ended up in deadlocks", not as in "I'm in expert in concurrency"). I've also read this:
Why is lockless concurrency such a big deal (in Clojure)?
I realize how great it is that Clojure programs cannot deadlock.
But I'm a bit confused: is such a feat achieved by implementing under the hood lockfree algorithms or are there potentially "deadlockable" algorithms used, but using a correct implementation guaranteed never to deadlock (that would somehow be "hidden" to Clojure programmers)?
There's been a recent discussion on hacker news about lockfree algorithms:
http://news.ycombinator.com/item?id=4103921
referencing the following "Lock-free algorithms" page on 1024cores.net:
http://www.1024cores.net/home/lock-free-algorithms
I don't understand the relation between this article and how concurrency works under Clojure.
And it got me totally confused: when I'm developing concurrent programs in Clojure, does it mean "locks and lock-free algorithms" are a non-issue for me?
In general, lock free algorithms are less efficient per thread - you're doing more work, as you mentioned, in order to implement a lock free algorithm than a simple lock. However, they do tend to dramatically improve the overall throughput of the algorithm as a whole in the face of contention.
At its essence, lock-free is a property used to describe some code, without saying too much about how that code was actually written. Basically, if some part of your program satisfies the following conditions, then that part can rightfully be considered lock-free.
Lock-free algorithms increase the overall throughput of a system by occassionally increasing the latency of a particular transaction. Most high- end database systems are based on lock-free algorithms, to varying degrees.
In general Clojure avoids the problem of locks by properly handling time In many systems time for an object is a very loose concept because an object at time-1 (before update) is edited in place to become that object at time-2 (after update), during that process it is netither first nor second, so we use locks to ensure it's only visible before or after this transition. Coordinating locks follows from this and deadlocks from that ...
It's a combination of algorithms, data structure, and time.
Clojure does this by combining immutable data structures, functional-programming, and a coordinated time model (refs, atoms, agents, etc). In this model a function takes something and produces the next version of it while preserving the past so long as anyone is looking at it (until the GC gets to it)
Immutable Data Structures: Clojure's collections are persistent in the FP sense of the word. Old copies "persist" after new versions are made. this way observers don't need to lock objects because they will never change out from under them. Newer versions may exist that are based on the version they are looking at though nothing will change their copy.
Functional Programming: Pure (or as close as makes no diference) Function take a collection at one point in time and produce the next version with out sharing their internal state so no locking will be required. This has many other benefits as well.
Coordinated time: When multiple objects do need to be coordinated, as is the case with any interesting system, then Clojure's time model comes into play. There are different mechanisms for different purposes. this has one lock internally used to count the increments of time so that there is exactly one time zero, one time one, and one time N. So it is not strictly lock free. the STM contains locks that you never* need to interact with
If you grep around the Clojure source, and in particular, the .java files you will find quite a few references to the java.util.concurrent
package. The java.util.concurrent
package is the culmination of literally decades of research on concurrency by Doug Lea at SUNY Oswego. Specifically, there are references to the atomic variable classes (e.g AtomicReference
) that allow access to the "compare and swap" (also called "compare and set", or CAS) instruction. The CAS instruction is a bit difficult to explain (I provide a reference below) but proper use of CAS is at the heart of what it means for an algorithm to be "lock free" (at least in the Java world). Lock free algorithms ultimately lead to higher throughput and less contention for highly concurrent applications; exactly the domain that Clojure is targeting.
For an in-depth look on this subject, read Java Concurrency in Practice by Brian Goetz. Also see this article by the same author.
As a side note, I always found it difficult to use the java.util.concurrent
package directly even as it evolved. It just felt too low-level to me. The wonderful thing about Clojure is it provides access to that expert concurrency library but through terrific easy to use software transactional memory (STM) abstractions. That is really quite an accomplishment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With