Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory barriers and coding style over a Java VM

Suppose I have a static complex object that gets periodically updated by a pool of threads, and read more or less continually in a long-running thread. The object itself is always immutable and reflects the most recent state of something.

class Foo() { int a, b; } static Foo theFoo; void updateFoo(int newA, int newB) {   f = new Foo();   f.a = newA;   f.b = newB;   // HERE   theFoo = f; } void readFoo() {   Foo f = theFoo;   // use f... } 

I do not care in the least whether my reader sees the old or the new Foo, however I need to see a fully initialized object. IIUC, The Java spec says that without a memory barrier in HERE, I may see an object with f.b initialized but f.a not yet committed to memory. My program is a real-world program that will sooner or later commit stuff to memory, so I don't need to actually commit the new value of theFoo to memory right away (though it wouldn't hurt).

What do you think is the most readable way to implement the memory barrier ? I am willing to pay a little performance price for the sake of readability if need be. I think I can just synchronize the assignment to Foo and that would work, but I'm not sure it's very obvious to someone reading the code why I do that. I could also synchronize the whole initialization of the new Foo, but that would introduce more locking that actually needed.

How would you write it so that it's as readable as possible ?
Bonus kudos for a Scala version :)

like image 354
Jean Avatar asked Oct 18 '10 23:10

Jean


People also ask

What is memory barrier in Java?

Memory barriers, or fences, are a set of processor instructions used to apply ordering limitations on memory operations.

What is memory barrier in OS?

In computing, a memory barrier, also known as a membar, memory fence or fence instruction, is a type of barrier instruction that causes a central processing unit (CPU) or compiler to enforce an ordering constraint on memory operations issued before and after the barrier instruction.


1 Answers

Short Answers to the Original Question

  • If Foo is immutable, simply making the fields final will ensure complete initialization and consistent visibility of fields to all threads irrespective of synchronization.
  • Whether or not Foo is immutable, publication via volatile theFoo or AtomicReference<Foo> theFoo is sufficient to ensure that writes to its fields are visible to any thread reading via theFoo reference
  • Using a plain assignment to theFoo, reader threads are never guaranteed to see any update
  • In my opinion, and based on JCiP, the "most readable way to implement the memory barrier" is AtomicReference<Foo>, with explicit synchronization coming in second, and use of volatile coming in third
  • Sadly, I have nothing to offer in Scala

You can use volatile

I blame you. Now I'm hooked, I've broken out JCiP, and now I'm wondering if any code I've ever written is correct. The code snippet above is, in fact, potentially inconsistent. (Edit: see the section below on Safe publication via volatile.) The reading thread could also see stale (in this case, whatever the default values for a and b were) for unbounded time. You can do one of the following to introduce a happens-before edge:

  • Publish via volatile, which creates a happens-before edge equivalent to a monitorenter (read side) or monitorexit (write side)
  • Use final fields and initialize the values in a constructor before publication
  • Introduce a synchronized block when writing the new values to theFoo object
  • Use AtomicInteger fields

These gets the write ordering solved (and solves their visibility issues). Then you need to address visibility of the new theFoo reference. Here, volatile is appropriate -- JCiP says in section 3.1.4 "Volatile variables", (and here, the variable is theFoo):

You can use volatile variables only when all the following criteria are met:
  • Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;
  • The variable does not participate in invariants with other state variables; and
  • Locking is not required for any other reason while the variable is being accessed

If you do the following, you're golden:

class Foo {    // it turns out these fields may not be final, with the volatile publish,    // the values will be seen under the new JMM   final int a, b;    Foo(final int a; final int b)    { this.a = a; this.b=b; } }  // without volatile here, separate threads A' calling readFoo() // may never see the new theFoo value, written by thread A  static volatile Foo theFoo; void updateFoo(int newA, int newB) {   f = new Foo(newA,newB);   theFoo = f; } void readFoo() {   final Foo f = theFoo;   // use f... } 

Straightforward and Readable

Several folks on this and other threads (thanks @John V) note that the authorities on these issues emphasize the importance of documentation of synchronization behavior and assumptions. JCiP talks in detail about this, provides a set of annotations that can be used for documentation and static checking, and you can also look at the JMM Cookbook for indicators about specific behaviors that would require documentation and links to the appropriate references. Doug Lea has also prepared a list of issues to consider when documenting concurrency behavior. Documentation is appropriate particularly because of the concern, skepticism, and confusion surrounding concurrency issues (on SO: "Has java concurrency cynicism gone too far?"). Also, tools like FindBugs are now providing static checking rules to notice violations of JCiP annotation semantics, like "Inconsistent Synchronization: IS_FIELD-NOT_GUARDED".

Until you think you have a reason to do otherwise, it's probably best to proceed with the most readable solution, something like this (thanks, @Burleigh Bear), using the @Immutable and @GuardedBy annotations.

@Immutable class Foo {    final int a, b;    Foo(final int a; final int b) { this.a = a; this.b=b; } }  static final Object FooSync theFooSync = new Object();  @GuardedBy("theFooSync"); static Foo theFoo;  void updateFoo(final int newA, final int newB) {   f = new Foo(newA,newB);   synchronized (theFooSync) {theFoo = f;} } void readFoo() {   final Foo f;   synchronized(theFooSync){f = theFoo;}   // use f... } 

or, possibly, since it's cleaner:

static AtomicReference<Foo> theFoo;  void updateFoo(final int newA, final int newB) {   theFoo.set(new Foo(newA,newB)); } void readFoo() { Foo f = theFoo.get(); ... } 

When is it appropriate to use volatile

First, note that this question pertains to the question here, but has been addressed many, many times on SO:

  • When exactly do you use volatile?
  • Do you ever use the volatile keyword in Java
  • For what is used "volatile"
  • Using volatile keyword
  • Java volatile boolean vs. AtomicBoolean

In fact, a google search: "site:stackoverflow.com +java +volatile +keyword" returns 355 distinct results. Use of volatile is, at best, a volatile decision. When is it appropriate? The JCiP gives some abstract guidance (cited above). I'll collect some more practical guidelines here:

  • I like this answer: "volatile can be used to safely publish immutable objects", which neatly encapsulates most of the range of use one might expect from an application programmer.
  • @mdma's answer here: "volatile is most useful in lock-free algorithms" summarizes another class of uses—special purpose, lock-free algorithms which are sufficiently performance sensitive to merit careful analysis and validation by an expert.

  • Safe Publication via volatile

    Following up on @Jed Wesley-Smith, it appears that volatile now provides stronger guarantees (since JSR-133), and the earlier assertion "You can use volatile provided the object published is immutable" is sufficient but perhaps not necessary.

    Looking at the JMM FAQ, the two entries How do final fields work under the new JMM? and What does volatile do? aren't really dealt with together, but I think the second gives us what we need:

    The difference is that it is now no longer so easy to reorder normal field accesses around them. Writing to a volatile field has the same memory effect as a monitor release, and reading from a volatile field has the same memory effect as a monitor acquire. In effect, because the new memory model places stricter constraints on reordering of volatile field accesses with other field accesses, volatile or not, anything that was visible to thread A when it writes to volatile field f becomes visible to thread B when it reads f.

    I'll note that, despite several rereadings of JCiP, the relevant text there didn't leap out to me until Jed pointed it out. It's on p. 38, section 3.1.4, and it says more or less the same thing as this preceding quote -- the published object need only be effectively immutable, no final fields required, QED.

    Older stuff, kept for accountability

    One comment: Any reason why newA and newB can't be arguments to the constructor? Then you can rely on publication rules for constructors...

    Also, using an AtomicReference likely clears up any uncertainty (and may buy you other benefits depending on what you need to get done in the rest of the class...) Also, someone smarter than me can tell you if volatile would solve this, but it always seems cryptic to me...

    In further review, I believe that the comment from @Burleigh Bear above is correct --- (EDIT: see below) you actually don't have to worry about out-of-sequence ordering here, since you are publishing a new object to theFoo. While another thread could conceivably see inconsistent values for newA and newB as described in JLS 17.11, that can't happen here because they will be committed to memory before the other thread gets ahold of a reference to the new f = new Foo() instance you've created... this is safe one-time publication. On the other hand, if you wrote

    void updateFoo(int newA, int newB) {   f = new Foo(); theFoo = f;        f.a = newA; f.b = newB; } 

    But in that case the synchronization issues are fairly transparent, and ordering is the least of your worries. For some useful guidance on volatile, take a look at this developerWorks article.

    However, you may have an issue where separate reader threads can see the old value for theFoo for unbounded amounts of time. In practice, this seldom happens. However, the JVM may be allowed to cache away the value of the theFoo reference in another thread's context. I'm quite sure marking theFoo as volatile will address this, as will any kind of synchronizer or AtomicReference.

    like image 173
    23 revs Avatar answered Sep 28 '22 08:09

    23 revs