In Effective Java - Item 74 Joshua Bloch demonstrates safe use of parameterless constructor with separate initialization method in following code snippet.
abstract class AbstractFoo {
private int x, y; // Our state
// This enum and field are used to track initialization
private enum State {
NEW, INITIALIZING, INITIALIZED
};
private final AtomicReference<State> init = new AtomicReference<State>(
State.NEW);
public AbstractFoo(int x, int y) {
initialize(x, y);
}
// This constructor and the following method allow
// subclass's readObject method to initialize our state.
protected AbstractFoo() {
}
protected final void initialize(int x, int y) {
if (!init.compareAndSet(State.NEW, State.INITIALIZING))
throw new IllegalStateException("Already initialized");
this.x = x;
this.y = y;
// ...Do anything else the original constructor did
init.set(State.INITIALIZED);
}
// These methods provide access to internal state so it can
// be manually serialized by subclass's writeObject method.
protected final int getX() {
checkInit();
return x;
}
protected final int getY() {
checkInit();
return y;
}
// Must call from all public and protected instance methods
private void checkInit() {
if (init.get() != State.INITIALIZED)
throw new IllegalStateException("Uninitialized");
}
}
What puzzles me is use of AtomicReference
. His explanation sounds:
Note that the initialized field is an atomic reference (java.util.concurrent.atomic.AtomicReference). This is necessary to ensure object integrity in the face of a determined adversary. In the absence of this precaution, if one thread were to invoke initialize on an instance while a second thread attempted to use it, the second thread might see the instance in an inconsistent state.
I fail to understand how this strengthens the object safety against using it in inconsistent state. In my understanding, if one threads runs initialize()
and the second one runs any of accessors, there cannot be a situation when
the second would read the value of the x or y field without initialization being marked as completed.
Other possible issue I might see here is that AtomicReference
should be threadsafe (probably with volatile field inside). This would ensure immediate synchronization of value change in the init
variable with other threads which would prevent getting IllegalStateException
when in fact the initialization has been done but the thread executing accessor methods cannot see it. But is this the thing the author is talking about?
Is my reasoning correct? Or is there other explanation to this?
This is a long answer, and it sounds like you already have some grasp of the issue, so I'm adding headers to try and make it easier for you to fast-forward past the parts you already know.
Multithreading is a bit tricky, and one of the trickier bits is that the compiler/JVM is allowed to reorder operations across threads in the absence of synchronization. That is, if thread A does:
field1 = "hello";
field2 = "world";
and thread B does:
System.out.println(field2);
System.out.println(field1);
Then it's possible that thread B would print out "world" followed by "null" (assuming that's what field1
was initially). This "shouldn't" happen, because you set field2
after field1
in the code — so if field2
has been set, then surely field1
must be, too? Nope! The compiler is allowed to reorder things so that thread 2 sees the assignments as happening like this:
field2 = "world";
field1 = "hello";
(It could even see field2 = "world"
and never see field1 = "hello"
, or it could never see either assignment, or other possibilities.) There are various reasons why this could happen: it might be more efficient due to how the compiler wants to use registers, or it could be that it's a more efficient way to share memory across CPU cores. Point is, it's allowed.
One of the more un-intuitive concepts here is that a constructor generally doesn't provide any special guarantees for reordering (except, it does for final
fields). So don't think of the constructor as anything other than a method, and don't think of a method as anything other than a grouping of actions, and don't think of an object's state as anything other than a grouping of fields. It seems obvious that an assignment in a constructor would be seen by anyone who has that object (after all, how can you read an object's state before you finished making the object?), but that notion is incorrect due to reorderings. What you think of as foo = new ConcreteFoo()
is actually:
ConcreteFoo
(call it this
); call initalize
, do some stuff...this.x = x
this.y = y
foo = <the newly constructed object>
You can see how the bottom three assignments could be reordered; thread B could see them as happening in various ways, including (but not limited to):
foo = <the newly constructed object, with default values for all fields>
foo.getX()
which returns 0
this.x = x
(possible a long time later)this.y = y
is never seen by thread B)However, there are ways to solve that problem. Let's put the AtomicReference
to the side for a moment...
The way to solve the problem is with a a happens-before (HB) relationship. If there is a HB relationship between the writes and the reads, then the CPU is not allowed to do the reordering above.
Specifically:
That's pretty abstract, so let me make it more concrete. One way you can establish a happens-before edge is with a volatile
field: there's a HB relationship between one thread writing to that field and another thread reading from it. So, if thread A writes to a volatile
field, and thread B reads from that same field, then thread B must see the world as thread A saw it at the time of the write (well, at least as recently as that: thread B could also see some subsequent actions).
So, let's say field2
were volatile
. In that case:
Thread 1:
field1 = "hello";
field2 = "world"; // point 1
Thread 2:
System.out.println(field2); // point 2
System.out.println(field1); // point 3
Here, point 1 "starts" a HB relationship that point 2 "finishes." That means that as of point 2, thread 2 must see everything that thread 1 saw at point 1 — specifically, the assignment field1 = "hello"
(as well as field2 = "world"
). And so, thread 2 will print out "world\n
hello" as expected.
So, what does all this have to do with AtomicReference
? The secret lies in the javadoc for the java.util.concurrent.atomic
package:
The memory effects for accesses and updates of atomics generally follow the rules for volatiles, as stated in section 17.4 of The Java™ Language Specification.
In other words, there is a HB relationship between myAtomicRef.set
and myAtomicRef.get
. Or, as in the example above, between myAtomicRef.compareAndSet
and myAtomicRef.get
.
AbstractFoo
Without the AtomicReference
actions, there are no HB relationships established in AbstractFoo
. If one thread assigns a value to this.x
(as it does in initialize
, called by the constructor) and another thread reads the value this.x
(as it does during getX
), you could have the reordering problem mentioned above, and have getX
return the default value for x
(that is, 0
).
But AbstractFoo
does take specific measures to establish HB relationships: initialize
also calls init.set
after it assigns this.x = x
, and getX
calls init.get
(via checkInit
) before it reads this.x
to return it (similarly with y
). That establishes the HB relationship, ensuring that thread 2 calling getX
, by the time it reads this.x
, sees the world as thread A saw it at the end of initialize
, when it called init.set
; specifically, thread 2 sees the action this.x = x
before it performs the action return [this.]x
.
There are a few other ways to establish happens-before edges, but that's out of scope for this answer. They're listed in JLS 17.4.4.
And the obligatory reference to JCIP, a great book for multithreading issues in general, and their applicability to Java in particular.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With