Instructions reordering in Java JVM

Tags:

concurrency

I was reading this blogpost.

And the author was talking about breaking the hashCode() in String in multithread environment.

By having:

public int hashCode() {      int h = hash;      if (h == 0) {          int off = offset;          char val[] = value;          int len = count;           for (int i = 0; i < len; i++) {              h = 31*h + val[off++];          }          hash = h;      }      return h;  }

Changed to:

public int hashCode() {      if (hash == 0) {          int off = offset;          char val[] = value;          int len = count;           int h = 0;          for (int i = 0; i < len; i++) {              h = 31*h + val[off++];          }          hash = h;      }      return hash;  }

Which the author says and I quote:

"What I've done here is to add an additional read: the second read of hash, before the return. As odd as it sounds, and as unlikely as it is to happen, the first read can return the correctly computed hash value, and the second read can return 0! This is allowed under the memory model because the model allows extensive reordering of operations. The second read can actually be moved, in your code, so that your processor does it before the first!"

So further going through comments, someone says it can be reordered to

int h = hash; if (hash == 0) {   ... } return h;

How is that possible? I thought reordering only involves moving program statements up and down. What rules is it following? I've Googled, read the JSR133 FAQ, checked the Java Concurrency in Practice book, but I can't seem to find a place that helps me to understand particularly on reordering. If anyone can point me to the right direction, I would really appreciate it.

Thanks to Louis clarifying the meaning of "Reordering", I wasn't thinking in terms of "byteCode"

However, I still don't understand why is it allowed to move the 2nd Read to the front, this is my naive attempt to translate it to somewhat "bytecode" format.

For simplification purpose, operations that are used to calculate the hashcode are express as calchash(). Therefore, I express the program as:

if (hash == 0)  {            h = calchash();     hash = h; } return hash;

And my attempt to express it in "bytecode" form:

R1,R2,R3 are in the operands stack, or the registers h is in the array of local variables

In program order:

if (hash == 0)  {       ---------- R1 = read hash from memory (1st read)                         ---------- Compare (R1 == 0)     h = calchash();     ---------- R2 = calchash()                         ---------- h = R2 (Storing the R2 to local variable h)     hash = h;           ---------- Hash = h (write to hash) } return hash             ---------- R3 = read hash from memory again(2nd read)                         ---------- return R3

Reordered transformation (My version based on comments):

                        ---------- R3 = read hash from memory (2nd read) *moved* if (hash == 0)  {       ---------- R1 = read hash from memory (1st read)                         ---------- Compare (R1 == 0)     h = calchash();     ---------- R2 = calchash()                         ---------- h = R2 (Storing the R2 to local variable h)     hash = h;           ---------- hash = h (write to hash) } return hash             ---------- return R3

Checking the comments again, I found this answered by the author:

Reordered Transformation (From the blog)

r1 = hash; if (hash == 0) {   r1 = hash = // calculate hash } return r1;

This case actually works on single thread, but it's possible to fail with multiple threads.

It seems that the JVM are making simplifications based on

h = hash and it simplifies the use of R1, R2, R3 to single R1

Therefore, JVM does more than reordering instructions, it also seems reducing the amount of registers being used.

917

asked Sep 23 '12 17:09

HBZ

1 Answers

In your modified code:

public int hashCode() {      if (hash == 0) { // (1)          int off = offset;          char val[] = value;          int len = count;           int h = 0;          for (int i = 0; i < len; i++) {              h = 31*h + val[off++];          }          hash = h;      }      return hash; // (2)  }

(1) and (2) could be reordered: (1) could read a non null value while (2) would read 0. That can't happen in the actual implementation in the String class because the calculation is made on the local variable and the return value is also that local variable, which, by definition, is thread safe.

The issue is that the Java Memory Model provides no guarantee when a shared variable (hash) is accessed without proper synchronization - in particular it does not guarantee that all executions will be sequentially consistent. Had hash been volatile, there would be no problem with the modified code.

ps: the author of that blog, I believe, is one of the writers of the Chapter 17 (Java Memory Model) of the JLS - so I would tend to believe him anyway ;-)

UPDATE

Following the various edits / comments - let's look at the bytecode in more details with these two methods (I assume that the hashcode is always 1 to keep things simple):

public int hashcode_shared() {     if (hash == 0) { hash = 1; }     return hash; }  public int hashcode_local() {     int h = hash;     if (h == 0) { hash = h = 1; }     return h; }

The java compiler on my machine generates the following bytecode:

public int hashcode_shared();    0: aload_0                           //read this    1: getfield      #6                  //read hash (r1)    4: ifne          12                  //compare r1 with 0    7: aload_0                           //read this    8: iconst_1                          //constant 1    9: putfield      #6                  //put 1 into hash (w1)   12: aload_0                           //read this   13: getfield      #6                  //read hash (r2)   16: ireturn                           //return r2  public int hashcode_local();    0: aload_0                           //read this    1: getfield      #6                  //read hash (r1)    4: istore_1                          //store r1 in local variable h    5: iload_1                           //read h    6: ifne          16                  //compare h with 0    9: aload_0                           //read this   10: iconst_1                          //constant 1   11: dup                               //constant again   12: istore_1                          //store 1 into h   13: putfield      #6                  //store 1 into hash (w1)   16: iload_1                           //read h   17: ireturn                           //return h

In the first example, there are 2 reads of the shared variable hash: r1 and r2. As discussed above, because there is no synchronization and the variable is shared, the Java Memory Model applies and a compiler/JVM is allowed to reorder the two reads: line #13 could be inserted before line #1*.

In the second example, all the operations on h, the local variable, need to be sequentially consistent because because of intra-thread semantics and program order guarantee on non-shared variables.

Note: as always, the fact that the reordering is allowed does not mean it will be performed. It is actually unlikely to happen on current x86/hotspot combinations. But it could happen on other current or future architectures/JVM.

*That is a bit of a shortcut, what could happen in practice is that the compiler might rewrite hashcode_shared like this:

public int hashcode_shared() {     int h = hash;     if (hash != 0) return h;     return (hash = 1); }

The code is strictly equivalent in a single threaded environment (it will always return the same value as the original method) so the reordering is allowed. But in a multi-threaded environment, it is clear that if hash is changed from 0 to 1 by another thread between the first two lines, this reordered method will incorrectly return 0.

answered Sep 17 '22 16:09

assylias

Related questions
                            
                                Windows error 2 occured while loading the Java VM
                            
                                Why does this Java code with "+ +" compile?
                            
                                Deleting all the rows in a JTable
                            
                                Conversion of nanoseconds to milliseconds and nanoseconds < 999999 in Java
                            
                                Examples of good Java desktop applications [closed]
                            
                                Implementing USSD features. Binding a service to the PhoneUtils without restarting the phone on every update
                            
                                FileObserver does not work on external storage in Android 6.0 Marshmallow (API 23)
                            
                                Eclipse Maven Projects get rid of wb-resource warnings
                            
                                How to use `setResultTransformer` after Hibernate 5.2?
                            
                                Does TestNG has runner like SpringJUnit4ClassRunner
                            
                                What's behind the hashCode() method for String in Java? [duplicate]
                            
                                Why can't I drag execution point in IntelliJ (I can in Visual Studio)
                            
                                Fast scroll thumb disappears while scrolling AlphabetIndexer
                            
                                Massive memory hemorrhage that causes heap size to go from about 64mb, to 1.5gb in roughly 8 seconds. Issue with garbage collector?
                            
                                Just a little recursion issue in Java
                            
                                Why reading a volatile and writing to a field member is not scalable in Java?
                            
                                Method breakpoints may dramatically slow down debugging
                            
                                How to detect all the Devices connected in a WiFi network from Android App
                            
                                Jar hell: how to use a classloader to replace one jar library version with another at runtime
                            
                                Recommended Java library for creating a video programmatically [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Instructions reordering in Java JVM

Tags:

java

concurrency

HBZ

People also ask

1 Answers

assylias

Recent Activity

Donate For Us