Is volatile expensive?

People also ask

Is volatile slow?

volatile fields can be slower than non- volatile fields, because the system is forced to store to memory rather than use registers. But they may useful to avoid concurrency problems.

Is volatile safe?

volatile is neither thread-safe nor non-thread-safe on its own. volatile guarantees atomicity for a single field, and so it can be used for single thread-safe reads or single thread-safe writes.

What is volatile in Java?

Volatile keyword is used to modify the value of a variable by different threads. It is also used to make classes thread safe. It means that multiple threads can use a method and instance of the classes at the same time without any problem. The volatile keyword can be used either with primitive type or objects.

On Intel an un-contended volatile read is quite cheap. If we consider the following simple case:

public static long l;

public static void run() {        
    if (l == -1)
        System.exit(-1);

    if (l == -2)
        System.exit(-1);
}

Using Java 7's ability to print assembly code the run method looks something like:

# {method} 'run2' '()V' in 'Test2'
#           [sp+0x10]  (sp of caller)
0xb396ce80: mov    %eax,-0x3000(%esp)
0xb396ce87: push   %ebp
0xb396ce88: sub    $0x8,%esp          ;*synchronization entry
                                    ; - Test2::run2@-1 (line 33)
0xb396ce8e: mov    $0xffffffff,%ecx
0xb396ce93: mov    $0xffffffff,%ebx
0xb396ce98: mov    $0x6fa2b2f0,%esi   ;   {oop('Test2')}
0xb396ce9d: mov    0x150(%esi),%ebp
0xb396cea3: mov    0x154(%esi),%edi   ;*getstatic l
                                    ; - Test2::run@0 (line 33)
0xb396cea9: cmp    %ecx,%ebp
0xb396ceab: jne    0xb396ceaf
0xb396cead: cmp    %ebx,%edi
0xb396ceaf: je     0xb396cece         ;*getstatic l
                                    ; - Test2::run@14 (line 37)
0xb396ceb1: mov    $0xfffffffe,%ecx
0xb396ceb6: mov    $0xffffffff,%ebx
0xb396cebb: cmp    %ecx,%ebp
0xb396cebd: jne    0xb396cec1
0xb396cebf: cmp    %ebx,%edi
0xb396cec1: je     0xb396ceeb         ;*return
                                    ; - Test2::run@28 (line 40)
0xb396cec3: add    $0x8,%esp
0xb396cec6: pop    %ebp
0xb396cec7: test   %eax,0xb7732000    ;   {poll_return}
;... lines removed

If you look at the 2 references to getstatic, the first involves a load from memory, the second skips the load as the value is reused from the register(s) it is already loaded into (long is 64 bit and on my 32 bit laptop it uses 2 registers).

If we make the l variable volatile the resulting assembly is different.

# {method} 'run2' '()V' in 'Test2'
#           [sp+0x10]  (sp of caller)
0xb3ab9340: mov    %eax,-0x3000(%esp)
0xb3ab9347: push   %ebp
0xb3ab9348: sub    $0x8,%esp          ;*synchronization entry
                                    ; - Test2::run2@-1 (line 32)
0xb3ab934e: mov    $0xffffffff,%ecx
0xb3ab9353: mov    $0xffffffff,%ebx
0xb3ab9358: mov    $0x150,%ebp
0xb3ab935d: movsd  0x6fb7b2f0(%ebp),%xmm0  ;   {oop('Test2')}
0xb3ab9365: movd   %xmm0,%eax
0xb3ab9369: psrlq  $0x20,%xmm0
0xb3ab936e: movd   %xmm0,%edx         ;*getstatic l
                                    ; - Test2::run@0 (line 32)
0xb3ab9372: cmp    %ecx,%eax
0xb3ab9374: jne    0xb3ab9378
0xb3ab9376: cmp    %ebx,%edx
0xb3ab9378: je     0xb3ab93ac
0xb3ab937a: mov    $0xfffffffe,%ecx
0xb3ab937f: mov    $0xffffffff,%ebx
0xb3ab9384: movsd  0x6fb7b2f0(%ebp),%xmm0  ;   {oop('Test2')}
0xb3ab938c: movd   %xmm0,%ebp
0xb3ab9390: psrlq  $0x20,%xmm0
0xb3ab9395: movd   %xmm0,%edi         ;*getstatic l
                                    ; - Test2::run@14 (line 36)
0xb3ab9399: cmp    %ecx,%ebp
0xb3ab939b: jne    0xb3ab939f
0xb3ab939d: cmp    %ebx,%edi
0xb3ab939f: je     0xb3ab93ba         ;*return
;... lines removed

In this case both of the getstatic references to the variable l involves a load from memory, i.e. the value can not be kept in a register across multiple volatile reads. To ensure that there is an atomic read the value is read from main memory into an MMX register movsd 0x6fb7b2f0(%ebp),%xmm0 making the read operation a single instruction (from the previous example we saw that 64bit value would normally require two 32bit reads on a 32bit system).

So the overall cost of a volatile read will roughly equivalent of a memory load and can be as cheap as a L1 cache access. However if another core is writing to the volatile variable, the cache-line will be invalidated requiring a main memory or perhaps an L3 cache access. The actual cost will depend heavily on the CPU architecture. Even between Intel and AMD the cache coherency protocols are different.

Generally speaking, on most modern processors a volatile load is comparable to a normal load. A volatile store is about 1/3 the time of a montior-enter/monitor-exit. This is seen on systems that are cache coherent.

To answer the OP's question, volatile writes are expensive while the reads usually are not.

Does this mean that volatile read operations can be done without a explicit cache invalidation on x86, and is as fast as a normal variable read (disregarding the reordering contraints of volatile)?

Yes, sometimes when validating a field the CPU may not even hit main memory, instead spy on other thread caches and get the value from there (very general explanation).

However, I second Neil's suggestion that if you have a field accessed by multiple threads you shold wrap it as an AtomicReference. Being an AtomicReference it executes roughly the same throughput for reads/writes but also is more obvious that the field will be accessed and modified by multiple threads.

Edit to answer OP's edit:

Cache coherence is a bit of a complicated protocol, but in short: CPU's will share a common cache line that is attached to main memory. If a CPU loads memory and no other CPU had it that CPU will assume it is the most up to date value. If another CPU tries to load the same memory location the already loaded CPU will be aware of this and actually share the cached reference to the requesting CPU - now the request CPU has a copy of that memory in its CPU cache. (It never had to look in main memory for the reference)

There is quite a bit more of protocol involved but this gives an idea of what is going on. Also to answer your other question, with the absence of multiple processors, volatile reads/writes can in fact be faster then with multiple processors. There are some applications that would in fact run faster concurrently with a single CPU then multiple.

In the words of the Java Memory Model (as defined for Java 5+ in JSR 133), any operation -- read or write -- on a volatile variable creates a happens-before relationship with respect to any other operation on the same variable. This means that the compiler and JIT are forced to avoid certain optimisations such as reordering instructions within the thread or performing operations only within the local cache.

Since some optimisations are not available, the resulting code is necessarily slower that it would have been, though probably not by very much.

Nevertheless you shouldn't make a variable volatile unless you know that it will be accessed from multiple threads outside of synchronized blocks. Even then you should consider whether volatile is the best choice versus synchronized, AtomicReference and its friends, the explicit Lock classes, etc.

Accessing a volatile variable is in many ways similar to wrapping access to an ordinary variable in a synchronized block. For instance, access to a volatile variable prevents the CPU from re-ordering the instructions before and after the access, and this generally slows down execution (though I can't say by how much).

More generally, on a multi-processor system I don't see how access to a volatile variable can be done without penalty -- there must be some way to ensure a write on processor A will be synchronized to a read on processor B.

Related questions
                            
                                Java: How to Indent XML Generated by Transformer
                            
                                Hibernate Criteria returns children multiple times with FetchType.EAGER
                            
                                Ubuntu: OpenJDK 8 - Unable to locate package
                            
                                error upon assigning Layout: BoxLayout can't be shared
                            
                                JFrame in full screen Java
                            
                                Differences in boolean operators: & vs && and | vs ||
                            
                                Does a finally block always run?
                            
                                Java split() method strips empty strings at the end? [duplicate]
                            
                                How to get Locale from its String representation in Java?
                            
                                How do you Programmatically Download a Webpage in Java
                            
                                In Hibernate Validator 4.1+, what is the difference between @NotNull, @NotEmpty, and @NotBlank?
                            
                                How to import android project as library and NOT compile it as apk (Android studio 1.0)
                            
                                Why is the month changed to 50 after I added 10 minutes?
                            
                                When to use Comparable and Comparator
                            
                                How to add test coverage to a private constructor?
                            
                                Convert JSON style properties names to Java CamelCase names with GSON
                            
                                Android List Preferences: have summary as selected value?
                            
                                Convert String[] to comma separated string in java
                            
                                When does Java's Thread.sleep throw InterruptedException?
                            
                                What is resource-ref in web.xml used for?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is volatile expensive?

Tags:

java

memory-management

concurrency

volatile

People also ask

Recent Activity

Donate For Us