In my scenario, I have DirtyArray
objects that are basically primitive array wrappers that set a boolean "dirty" flag when a write access happens.
public class DirtyArray {
private byte[] data;
public DirtyArray(byte[] data) {
this.data = data;
}
private boolean dirty = false;
public void setValue(int index, byte value) {
dirty = true;
data[index] = value;
}
public boolean isDirty() {
return dirty;
}
}
The dirty flag only ever goes from false
to true
.
I need to make this safe for concurrent use:
There are one or more threads that may modify the array (setValue
).
There are one or more threads that catch DirtyArray
before it is GCed and are supposed to write it off to disk if it has been modified (isDirty
).
Now, if I understand correctly, it is not safe to do this like above:
Effectively, from the point of view of the isDirty
thread, the data[index]=value
store could be reordered before the dirty=true
store.
Therefore, seeing isDirty()==false
does not guarantee that data
has not been modified.
Is this correct?
Assuming that yes, then making the dirty
flag volatile
should fix this problem.
However, in the following benchmark, I see a ~50x-100x slowdown when doing that.
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public void touchAll()
{
for (int i = 0; i < numEntities; i++)
bytes.setValue(i, ( byte ) 2);
}
Using instead AtomicBoolean with the memory ordering get/set variants introduced in Java 9, I have this variant:
public class DirtyArray {
private byte[] data;
public DirtyArray(byte[] data) {
this.data = data;
}
private AtomicBoolean dirty = new AtomicBoolean();
public void setValue(int index, byte value) {
if (!dirty.getPlain())
dirty.setRelease(true);
data[index] = value;
}
public boolean isDirty() {
return dirty.getAcquire();
}
}
which has the same performance (in the above benchmark) as the original non-volatile version.
Is this safe to do? That is, does it guarantee that when data
is modified I will see isDirty()==true
?
(In my understanding it should be, but only because dirty
only ever goes from false
to true
, never back.)
Are there other variation to achieve this guarantee, possibly even allowing to reset dirty
to false
,
and ideally without negatively impacting performance?
I agree with the general assessment of the answers so far that the only way to guarantee consistency between changed data
array and dirty
flag is to synchronize both setValue
and isDirty
. The race conditions that Pak Uula pointed out are the real problem, not so much getting the dirty flag visible. So basically the question I asked above is the wrong question...
For more context: this is about storing pixels for transparently cached images in https://github.com/imglib. It is used in very tight loops and taking the hit from synchronization is not really an option. The typical usage scenario is:
DirtyArrays
).isDirty()
check happens on another thread that catches
DirtyArray
before it is garbage-collected (PhantomReference
on the
holder of the DirtyArray
) and if it is dirty, write it to disk.My opinion now is that this should be approached on a coarser level than individual setValue()
calls. There are kind of "natural" synchronization points that happen because threads switch between DirtyArray
s by getting them from a ConcurrentHashMap
(glossing over the details of course), threads are in a thread pool and take jobs from a shared queue, or threads in some other way wait for each other. At these synchronization points, effects of earlier (in program order) setValue()
s must become visible. So I tend to just use the plain unsynchronized version and rely on synchronization on the coarser level.
The only thing that gives me slight headaches is that the clean-up is triggered by garbage-collection and I have to make sure that (the holder of) DirtyArray
is not collected before coarse level synchronization point. But I think I can make sure of that by keeping strong references and adding reachability fences if necessary.
The simplest way to avoid problems with concurrency is to share only immutable data between threads. Immutable data is data which cannot be changed. To make a class immutable define the class and all its fields as final. Also ensure that no reference to fields escape during construction.
Using Atomic Variable Using an atomic variable is another way to achieve thread-safety in java. When variables are shared by multiple threads, the atomic variable ensures that threads don't crash into each other.
The backbone of Java concurrency is threads (a lightweight process, which has its own files and stacks and can access the shared data from other threads in the same process). The throughput and the interactivity of the program can be improved by performing time-consuming tasks asynchronously or in parallel.
A flag variable, in its simplest form, is a variable you define to have one value until some condition is true, in which case you change the variable's value. It is a variable you can use to control the flow of a function or statement, allowing you to check for certain conditions while your function progresses.
Java Booleans. Very often, in programming, you will need a data type that can only have one of two values, like: For this, Java has a boolean data type, which can take the values true or false.
flag its not java keyword. but yes you can define a flag for your logic. you can define a flag for boolean or int or char type ( not limited to these) or anything you want to use as logical check condition in your program or application. Is there any free online resource or website from which i can learn C programming language?
The main problem is when one thread is iterating an Collections object then if another thread cant modify the content of the object. If another thread try to modify the content of object then we will get RuntimeException saying ConcurrentModificationException.
AtomicBoolean
(or any other of the atomic family) does not guarantee synchronization with a different variable. so no. the code does not guarantee that when data is modified, you will get isDirty()==true. The only guarantee you have is that all threads always see the same value of isDirty(). in fact, neither of the options listed make that guarantee.
the only way to make the guarantee is to have exclusive lock on the whole code block inside the set method: the if statement together with the assignment. this can be achieved with synchronized keyword (either on the method or in a code block) or use one of the locking mechanisms in java.util.concurrency
Several notes about your solution with AtomicBoolean
.
What kind of thread-safety do you look for? I don't see why do you care that much about read-write ordering of dirty
flag. Get/set operations for this flag might be perfectly ordered in time and still result in races.
In both your examples I see a race condition between setValue
and isDirty
:
setValue
, updates dirty
flag and preempts to Thread 2 before setting data[index] = value
.isDirty
, it returns true
, but the array data
is not updated yet, because Thread 1 was preempted.data[index] = value
.Changing the order of operations solves this race, but introduces another one.
public void setValue(int index, byte value) {
data[index] = value;
dirty = true;
}
Consider the following sequence:
dirty
became true
.data[1] = value
got executed, and Thread 1 is preempted just before dirty = true
.The race is that the array is coherent with an external copy, but marked as dirty. This race might lead to excessive copy operations.
Your use of AtomicBoolean is non-atomic. A thread could be preempted just between conditional and then
statement. The atomic equivalent of the if
construct used by you is dirty.compareAndSet(false, true);
.
Actually, you don't need it this way. dirty.set(true)
is sufficient since you uncoditionally set the dirty flag upon update any value.
But the sad story is that even AtomicBoolean.set
doesn't save from the race condition. Thread could be preempted just between setting the dirty flag and updating the data. The race condition described above doesn't depent on atomicity of flag update.
Since the very beginning Java provides a pattern exactly for this case: synchronized
.
public class SyncronizedDirtyArray {
private byte[] data;
public SyncronizedDirtyArray (byte[] data) {
this.data = data;
}
private boolean dirty = false;
public synchronized void setValue(int index, byte value) {
dirty = true;
data[index] = value;
}
public synchronized boolean isDirty() {
return dirty;
}
}
synchronized
ensures that no thread can interfere between dirty = true;
and data[index] = value;
.
It might be slower compared to unsynchronized solution, but saves you from the race conditions. Try this solution with your benchmarks.
I made my own benchmark, very simple, but revealing.
I impemented several variants of DirtyArray with different sync primitives.
DirtyArrayNoSync
: the plain implementation without any syncDirtyArraySync
: using synchronized
DirtyArrayLock
: using ReentrantLock
DirtyArrayVolatile
: using volatile boolean
DirtyArrayAtomic
: the implementation from the topic starterDirtyArrayAtomicSet
: using AtomicBoolean.set
and AtomicBoolean.get
The benchmar was to call setValue
for every element in the array with 100 mln elements. The output is duration in milliseconds.
Configuration: OpenJDK 11.0.6, Core i7, Windows 10 Home
org.example.sync.DirtyArrayNoSync: 112
org.example.sync.DirtyArraySync: 2222
org.example.sync.DirtyArrayLock: 16752
org.example.sync.DirtyArrayVolatile: 7555
org.example.sync.DirtyArrayAtomic: 7591
org.example.sync.DirtyArrayAtomicSet: 3066
Yes, syncronization makes it 20 times slower, but it is the second fastest solution. All others, even with AtomicBoolean
, are slower.
UPDATE 2.
Benchmark for jdk-14.0.1 showed practically same results:
org.example.sync.DirtyArrayNoSync: 102
org.example.sync.DirtyArraySync: 2323
org.example.sync.DirtyArrayLock: 16801
org.example.sync.DirtyArrayVolatile: 7942
org.example.sync.DirtyArrayAtomic: 7984
org.example.sync.DirtyArrayAtomicSet: 3320
I think that Sharon's answer has nailed it. All attempts to solve this using volatile
or Atomic*
classes will lead to race conditions where either isDirty()
returns true
before the array updates can be seen OR array updates can be see when isDirty()
returns false
.
The solution is to use locking. (In this case, primitive locking should be fastest. Pak's benchmarking seems to confirm this.)
But what I really wanted to say that you may well be worrying too much about the concurrency overheads of locking versus atomic types / volatile. In real applications, locking is a concern if the lock is contended, either because their are lots of threads trying to use it, or because the lock is held for a long time.
In this example, the second does not apply, a lock is likely to be held for 10 or less machine instructions. But the flipside is that your benchmark does not look remotely realistic.
DirtyArray
class like that, you would be better off with a setValue
method that sets multiple values in one call.Another alternative is that you could relax your requirements. For example, does your application actually require isDirty()
to correct at all times? If not, your initial solution may be good enough.
Finally, one thing that bugs be about your problem is how your application could make use of the strong consistency properties of the isDirty
flag:
Suppose that you want to know if the array is dirty so that you can do something with the updated values in their current state. That implies that you need to stop the updates, but there is nothing in your class that will do that.
Suppose that you want to know if the array is dirty to determine if it (still) needs to be updated. Consider this attempt at doing this:
if (!da.isDirty()) {
da.setValue(...);
}
There is a race condition in the above. Some other thread could jump in between the test and the setValue
and make is own setValue
call. So you would end up with two changes to the array.
Maybe the real version of this class has a more complicated API and/or other use-cases. But if that is so, I am not convinced that lessons learned from this question will translate into the more complicated problem.
You could try to use AtomicReferenceArray
(see docs here).
This array provides you guarantees with respect concurrent access.
I found an example of usage here which I copy-paste here for you to try:
import java.util.concurrent.atomic.AtomicReferenceArray;
public class TestThread {
private static String[] source = new String[10];
private static AtomicReferenceArray<String> atomicReferenceArray
= new AtomicReferenceArray<String>(source);
public static void main(final String[] arguments) throws InterruptedException {
for (int i = 0; i<atomicReferenceArray.length(); i++) {
atomicReferenceArray.set(i, "item-2");
}
Thread t1 = new Thread(new Increment());
Thread t2 = new Thread(new Compare());
t1.start();
t2.start();
t1.join();
t2.join();
}
static class Increment implements Runnable {
public void run() {
for(int i = 0; i<atomicReferenceArray.length(); i++) {
String add = atomicReferenceArray.getAndSet(i,"item-"+ (i+1));
System.out.println("Thread " + Thread.currentThread().getId()
+ ", index " +i + ", value: "+ add);
}
}
}
static class Compare implements Runnable {
public void run() {
for(int i = 0; i<atomicReferenceArray.length(); i++) {
System.out.println("Thread " + Thread.currentThread().getId()
+ ", index " +i + ", value: "+ atomicReferenceArray.get(i));
boolean swapped = atomicReferenceArray.compareAndSet(i, "item-2", "updated-item-2");
System.out.println("Item swapped: " + swapped);
if(swapped) {
System.out.println("Thread " + Thread.currentThread().getId()
+ ", index " +i + ", updated-item-2");
}
}
}
}
}
Output:
Thread 9, index 0, value: item-2
Thread 10, index 0, value: item-1
Item swapped: false
Thread 10, index 1, value: item-2
Item swapped: true
Thread 9, index 1, value: updated-item-2
Thread 10, index 1, updated-item-2
Thread 10, index 2, value: item-3
Item swapped: false
Thread 10, index 3, value: item-2
Item swapped: true
Thread 10, index 3, updated-item-2
Thread 10, index 4, value: item-2
Item swapped: true
Thread 10, index 4, updated-item-2
Thread 10, index 5, value: item-2
Item swapped: true
Thread 10, index 5, updated-item-2
Thread 10, index 6, value: item-2
Thread 9, index 2, value: item-2
Item swapped: true
Thread 9, index 3, value: updated-item-2
Thread 10, index 6, updated-item-2
Thread 10, index 7, value: item-2
Thread 9, index 4, value: updated-item-2
Item swapped: true
Thread 9, index 5, value: updated-item-2
Thread 10, index 7, updated-item-2
Thread 9, index 6, value: updated-item-2
Thread 10, index 8, value: item-2
Thread 9, index 7, value: updated-item-2
Item swapped: true
Thread 9, index 8, value: updated-item-2
Thread 10, index 8, updated-item-2
Thread 9, index 9, value: item-2
Thread 10, index 9, value: item-10
Item swapped: false
It seems to me as if the dirty flag is only changed from false to true. If it is set to true, it will never be reset.
If you, as you suggest in your question, believe to manage the visibility of changes to other threads with other means than synchronization and volatility, would it not suffice if you synchronize around the state change of the dirty flag with something like this:
public void setValue(int index, byte value) {
if (dirty) {
data[index] = value;
}
else {
synchronized(this) {
dirty = true;
data[index] = value;
}
}
}
This way, the synchronized overhead will only apply for the first method call and following method calls will be much faster. Just to make sure that I have pointed it out: With this code, modifications of the data array are not immediately visible to other threads.
If you want to make sure that changes to the data array are visible to other threads, you might consider further access methods, which are not only updating a single byte, but e.g. filling ranges with a given value or copying ranges from a source array like:
void fill(byte value, int offset, int length);
void setValue(byte[] source, int offset, int length);
With such methods, you might be able to properly synchronize and propagate changes without too much extra impact on the runtime.
You should also consider if this level of synchronization is good enough for other purposes as well. Your benchmark is perhaps not representative for real usage, but if you are updating the content of a DirtyArray by setting one byte at a time and you usually update several bytes at once, do you want other threads to access the object and see a partially updated byte array?
I'd consider other options.
First, you should't use finalizers, cleaners or reference queues to implement such application requirements. For instance, although unlikely, someone can use a no-gc JVM instance. On a more practical matter, there's no guarantee that finalizers, cleaners and reference queues will run when the application is exiting.
You should use other mechanisms, such as parallel streams and/or completable futures, to handle this. You can actually use both approaches.
Then, you're locking or atomically updating on every pixel operation, which will without a doubt add a significant overhead. You should definitely make it more coarse and handle a bunch of pixels at a time. Although at first it seems like many of the embarrassingly-parallelizable algorithms could be run in parallel on each pixel, the truth is that doing so may add even more overhead.
You could have something in your library that allows configuring about how few pixels are worth the overhead of parallelizing work. Perhaps even have a benchmarking method that would try a binary search of time taken to process pixels sequentially and in parallel, until it finds an amount where the total parallel time is lower than sequential, and perhaps the overhead is low enough to justify, given a degree of parallelism.
In this particular case, if you target the same array, you may be victim of false sharing, where you might see worse performance due to frequent cache line invalidation. So, your benchmark may hopefully find a value very close to the cache line size.
You can't get a better than linear speedup. An n-core machine can't run better than 1/n the original sequential time, due at least to synchronization and cache invalidation, and generally to everything else that the machine is doing. Perhaps an overhead of about 1% to 5% might be good enough, for instance, a 2-core machine running the parallel approach in 52% of the sequential time.
Anyway, I believe you should definitely rethink this approach in these 2 points: not relying on GC to store your data, and having coarser operations properly synchronized. I'm assuming your operations are totally independent from each other, because otherwise this isn't about embarrassingly-parallel algorithms.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With