There are multiple threads, say B, C and D, each writing small packets of data to a buffer at a high frequency. They own their buffer and nobody else ever writes to it. Writing must be as fast as possible, and I've determined that using synchronized
makes it unacceptably slow.
The buffers are simply byte arrays, along with the index of the first free element:
byte[] buffer;
int index;
public void write(byte[] data) {
// some checking that the buffer won't overflow... not important now
System.arraycopy(data, 0, buffer, index, data.length);
index += data.length;
}
Every once in a while, thread A comes along to flush everybody's buffer to a file. It's okay if this part has some overhead, so using synchronized
here is no problem.
Now the trouble is, that some other thread might be writing to a buffer, while thread A is flushing it. This means that two threads attempt to write to index
around the same time. That would lead to data corruption, which I would like to prevent, but without using synchronized
in the write()
method.
I've got the feeling that, using the right order of operations and probably some volatile
fields, this must be possible. Any bright ideas?
Have you tried a solution which uses synchronization, and found it doesn't perform well enough? You say you've determined that it's unacceptably slow - how slow was it, and do you already have a performance budget? Normally, obtaining an uncontested lock is extremely cheap, so I wouldn't expect it to be a problem.
There may well be some clever lock-free solution - but it's likely to be significantly more complicated than just synchronizing whenever you need to access shared data. I understand that lock-free coding is all the rage, and scales beautifully when you can do it - but if you've got one thread interfering with another's data, it's very hard to do it safely. Just to be clear, I like using lock-free code when I can use high-level abstractions created by experts - things like the Parallel Extensions in .NET 4. I just don't like working with low-level abstractions like volatile variables if I can help it.
Try locking, and benchmark it. Work out what performance is acceptable, and compare the performance of a simple solution with that goal.
Of course, one option is redesigning... does the flushing have to happen actively in a different thread? Could the individual writer threads not just hand off the buffer to the flushing thread (and start a different buffer) periodically? That would make things a lot simpler.
EDIT: Regarding your "flush signal" idea - I'd been thinking along similar lines. But you need to be careful about how you do it so that the signal can't get lost even if one thread takes a long time to process whatever it's doing. I suggest you make thread A publish a "flush counter"... and each thread keeps its own counter of when it last flushed.
EDIT: Just realized this is Java, not C# - updated :)
Use AtomicLong.incrementAndGet()
to increment from thread A, and AtomicLong.get()
to read from the other threads. Then in each thread, compare whether you're "up to date", and flush if necessary:
private long lastFlush; // Last counter for our flush
private Flusher flusher; // The single flusher used by all threads
public void write(...)
{
long latestFlush = flusher.getCount(); // Will use AtomicLong.get() internally
if (latestFlush > lastFlush)
{
flusher.Flush(data);
// Do whatever else you need
lastFlush = latestFlush; // Don't use flusher.getCount() here!
}
// Now do the normal write
}
Note that this assumes you only ever need to check for flushing in the Write method. Obviously that may not be the case, but hopefully you can adapt the idea.
You can use volatile alone to safely read/write to a buffer (if you have only one writer) however, only one thread can safely flush the data. To do this you can use a ring buffer.
I would add to @Jon's comment that this is significantly more complicated to test. e.g. I had one "solution" which worked for 1 billion messages consistently one day but kept breaking the next because the box was more loaded.
With synchronized your latency should be below 2 micro-seconds. With Lock, you could get this down to 1 micro-second. with busy waiting on a volatile you can get this down to 3-6 ns per byte (The time it takes to transfer data between threads becomes important)
Note: as the volume of data increases the relative cost of the lock becomes less important. e.g. if you are typically writing 200 bytes or more I wouldn't worry about the difference.
One approach I take is to use the Exchanger with two direct ByteBuffers and avoid writing any data in the critical path (i.e. only write the data after I have processed everything and it doesn't matter so much)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With