Say I have a bitmap, and several threads (running on several CPUs) are setting bits on it. No synchronization is used, and no atomic operations. Also, no resets are done. To my understanding, when two threads are trying to set two bits on the same word, only one operation would eventually stick. The reason is that for a bit to be set, the whole word should be read and written back, and so when both reads are done at the same time, when writing back one operation would override the other. Is that correct?
If the above is true, is it always so for byte operations as well? Namely, if a word is 2 bytes, and each thread tries to set a different byte to 1, will they too override each other when done concurrently, or do some systems support writing back the results to only a part of a word?
Reason for asking is trying to figure out how much space do I have to give up in order to omit synchronization in bit/byte/word-map operations.
In short, it's very CPU and compiler dependent.
Say you have a 32-bit value containing zero, and thread A wants to set bit 0 and thread B wants to set bit 1.
As you describe, these are read-modify-write operations, and the synchronization issue is 'what happens if they collide'.
The case you need to avoid is this:
A: Reads (gets 0)
B: Reads (also gets zero)
A: Logical-OR bit 0, result = 1
A: Writes 1
B: Logical-OR bit 1, result = 2
B: Writes 2 - oops, should have been 3
... when the correct result is this...
A: Reads (gets 0)
A: Logical-OR bit 0, result = 1
A: Writes 1
B: Reads (gets 1)
B: Logical-OR bit 1, result = 2
B: Writes 3 - correct
On some processors, the read-modify write will be three separate instructions, so you WILL need synchronization. On others, it will be a single atomic instruction. On multiple Core/CPU systems it will be a single instruction BUT other cores/CPUs may be able to access, so again you will need synchronization.
Doing it with bytes can be the same. In some processor memory architectures, you can only write a 32-bit memory value, so byte updates require a read-modify-write as before.
Update for X86 architecture (and windows, specifically)
Windows provides a set of atomic "Interlocked" operations on 32-bit values, including Logical OR. These could be a big help to you in avoiding critical sections. but beware, because as Raymond Chen points out, they don't solve everything. Keeping reading that post until you understand it!
The specifics will be system-dependent, and possibly compiler-dependent. I imagine you might have to go all the way to a 32-bit integer before you are free from the effects you fear.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With