Let's assume there is a data structure like a std::vector and a global variable int syncToken initialized to zero. Also given, exactly two threads as reader/writer, why is the following (pseudo) code (in)valid?
void reader_thread(){
while(1){
if(syncToken!=0){
while(the_vector.length()>0){
// ... process the std::vector
}
syncToken = 0; // let the writer do it's work
}
sleep(1);
}
}
void writer_thread(){
while(1){
std::string data = waitAndReadDataFromSomeResource(the_resource);
if(syncToken==0){
the_vector.push(data);
syncToken = 1; // would syncToken++; be a difference here?
}
// drop data in case we couldn't write to the vector
}
}
Although this code is not (time-)efficient, as far as I can see the code is valid, because the two threads only synchronize on the global variable value in a way such that no undefined behaviour could result. The only problem might occur at using the vector concurrently, but this shouldn't happen because of only switching between zero and one as a synchronization value, right?
UPDATE Since I made the mistake of asking just a yes/no question, I updated my question to why in hope of getting a very specific case as an answer. It also seems that the question itself draws the wrong picture based on the answers so I'll elaborate more on what my problem/question is with above code.
Beforehand, I want to point out that I'm asking for a specific use case/example/proof/detailed explanation which demonstrates exactly what goes out of sync. Even a C example code which let a an example counter behave non monotonic increasing would just answer the yes/no question but not why! I'm interested in the why. So, if you provide an example which demonstrates that it has a problem I'm interested in the why.
By (my) definition above code shall be named synchronized if and only if the code within the if statement, excluding the syncToken assignment at the bottom of the if block, can only be executed by exactly one of those two given threads at a given time.
Based on this thought I'm searching for a, maybe assembler based, example where both threads execute the if block at the same time - meaning they are out of sync or namely not synchronized.
As a reference, let's look at the relevant part of assembler code produced by gcc:
; just the declaration of an integer global variable on a 64bit cpu initialized to zero
syncToken:
.zero 4
.text
.globl main
.type main, @function
; writer (Cpu/Thread B): if syncToken == 0, jump not equal to label .L1
movl syncToken(%rip), %eax
testl %eax, %eax
jne .L1
; reader (Cpu/Thread A): if syncToken != 0, jump to Label L2
movl syncToken(%rip), %eax
testl %eax, %eax
je .L2
; set syncToken to be zero
movl $0, syncToken(%rip)
Now my problem is that, I don't see a way why those instructions can get out of sync.
Assume both threads run on their own CPU core like Thread A runs on core A, Thread B runs on core B. The initialization is global and done before both threads begin execution, so we can ignore the initialization and assume both Threads start with syncToken=0;
Example:
Honestly I've constructed an example which works well, but it demonstrates that I don't see a way why the variable should go out of sync such that both threads execute the if block concurrently. My point is: although the context switch will result in an inconsistency between %eax and the actual value of syncToken in RAM, the code should do the right thing and just not execute the if block if it is not the single only thread allowed to run it.
UPDATE 2 It can be assumed that syncToken will only be used like in the code as shown. No other function (like waitAndReadDataFromSomeResource) is allowed to use it in any way
UPDATE 3 Let's go one step further by asking a slight different question: Is it possible to synchronize two threads, one reader, one writer using an int syncToken such that the threads won't go out of sync all time by executing the if block concurrently? If yes - that's very interesting ^^ If no - why?
The basic problem is you are assuming updates to syncToken
are atomic with updates to the vector, which they aren't.
There's no guarantee that on a multi core CPU these two threads won't be running on different cores. And there's no guarantee of the sequence in which memory updates get written to main memory or that cache gets refreshed from main memory.
So when in the read thread you set syncToken
to zero, it could be that the writer thread sees that change before it sees the change to the vector memory. So it could start pushing stuff to an out of date end of the vector.
Similarly, when you set the token in the writer thread, the reader may start accessing an old version of the contents of the vector. Even more fun, depending on how the vector is implemented, the reader might see the vector header containing an old pointer to the contents of the memory
void reader_thread(){
while(1){
if(syncToken!=0){
while(the_vector.length()>0){
// ... process the std::vector
}
syncToken = 0; // let the writer do it's work
}
sleep(1);
This sleep
will cause a memory flush as it goes to the OS, but there's no guarantee of the order of the memory flush or in which order the writer thread will see it.
}
}
void writer_thread(){
while(1){
std::string data = waitAndReadDataFromSomeResource(the_resource);
This might cause a memory flush. On the other hand it might not.
if(syncToken==0){
the_vector.push(data);
syncToken = 1; // would syncToken++; be a difference here?
}
// drop data in case we couldn't write to the vector
}
}
Using syncToken++
would (in general) not help, as that performs a read/modify/write, so if the other end happens to be doing a modification at the same time, you could get any sort of result out of it.
To be safe you need to use memory synchronisation or locks to ensure memory gets read/written in the correct order.
In this code, you would need to use a read synchronisation barrier before you read syncToken
and a write synchronisation barrier before you write it.
Using the write synchronisation ensures that all memory updates up to that point are visible to main memory before any updates afterwards are - so that the_vector
is appropriately updated before syncToken
is set to one.
Using the read synchronisation before you read syncToken
will ensure that what is in your cache will be correct with main memory.
Generally this can be rather tricky to get right, and you'd be better off using mutexes or semaphores to ensure the synchronisation, unless performance is very critical.
As noted by Anders, the compiler is still free to re-order access to syncToken
with accesses to the_vector
(if it can determine what these functions do, which with std::vector
it probably can) - adding memory barriers will stop this re-ordering. Making syncToken
volatile will also stop the reordering, but it won't address the issues with memory coherency on a multicore system, and it won't allow you to safely do read/modify/writes to the same variable from 2 threads.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With