What does "release sequence" mean?

Question

I don't understand, why will be problems without release sequence, if we have 2 threads in the example below. We have only 2 operations on the atomic variable count. count is decremented sequently as shown in the output.

From C++ Concurrency in Action by Antony Williams:

I mentioned that you could get a synchronizes-with relationship between a store to an atomic variable and a load of that atomic variable from another thread, even when there’s a sequence of read-modify-write operations between the store and the load, provided all the operations are suitably tagged. If the store is tagged with memory_order_release, memory_order_acq_rel, or memory_order_seq_cst, and the load is tagged with memory_order_consume, memory_order_acquire, or memory_order_seq_cst, and each operation in the chain loads the value written by the previous operation, then the chain of operations constitutes a release sequence and the initial store synchronizes-with (for memory_order_acquire or memory_order_seq_cst) or is dependency-ordered-before (for memory_order_consume) the final load. Any atomic read-modify-write operations in the chain can have any memory ordering (even memory_order_relaxed).

To see what this means (release sequence) and why it’s important, consider an atomic<int> being used as a count of the number of items in a shared queue, as in the following listing.

One way to handle things would be to have the thread that’s producingthe data store the items in a shared buffer and then do count.store(number_of_items, memory_order_release) #1 to let the other threads know that data is available. The threads consuming the queue items might then do count.fetch_sub(1,memory_ order_acquire) #2 to claim an item from the queue, prior to actually reading the shared buffer #4. Once the count becomes zero, there are no more items, and the thread must wait #3.

#include <atomic>
#include <thread>
#include <vector>
#include <iostream>
#include <mutex>

std::vector<int> queue_data;
std::atomic<int> count;
std::mutex m;
void process(int i)
{

    std::lock_guard<std::mutex> lock(m);
    std::cout << "id " << std::this_thread::get_id() << ": " << i << std::endl;
}


void populate_queue()
{
    unsigned const number_of_items = 20;
    queue_data.clear();
    for (unsigned i = 0;i<number_of_items;++i)
    {
        queue_data.push_back(i);
    }

    count.store(number_of_items, std::memory_order_release); //#1 The initial store
}

void consume_queue_items()
{
    while (true)
    {
        int item_index;
        if ((item_index = count.fetch_sub(1, std::memory_order_acquire)) <= 0) //#2 An RMW operation
        {
            std::this_thread::sleep_for(std::chrono::milliseconds(500)); //#3
            continue;
        }
        process(queue_data[item_index - 1]); //#4 Reading queue_data is safe
    }
}

int main()
{
    std::thread a(populate_queue);
    std::thread b(consume_queue_items);
    std::thread c(consume_queue_items);
    a.join();
    b.join();
    c.join();
}

output (VS2015):

id 6836: 19
id 6836: 18
id 6836: 17
id 6836: 16
id 6836: 14
id 6836: 13
id 6836: 12
id 6836: 11
id 6836: 10
id 6836: 9
id 6836: 8
id 13740: 15
id 13740: 6
id 13740: 5
id 13740: 4
id 13740: 3
id 13740: 2
id 13740: 1
id 13740: 0
id 6836: 7

If there’s one consumer thread, this is fine; the fetch_sub() is a read, with memory_order_acquire semantics, and the store had memory_order_release semantics, so the store synchronizes-with the load and the thread can read the item from the buffer.

If there are two threads reading, the second fetch_sub() will see the value written by the first and not the value written by the store. Without the rule about the release sequence, this second thread wouldn’t have a happens-before relationship with the first thread, and it wouldn’t be safe to read the shared buffer unless the first fetch_sub() also had memory_order_release semantics, which would introduce unnecessary synchronization between the two consumer threads. Without the release sequence rule or memory_order_release on the fetch_sub operations, there would be nothing to require that the stores to the queue_data were visible to the second consumer, and you would have a data race.

What does he mean? That both threads should see the value of count is 20? But in my output count is sequently decremented in threads.

Thankfully, the first fetch_sub() does participate in the release sequence, and so the store() synchronizes-with the second fetch_sub(). There’s still no synchronizes-with relationship between the two consumer threads. This is shown in figure 5.7. The dotted lines in figure 5.7 show the release sequence, and the solid lines show the happens-before relationships

pqnet · Accepted Answer

it means that the initial store is synchronized-with the final load even if the value read by the final load isn't directly the same value stored at beginning, but it is the value modified by one of the atomic instruction which could race into. A simpler example, assuming there are three threads racing which executes these instruction (assume x initialized to 0 before the race)

// Thread 1:
A;
x.store(2, memory_order_release);

// Thread 2:
B;
int n = x.fetch_add(1, memory_order_relaxed);
C;

// Thread 3:
int m = x.load(memory_order_acquire);
D;

What are the possible values read for n and m according to possible results of the race? And what are the guarantees that we have on the ordering of instructions A, B, C, and D based on what we read on m and n? For n we have two cases, either 0 or 2. For m we could read 0, 1, 2, and 3. There are six valid combinations of the two. Let's see each case:

m = 0, n = 0. We don't have any synchronizes-with relationship, thus we can't infer any happens-before relationship except for the obvious B happens-before C
m = 0, n = 2. Even though the fetch_add operation read the value written by the store, since the fetch_add has a relaxed memory ordering there is no synchronizes-with relationship between the two instruction. We can't say that A happens-before C
m = 1, n = 0. Similarly as before, since fetch_add don't have a release semantic we can't infer a synchronizes-with relationship between the fetch_add and the load operation, hence we don't know whether B happens-before D
m = 2, n = 0. The value we read with the acquire semantic load has been written with a release semantic store. We are guaranteed that the store synchronizes-with the load, hence A happens-before D
m = 2, n = 2. Same as above, the store synchronizes-with the load, hence A happens-before D. As usual, the fact that the value read from fetch_add is the same as the one stored from thread 1 do not imply any synchronization relationship.
m = 3, n = 2. In this case the data read by the load has been written by the fetch_add, and the data read by the fetch_add has been written by the store. However because fetch_add has relaxed semantic, no synchronization can be assumed between store and fetch_add and between fetch_add and load. Apparently, in this case no synchronization can be assumed, same as the case m = 0, n = 0. Here is where the release sequence concept comes in handy: the release semantic store in thread 1 will synchronize-with the acquire semantic load in thread 3 as long as the value that is being read has been written in the release sequence, which includes
1. all the stores performed later in the same thread as the release operation
2. all the atomic read-modify-write operation which read a value from the same release sequence.
In this case since fetch_add is an atomic read-modify-write operation we know that the store in thread 1 synchronizes-with the load in thread 3, and thus A happens-before D. We still can't say anything about the ordering of B and C though.

In your case you have this pseoudocode, assuming number_of_items = 2:

// Thread 1
Item[0] = ...;
Item[1] = ...;
count.store(2,memory_order_release);

// Thread 2
int i2 = 0;
while (i2 = count.fetch_sub(1,memory_order_acquire) <= 0 ) sleep();
auto x2 = Item[i2-1];
process(x2);

// Thread 3
int i3 = 0;
while (i3 = count.fetch_sub(1,memory_order_acquire) <= 0 ) sleep();
auto x3 = Item[i3-1];
process(x3);

Let's assume that the first positive value read into i2 is 2, and thus the first positive value read into i3 is 1. Since the value read from Thread 2 has been written from the store in Thread 1, the store synchronizes-with the load, and we know that Item[1] = ...; from Thread 1 happens-before auto x2 = Item[1]; in Thread 2. However the value 1 read from Thread 3 has been written by Thread 2, with fetch_sub which has no release semantic. The fetch_sub from Thread 2 thus does not synchronizes-with the fetch_sub from Thread 3, however since the fetch_sub from Thread 2 is part of the release chain of the store in Thread 1, the store in Thread 1 also synchronizes-with the fetch_sub in Thread 3, from which we know that Item[0] = ...; happens-before auto x3 = Item[0];

phön · Answer

i stumbled over the exact same question like you did. i thought i got the understanding right and then he comes in with this example and only uses std::memory_order_aquire. it was difficult to find any good information on this, but finally i found some helpful sources. the main information i was not aware of was the simple fact, that read-modify-write operations ALWAYS work on the newest/latest value, no matter what memory order given (even std::memory_order_relaxed). this ensures, that you wont have the same index two times in the example. still the ordering of operations can mix up (so you dont know which fetch_sub will happens before the other).

this is an answer of anthony williams himself stating that read-modify-write operations always work on the latest value: Concurrency: Atomic and volatile in C++11 memory model

additionally, someone asked about the fetch_sub in combination with the shared_ptr ref count. here anthony williams responded too and brings clarity into the situation with the reordering of the fetch_sub: https://groups.google.com/a/isocpp.org/forum/#!topic/std-discussion/OHv-oNSuJuk

What does "release sequence" mean?

Tags:

c++

multithreading

c++11

memory-model

stdatomic

Ivan Kush

2 Answers

pqnet

phön

Recent Activity

Donate For Us

What does "release sequence" mean?

Tags:

c++

multithreading

c++11

memory-model

stdatomic

Ivan Kush

2 Answers

pqnet

phön

Related questions

Recent Activity

Donate For Us