Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

One occasional writer, multiple frequent readers for std::map

I am facing a concurrency problem here. I have a std::map, there is one occasional writer and multiple frequent readers from different threads, this writer will occasionally add keys (key is a std::string)to the map, and I can not guarantee when exactly the readers perform reading and stop reading. I don't want to put locks for the readers, since reading is very frequent and checking locks frequently will hurt performance.

If the readers will always access the map by keys (not map iterators), will it be always thread-safe? If not, any idea how to design the code so that the readers will always access valid keys (or map iterators )?

Other approaches using different containers solving this problem are also welcome.

like image 236
Allanqunzi Avatar asked Aug 13 '15 15:08

Allanqunzi


2 Answers

I have to disagree with the previous answer. When they talk about "concurrently accessing existing elements" (when talking about insert()), that presumes that you already have a pointer/reference/iterator to the existing element. This is basically acknowledging that the map isn't going to move the elements around in memory after the insertion. It also acknowledges that iterating the map is not safe during an insert.

Thus, as soon as you have an insert, attempting to do an at() on the same container (at the same time) is a data race. During the insert the map must change some sort of internal state (pointers to tree nodes, perhaps). If the at() catches the container during that manipulation, the pointers may not be in a consistent state.

You need some sort of external synchronization (such as a reader-writer lock) as soon as you have the possibility of both an insert() and at() (or operator[]) occurring at the same time.

like image 92
Andre Kostur Avatar answered Nov 13 '22 03:11

Andre Kostur


Attention: fundamentally edited answer

As a reflex I would put a lock.

At first sight, it seems not required to put a lock in your case:

  • For insert(), it's said that "Concurrently accessing existing elements is safe, although iterating ranges in the container is not."
  • For at() , it's said that: "Concurrently accessing or modifying other elements is safe."

The standard library addresses thread-safe aspects:

23.2.2. Container data races

1) For purposes of avoiding data races (17.6.5.9), implementations shall consider the following functions to be const: begin, end, rbegin, rend, front, back, data, find, lower_bound, upper_bound, equal_range, at and, except in associative or unordered associative containers, operator[].

2) Notwithstanding (17.6.5.9), implementations are required to avoid data races when the contents of the contained object in different elements in the same sequence, excepting vector,are modified concurrently.

There are several other SO answers which interpret this as thread-safe guarantee, as I originally did.

Nevertheless, we know that iterating ranges in the container is not safe when an insert is done. And access to an element requires before somehow iterating to find the element. So, while the standard clarifies safety for concurent access to different elements when you already have their address, the wording leaves potential container concurrency issues open.

I have tried a simulation scenario with multiple read and single write on MSVC, and it never failed. But this is not engough to make the point : implementations are allowed to avoid more data races than what is foressen in the standard (see 17.5.6.9) (or maybe I was simply many times lucky).

Finally, I have found two serious (post C++11) references stating unambiguously that a user lock is required to be safe :

  • GNU document on concurrency in the standard library: "The standard places requirements on the library to ensure that no data races are caused by the library itself (...) The user code must guard against concurrent function calls which access any particular library object's state when one or more of those accesses modifies the state."

  • GotW #95 Solution: Thread Safety and Synchronization, by Herb Sutter : "Is the code correctly synchronized (...) ? No. The code has one thread reading (via const operations) from some_obj, and a second thread writing to the same variable. If those threads can execute at the same time, that’s a race and a direct non-stop ticket to undefined behavior land."

Based on these two almost authoritative interpretations, I revise my first answer and come back to my initial reflex : you'll have to lock your concurrent accesses.

Alternatively you could use non standard-libraries with concurrent implementation of maps such as for example Microsoft's concurrent_unordered_map from the Parallel Pattern Library or Intel's concurrent_unordered_map from the Threading Building Blocks (TBB) or lock-free library as described in this SO answer

like image 45
Christophe Avatar answered Nov 13 '22 05:11

Christophe