Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clang's thread sanitizer warning while using std::string in a multi-threaded environment

While working with clang's thread sanitizer we noticed data race warnings. We think it's due to std::string's copy-on-write technique not being thread safe, but we could be wrong. We reduced the warning we were seeing to this code:

void test3() {
  std::unique_ptr<std::thread> thread;

  {
    auto output = make_shared<string>();
    std::string str = "test";
    thread.reset(new std::thread([str, output]() { *output += str; }));
    // The str string now goes out of scope but due to COW
    // the captured string may not have the copy of the content yet.
  }

  thread->join();
}

When compiled with thread sanitizer enabled:

clang++ -stdlib=libc++ -std=c++11 -O0 -g -fsanitize=thread -lpthread -o test main.cpp

or

clang++ -std=c++11 -O0 -g -fsanitize=thread -lpthread -o test main.cpp

And when run multiple times, it eventually produces this warning:

WARNING: ThreadSanitizer: data race (pid=30829)
  Write of size 8 at 0x7d0c0000bef8 by thread T62:
    #0 operator delete(void*) <null>:0
    ...

  Previous write of size 1 at 0x7d0c0000befd by thread T5:
    #0 std::__1::char_traits<char>::assign(char&, char const&) string:639
    ...

Is this a false positive from the thread sanitizer or is it a real data race? If the later, can it be work arounded without changing the code (e.g. by passing some flags to the compiler), is this a know bug in the string implemntation (or something else)?

UPDATE: clang --version outputs:

Ubuntu clang version 3.5-1ubuntu1 (trunk) (based on LLVM 3.5)
Target: x86_64-pc-linux-gnu
Thread model: posix

UPDATE: The cpp I use to reproduce this warning.

like image 306
Peter Jankuliak Avatar asked Jan 01 '15 10:01

Peter Jankuliak


2 Answers

[edit] Assumptions below turn out to be faulty, see link in comments. T5, not T62 is the thread spawned in the code above.

It would be useful to understand the thread ID's, but I assume that T5 is the main thread and T62 is the spawned thread. It looks like the copy is made on the main thread (before the new thread is spwaned) and destroyed on the new thread (obviously). This is safe because the new thread cannot race with the main thread before it exists.

Hence, this is a thread sanitizer bug. It failed to check whether thread T62 existed at the time of the previous write.

like image 120
MSalters Avatar answered Oct 20 '22 07:10

MSalters


This is quite tricky. I've summarized the logic in your code below:

    
    In thread T62:
        Create string s (with reference count)
        Create output_1 pointing to s in the thread storage for T62 
        Create thread T5
        Create output_2 pointing to s in the thread storage for T5
    Sync point
    In thread T5:
        Append to s   ** MODIFY **
        Thread-safe decrement of reference count for s (not a sync point)
        End of output_2 lifetime
        Exit
    In thread T62:
        Thread-safe decrement of reference count for s (not a sync point)
        End of output_1 lifetime
        Deallocate s  ** MODIFY **
        Join
    Sync point
    In thread T62:
        Destroy T5

As far as I can tell, the standard makes no guarantees about synchronization with regard to calling the shared_ptr deleter:

(20.8.2.2/4) For purposes of determining the presence of a data race, member functions shall access and modify only the shared_ptr and weak_ptr objects themselves and not objects they refer to.

I take this to mean that any modifications that do actually happen to the pointed-to object while calling a member function of the shared_ptr, such as any modifications the deleter might make, are considered to be outside the scope of the shared_ptr, and therefore it is not the responsibility of the shared_ptr to make sure they do not introduce a data race. For example, the modifications made to the string by T5 may not be visible to T62 by the time thread T62 tries to destroy it.

However, Herb Sutter, in his "Atomic<> weapons" talk, indicated he saw it as a bug to have the atomic decrement of the reference count in the shared_ptr destructor without both acquire and release semantics, but I'm not sure how it violates the standard.

like image 22
Vaughn Cato Avatar answered Oct 20 '22 06:10

Vaughn Cato