Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Current state of drd and helgrind support for std::thread

As I transition my code to C++11, I would very much like to convert my pthread code to std::thread. However, I seem to be getting false race conditions on very simple programs in drd and in helgrind.

#include <thread>

int main(int argc, char** argv)
{
    std::thread t( []() { } );
    t.join();
    return 0;
}

Helgrind output snippet - I also get similar errors in drd, using gcc 4.6.1, valgrind 3.7.0 on Ubuntu 11.11 amd64.

My questions are:

  • sanity check: Am I doing anything wrong? Are others getting similar false reports on simple std::thread programs?
  • What are current users of std::thread using to detect race-conditions?

I am reluctant to port a ton of code from pthread to std::thread until some crucial tools like helgrind/drd have caught up.

==19347== ---Thread-Announcement------------------------------------------
==19347== 
==19347== Thread #1 is the program's root thread
==19347== 
==19347== ---Thread-Announcement------------------------------------------
==19347== 
==19347== Thread #2 was created
==19347==    at 0x564C85E: clone (clone.S:77)
==19347==    by 0x4E37E7F: do_clone.constprop.3 (createthread.c:75)
==19347==    by 0x4E39604: pthread_create@@GLIBC_2.2.5 (createthread.c:256)
==19347==    by 0x4C2B3DA: pthread_create_WRK (hg_intercepts.c:255)
==19347==    by 0x4C2B55E: pthread_create@* (hg_intercepts.c:286)
==19347==    by 0x50BED02: std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16)
==19347==    by 0x400D51: _ZNSt6threadC1IZ4mainEUlvE_IEEEOT_DpOT0_ (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x400C60: main (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347== 
==19347== ----------------------------------------------------------------
==19347== 
==19347== Possible data race during write of size 8 at 0x5B8E060 by thread #1
==19347== Locks held: none
==19347==    at 0x40165E: _ZNSt6thread5_ImplISt12_Bind_resultIvFZ4mainEUlvE_vEEED1Ev (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x401895: _ZNKSt19_Sp_destroy_inplaceINSt6thread5_ImplISt12_Bind_resultIvFZ4mainEUlvE_vEEEEEclEPS6_ (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x4016D8: _ZNSt19_Sp_counted_deleterIPNSt6thread5_ImplISt12_Bind_resultIvFZ4mainEUlvE_vEEEESt19_Sp_destroy_inplaceIS6_ESaIS6_ELN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x401B83: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x401B3E: std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x401A93: std::__shared_ptr<std::thread::_Impl_base, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x401AAD: std::shared_ptr<std::thread::_Impl_base>::~shared_ptr() (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x400D5D: _ZNSt6threadC1IZ4mainEUlvE_IEEEOT_DpOT0_ (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x400C60: main (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347== 
==19347== This conflicts with a previous read of size 8 by thread #2
==19347== Locks held: none
==19347==    at 0x50BEABE: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16)
==19347==    by 0x4C2B547: mythread_wrapper (hg_intercepts.c:219)
==19347==    by 0x4E38EFB: start_thread (pthread_create.c:304)
==19347==    by 0x564C89C: clone (clone.S:112)
==19347== 
==19347== Address 0x5B8E060 is 32 bytes inside a block of size 64 alloc'd
==19347==    at 0x4C29059: operator new(unsigned long) (vg_replace_malloc.c:287)
==19347==    by 0x4012E9: _ZN9__gnu_cxx13new_allocatorISt23_Sp_counted_ptr_inplaceINSt6thread5_ImplISt12_Bind_resultIvFZ4mainEUlvE_vEEEESaIS8_ELNS_12_Lock_policyE2EEE8allocateEmPKv (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x40117C: _ZNSt14__shared_countILN9__gnu_cxx12_Lock_policyE2EEC1INSt6thread5_ImplISt12_Bind_resultIvFZ4mainEUlvE_vEEEESaISA_EIS9_EEESt19_Sp_make_shared_tagPT_RKT0_DpOT1_ (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x4010B9: _ZNSt12__shared_ptrINSt6thread5_ImplISt12_Bind_resultIvFZ4mainEUlvE_vEEEELN9__gnu_cxx12_Lock_policyE2EEC1ISaIS6_EIS5_EEESt19_Sp_make_shared_tagRKT_DpOT0_ (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x401063: _ZNSt10shared_ptrINSt6thread5_ImplISt12_Bind_resultIvFZ4mainEUlvE_vEEEEEC1ISaIS6_EIS5_EEESt19_Sp_make_shared_tagRKT_DpOT0_ (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x401009: _ZSt15allocate_sharedINSt6thread5_ImplISt12_Bind_resultIvFZ4mainEUlvE_vEEEESaIS6_EIS5_EESt10shared_ptrIT_ERKT0_DpOT1_ (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x400EF7: _ZSt11make_sharedINSt6thread5_ImplISt12_Bind_resultIvFZ4mainEUlvE_vEEEEIS5_EESt10shared_ptrIT_EDpOT0_ (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x400E17: _ZNSt6thread15_M_make_routineISt12_Bind_resultIvFZ4mainEUlvE_vEEEESt10shared_ptrINS_5_ImplIT_EEEOS7_ (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x400D2B: _ZNSt6threadC1IZ4mainEUlvE_IEEEOT_DpOT0_ (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347==    by 0x400C60: main (in /mnt/home/kfeng/dev/robolab/cpp/sbx/sandbox)
==19347== 
==19347== ----------------------------------------------------------------
==19347==
like image 732
kfmfe04 Avatar asked Dec 06 '11 00:12

kfmfe04


2 Answers

std::thread uses a shared pointer internally. What you are seeing are false positives on the reference count of that shared pointer object. You can avoid these false positives by adding the four lines of code shown below in each source file just before the C++ header include directives. Note: this only works with the version of libstdc++ included with gcc 4.6.0 or later.

#include <valgrind/drd.h>
#define _GLIBCXX_SYNCHRONIZATION_HAPPENS_BEFORE(addr) ANNOTATE_HAPPENS_BEFORE(addr)
#define _GLIBCXX_SYNCHRONIZATION_HAPPENS_AFTER(addr) ANNOTATE_HAPPENS_AFTER(addr)
#define _GLIBCXX_EXTERN_TEMPLATE -1

For more information, see also the Data Race Hunting section in the libstdc++ manual (http://gcc.gnu.org/onlinedocs/libstdc++/manual/debug.html).

like image 192
user251384 Avatar answered Nov 10 '22 01:11

user251384


Most likely what you are seeing are false positives. I am observing a similar behaviour in my code.

Specifically, the warnings seem related to the implementation of the shared pointer class, and my understanding is that on your platform (which I presume is x86/x86-64?) GCC is using optimized atomic assembly instruction in the shared pointer reference counting machinery. The problem is that valgrind is able to detect errors when using the POSIX primitives (locks, mutexes, etx.), but it is not able to cope with lower level primitives.

What I've done so far is to simply filter out from the output of valgrind the warnings (possibly you could write some suppression file that does the job in the proper way).

like image 24
bluescarni Avatar answered Nov 09 '22 23:11

bluescarni