Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why valgrind(helgrind) generates "Possible Data Races" in case virtual function is called upon my thread struct

Tags:

c++

valgrind

When I began to learn valgrind(helgrind) tool, I came across such a teedy issue that I failed to tackle.

In simple, a user-defined thread-class is created with a virtual function which would be called by thread's entry routine. If this is the case, helgrind will report Possible-data-race. But after simply omitting virtual keyword, no such errors would ever be reported. How come this happens this way? Anything wrong with my code? Or is there a workaround?

Hereafter is the simple threaded application demonstrating such issue, including cpp,Makefile and messages that helgrind reports.

/* main.cpp */
#include <memory.h>
#include <pthread.h>

class thread_s {
public:
  pthread_t       th;
  thread_s(void);
  ~thread_s(void);
  virtual void* routine(); /* if omit virtual, no error would be generated */
  void stop(void);
};
static void* routine(void*);
int main(int, const char*[])
{
  thread_s s_v;
  pthread_create(&s_v.th, 0, routine, &s_v);
  return 0;
}
static void* routine(void* arg)
{
  thread_s *pV = reinterpret_cast<thread_s*>(arg);
  pV->routine();
  return 0;
}
void* thread_s::routine(void)
{
  return 0;
}
thread_s::thread_s(void)
{
  th = 0;
}
thread_s::~thread_s(void)
{
  stop();
}
void thread_s::stop(void)
{
  void *v = 0;
  pthread_join(th, &v);
}

=======================================

/* Makefile */
all: main test_helgrind

main: main.cpp
        g++ -o main main.cpp \
        -g -Wall -O0 \
        -lpthread

test_helgrind:
        valgrind \
                --tool=helgrind \
                ./main

clean:
        rm -f main

.PHONY: clean

=======================================

g++ -o main main.cpp \
        -g -Wall -O0 \
        -lpthread
valgrind \
                --tool=helgrind \
                ./main
==7477== Helgrind, a thread error detector
==7477== Copyright (C) 2007-2010, and GNU GPL'd, by OpenWorks LLP et al.
==7477== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==7477== Command: ./main
==7477==
==7477== Thread #1 is the program's root thread
==7477==
==7477== Thread #2 was created
==7477==    at 0x4259728: clone (clone.S:111)
==7477==    by 0x40484B5: pthread_create@@GLIBC_2.1 (createthread.c:256)
==7477==    by 0x4026E2D: pthread_create_WRK (hg_intercepts.c:257)
==7477==    by 0x4026F8B: pthread_create@* (hg_intercepts.c:288)
==7477==    by 0x8048560: main (main.cpp:18)
==7477==
==7477== Possible data race during write of size 4 at 0xbeab24c8 by thread #1
==7477==    at 0x80485C9: thread_s::~thread_s() (main.cpp:35)
==7477==    by 0x8048571: main (main.cpp:17)
==7477==  This conflicts with a previous read of size 4 by thread #2
==7477==    at 0x804858B: routine(void*) (main.cpp:24)
==7477==    by 0x4026F60: mythread_wrapper (hg_intercepts.c:221)
==7477==    by 0x4047E98: start_thread (pthread_create.c:304)
==7477==    by 0x425973D: clone (clone.S:130)
==7477==
==7477==
==7477== For counts of detected and suppressed errors, rerun with: -v
==7477== Use --history-level=approx or =none to gain increased speed, at
==7477== the cost of reduced accuracy of conflicting-access information
==7477== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 1 from 1)
like image 636
user1638062 Avatar asked Aug 31 '12 09:08

user1638062


2 Answers

I don't know if this is the reason for helgrind's complaint, but you have a serious problem in you program. You create a thread, passing a pointer to a local thread_s instance (s_v in main()).

However, main() will soon return without any kind of synchronization with thread - there's nothing to ensure that s_v is still alive when the thread function routine() gets hold of the pointer and uses it to call pV->routine().

See if adding the following after the pthread_create() call prevents helgrind from complaining:

pthread_join( s_v.th, NULL);

Actually, looking more closely at the helgrind output, this will almost certainly remove helgrind's complaint, since the log is pointing to the thread_s destructor as one participant in the data race.

like image 129
Michael Burr Avatar answered Sep 22 '22 16:09

Michael Burr


In one case the vptr is written, in the other it is read. Both without any lock being held. Helgrind can not know if there are other means in your program that make this condition impossible to happen in two threads simultaneously, so it flags it. If you can guarantee that the object is not destroyed while in another thread someone tries to call a function on it, then you can generate a suppression for this.

like image 22
PlasmaHH Avatar answered Sep 23 '22 16:09

PlasmaHH