Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

segfault during __cxa_allocate_exception in SWIG wrapped library

While developing a SWIG wrapped C++ library for Ruby, we came across an unexplained crash during exception handling inside the C++ code.

I'm not sure of the specific circumstances to recreate the issue, but it happened first during a call to std::uncaught_exception, then after a some code changes, moved to __cxa_allocate_exception during exception construction. Neither GDB nor valgrind provided any insight into the cause of the crash.

I've found several references to similar problems, including:

  • http://wiki.fifengine.de/Segfault_in_cxa_allocate_exception
  • http://forums.fifengine.de/index.php?topic=30.0
  • http://code.google.com/p/osgswig/issues/detail?id=17
  • https://bugs.launchpad.net/ubuntu/+source/libavg/+bug/241808

The overriding theme seems to be a combination of circumstances:

  • A C application is linked to more than one C++ library
  • More than one version of libstdc++ was used during compilation
  • Generally the second version of C++ used comes from a binary-only implementation of libGL
  • The problem does not occur when linking your library with a C++ application, only with a C application

The "solution" is to explicitly link your library with libstdc++ and possibly also with libGL, forcing the order of linking.

After trying many combinations with my code, the only solution that I found that works is the LD_PRELOAD="libGL.so libstdc++.so.6" ruby scriptname option. That is, none of the compile-time linking solutions made any difference.

My understanding of the issue is that the C++ runtime is not being properly initialized. By forcing the order of linking you bootstrap the initialization process and it works. The problem occurs only with C applications calling C++ libraries because the C application is not itself linking to libstdc++ and is not initializing the C++ runtime. Because using SWIG (or boost::python) is a common way of calling a C++ library from a C application, that is why SWIG often comes up when researching the problem.

Is anyone out there able to give more insight into this problem? Is there an actual solution or do only workarounds exist?

Thanks.

like image 708
lefticus Avatar asked May 06 '10 03:05

lefticus


People also ask

How is Segfault caused?

The following are some typical causes of a segmentation fault: Attempting to access a nonexistent memory address (outside process's address space) Attempting to access memory the program does not have rights to (such as kernel structures in process context) Attempting to write read-only memory (such as code segment)

Is a Segfault an exception?

They are both called exceptions, but they originate at different levels of the software/hardware of the system. Technically, you can catch segfaults with a signal handler for SIGSEGV . However, as Ivaylo explains, it's is not, typically, allowed to just "try again" if you get a segfault.


2 Answers

Following Michael Dorgan's suggestion, I'm copying my comment into an answer:

Found the real cause of the problem. Hopefully this will help someone else encountering this bug. You probably have some static data somewhere that is not being properly initialized. We did, and the solution was in boost-log for our code base. https://sourceforge.net/projects/boost-log/forums/forum/710022/topic/3706109. The real problem is the delay loaded library (plus statics), not the potentially multiple versions of C++ from different libraries. For more info: http://parashift.com/c++-faq-lite/ctors.html#faq-10.13

Since encountering this problem and its solution, I've learned that it's important to understand how statics are shared or not shared between your statically and dynamically linked libraries. On Windows this requires explicitly exporting the symbols for the shared statics (including things like singletons meant to be accessed across different libraries). The behavior is subtly different between each of the major platforms.

like image 151
lefticus Avatar answered Sep 29 '22 11:09

lefticus


I recently ran into this problem as well. My process creates a shared object module that is used as a python C++ extension. A recent OS upgrade from RHEL 6.4 to 6.5 exposed the problem.

Following the tips here, I merely added -lstdc++ to my link switches and that solved the problem.

like image 45
John Boia Avatar answered Sep 29 '22 12:09

John Boia