<h3>Code</h3> Here is the program that gives the segfault. <pre class="prettyprint"><code>#include <iostream> #include <vector> #include <memory> int main() { std::cout << "Hello World" << std::endl; std::vector<std::shared_ptr<int>> y {}; std::cout << "Hello World" << std::endl; } </code></pre> Of course, there is absolutely nothing wrong in the program itself. The root cause of the segfault depends on the environment in which its built and ran. <hr> <h3>Background</h3> We, at Amazon, use a build system which builds and deploys the binaries (<code>lib</code> and <code>bin</code>) in an almost machine independent way. For our case, that basically means it deploys the executable (built from the above program) into <code>$project_dir/build/bin/</code> and almost all its dependencies (i.e the shared libraries) into <code>$project_dir/build/lib/</code>. Why I used the phrase "almost" is because for shared libraries such <code>libc.so</code>, <code>libm.so</code>, <code>ld-linux-x86-64.so.2</code> and possibly few others, the executable picks from the system (i.e from <code>/lib64</code> ). Note that it is supposed to pick <code>libstdc++</code> from <code>$project_dir/build/lib</code> though. Now I run it as follows: <pre class="prettyprint"><code>$ LD_LIBRARY_PATH=$project_dir/build/lib ./build/bin/run segmentation fault </code></pre> However if I run it, without setting the <code>LD_LIBRARY_PATH</code>. It runs fine. <hr> <h3>Diagnostics</h3> <h3>1. ldd</h3> Here are <code>ldd</code> informations for both cases (please note that I've edited the output to mention the full version of the libraries wherever there is difference) <pre class="prettyprint"><code>$ LD_LIBRARY_PATH=$project_dir/build/lib ldd ./build/bin/run linux-vdso.so.1 => (0x00007ffce19ca000) libstdc++.so.6 => $project_dir/build/lib/libstdc++.so.6.0.20 libgcc_s.so.1 => $project_dir/build/lib/libgcc_s.so.1 libc.so.6 => /lib64/libc.so.6 libm.so.6 => /lib64/libm.so.6 /lib64/ld-linux-x86-64.so.2 (0x0000562ec51bc000) </code></pre> and without LD_LIBRARY_PATH: <pre class="prettyprint"><code>$ ldd ./build/bin/run linux-vdso.so.1 => (0x00007fffcedde000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6.0.16 libgcc_s.so.1 => /lib64/libgcc_s-4.4.6-20110824.so.1 libc.so.6 => /lib64/libc.so.6 libm.so.6 => /lib64/libm.so.6 /lib64/ld-linux-x86-64.so.2 (0x0000560caff38000) </code></pre> <h3>2. gdb when it segfaults</h3> <pre class="prettyprint"><code>Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7dea45c in _dl_fixup () from /lib64/ld-linux-x86-64.so.2 Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.62.al12.x86_64 (gdb) bt #0 0x00007ffff7dea45c in _dl_fixup () from /lib64/ld-linux-x86-64.so.2 #1 0x00007ffff7df0c55 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2 #2 0x00007ffff7b1dc41 in std::locale::_S_initialize() () from $project_dir/build/lib/libstdc++.so.6 #3 0x00007ffff7b1dc85 in std::locale::locale() () from $project_dir/build/lib/libstdc++.so.6 #4 0x00007ffff7b1a574 in std::ios_base::Init::Init() () from $project_dir/build/lib/libstdc++.so.6 #5 0x0000000000400fde in _GLOBAL__sub_I_main () at $project_dir/build/gcc-4.9.4/include/c++/4.9.4/iostream:74 #6 0x00000000004012ed in __libc_csu_init () #7 0x00007ffff7518cb0 in __libc_start_main () from /lib64/libc.so.6 #8 0x0000000000401021 in _start () (gdb) </code></pre> <h3>3. LD_DEBUG=all</h3> I also tried to see the linker information by enabling <code>LD_DEBUG=all</code> for the segfault case. I found something suspicious, as it searches for <code>pthread_once</code> symbol, and when it unable to find this, it gives segfault (that is my interpretation of the following output snippet BTW): <pre class="prettyprint"><code>initialize program: $project_dir/build/bin/run symbol=_ZNSt8ios_base4InitC1Ev; lookup in file=$project_dir/build/bin/run [0] symbol=_ZNSt8ios_base4InitC1Ev; lookup in file=$project_dir/build/lib/libstdc++.so.6 [0] binding file $project_dir/build/bin/run [0] to $project_dir/build/lib/libstdc++.so.6 [0]: normal symbol `_ZNSt8ios_base4InitC1Ev' [GLIBCXX_3.4] symbol=_ZNSt6localeC1Ev; lookup in file=$project_dir/build/bin/run [0] symbol=_ZNSt6localeC1Ev; lookup in file=$project_dir/build/lib/libstdc++.so.6 [0] binding file $project_dir/build/lib/libstdc++.so.6 [0] to $project_dir/build/lib/libstdc++.so.6 [0]: normal symbol `_ZNSt6localeC1Ev' [GLIBCXX_3.4] symbol=pthread_once; lookup in file=$project_dir/build/bin/run [0] symbol=pthread_once; lookup in file=$project_dir/build/lib/libstdc++.so.6 [0] symbol=pthread_once; lookup in file=$project_dir/build/lib/libgcc_s.so.1 [0] symbol=pthread_once; lookup in file=/lib64/libc.so.6 [0] symbol=pthread_once; lookup in file=/lib64/libm.so.6 [0] symbol=pthread_once; lookup in file=/lib64/ld-linux-x86-64.so.2 [0] </code></pre> But I dont see any <code>pthread_once</code> for the case when it runs successfully! <hr> <h3>Questions</h3> I know that its very difficult to debug like this and probably I've not given a lot of informations about the environments and all. But still, my question is: what could be the possible root-cause for this segfault? How to debug further and find that? Once I find the issue, fix would be easy. <hr> <h3>Compiler and Platform</h3> I'm using GCC 4.9 on RHEL5. <hr> <h3>Experiments</h3> <h3>E#1</h3> If I comment the following line: <pre class="prettyprint"><code>std::vector<std::shared_ptr<int>> y {}; </code></pre> It compiles and runs fine! <h3>E#2</h3> I just included the following header to my program: <pre class="prettyprint"><code>#include <boost/filesystem.hpp> </code></pre> and linked accordingly. Now it works without any segfault. So it seems by having a dependency on <code>libboost_system.so.1.53.0.</code>, some requirements are met, or the problem is circumvented! <h3>E#3</h3> Since I saw it working when I made the executable to be linked against <code>libboost_system.so.1.53.0</code>, so I did the following things step by step. Instead of using <code>#include <boost/filesystem.hpp></code> in the code itself, I use the original code and ran it by preloading <code>libboost_system.so</code> using <code>LD_PRELOAD</code> as follows: <pre class="prettyprint"><code>$ LD_PRELOAD=$project_dir/build/lib/libboost_system.so $project_dir/build/bin/run </code></pre> and it ran successfully! Next I did <code>ldd</code> on the <code>libboost_system.so</code> which gave a list of libs, two of which were: <pre class="prettyprint"><code> /lib64/librt.so.1 /lib64/libpthread.so.0 </code></pre> So instead of preloading <code>libboost_system</code>, I preload <code>librt</code> and <code>libpthread</code> separately: <pre class="prettyprint"><code>$ LD_PRELOAD=/lib64/librt.so.1 $project_dir/build/bin/run $ LD_PRELOAD=/lib64/libpthread.so.0 $project_dir/build/bin/run </code></pre> In both cases, it ran successfully. Now my conclusion is that by loading either <code>librt</code> or <code>libpthread</code> (or both ), some requirements are met or the problem is circumvented! I still dont know the root cause of the issue, though. <hr> <h3>Compilation and Linking Options</h3> Since the build system is complex and there are lots of options which are there by default. So I tried to explicitly add <code>-lpthread</code> using CMake's <code>set</code> command, then it worked, as we have already seen that by preloading <code>libpthread</code> it works! In order to see the build difference between these two cases (when-it-works and when-it-gives-segfault), I built it in verbose mode by passing <code>-v</code> to GCC, to see the compilation stages and the options it actually passes to <code>cc1plus</code> (compiler) and <code>collect2</code> (linker). (Note that paths has been edited for brevity, using dollar-sign and dummy paths.) <blockquote> $/gcc-4.9.4/cc1plus -quiet -v -I /a/include -I /b/include -iprefix $/gcc-4.9.4/ -MMD main.cpp.d -MF main.cpp.o.d -MT main.cpp.o -D_GNU_SOURCE -D_REENTRANT -D __USE_XOPEN2K8 -D _LARGEFILE_SOURCE -D _FILE_OFFSET_BITS=64 -D __STDC_FORMAT_MACROS -D __STDC_LIMIT_MACROS -D NDEBUG $/lab/main.cpp -quiet -dumpbase main.cpp -msse -mfpmath=sse -march=core2 -auxbase-strip main.cpp.o -g -O3 -Wall -Wextra -std=gnu++1y -version -fdiagnostics-color=auto -ftemplate-depth=128 -fno-operator-names -o /tmp/ccxfkRyd.s </blockquote> Irrespective of whether it works or not, the command-line arguments to <code>cc1plus</code> are exactly the same. No difference at all. That does not seem to be very helpful. The difference, however, is at the linking time. Here is what I see, for the case when it works: <blockquote> $/gcc-4.9.4/collect2 -plugin $/gcc-4.9.4/liblto_plugin.so -plugin-opt=$/gcc-4.9.4/lto-wrapper -plugin-opt=-fresolution=/tmp/cchl8RtI.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lpthread -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --eh-frame-hdr -m elf_x86_64 -export-dynamic -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o run /usr/lib/../lib64/crt1.o /usr/lib/../lib64/crti.o $/gcc-4.9.4/crtbegin.o -L/a/lib -L/b/lib -L/c/lib -lpthread --as-needed main.cpp.o -lboost_timer -lboost_wave -lboost_chrono -lboost_filesystem -lboost_graph -lboost_locale -lboost_thread -lboost_wserialization -lboost_atomic -lboost_context -lboost_date_time -lboost_iostreams -lboost_math_c99 -lboost_math_c99f -lboost_math_c99l -lboost_math_tr1 -lboost_math_tr1f -lboost_math_tr1l -lboost_mpi -lboost_prg_exec_monitor -lboost_program_options -lboost_random -lboost_regex -lboost_serialization -lboost_signals -lboost_system -lboost_unit_test_framework -lboost_exception -lboost_test_exec_monitor -lbz2 -licui18n -licuuc -licudata -lz -rpath /a/lib:/b/lib:/c/lib: -lstdc++ -lm -lgcc_s -lgcc -lpthread -lc -lgcc_s -lgcc $/gcc-4.9.4/crtend.o /usr/lib/../lib64/crtn.o </blockquote> As you can see, <code>-lpthread</code> is mentioned twice! The first <code>-lpthread</code> (which is followed by <code>--as-needed</code>) is missing for the case when it gives segfault. That is the only difference between these two cases. <hr> <h3>Output of <code>nm -C</code> in both cases</h3> Interestingly, the output of <code>nm -C</code> in both cases is identical (if you ignore the integer values in the first columns). <pre class="prettyprint"><code>0000000000402580 d _DYNAMIC 0000000000402798 d _GLOBAL_OFFSET_TABLE_ 0000000000401000 t _GLOBAL__sub_I_main 0000000000401358 R _IO_stdin_used w _ITM_deregisterTMCloneTable w _ITM_registerTMCloneTable w _Jv_RegisterClasses U _Unwind_Resume 0000000000401150 W std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_destroy() 0000000000401170 W std::vector<std::shared_ptr<int>, std::allocator<std::shared_ptr<int> > >::~vector() 0000000000401170 W std::vector<std::shared_ptr<int>, std::allocator<std::shared_ptr<int> > >::~vector() 0000000000401250 W std::vector<std::unique_ptr<int, std::default_delete<int> >, std::allocator<std::unique_ptr<int, std::default_delete<int> > > >::~vector() 0000000000401250 W std::vector<std::unique_ptr<int, std::default_delete<int> >, std::allocator<std::unique_ptr<int, std::default_delete<int> > > >::~vector() U std::ios_base::Init::Init() U std::ios_base::Init::~Init() 0000000000402880 B std::cout U std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&) 0000000000402841 b std::__ioinit U std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) U operator delete(void*) U operator new(unsigned long) 0000000000401510 r __FRAME_END__ 0000000000402818 d __JCR_END__ 0000000000402818 d __JCR_LIST__ 0000000000402820 d __TMC_END__ 0000000000402820 d __TMC_LIST__ 0000000000402838 A __bss_start U __cxa_atexit 0000000000402808 D __data_start 0000000000401100 t __do_global_dtors_aux 0000000000402820 t __do_global_dtors_aux_fini_array_entry 0000000000402810 d __dso_handle 0000000000402828 t __frame_dummy_init_array_entry w __gmon_start__ U __gxx_personality_v0 0000000000402838 t __init_array_end 0000000000402828 t __init_array_start 00000000004012b0 T __libc_csu_fini 00000000004012c0 T __libc_csu_init U __libc_start_main w __pthread_key_create 0000000000402838 A _edata 0000000000402990 A _end 000000000040134c T _fini 0000000000400e68 T _init 0000000000401028 T _start 0000000000401054 t call_gmon_start 0000000000402840 b completed.6661 0000000000402808 W data_start 0000000000401080 t deregister_tm_clones 0000000000401120 t frame_dummy 0000000000400f40 T main 00000000004010c0 t register_tm_clones </code></pre>

This is likely a problem caused by subtle mismatches between <code>libstdc++</code> ABIs. GCC 4.9 is not the system compiler on Red Hat Enterprise Linux 5, so it's not quite clear what you are using there (DTS 3?). The locale implementation is known to be quite sensitive to ABI mismatches. See this thread on the gcc-help list: <ul> <li>Binary compatibility between an old static libstdc++ and a new dynamic one</li> <li>plus follow-ups in the next month</li> </ul> Your best bet is to figure out which bits of <code>libstdc++</code> where linked where, and somehow achieve consistency (either by hiding symbols, or recompiling things so that they are compatible). It may also be useful to investigate the hybrid linkage model used for <code>libstdc++</code> in Red Hat's Developer Toolset (where newer bits are linked statically, but the bulk of the C++ standard library uses the existing system DSO), but the system <code>libstdc++</code> in Red hat Enterprise Linux 5 might be too old for that if you need support for current language features.

Given the point of crash, and the fact that preloading <code>libpthread</code> seems to fix it, I believe that the execution of the two cases diverges at <code>locale_init.cc:315</code>. Here is an extract of the code: <pre class="prettyprint"><code> void locale::_S_initialize() { #ifdef __GTHREADS if (__gthread_active_p()) __gthread_once(&_S_once, _S_initialize_once); #endif if (!_S_classic) _S_initialize_once(); } </code></pre> <code>__gthread_active_p()</code> returns true if your program is linked against pthread, specifically it checks if <code>pthread_key_create</code> is available. On my system, this symbol is defined in "/usr/include/c++/7.2.0/x86_64-pc-linux-gnu/bits/gthr-default.h" as <code>static inline</code>, hence it is a potential source of ODR violation. Notice that <code>LD_PRELOAD=libpthread,so</code> will always cause <code>__gthread_active_p()</code> to return true. <code>__gthread_once</code> is another inlined symbol that should always forward to <code>pthread_once</code>. It's hard to guess what's going on without debugging, but I suspect that you are hitting the true branch of <code>__gthread_active_p()</code> even when it shouldn't, and the program then crashes because there is no <code>pthread_once</code> to call. EDIT: So I did some experiments, the only way I see to get a crash in <code>std::locale::_S_initialize</code> is if <code>__gthread_active_p</code> returns true, but <code>pthread_once</code> is not linked in. libstdc++ does not link directly against <code>pthread</code>, but it imports half of <code>pthread_xx</code> as weak objects, which means they can be undefined and not cause a linker error. Obviously linking pthread will make the crash disappear, but if I am right, the main issue is that your <code>libstdc++</code> thinks that it is inside a multi-threaded executable even if we did not link pthread in. Now, <code>__gthread_active_p</code> uses <code>__pthread_key_create</code> to decide if we have threads or no. This is defined in your executable as a weak object (can be nullptr and still be fine). I am 99% sure that the symbol is there because of <code>shared_ptr</code> (remove it and check <code>nm</code> again to be sure). So, somehow <code>__pthread_key_create</code> gets bound to a valid address, maybe because of that last <code>-lpthread</code> in your linker flags. You can verify this theory by putting a breakpoint at <code>locale_init.cc:315</code> and checking which branch you take. EDIT2: Summary of the comments, the issue is only reproducible if we have all of the following: <ol> <li>Use <code>ld.gold</code> instead of <code>ld.bfd</code> </li> <li>Use <code>--as-needed</code> </li> <li>Forcing a weak definition of <code>__pthread_key_create</code>, in this case via instantiation of <code>std::shared_ptr</code>.</li> <li>Not linking to <code>pthread</code>, or linking <code>pthread</code> after <code>--as-needed</code>.</li> </ol> To answer the questions in the comments: <blockquote> Why does it use gold by default? </blockquote> By default it uses <code>/usr/bin/ld</code>, which on most distro is a symlink to either <code>/usr/bin/ld.bfd</code> or <code>/usr/bin/ld.gold</code>. Such default can be manipulated using <code>update-alternatives</code>. I am not sure why in your case it is <code>ld.gold</code>, as far as I understand RHEL5 ships with <code>ld.bfd</code> as default. <blockquote> And why does gold not add pthread.so dependency to the binary if it is needed? </blockquote> Because the definition of what is needed is somehow shady. <code>man ld</code> says (emphasis mine): <blockquote> --as-needed --no-as-needed This option affects ELF DT_NEEDED tags for dynamic libraries mentioned on the command line after the --as-needed option. Normally the linker will add a DT_NEEDED tag for each dynamic library mentioned on the command line, regardless of whether the library is actually needed or not. --as-needed causes a DT_NEEDED tag to only be emitted for a library that at that point in the link satisfies a non-weak undefined symbol reference from a regular object file or, if the library is not found in the DT_NEEDED lists of other needed libraries, a non-weak undefined symbol reference from another needed dynamic library. Object files or libraries appearing on the command line after the library in question do not affect whether the library is seen as needed. This is similar to the rules for extraction of object files from archives. --no-as-needed restores the default behaviour. </blockquote> Now, according to this bug report, <code>gold</code> is honoring the "non weak undefined symbol" part, while <code>ld.bfd</code> sees weak symbols as needed. TBH I do not have a full understanding on this, and there is some discussion on that link as to whether this is to be considered a <code>ld.gold</code> bug, or a <code>libstdc++</code> bug. <blockquote> Why do I need to mention -pthread and -lpthread both? (-pthread is passed by default by our build system, and I've pass -lpthread to make it work with gold is used). </blockquote> <code>-pthread</code> and <code>-lpthread</code> do different things (see pthread vs lpthread). It is my understanding that the former should imply the latter. Regardless, you can probably pass <code>-lpthread</code> only once, but you need to do it before <code>--as-needed</code>, or use <code>--no-as-needed</code> after the last library and before <code>-lpthread</code>. It is also worth mentioning that I was not able to reproduce this issue on my system (GCC 7.2), even using the gold linker. So I suspect that it has been fixed in a more recent version libstdc++, which might also explain why it does not segfault if you use the system standard library.

Segfault on declaring a variable of type vector<shared_ptr<int>>

Code

Here is the program that gives the segfault.

#include <iostream> #include <vector> #include <memory>  int main()  {     std::cout << "Hello World" << std::endl;      std::vector<std::shared_ptr<int>> y {};        std::cout << "Hello World" << std::endl; }

Of course, there is absolutely nothing wrong in the program itself. The root cause of the segfault depends on the environment in which its built and ran.

Background

We, at Amazon, use a build system which builds and deploys the binaries (lib and bin) in an almost machine independent way. For our case, that basically means it deploys the executable (built from the above program) into $project_dir/build/bin/ and almost all its dependencies (i.e the shared libraries) into $project_dir/build/lib/. Why I used the phrase "almost" is because for shared libraries such libc.so, libm.so, ld-linux-x86-64.so.2 and possibly few others, the executable picks from the system (i.e from /lib64 ). Note that it is supposed to pick libstdc++ from $project_dir/build/lib though.

Now I run it as follows:

$ LD_LIBRARY_PATH=$project_dir/build/lib ./build/bin/run  segmentation fault

However if I run it, without setting the LD_LIBRARY_PATH. It runs fine.

Diagnostics

1. ldd

Here are ldd informations for both cases (please note that I've edited the output to mention the full version of the libraries wherever there is difference)

$ LD_LIBRARY_PATH=$project_dir/build/lib ldd ./build/bin/run  linux-vdso.so.1 =>  (0x00007ffce19ca000) libstdc++.so.6 => $project_dir/build/lib/libstdc++.so.6.0.20  libgcc_s.so.1 =>  $project_dir/build/lib/libgcc_s.so.1  libc.so.6 => /lib64/libc.so.6  libm.so.6 => /lib64/libm.so.6  /lib64/ld-linux-x86-64.so.2 (0x0000562ec51bc000)

and without LD_LIBRARY_PATH:

$ ldd ./build/bin/run  linux-vdso.so.1 =>  (0x00007fffcedde000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6.0.16  libgcc_s.so.1 => /lib64/libgcc_s-4.4.6-20110824.so.1 libc.so.6 => /lib64/libc.so.6  libm.so.6 => /lib64/libm.so.6  /lib64/ld-linux-x86-64.so.2 (0x0000560caff38000)

2. gdb when it segfaults

Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7dea45c in _dl_fixup () from /lib64/ld-linux-x86-64.so.2 Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.62.al12.x86_64 (gdb) bt #0  0x00007ffff7dea45c in _dl_fixup () from /lib64/ld-linux-x86-64.so.2 #1  0x00007ffff7df0c55 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2 #2  0x00007ffff7b1dc41 in std::locale::_S_initialize() () from $project_dir/build/lib/libstdc++.so.6 #3  0x00007ffff7b1dc85 in std::locale::locale() () from $project_dir/build/lib/libstdc++.so.6 #4  0x00007ffff7b1a574 in std::ios_base::Init::Init() () from $project_dir/build/lib/libstdc++.so.6 #5  0x0000000000400fde in _GLOBAL__sub_I_main () at $project_dir/build/gcc-4.9.4/include/c++/4.9.4/iostream:74 #6  0x00000000004012ed in __libc_csu_init () #7  0x00007ffff7518cb0 in __libc_start_main () from /lib64/libc.so.6 #8  0x0000000000401021 in _start () (gdb)

3. LD_DEBUG=all

I also tried to see the linker information by enabling LD_DEBUG=all for the segfault case. I found something suspicious, as it searches for pthread_once symbol, and when it unable to find this, it gives segfault (that is my interpretation of the following output snippet BTW):

initialize program: $project_dir/build/bin/run  symbol=_ZNSt8ios_base4InitC1Ev;  lookup in file=$project_dir/build/bin/run [0] symbol=_ZNSt8ios_base4InitC1Ev;  lookup in file=$project_dir/build/lib/libstdc++.so.6 [0] binding file $project_dir/build/bin/run [0] to $project_dir/build/lib/libstdc++.so.6 [0]: normal symbol `_ZNSt8ios_base4InitC1Ev' [GLIBCXX_3.4] symbol=_ZNSt6localeC1Ev;  lookup in file=$project_dir/build/bin/run [0] symbol=_ZNSt6localeC1Ev;  lookup in file=$project_dir/build/lib/libstdc++.so.6 [0] binding file $project_dir/build/lib/libstdc++.so.6 [0] to $project_dir/build/lib/libstdc++.so.6 [0]: normal symbol `_ZNSt6localeC1Ev' [GLIBCXX_3.4] symbol=pthread_once;  lookup in file=$project_dir/build/bin/run [0] symbol=pthread_once;  lookup in file=$project_dir/build/lib/libstdc++.so.6 [0] symbol=pthread_once;  lookup in file=$project_dir/build/lib/libgcc_s.so.1 [0] symbol=pthread_once;  lookup in file=/lib64/libc.so.6 [0] symbol=pthread_once;  lookup in file=/lib64/libm.so.6 [0] symbol=pthread_once;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]

But I dont see any pthread_once for the case when it runs successfully!

Questions

I know that its very difficult to debug like this and probably I've not given a lot of informations about the environments and all. But still, my question is: what could be the possible root-cause for this segfault? How to debug further and find that? Once I find the issue, fix would be easy.

Compiler and Platform

I'm using GCC 4.9 on RHEL5.

Experiments

E#1

If I comment the following line:

std::vector<std::shared_ptr<int>> y {};

It compiles and runs fine!

E#2

I just included the following header to my program:

#include <boost/filesystem.hpp>

and linked accordingly. Now it works without any segfault. So it seems by having a dependency on libboost_system.so.1.53.0., some requirements are met, or the problem is circumvented!

E#3

Since I saw it working when I made the executable to be linked against libboost_system.so.1.53.0, so I did the following things step by step.

Instead of using #include <boost/filesystem.hpp> in the code itself, I use the original code and ran it by preloading libboost_system.so using LD_PRELOAD as follows:

$ LD_PRELOAD=$project_dir/build/lib/libboost_system.so $project_dir/build/bin/run

and it ran successfully!

Next I did ldd on the libboost_system.so which gave a list of libs, two of which were:

  /lib64/librt.so.1   /lib64/libpthread.so.0

So instead of preloading libboost_system, I preload librt and libpthread separately:

$ LD_PRELOAD=/lib64/librt.so.1 $project_dir/build/bin/run  $ LD_PRELOAD=/lib64/libpthread.so.0 $project_dir/build/bin/run

In both cases, it ran successfully.

Now my conclusion is that by loading either librt or libpthread (or both ), some requirements are met or the problem is circumvented! I still dont know the root cause of the issue, though.

Compilation and Linking Options

Since the build system is complex and there are lots of options which are there by default. So I tried to explicitly add -lpthread using CMake's set command, then it worked, as we have already seen that by preloading libpthread it works!

In order to see the build difference between these two cases (when-it-works and when-it-gives-segfault), I built it in verbose mode by passing -v to GCC, to see the compilation stages and the options it actually passes to cc1plus (compiler) and collect2 (linker).

(Note that paths has been edited for brevity, using dollar-sign and dummy paths.)

$/gcc-4.9.4/cc1plus -quiet -v -I /a/include -I /b/include -iprefix $/gcc-4.9.4/ -MMD main.cpp.d -MF main.cpp.o.d -MT main.cpp.o -D_GNU_SOURCE -D_REENTRANT -D __USE_XOPEN2K8 -D _LARGEFILE_SOURCE -D _FILE_OFFSET_BITS=64 -D __STDC_FORMAT_MACROS -D __STDC_LIMIT_MACROS -D NDEBUG $/lab/main.cpp -quiet -dumpbase main.cpp -msse -mfpmath=sse -march=core2 -auxbase-strip main.cpp.o -g -O3 -Wall -Wextra -std=gnu++1y -version -fdiagnostics-color=auto -ftemplate-depth=128 -fno-operator-names -o /tmp/ccxfkRyd.s

Irrespective of whether it works or not, the command-line arguments to cc1plus are exactly the same. No difference at all. That does not seem to be very helpful.

The difference, however, is at the linking time. Here is what I see, for the case when it works:

$/gcc-4.9.4/collect2 -plugin $/gcc-4.9.4/liblto_plugin.so
-plugin-opt=$/gcc-4.9.4/lto-wrapper -plugin-opt=-fresolution=/tmp/cchl8RtI.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lpthread -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --eh-frame-hdr -m elf_x86_64 -export-dynamic -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o run /usr/lib/../lib64/crt1.o /usr/lib/../lib64/crti.o $/gcc-4.9.4/crtbegin.o -L/a/lib -L/b/lib -L/c/lib -lpthread --as-needed main.cpp.o -lboost_timer -lboost_wave -lboost_chrono -lboost_filesystem -lboost_graph -lboost_locale -lboost_thread -lboost_wserialization -lboost_atomic -lboost_context -lboost_date_time -lboost_iostreams -lboost_math_c99 -lboost_math_c99f -lboost_math_c99l -lboost_math_tr1 -lboost_math_tr1f -lboost_math_tr1l -lboost_mpi -lboost_prg_exec_monitor -lboost_program_options -lboost_random -lboost_regex -lboost_serialization -lboost_signals -lboost_system -lboost_unit_test_framework -lboost_exception -lboost_test_exec_monitor -lbz2 -licui18n -licuuc -licudata -lz -rpath /a/lib:/b/lib:/c/lib: -lstdc++ -lm -lgcc_s -lgcc -lpthread -lc -lgcc_s -lgcc $/gcc-4.9.4/crtend.o /usr/lib/../lib64/crtn.o

As you can see, -lpthread is mentioned twice! The first -lpthread (which is followed by --as-needed) is missing for the case when it gives segfault. That is the only difference between these two cases.

Output of `nm -C` in both cases

Interestingly, the output of nm -C in both cases is identical (if you ignore the integer values in the first columns).

0000000000402580 d _DYNAMIC 0000000000402798 d _GLOBAL_OFFSET_TABLE_ 0000000000401000 t _GLOBAL__sub_I_main 0000000000401358 R _IO_stdin_used                  w _ITM_deregisterTMCloneTable                  w _ITM_registerTMCloneTable                  w _Jv_RegisterClasses                  U _Unwind_Resume 0000000000401150 W std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_destroy() 0000000000401170 W std::vector<std::shared_ptr<int>, std::allocator<std::shared_ptr<int> > >::~vector() 0000000000401170 W std::vector<std::shared_ptr<int>, std::allocator<std::shared_ptr<int> > >::~vector() 0000000000401250 W std::vector<std::unique_ptr<int, std::default_delete<int> >, std::allocator<std::unique_ptr<int, std::default_delete<int> > > >::~vector() 0000000000401250 W std::vector<std::unique_ptr<int, std::default_delete<int> >, std::allocator<std::unique_ptr<int, std::default_delete<int> > > >::~vector()                  U std::ios_base::Init::Init()                  U std::ios_base::Init::~Init() 0000000000402880 B std::cout                  U std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&) 0000000000402841 b std::__ioinit                  U std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)                  U operator delete(void*)                  U operator new(unsigned long) 0000000000401510 r __FRAME_END__ 0000000000402818 d __JCR_END__ 0000000000402818 d __JCR_LIST__ 0000000000402820 d __TMC_END__ 0000000000402820 d __TMC_LIST__ 0000000000402838 A __bss_start                  U __cxa_atexit 0000000000402808 D __data_start 0000000000401100 t __do_global_dtors_aux 0000000000402820 t __do_global_dtors_aux_fini_array_entry 0000000000402810 d __dso_handle 0000000000402828 t __frame_dummy_init_array_entry                  w __gmon_start__                  U __gxx_personality_v0 0000000000402838 t __init_array_end 0000000000402828 t __init_array_start 00000000004012b0 T __libc_csu_fini 00000000004012c0 T __libc_csu_init                  U __libc_start_main                  w __pthread_key_create 0000000000402838 A _edata 0000000000402990 A _end 000000000040134c T _fini 0000000000400e68 T _init 0000000000401028 T _start 0000000000401054 t call_gmon_start 0000000000402840 b completed.6661 0000000000402808 W data_start 0000000000401080 t deregister_tm_clones 0000000000401120 t frame_dummy 0000000000400f40 T main 00000000004010c0 t register_tm_clones

742

asked Nov 09 '17 12:11

Nawaz

2 Answers

This is likely a problem caused by subtle mismatches between libstdc++ ABIs. GCC 4.9 is not the system compiler on Red Hat Enterprise Linux 5, so it's not quite clear what you are using there (DTS 3?).

The locale implementation is known to be quite sensitive to ABI mismatches. See this thread on the gcc-help list:

Binary compatibility between an old static libstdc++ and a new dynamic one
plus follow-ups in the next month

Your best bet is to figure out which bits of libstdc++ where linked where, and somehow achieve consistency (either by hiding symbols, or recompiling things so that they are compatible).

It may also be useful to investigate the hybrid linkage model used for libstdc++ in Red Hat's Developer Toolset (where newer bits are linked statically, but the bulk of the C++ standard library uses the existing system DSO), but the system libstdc++ in Red hat Enterprise Linux 5 might be too old for that if you need support for current language features.

answered Sep 18 '22 13:09

Florian Weimer

Given the point of crash, and the fact that preloading libpthread seems to fix it, I believe that the execution of the two cases diverges at locale_init.cc:315. Here is an extract of the code:

  void   locale::_S_initialize()   { #ifdef __GTHREADS     if (__gthread_active_p())       __gthread_once(&_S_once, _S_initialize_once); #endif     if (!_S_classic)       _S_initialize_once();   }

__gthread_active_p() returns true if your program is linked against pthread, specifically it checks if pthread_key_create is available. On my system, this symbol is defined in "/usr/include/c++/7.2.0/x86_64-pc-linux-gnu/bits/gthr-default.h" as static inline, hence it is a potential source of ODR violation.

Notice that LD_PRELOAD=libpthread,so will always cause __gthread_active_p() to return true.

__gthread_once is another inlined symbol that should always forward to pthread_once.

It's hard to guess what's going on without debugging, but I suspect that you are hitting the true branch of __gthread_active_p() even when it shouldn't, and the program then crashes because there is no pthread_once to call.

EDIT: So I did some experiments, the only way I see to get a crash in std::locale::_S_initialize is if __gthread_active_p returns true, but pthread_once is not linked in.

libstdc++ does not link directly against pthread, but it imports half of pthread_xx as weak objects, which means they can be undefined and not cause a linker error.

Obviously linking pthread will make the crash disappear, but if I am right, the main issue is that your libstdc++ thinks that it is inside a multi-threaded executable even if we did not link pthread in.

Now, __gthread_active_p uses __pthread_key_create to decide if we have threads or no. This is defined in your executable as a weak object (can be nullptr and still be fine). I am 99% sure that the symbol is there because of shared_ptr (remove it and check nm again to be sure). So, somehow __pthread_key_create gets bound to a valid address, maybe because of that last -lpthread in your linker flags. You can verify this theory by putting a breakpoint at locale_init.cc:315 and checking which branch you take.

EDIT2:

Summary of the comments, the issue is only reproducible if we have all of the following:

Use ld.gold instead of ld.bfd
Use --as-needed
Forcing a weak definition of __pthread_key_create, in this case via instantiation of std::shared_ptr.
Not linking to pthread, or linking pthread after --as-needed.

To answer the questions in the comments:

Why does it use gold by default?

By default it uses /usr/bin/ld, which on most distro is a symlink to either /usr/bin/ld.bfd or /usr/bin/ld.gold. Such default can be manipulated using update-alternatives. I am not sure why in your case it is ld.gold, as far as I understand RHEL5 ships with ld.bfd as default.

And why does gold not add pthread.so dependency to the binary if it is needed?

Because the definition of what is needed is somehow shady. man ld says (emphasis mine):

--as-needed

--no-as-needed

This option affects ELF DT_NEEDED tags for dynamic libraries mentioned on the command line after the --as-needed option. Normally the linker will add a DT_NEEDED tag for each dynamic library mentioned on the command line, regardless of whether the library is actually needed or not. --as-needed causes a DT_NEEDED tag to only be emitted for a library that at that point in the link satisfies a non-weak undefined symbol reference from a regular object file or, if the library is not found in the DT_NEEDED lists of other needed libraries, a non-weak undefined symbol reference from another needed dynamic library. Object files or libraries appearing on the command line after the library in question do not affect whether the library is seen as needed. This is similar to the rules for extraction of object files from archives. --no-as-needed restores the default behaviour.

Now, according to this bug report, gold is honoring the "non weak undefined symbol" part, while ld.bfd sees weak symbols as needed. TBH I do not have a full understanding on this, and there is some discussion on that link as to whether this is to be considered a ld.gold bug, or a libstdc++ bug.

Why do I need to mention -pthread and -lpthread both? (-pthread is passed by default by our build system, and I've pass -lpthread to make it work with gold is used).

-pthread and -lpthread do different things (see pthread vs lpthread). It is my understanding that the former should imply the latter.

Regardless, you can probably pass -lpthread only once, but you need to do it before --as-needed, or use --no-as-needed after the last library and before -lpthread.

It is also worth mentioning that I was not able to reproduce this issue on my system (GCC 7.2), even using the gold linker. So I suspect that it has been fixed in a more recent version libstdc++, which might also explain why it does not segfault if you use the system standard library.

answered Sep 19 '22 13:09

sbabbi

Related questions
                            
                                OpenCV CV::Mat and Eigen::Matrix
                            
                                What's faster: inserting into a priority queue, or sorting retrospectively?
                            
                                Fast divisibility tests (by 2,3,4,5,.., 16)?
                            
                                Simple ways to disable parts of code
                            
                                Why do people use enums in C++ as constants while they can use const?
                            
                                Remove ith item from a C++ std::vector [duplicate]
                            
                                Recursive Fibonacci
                            
                                Checking for existence of C++ member function, possibly protected
                            
                                Secure this invaluable documentation on using C/C++ with GSSAPI and SASL
                            
                                clang-format closing bracket on a new line
                            
                                The behavior of value-initializing an enum
                            
                                3d reconstruction from 2 images without info about the camera
                            
                                How to auto indent a C++ class with 4 spaces using clang-format?
                            
                                Stopping C++ 11 std::threads waiting on a std::condition_variable
                            
                                Why doesn't std::bitset come with iterators?
                            
                                Is end() required to be constant in an STL map/set?
                            
                                How to determine which compiler has been used to compile an executable?
                            
                                Are there cases where a typedef is absolutely necessary?
                            
                                Modifying a global variable in a constexpr function in C++17
                            
                                C++ CMake (add non-built files)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Segfault on declaring a variable of type vector<shared_ptr<int>>

Tags:

c++

gcc

segmentation-fault

redhat

ld

Code

Background

Diagnostics

1. ldd

2. gdb when it segfaults

3. LD_DEBUG=all

Questions

Compiler and Platform

Experiments

E#1

E#2

E#3

Compilation and Linking Options

Output of `nm -C` in both cases

Nawaz

People also ask

2 Answers

Florian Weimer

sbabbi

Recent Activity

Donate For Us

Segfault on declaring a variable of type vector<shared_ptr<int>>

Tags:

c++

gcc

segmentation-fault

redhat

ld

Code

Background

Diagnostics

1. ldd

2. gdb when it segfaults

3. LD_DEBUG=all

Questions

Compiler and Platform

Experiments

E#1

E#2

E#3

Compilation and Linking Options

Output of nm -C in both cases

Nawaz

People also ask

2 Answers

Florian Weimer

sbabbi

Related questions

Recent Activity

Donate For Us

Output of `nm -C` in both cases