Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What would cause _mm_setzero_si128() to SIGSEGV? [duplicate]

Possible Duplicate:
Qt, GCC, SSE and stack alignment

I am converting a simulator from TinyPTC to WxWidgets. Some graphics routines are optimized with SSE intrinsics. During the initialization of the GUI, the initial state is rendered once, and all of the SSE routines work perfectly. However, if I call them later from an event handler, I get a SIGSEGV.

At first I thought those were some weird alignment issues, but it even happens for:

__m128i zero = _mm_setzero_si128();

When I replace the SSE routines with non-optimized code, everything works fine.

I suppose the event handling happens in a different thread than the initialization. Is there anything to watch out for when using SSE from different threads? What else could possibly cause this behavior?


The SIGSEGV happens at a movdqa %xmm0, -40(%ebp) instruction (there are several of those). If I compile with -O1, the movdqa instructions are completely optimized away, and the program runs fine. It seems to be an alignment issue with the stack after all, as already pointed out in the comments.

Here is the command CodeLite generates for compilation:

g++ -c "x:/some/folder/sse.cpp" -g -O1 -Wall -std=gnu++0x -msse3
-mthreads -DHAVE_W32API_H -D__WXMSW__ -D__WXDEBUG__ -D_UNICODE
-ID:\CodeLite\wxWidgets\lib\gcc_dll\mswud -ID:\CodeLite\wxWidgets\include
-DWXUSINGDLL -Wno-ctor-dtor-privacy -pipe -fmessage-length=0 -o ./Debug/sse.o -I.

Anything unusual? Is it possible that WxWidgets changes the alignment settings somewhere?

like image 276
fredoverflow Avatar asked Oct 08 '22 02:10

fredoverflow


1 Answers

Your stack pointer is probably misaligned. The SSE instructions require that all memory locations are 16-byte aligned. The issue isn't occurring with the _mm_setzero_si128 instruction, which just loads a constant into an SSE register, but rather the instruction that the compiler generated to store that register back into memory on the stack.

First make sure you're not using an outdated version of GCC (older versions had issues with stack alignment with SSE). Then, try also adding the -mstackrealign option for that translation unit, which will forcibly realign the stack to 16-byte alignment on function entry (which adds a very tiny runtime cost).

See Volume 2B page 4-67 of the Intel Architectures Software Developer Manuals for more details on the movdqa instruction and the exact conditions under which it can generate exceptions.

like image 83
Adam Rosenfield Avatar answered Oct 13 '22 12:10

Adam Rosenfield