I have been struggling to get a call stack in a Windows executable. I have tried several different ways to obtain the call stack. The following are some examples. Note that I modified them slightly and removed error handling to make them easy to understand so they may not compile as is. I think you get the point.
The simple way:
const int max_entries = 10;
void *entries[max_entries];
return CaptureStackBackTrace(0, max_entries, entries, 0);
The low level way:
const int max_entries = 10;
void *entries[max_entries];
void **frame = 0;
__asm { mov frame, ebp }
unsigned int i = 0;
while(frame && i < max_entries) {
entries[i++] = frame[1];
frame = (void **)frame[0];
}
The compatible way:
void *entries[max_entries];
CONTEXT context;
RtlCaptureContext(&context);
STACKFRAME64 stack_frame;
ZeroMemory(&stack_frame, sizeof(STACKFRAME64));
stack_frame.AddrPC.Offset = context.Eip;
stack_frame.AddrPC.Mode = AddrModeFlat;
stack_frame.AddrFrame.Offset = context.Ebp;
stack_frame.AddrFrame.Mode = AddrModeFlat;
stack_frame.AddrStack.Offset = context.Esp;
stack_frame.AddrStack.Mode = AddrModeFlat;
unsigned int num_frames = 0;
while (true) {
if (!StackWalk64(IMAGE_FILE_MACHINE_I386, GetCurrentProcess(),
GetCurrentThread(), &stack_frame, &context, NULL,
SymFunctionTableAccess64, SymGetModuleBase64, NULL))
break;
if (stack_frame.AddrPC.Offset == 0)
break;
entries[num_frames++] = reinterpret_cast<void *>(stack_frame.AddrPC.Offset);
}
My problem is that they work in an unoptimized build, but not with full optimization on. What happens is that I get one broken entry and then they exits their loops. In debug I get the full call stack and when I later look up the symbols, it is all correct.
I don't understand how it can be hard to make this work in all builds when the debugger does it all the time. I can specifically say that the frame pointers are not omitted in the code generation. I build for debug first and then only change the optimization from none to full optimization and rebuild to reproduce the call stack failure.
Any hints to a solution will be greatly appreciated.
/Jonas
I got this working using the "compatible way" now. I use the following code to initialize the context:
#define GET_CURRENT_CONTEXT(c, contextFlags) \
do { \
memset(&c, 0, sizeof(CONTEXT)); \
c.ContextFlags = contextFlags; \
__asm call x \
__asm x: pop eax \
__asm mov c.Eip, eax \
__asm mov c.Ebp, ebp \
__asm mov c.Esp, esp \
} while(0);
CONTEXT context;
GET_CURRENT_CONTEXT(context, CONTEXT_FULL);
and then continue to fetch the stack using StackWalk64 as before.
void *entries[max_entries];
STACKFRAME64 stack_frame;
ZeroMemory(&stack_frame, sizeof(STACKFRAME64));
stack_frame.AddrPC.Offset = context.Eip;
stack_frame.AddrPC.Mode = AddrModeFlat;
stack_frame.AddrFrame.Offset = context.Ebp;
stack_frame.AddrFrame.Mode = AddrModeFlat;
stack_frame.AddrStack.Offset = context.Esp;
stack_frame.AddrStack.Mode = AddrModeFlat;
unsigned int num_frames = 0;
while (true) {
if (!StackWalk64(IMAGE_FILE_MACHINE_I386, GetCurrentProcess(),
GetCurrentThread(), &stack_frame, &context, NULL,
SymFunctionTableAccess64, SymGetModuleBase64, NULL))
break;
if (stack_frame.AddrPC.Offset == 0)
break;
entries[num_frames++] = reinterpret_cast<void *>(stack_frame.AddrPC.Offset);
}
I noticed that I forgot to clear the CONTEXT structure before sending it to RtlCaptureContext so I tried to do it like this (because I would prefer to use the RtlCaptureContext function).
CONTEXT context;
memset(&context, 0, sizeof(CONTEXT));
context.ContextFlags = CONTEXT_FULL;
RtlCaptureContext(&context);
Now RtlCaptureContext crashes, so I went back to using the GET_CURRENT_CONTEXT macro.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With