Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a programmatic way to check stack corruption

I am working with a multithreaded embedded application. Each thread is allocated stack sizes based on its functionality. Recently we found that one of the thread corrupted the stack by defining a array of local variables that was more than the stack size. The OS is uItron.

My solution, I registered a timer for 10 mS, and this timer will check for stack corruption.

Stack corruption checking method, 1. Initialize the stack memory with some unique pattern (I use 0x5A5A5A5A) 2. Check from the time if top of the stack memory is still 0x5A5A5A5A

My question,

Is there a better way to check this type of corruption

Forgot to add, adding now: OS : Itron, Processor : ARM9. Compiler : Is not GCC (ARM9 specific supplied by the processor vendor)... And there is no built in support for stack checking...

like image 848
Alphaneo Avatar asked Sep 15 '09 01:09

Alphaneo


People also ask

How do you detect stack corruption?

When a stack corruption is detected, one should look at the local variables in the called and calling functions to look for possible sources of memory corruption. Check array and pointer declarations for sources of errors. Sometimes stray corruption of a processors registers might also be due to a stack corruption.

How is stack overflow diagnosed?

A method of detecting stack overflows is to create a canary space at the end of each task. This space is filled with some known data. If this data is ever modified, then the application has written past the end of the stack.

How do you prevent stack corruption?

Compiling with --fstack-protector-all will cause your program to abort (with signal SIGABRT) when it returns from a function that corrupts the stack, if that corruption includes the area of the stack around the return address.

What is heap corruption stack?

Heap corruption occurs when a program damages the allocator's view of the heap. The outcome can be relatively benign and cause a memory leak (where some memory isn't returned to the heap and is inaccessible to the program afterward), or it may be fatal and cause a memory fault, usually within the allocator itself.


1 Answers

ARM9 has JTAG/ETM debugging support on-die; you should be able to set up a data access watchpoint covering e.g. 64 bytes near the top of your stacks, which would then trigger a data abort, which you could catch in your program or externally.

(The hardware I work with only supports 2 read/write watchpoints, not sure if that's a limitation of the on-chip stuff or the surrounding third-party debug kit.)

This document, which is an extremely low-level description of how to interface with the JTAG functionality, suggests you read your processor's Technical Reference Manual -- and I can vouch that there's a decent amount of higher-level info in chapter 9 ("Debug Support") for the ARM946E-S r1p1 TRM.

Before you dig into understanding all this stuff (unless you're just doing it for fun/education), double-check that the hardware and software you're using won't already manage breakpoints/watchpoints for you. The concept of "watchpoint" was a bit hard to find in the debugging software we use -- it was a tab labelled "Hardware" in the add breakpoint dialog.


Another alternative: your compiler may support a command-line option to add function calls at the entry and exit points of functions (some sort of "void enterFunc(const char * callingFunc)" and "void exitFunc(const char * callingFunc)"), for function cost profiling, more accurate stack tracing, or similar. You can then write these functions to check your stack canary value.

(As an aside, in our case we actually ignore the function name that is passed in (I wish I could get the linker to strip these) and just use the processor's link register (LR) value to record where we came from. We use this for getting accurate call traces as well as profiling information; checking the stack canaries at this point would be trivial too!)

The problem is, of course, that calling these functions changes the register and stack profiles for the functions a bit... Not much, in our experiments, but a bit. The performance implications are worse, and wherever there's a performance implication there's the chance of a behavior change in the program, which may mean you e.g. avoid triggering a deep-recursion case that you might have before...


Very late update: these days, if you have a clang+LLVM based pipeline, you may be able to use Address Sanitizer (ASAN) to catch some of these. Be on the lookout for similar features in your compiler! (It's worth knowing about UBSAN and the other sanitizers too.)

like image 154
leander Avatar answered Oct 20 '22 00:10

leander