Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pointer randomly assigned mysterious values (A5A5A5A5 and FFFFFFFF) on a stm32 using freeRTOS causing hardfault

I have a problem with a hardfault that appears at seemingly random times where a pointer is pointing to address A5 or FF (my allowed memory space is far below that at 80000000 and up). It seems to always be the same pointer with these two values.

I'm using an embedded system running a STM32F205RE processor which communicates to a fm/bluetooth/gps chip called cg2900 where this error occurs.

Using a debugger I can see that the pointer is pointing to address A5 and FF respectively during a few testruns. However it seems to happen at random times, sometimes I can run the test for an hour without a failure while other times it crashes 20 seconds in.

I'm running freeRTOS as a scheduler to switch between different tasks (one for radio, one for bluetooth, one for other periodical maintenance) which might interfere somehow.

What can be the cause of this? As it's running custom hardware it can not be ruled out that it's a hardware issue (potentially). Any pointers (no pun intended) on how to approach debugging the issue?

EDIT:

After further investigations it seems that it is very random where it crashes, not just that specific pointer. I used a hardfault handler to get the following values of these registers (all values in hex):

Semi-long run before crash (minutes):

R0 = 1
R1 = fffffffd
R2 = 20000400
R3 = 20007f7c
R12 = 7
LR [R14] = 200000c8  subroutine call return address
PC [R15] = 1010101  program counter
PSR = 8013d0f
BFAR = e000ed38
CFSR = 10000
HFSR = 40000000
DFSR = 0
AFSR = 0
SCB_SHCSR = 0

Very short run before crash (seconds):

R0 = 40026088
R1 = fffffff1
R2 = cb3
R3 = 1
R12 = 34d
LR [R14] = 40026088  subroutine call return address
PC [R15] = a5a5a5a5  program counter
PSR = fffffffd
BFAR = e000ed38
CFSR = 100
HFSR = 40000000
DFSR = 0
AFSR = 0
SCB_SHCSR = 0

Another short one (seconds):

R0 = 0
R1 = fffffffd
R2 = 20000400
R3 = 20007f7c
R12 = 7
LR [R14] = 200000c8  subroutine call return address
PC [R15] = 1010101  program counter
PSR = 8013d0f
BFAR = e000ed38
CFSR = 1
HFSR = 40000000
DFSR = 0
AFSR = 0
SCB_SHCSR = 0

After a very long run (1hour +):

R0 = e80000d0
R1 = fffffffd
R2 = 20000400
R3 = 2000877c
R12 = 7
LR [R14] = 200000c8  subroutine call return address
PC [R15] = 1010101  program counter
PSR = 8013d0f
BFAR = 200400d4
CFSR = 8200
HFSR = 40000000
DFSR = 0
AFSR = 0
SCB_SHCSR = 0

Seems to crash at the same point most of the time. I adjusted the memory according to previous suggestions but I still seem to have the same issue.

Thanks for your time!

Kind regards

like image 234
ChewToy Avatar asked Dec 29 '12 09:12

ChewToy


3 Answers

In your comment you mention that this pointer is explicitly assigned once then never written to. In that case you should at least declare it const and use initialisation rather than assignment, e.g.

arraytype* const ptr = array ;

that will allow the compiler to detect any explicit writes. However it is more likely that the pointer is being corrupted by some unrelated coding error.

The Coretx-M3 on chip debug supports data access breakpoints; you should set such a breakpoint over the pointer in question so that all write accesses to it are trapped. You will get a break on initialisation, then after that on modification - intentional or otherwise.

Likely causes are overrun of an adjacent array or of a thread stack.

like image 112
Clifford Avatar answered Nov 15 '22 07:11

Clifford


If you tried to relocate the array and continues with the same problem,

then some task is overflowing.

As you mentioned, you are using FreeRTOS, and because the behavior is random is likely that something is wrong with your settings STACK_SIZE in calls to xTaskCreate

This usually happens when the allocated size is less than you really need.

If you read the documentation about usStackDepth, you noticed that represents a multiplier and not the number of bytes.

I personally would exclude hardware problems in your embedded board and I would focus on the configuration problems of FreeRTOS

like image 37
RTOSkit Avatar answered Nov 15 '22 09:11

RTOSkit


Turns it that the problem was caused by the memory storage. As I was running the processor at it's highest speed (120 Mhz) and used 1.8 volts supply (it's designed mainly for 3 volts) I had some race conditions with the memory. Resolved it by using a higher wait state.

like image 42
ChewToy Avatar answered Nov 15 '22 08:11

ChewToy