Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Executing code from RAM in STM32

Tags:

c

gcc

linker

arm

stm32

I recently have started programming on a STM32F4 nucleo board. I just have figured out that programming into the flash is only possible for a limited amount of times (though it's not a few, but it is an evaluation board and it will be programmed over and over to develop different projects). After that I read somewhere that it is possible to directly program into RAM instead of flash, but could not find any technical information about it.

Does anybody know how to modify linker/makefile to compile and link the program to be executed from starting address of RAM and not flash?

ps: I use generated codes by STM32CubeMX for System workbench and a script to generate makefile for the project

like image 805
Nixmd Avatar asked Mar 05 '17 18:03

Nixmd


2 Answers

If you recently started using it then you have a long time before the flash wears out. You might be getting drive full errors, just unplug and replug the board. I have had these things for years and have not worn out the flash yet. Not to say it cant be done, it can but you are not likely there unless you wrote a flash thrashing program that wore it out.

You will need openocd (or some other debugger, maybe your IDE provides that, I dont use those so cant help there). openocd and gnu tools are trivial to come by so going to walk through that.

From the correct directory, or by copying these files from openocd

openocd -f stlink-v2-1.cfg -f stm32f4x.cfg

(one or both might have dependencies other files they include, can pull those in or whatever it takes).

should end with something like this and not exit back to the command line

Info : stm32f4x.cpu: hardware has 6 breakpoints, 4 watchpoints

In another window

telnet localhost 4444

Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Open On-Chip Debugger
> 

In that window you can halt the processor

> halt
stm32f4x.cpu: target state: halted
target halted due to debug-request, current mode: Thread 
xPSR: 0x61000000 pc: 0x080000b2 msp: 0x20000ff0
> 

Full sized arm processors your entry point is an instruction and you just start executing. The cortex-m uses a vector table you cannot just branch there.

.thumb_func
.global _start
_start:
stacktop: .word 0x20001000
.word reset
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang

.thumb_func
reset:
    bl notmain
    b hang

.thumb_func
hang:   b .

You could in theory branch to the reset handler address, but the linker script is going to want that in flash, anything position dependent will not work. And your stack pointer might not be set if you rely on the vector table to do that. so instead something like this would work, part of a complete example

sram.s

.cpu cortex-m0
.thumb

.thumb_func
.global _start
_start:
    ldr r0,stacktop
    mov sp,r0
    bl notmain
    b .

.align
stacktop: .word 0x20001000

.thumb_func
.globl PUT32
PUT32:
    str r1,[r0]
    bx lr

.thumb_func
.globl GET32
GET32:
    ldr r0,[r0]
    bx lr

notmain.c

void PUT32 ( unsigned int, unsigned int );
unsigned int GET32 ( unsigned int );

int notmain ( void )
{
    unsigned int ra;
    ra=GET32(0x20000400);
    PUT32(0x20000404,ra);
    PUT32(0x20000400,ra+1);
    return(0);
}

sram.ld

MEMORY
{    
    rom : ORIGIN = 0x08000000, LENGTH = 0x1000
    ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}

SECTIONS
{
    .text : { *(.text*) } > ram
    .rodata : { *(.rodata*) } > ram
    .bss : { *(.bss*) } > ram
}

basically replace the rom references with ram. (your linker script if gnu is likely way more complicated than this one, but this works just fine could add a .data here as needed).

arm-none-eabi-as --warn --fatal-warnings -mcpu=cortex-m0 flash.s -o flash.o
arm-none-eabi-gcc -Wall -Werror -O2 -nostdlib -nostartfiles -ffreestanding  -mcpu=cortex-m0 -mthumb -c notmain.c -o notmain.o
arm-none-eabi-ld -o notmain.flash.elf -T flash.ld flash.o notmain.o
arm-none-eabi-objdump -D notmain.flash.elf > notmain.flash.list
arm-none-eabi-objcopy notmain.flash.elf notmain.flash.bin -O binary
arm-none-eabi-as --warn --fatal-warnings -mcpu=cortex-m0 sram.s -o sram.o
arm-none-eabi-ld -o notmain.sram.elf -T sram.ld sram.o notmain.o
arm-none-eabi-objdump -D notmain.sram.elf > notmain.sram.list
arm-none-eabi-objcopy notmain.sram.elf notmain.sram.hex -O ihex
arm-none-eabi-objcopy notmain.sram.elf notmain.sram.bin -O binary

my build of both a flash version and an sram version of the program.

so now we have our telnet into the openocd server, the processor is halted, lets look at a memory location and change it

> mdw 0x20000400
0x20000400: 7d7d5889 
> mww 0x20000400 0x12345678
> mdw 0x20000400           
0x20000400: 12345678 

and run our new sram based program

> load_image /path/to/notmain.sram.elf
64 bytes written at address 0x20000000
downloaded 64 bytes in 0.008047s (7.767 KiB/s)
> resume 0x20000001

let it run, script speed is probably still to slow but certainly taking the time to type the halt command is plenty.

> halt
stm32f4x.cpu: target state: halted
target halted due to debug-request, current mode: Thread 
xPSR: 0x41000000 pc: 0x20000008 msp: 0x20001000
> mdw 0x20000400 10
0x20000400: 12345679 12345678 ce879a24 fc4ba5c7 997e5367 9db9a851 40d5083f fbfbcff8 
0x20000420: 035dce6b 65a7f13c 
> 

so the program ran, the program reads 0x20000400 saves it to 0x20000404 increments and saves that to 0x20000400 and it did all of that.

> load_image /path/to/notmain.sram.elf
64 bytes written at address 0x20000000
downloaded 64 bytes in 0.008016s (7.797 KiB/s)
> resume 0x20000000
> halt
stm32f4x.cpu: target state: halted
target halted due to debug-request, current mode: Thread 
xPSR: 0x41000000 pc: 0x20000008 msp: 0x20001000
> mdw 0x20000400 10                           
0x20000400: 1234567a 12345679 ce879a24 fc4ba5c7 997e5367 9db9a851 40d5083f fbfbcff8 
0x20000420: 035dce6b 65a7f13c 
> 

so we didnt need to or the start address with one, which you do with a BX, they must just shove the address right into the pc, and/or do the right thing for us.

If you were to only modify your linker script to replace the roms with rams.

20000000 <_start>:
20000000:   20001000
20000004:   20000041
20000008:   20000047
2000000c:   20000047
20000010:   20000047
20000014:   20000047
20000018:   20000047
2000001c:   20000047
20000020:   20000047
20000024:   20000047
20000028:   20000047
2000002c:   20000047
20000030:   20000047
20000034:   20000047
20000038:   20000047
2000003c:   20000047

20000040 <reset>:
20000040:   f000 f806   bl  20000050 <notmain>
20000044:   e7ff        b.n 20000046 <hang>

you could use the 0x20000041 address as your entry point (resume 0x20000041) but you have to deal with the stack pointer first.

By doing something like this

> reg sp 0x20001000
sp (/32): 0x20001000
> reg sp
sp (/32): 0x20001000
> resume 0x20000041

Note that the ram on theses is faster than rom and doesnt need wait states as you increase the clock frequency so if you do increase the clock frequency and debug in ram only, it may fail when you switch over to flash if you have not remembered to set the flash wait states...Other than that and having significantly less room for programs you can develop in ram all day long if you want.

One nice feature is that you can keep halting and re-loading. I dont know on this device/debugger, if you turn on the cache (some cortex-m4s have a cache if not all) you have to be careful to make sure that is off when you change programs. writing to memory is a data operation fetching instructions is an instruction fetch operation which could land in an instruction cache, if you execute some instruction at 0x20000100 and it gets cached in I cache. then you halt using the debugger then write a new program including the addresses in cache (0x20000100) when you run it the I cache has not been flushed so you would be running a mixture of prior programs in cache and the new program in data, which is a disaster at best. So either never turn on the caches when running this way or come up with a solution to this problem (clear the caches before you stop the program, use the reset button to reset the processor between runs, power cycle, etc).

like image 61
old_timer Avatar answered Oct 29 '22 23:10

old_timer


First of all - don't think about saving flash too much. When I started with microcontrollers, I had the same plan as you, but later come to conclusion, that it doesn't really make sense at all. An example STM32F4 chip has a flash that guarantees minimum 10000 write/erase cycles. You would have to program your board 14 times a day, every single day for straight two years to reach that value. And even after you reach it, it's not said that the flash stops working immediately. Most likely you shouldn't count on the flash contents to be retained for the guaranteed 20 years. All that effort is not worth the trouble given the endurance and usual use cycles (on average, your board will see maybe a few write/erase cycles per day, and you probably won't be playing with it after several years anyway). Especially if we talk about cheap boards.

TL;DR: just don't try to save flash. It's not worth all the hassle.

If you really want to execute code from RAM and not write flash at all, just remember that this is only possible with a debugger. Otherwise you'd have to write your code to flash, with a small routine that would copy it to RAM and then execute it from there - which would be totally pointless given your original idea of preserving the flash. Anyway - if you want to do that, it's pretty simple and all you have to do is modify the linker script. First of all completely delete "rom" (or maybe "flash" or sth similar) memory block from MEMORY section. Now replace all uses of the deleted memory with the RAM memory block, so you should probably replace all occurences of "rom" with "ram" (or maybe "flash" with "sram" or sth like that). At this stage it should actually work. The last thing you should do is completely remove the code and functionality to perform .data section initialization - that would require modifications of the linker script (make sure the LMA of this section is identical to it's VMA), and removal of the initialization code from Reset handler routine.

Please note that for this procedure to work you should either:

  • select "boot from SRAM" with BOOT0 & BOOT1 pins,
  • force PC and SP to correct addresses with debugger.

For your Nucleo board the first option is unfortunately not available, as BOOT1 pin (which should be high in this case) is shorted to GND.

But again - just don't do it, it's not worth the trouble.

like image 42
Freddie Chopin Avatar answered Oct 29 '22 22:10

Freddie Chopin