I'm currently playing around with the STM32F303xx family of chips. They feature the core coupled memory (CCMRAM) which allows execution of code unlike the CCM found on the F4 series. I have put the critical routines (e.g ISR's) into the CCM and was wondering what would be the most efficient setup, putting the Interrupt Vector Table also into CCM or into the normal SRAM and am kind of stuck on that one. Can anybody hint me in the right direction?
(Core-Coupled memory, is a special area of memory offset from standard RAM, CCM can't be DMA'ed to, but it does provide no-wait state access to memory.
The STM32 F2, F3, and F4 families have a special block of SRAM available called CCM (Core Coupled Memory). This memory has the drawback that it cannot be used for STM32 DMA operations. By default, the CCM memory is lumped in with the rest of memory when the NuttX heaps are created.
I am not sure that it makes any difference to code execution performance directly, but the critical thing is the bus architecture, and where you place data and code, and whether you are executing DMA operations or will be writing to the flash memory.
The Flash memory, the SRAM and the CCM are each on a separate bus, on many STM32 parts the SRAM, and for larger parts the flash are further divided into more than one bus. So when code is executed from one, data can be fetched concurrently from another. If however you place your data and instructions in the same memory, instruction and data access must be serialised. Equally if you have DMA operations to/from memory that can also impact both data access and instruction fetch from the same memory.
For the most part, there is little or no latency for code execution from on-chip flash on an STM32 due to the flash accelerator, so there may be little to gain from placing code in the CCM at all. Code that needs to execute while programming the flash memory is an exception, since flash write/erase operations stall the bus for a significant length of time on STM32.
For performance it is best to arrange it such that DMA, instruction fetch and data access all occur on separate busses for the most part. Bearing in mind also that you cannot DMA or bit-band access the CCM. So CCM is good for instruction or data (where DMA or bitband access is not required), but ideally not both at the same time.
When either CCM or SRAM is used for code you have the added linker/start-up complexity of placing code in RAM, and the possibility of code corruption from errant code or security flaws with little or no significant performance benefit compared to on-chip flash. External memory, of any kind will be significantly slower - partly because of the clock rate of the EMIF, and also because it is a single bus for both data and instruction for all external memories.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With