Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ARM bare-metal multi-core core selection

For a multi-core ARM platform e.g. (Cortex-A53 cluster of 4 CPU's):
How can we assign a specific core to run some function as a base for writing a simple bare scheduler?
How do different mainline RTOS's implement such functionality on ARM?

like image 704
Amir ElAttar Avatar asked Apr 20 '26 03:04

Amir ElAttar


2 Answers

Well assigning a single core to a software or multiple cores to a software is not a hardware configuration that can be changed. It all depends on the use case and how the software flow goes. Lets take an example.

Lets take CortexA53 cluster of 4 cores. Normally in the start a board initializing firmware runs. Like FSBL(First stage bootloader) in case of zcu102 by xilinx, it also has 4 cortexA53 cores. After that runs ATF (arm trusted firmware), after that runs uboot. All these run on core 0.

Important: Now when we launch any binary using u-boot, its launched on core0. Lets say we launched linux. Now after couple of initializations linux will start other cores using some Soc specific registers. Normally two registers are important. When software on one core wants to start the other core it will load the software in memory that it wants to execute on core1 and it will program a special register with its start address and using another special register it will bring it out of reset and core1 will start executing that software.

So you see it all depends on the software that it want to use other cores or not.

So write your code without any fear that it will get automatically executed by other cores and everything going bang!!!

One correction here, for simplicity I said above that linux directly uses registers to start other cores. It is not the case in ARM. We use special calls called SMC. These calls go to secure world where ATF looks into what arguments are passed for the SMC and does the appropriate service.

Extra stuff: To get started quickly use vendor provided startup files and write a simple hello world application by accessing its serial and load it using uboot by following command

fatload mmc 0:1 0x0 app.bin; go 0x0

it will load your application from the sd card and launch it on core0 at address 0x0. Obviously you will have to change the address to the one that your application is linked to and also the partition number given after mmc.

like image 65
Rajnesh Avatar answered Apr 22 '26 03:04

Rajnesh


You are jumping ahead, first you need to see how the chip vendor manages those cores. Falls into two major categories, one is the chip vendor releases resets on all the cores at once, the other is the chip vendor releases reset on one core and that core can then through CSRs release the reset on the other cores. The Raspberry Pi family is an example of the former, Allwinner based stuff is an example of the latter.

It's still very manageable though, the cores are all going to enter at the same place in memory, the reset exception address, so you either place code that sorts the cores from the beginning or as you release each core you change the reset handler somewhere to route each new core to a new place. If you look at the Raspberry Pi bare-metal forum you will see simple code that does this and/or just dump the code that the GPU bootloader places at the beginning of arm ram to sort the cores (boot without a config.txt which parks three of the cores and lets core0 run, then put some code in there so core0 can printout via the uart the contents of the first so many dozen words, disassemble that you can see how they do it). Basically each core has a unique id that you can use to route that cores execution to its own code.

A ported OS should be doing all of this for you.

The early multi-cores was pretty obvious and shown in the technical reference manual that each core had its own clock enable and reset, up to the chip vendor to decide what to do with those. The newer cores and documentation have this black box, so I don't know how that works, but do know that we see both flavors of solution among chip vendors. I find the Broadcom/Pi approach better only because there aren't hidden/undocumented CSRs to find or figure out, where Allwinner you have to wait for someone to hack through that to figure it out. Doesn't mean all Broadcoms nor Allwinners are the same, each company is free to design each individual part however they like and may very well have different solutions. I wouldn't be surprised if Broadcom's Pi related parts have a control register that the GPU fiddles and that we might be able to fiddle ourselves if we could find it.

Once you get the cores running then it is a matter of simply pointing the program counter of a specific core to a specific address. Either from reset you control where that core is, or through interrupts or exceptions to that core you then return control to a different address. No difference from control over a single core processor. There is no magic here.

like image 40
old_timer Avatar answered Apr 22 '26 03:04

old_timer



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!