Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Differences between ARM architectures from a C programmer's perspective?

I'm fairly new to programming for ARM. I've noticed there are several architectures like ARMv4, ARMv5, ARMv6, etc. What is the difference between these? Do they have different instruction sets or behaviors?

Most importantly, if I compile some C code for ARMv6, will it run on ARMv5? What about ARMv5 code running on ARMv6? Or would I only have to worry about the difference if I were writing kernel assembly code?

like image 853
Jay Conrod Avatar asked Dec 07 '10 20:12

Jay Conrod


People also ask

What is the difference between x86 and Arm architecture?

ARM vs. x86. The primary difference between the two major processors is that ARM utilizes smaller silicon space and lower power, conserving energy for longer battery life. Meanwhile, x86 delivers far more power and higher performance.

What are x86 and ARM?

ARM is a RISC (Reduced Instruction Set Computing) architecture while x86 is a CISC (Complex Instruction Set Computing) one.

How does Arm architecture work?

Arm architecture specifies a set of rules that dictate how the hardware works when a particular instruction is executed. It is a contract between the hardware and the software, defining how they interact with one another.

What is ARM based architecture?

ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer (RISC) instruction set architectures for computer processors, configured for various environments.


2 Answers

The ARM world is a bit messy.

For the C programmers, things are simple: all ARM architectures offer a regular, 32-bit with flat addressing programming model. As long as you stay with C source code, the only difference you may see is about endianness and performance. Most ARM processors (even old models) can be both big-endian and little-endian; the choice is then made by the logic board and the operating system. Good C code is endian neutral: it compiles and works correctly, regardless of the platform endianness (endian neutrality is good for reliability and maintainability, but also for performance: non-neutral code is code which accesses the same data through pointers of distinct sizes, and this wreaks havoc with the strict aliasing rules that the compiler uses to optimize code).

The situation is quite different if you consider binary compatibility (i.e. reusing code which has been compiled once):


  • There are several instruction sets:
    1. the original ARM instruction set with a 26-bit program counter (very old, very unlikely to be encountered nowadays)
    2. the ARM instruction set with a 32-bit program counter (often called "ARM code")
    3. the Thumb instruction set (16-bit simplified opcodes)
    4. the Thumb-2 instruction set (Thumb with extensions)

A given processor may implement several instruction sets. The newest processor which knows only ARM code is the StrongARM, an ARMv4 representative which is already quite old (15 years). The ARM7TDMI (ARMv4T architecture) knows both ARM and Thumb, as do almost all subsequent ARM systems except the Cortex-M. ARM and Thumb code can be mixed together within the same application, as long as the proper glue is inserted where conventions change; this is called thumb interworking and can be handled automatically by the C compiler.

The Cortex-M0 knows only Thumb instructions. It knows a few extensions, because in "normal" ARM processors, the operating system must use ARM code (for handling interrupts); thus, the Cortex-M0 knows a few Thumb-for-OS things. This does not matter for application code.

The other Cortex-M know only Thumb-2. Thumb-2 is mostly backward compatible with Thumb, at least at assembly level.


  • Some architectures add extra instructions.

Thus, if some code is compiled with a compiler switch telling that this is for an ARMv6, then the compiler may use one of the few instructions with the ARMv6 has but not the ARMv5. This is a common situation, encountered on almost all platforms: e.g., if you compile C code on a PC, with GCC, using the -march=core2 flag, then the resulting binary may fail to run on an older Pentium processor.


  • There are several call conventions.

The call convention is the set of rules which specify how functions exchange parameters and return values. The processor knows only of its registers, and has no notion of a stack. The call convention tells in which registers parameters go, and how they are encoded (e.g. if there is a char parameter, it goes in the low 8 bits of a register, but is the caller supposed to clear/sign-extend the upper 24 bits, or not ?). It describes the stack structure and alignment. It normalizes alignment conditions and padding for structure fields.

There are two main conventions for ARM, called ATPCS (old) and AAPCS (new). They are quite different on the subject of floating point values. For integer parameters, they are mostly identical (but AAPCS requires a stricter stack alignment). Of course, conventions vary depending on the instruction set, and the presence of Thumb interworking.

In some cases, it is possible to have some binary code which conforms to both ATPCS and AAPCS, but that is not reliable and there is no warning on mismatch. So the bottom-line is: you cannot have true binary compatibility between systems which use distinct call conventions.


  • There are optional coprocessors.

The ARM architecture can be extended with optional elements, which add their own instructions to the core instruction set. The FPU is such an optional coprocessor (and it is very rarely encountered in practice). Another coprocessor is NEON, a SIMD instruction set found on some of the newer ARM processors.

Code which uses a coprocessor will not run on a processor which does not feature that coprocessor, unless the operating system traps the corresponding opcodes and emulates the coprocessor in software (this is more or less what happens with floating-point arguments when using the ATPCS call convention, and it is slow).


To sum up, if you have C code, then recompile it. Do not try to reuse code compiled for another architecture or system.

like image 197
Thomas Pornin Avatar answered Oct 17 '22 09:10

Thomas Pornin


Think of this ARM vs ARM thing like a wintel computer vs an intel mac. Assume even you have the same intel chip (family) on both computers, so portions of your C code could be compiled one time and run on both processors just fine. Where and why your programs vary has nothing to do with the intel processor but everything to do with the chips and motherboard around it plus the operating system in this case.

With ARM vs ARM most of the differences are not the core but the vendor specific logic that surrounds the core. so it is a loaded question, if your C code is some application calling standard api calls then it should compile on arm or intel or powerpc or whatever. If your application gets into talking to on chip or on board peripherals then no matter what the processor type is, one board, one chip will vary and as a result your C code has to be written for that chip or motherboard. If you compile a binary for ARMv6 it can and will have instructions deemed undefined on an ARMv4 and will cause an exeception. If you compile for ARMv4 the ARMv6 should run it just fine.

At best, if you are in this application space, then what you will likely see is just performance differences. Some of which have to do with your choice in compiler options. And sometimes you can help with your code. I recommend avoiding divides and floating point wherever possible. I dont like multiplies but will take a multiply instead of a divide if pushed. x86 has gotten us spoiled with unaligned accesses, if you start now with aligned I/O, it will save you down the road as you get into other chips that also prefer aligned accesses, and or you get bit by the various was operating systems and bootloaders configure the ARM to react, none of which is what you were used to on an x86. Likewise keep this habit and your x86 code will run much faster.

Get a copy of the ARM ARM (google: ARM Architectural Reference Manual, you can download it for free many places, I dont know what the current rev is, rev I or something maybe). Browse through the ARM instruction set and see that most instructions are supported on all cores, and some were added over time like divide and byteswap and such. You will see there is nothing to fear between the cores.

Think from a systems perspective, the wintel vs the intel mac. ARM does not make chips, they make and license cores. Most vendors that use an ARM in their chip have their own special sauce around it. So it is like the wintel vs the mac with the same processor in the middle, but completely different when it comes to all the stuff the processor touches and has to use. It doesnt stop with the ARM core, ARM sells peripherals, floating point units, caches, etc. So few if any ARMv4s are the same for example. If your code touches the differences you will have problems if it doesnt you wont.

For the arm portions of the chip in addition to the ARM ARM there are TRMs (Technical Reference Manuals). but if you get the wrong trm for the component you are using it may give you headaches. The TRM may have register descriptions and other such things that the ARM ARM doesnt, but if you are living in application space you likely wont need any of them, nor the ARM ARM. The ARM ARM is good for educational purposes if nothing else. Understanding why you might not want to divide or use unaligned accesses.

like image 36
old_timer Avatar answered Oct 17 '22 08:10

old_timer