Differences between ARM architectures from a C programmer's perspective?

Tags:

I'm fairly new to programming for ARM. I've noticed there are several architectures like ARMv4, ARMv5, ARMv6, etc. What is the difference between these? Do they have different instruction sets or behaviors?

Most importantly, if I compile some C code for ARMv6, will it run on ARMv5? What about ARMv5 code running on ARMv6? Or would I only have to worry about the difference if I were writing kernel assembly code?

853

asked Dec 07 '10 20:12

Jay Conrod

2 Answers

The ARM world is a bit messy.

For the C programmers, things are simple: all ARM architectures offer a regular, 32-bit with flat addressing programming model. As long as you stay with C source code, the only difference you may see is about endianness and performance. Most ARM processors (even old models) can be both big-endian and little-endian; the choice is then made by the logic board and the operating system. Good C code is endian neutral: it compiles and works correctly, regardless of the platform endianness (endian neutrality is good for reliability and maintainability, but also for performance: non-neutral code is code which accesses the same data through pointers of distinct sizes, and this wreaks havoc with the strict aliasing rules that the compiler uses to optimize code).

The situation is quite different if you consider binary compatibility (i.e. reusing code which has been compiled once):

There are several instruction sets:
1. the original ARM instruction set with a 26-bit program counter (very old, very unlikely to be encountered nowadays)
2. the ARM instruction set with a 32-bit program counter (often called "ARM code")
3. the Thumb instruction set (16-bit simplified opcodes)
4. the Thumb-2 instruction set (Thumb with extensions)

A given processor may implement several instruction sets. The newest processor which knows only ARM code is the StrongARM, an ARMv4 representative which is already quite old (15 years). The ARM7TDMI (ARMv4T architecture) knows both ARM and Thumb, as do almost all subsequent ARM systems except the Cortex-M. ARM and Thumb code can be mixed together within the same application, as long as the proper glue is inserted where conventions change; this is called thumb interworking and can be handled automatically by the C compiler.

The Cortex-M0 knows only Thumb instructions. It knows a few extensions, because in "normal" ARM processors, the operating system must use ARM code (for handling interrupts); thus, the Cortex-M0 knows a few Thumb-for-OS things. This does not matter for application code.

The other Cortex-M know only Thumb-2. Thumb-2 is mostly backward compatible with Thumb, at least at assembly level.

Some architectures add extra instructions.

Thus, if some code is compiled with a compiler switch telling that this is for an ARMv6, then the compiler may use one of the few instructions with the ARMv6 has but not the ARMv5. This is a common situation, encountered on almost all platforms: e.g., if you compile C code on a PC, with GCC, using the -march=core2 flag, then the resulting binary may fail to run on an older Pentium processor.

There are several call conventions.

The call convention is the set of rules which specify how functions exchange parameters and return values. The processor knows only of its registers, and has no notion of a stack. The call convention tells in which registers parameters go, and how they are encoded (e.g. if there is a char parameter, it goes in the low 8 bits of a register, but is the caller supposed to clear/sign-extend the upper 24 bits, or not ?). It describes the stack structure and alignment. It normalizes alignment conditions and padding for structure fields.

There are two main conventions for ARM, called ATPCS (old) and AAPCS (new). They are quite different on the subject of floating point values. For integer parameters, they are mostly identical (but AAPCS requires a stricter stack alignment). Of course, conventions vary depending on the instruction set, and the presence of Thumb interworking.

In some cases, it is possible to have some binary code which conforms to both ATPCS and AAPCS, but that is not reliable and there is no warning on mismatch. So the bottom-line is: you cannot have true binary compatibility between systems which use distinct call conventions.

There are optional coprocessors.

The ARM architecture can be extended with optional elements, which add their own instructions to the core instruction set. The FPU is such an optional coprocessor (and it is very rarely encountered in practice). Another coprocessor is NEON, a SIMD instruction set found on some of the newer ARM processors.

Code which uses a coprocessor will not run on a processor which does not feature that coprocessor, unless the operating system traps the corresponding opcodes and emulates the coprocessor in software (this is more or less what happens with floating-point arguments when using the ATPCS call convention, and it is slow).

To sum up, if you have C code, then recompile it. Do not try to reuse code compiled for another architecture or system.

197

answered Oct 17 '22 09:10

Thomas Pornin

Think of this ARM vs ARM thing like a wintel computer vs an intel mac. Assume even you have the same intel chip (family) on both computers, so portions of your C code could be compiled one time and run on both processors just fine. Where and why your programs vary has nothing to do with the intel processor but everything to do with the chips and motherboard around it plus the operating system in this case.

With ARM vs ARM most of the differences are not the core but the vendor specific logic that surrounds the core. so it is a loaded question, if your C code is some application calling standard api calls then it should compile on arm or intel or powerpc or whatever. If your application gets into talking to on chip or on board peripherals then no matter what the processor type is, one board, one chip will vary and as a result your C code has to be written for that chip or motherboard. If you compile a binary for ARMv6 it can and will have instructions deemed undefined on an ARMv4 and will cause an exeception. If you compile for ARMv4 the ARMv6 should run it just fine.

At best, if you are in this application space, then what you will likely see is just performance differences. Some of which have to do with your choice in compiler options. And sometimes you can help with your code. I recommend avoiding divides and floating point wherever possible. I dont like multiplies but will take a multiply instead of a divide if pushed. x86 has gotten us spoiled with unaligned accesses, if you start now with aligned I/O, it will save you down the road as you get into other chips that also prefer aligned accesses, and or you get bit by the various was operating systems and bootloaders configure the ARM to react, none of which is what you were used to on an x86. Likewise keep this habit and your x86 code will run much faster.

Get a copy of the ARM ARM (google: ARM Architectural Reference Manual, you can download it for free many places, I dont know what the current rev is, rev I or something maybe). Browse through the ARM instruction set and see that most instructions are supported on all cores, and some were added over time like divide and byteswap and such. You will see there is nothing to fear between the cores.

Think from a systems perspective, the wintel vs the intel mac. ARM does not make chips, they make and license cores. Most vendors that use an ARM in their chip have their own special sauce around it. So it is like the wintel vs the mac with the same processor in the middle, but completely different when it comes to all the stuff the processor touches and has to use. It doesnt stop with the ARM core, ARM sells peripherals, floating point units, caches, etc. So few if any ARMv4s are the same for example. If your code touches the differences you will have problems if it doesnt you wont.

For the arm portions of the chip in addition to the ARM ARM there are TRMs (Technical Reference Manuals). but if you get the wrong trm for the component you are using it may give you headaches. The TRM may have register descriptions and other such things that the ARM ARM doesnt, but if you are living in application space you likely wont need any of them, nor the ARM ARM. The ARM ARM is good for educational purposes if nothing else. Understanding why you might not want to divide or use unaligned accesses.

answered Oct 17 '22 08:10

old_timer

Related questions
                            
                                What is the cause of flexible array member not at end of struct error?
                            
                                Adding leading underscores to assembly symbols with GCC on Win32?
                            
                                Sharing memory between two processes (C, Windows)
                            
                                C compound literals, pointer to arrays
                            
                                C - why is strcpy() necessary
                            
                                Is changing a pointer considered an atomic action in C?
                            
                                gcc /usr/bin/ld: error: cannot find -lncurses
                            
                                What's missing/sub-optimal in this memcpy implementation?
                            
                                shared c constants in a header
                            
                                Why 1103515245 is used in rand?
                            
                                What is a Kernel thread?
                            
                                What is a "wide character string" in C language?
                            
                                flock vs lockf on Linux
                            
                                Is `*((*(&array + 1)) - 1)` safe to use to get the last element of an automatic array?
                            
                                What is the equivalent to Posix popen() in the Win32 API?
                            
                                Is there a safe version of strlen?
                            
                                typedef a struct before it's declared
                            
                                C / C++ best practices with signed / unsigned ints and function calls
                            
                                When should I omit the frame pointer?
                            
                                Defining a string with no null terminating char(\0) at the end

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Differences between ARM architectures from a C programmer's perspective?

Tags:

c

architecture

arm

instruction-set

Jay Conrod

People also ask

2 Answers

Thomas Pornin

old_timer

Recent Activity

Donate For Us