I'm assembling the following piece of assembler:
.syntax unified
.cpu cortex-m4
.thumb
.section .text
orr r1, #12800
orr r1, #12801
Essentially, just two OR instructions. If I look at the results with objdump
, I get:
bla.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <.text>:
0: f441 5148 orr.w r1, r1, #12800 ; 0x3200
4: f243 2101 movw r1, #12801 ; 0x3201
The second OR is silently changed into a MOVW! The assembler was run as follows: arm-none-eabi-gcc -g -Wall -c bla.s
and it didn't show any warnings.
The version of as
is GNU assembler version 2.29.51 (arm-none-eabi) using BFD version (GNU Tools for Arm Embedded Processors 7-2017-q4-major) 2.29.51.20171128
, running on OSX.
Any idea why the second OR is changed into a MOV?
.syntax unified
.cpu cortex-m4
.thumb
.section .text
orr r1, #12800
orr r1, #12801
arm-none-eabi-as --version GNU assembler (GNU Binutils) 2.29.1 Copyright (C) 2017 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or later. This program has absolutely no warranty. This assembler was configured for a target of `arm-none-eabi'.
build
arm-none-eabi-as so.s -o so.o
arm-none-eabi-objdump -D so.o
so.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <.text>:
0: f441 5148 orr.w r1, r1, #12800 ; 0x3200
4: f243 2101 movw r1, #12801 ; 0x3201
Jester has the answer in a comment, you should upvote that.
2.30 was just released a couple of days ago. It also produces the same results.
Working backward the issues started between 2.27.1 and 2.28. The tc-arm.c changes for that release were related to the addition of armv8m. (Cortex-m23 and cortex-m33)
Here is the bug in gas
/* MOV accepts both Thumb2 modified immediate (T2 encoding) and
UINT16 (T3 encoding), MOVW only accepts UINT16. When
disassembling, MOV is preferred when there is no encoding
overlap.
NOTE: MOV is using ORR opcode under Thumb 2 mode. */
if (((newval >> T2_DATA_OP_SHIFT) & 0xf) == T2_OPCODE_ORR
&& ARM_CPU_HAS_FEATURE (cpu_variant, arm_ext_v6t2_v8m)
&& !((newval >> T2_SBIT_SHIFT) & 0x1)
&& value >= 0 && value <=0xffff)
{
/* Toggle bit[25] to change encoding from T2 to T3. */
newval ^= 1 << 25;
/* Clear bits[19:16]. */
newval &= 0xfff0ffff;
/* Encoding high 4bits imm. Code below will encode the
remaining low 12bits. */
newval |= (value & 0x0000f000) << 4;
newimm = value & 0x00000fff;
}
The ARM documentation which is over 10 years old now without anyone indicating it is buggy with respect to these instructions.
Yes there is an unused ORR encoding that is used as a MOV encoding, this is typical, not uncommon, in instruction set design. In no way, shape, or form does this mean a MOV is an ORR. Further once the mistake was made to think a MOV was an ORR, then the other MOV encoding was chosen. I am speechless.
Even worse this has been present for almost a year in the released versions of gas. How is that possible?
Part of how it is possible is that GCC knows better it encodes this as two separate instructions.
orr r1,#0x3200
orr r1,#0x0001
So for this to have been found other than the obvious lack of a peer review in the gnu world, would have been for a human to try this. The ARM immediate encoding rules are easier to remember than the thumb rules. Folks are always struggling with immediates it is the nature of the beast for RISC instruction sets. Someone should have hit this by now and someone now has.
Trying on hardware a cortex-m7
test.s
.cpu cortex-m7
.syntax unified
.thumb
.thumb_func
.globl test1
test1:
orr r0,#0x3200
bx lr
.thumb_func
.globl test2
test2:
orr r0,#0x3201
bx lr
run and print out the results
hexstring(test1(0x0000));
hexstring(test2(0x0000));
hexstring(test1(0x00FE));
hexstring(test2(0x00FE));
gas
arm-none-eabi-as --version
GNU assembler (GNU Binutils) 2.30
result
0800005c <test1>:
800005c: f440 5048 orr.w r0, r0, #12800 ; 0x3200
8000060: 4770 bx lr
08000062 <test2>:
8000062: f243 2001 movw r0, #12801 ; 0x3201
8000066: 4770 bx lr
output
00003200
00003201
000032FE
00003201
A MOV is a MOV not an ORR.
You have found a very nasty bug in gnu assembler, I recommend that you file this bug. Despite how obvious this bug is I am very curious to see what happens. I have filed other bugs in the past and they have made excuses rather than fixes, and left the bugs in place. Please post the link to the ticket as a comment if you choose to file this, so we can all see what they do about it.
bada43421274615d0d5f629a61a60b7daa71bc15 tc-arm.c:23596 is the correct commit and location.
The gas team has confirmed that this is a bug, and has checked in a patch. The Bugzilla entry can be found at https://sourceware.org/bugzilla/show_bug.cgi?id=22773
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With