Inline NOPs not optimized out in LLVM

Question

I'm working through an example in this overview of compiling inline ARM assembly using GCC. Rather than GCC, I'm using llvm-gcc 4.2.1, and I'm compiling the following C code:

#include <stdio.h>
int main(void) {
    printf("Volatile NOP
");
    asm volatile("mov r0, r0");
    printf("Non-volatile NOP
");
    asm("mov r0, r0");
    return 0;
}

Using the following commands:

llvm-gcc -emit-llvm -c -o compiled.bc input.c
llc -O3 -march=arm -o output.s compiled.bc

My output.s ARM ASM file looks like this:

    .syntax unified
    .eabi_attribute 20, 1
    .eabi_attribute 21, 1
    .eabi_attribute 23, 3
    .eabi_attribute 24, 1
    .eabi_attribute 25, 1
    .file   "compiled.bc"
    .text
    .globl  main
    .align  2
    .type   main,%function
main:                                   @ @main
@ BB#0:                                 @ %entry
    str lr, [sp, #-4]!
    sub sp, sp, #16
    str r0, [sp, #12]
    ldr r0, .LCPI0_0
    str r1, [sp, #8]
    bl  puts
    @APP
    mov r0, r0
    @NO_APP
    ldr r0, .LCPI0_1
    bl  puts
    @APP
    mov r0, r0
    @NO_APP
    mov r0, #0
    str r0, [sp, #4]
    str r0, [sp]
    ldr r0, [sp, #4]
    add sp, sp, #16
    ldr lr, [sp], #4
    bx  lr
@ BB#1:
    .align  2
.LCPI0_0:
    .long   .L.str

    .align  2
.LCPI0_1:
    .long   .L.str1

.Ltmp0:
    .size   main, .Ltmp0-main

    .type   .L.str,%object          @ @.str
    .section    .rodata.str1.1,"aMS",%progbits,1
.L.str:
    .asciz   "Volatile NOP"
    .size   .L.str, 13

    .type   .L.str1,%object         @ @.str1
    .section    .rodata.str1.16,"aMS",%progbits,1
    .align  4
.L.str1:
    .asciz   "Non-volatile NOP"
    .size   .L.str1, 17

The two NOPs are between their respective @APP/@NO_APP pairs. My expectation is that the asm() statement without the volatile keyword will be optimized out of existence due to the -O3 flag, but clearly both inline assembly statements survive.

Why does the asm("mov r0, r0") line not get recognized and removed as a NOP?

5 revs · Accepted Answer

As Mystical and Mārtiņš Možeiko have describe the compiler does not optimize the code; ie, change the instructions. What the compiler does optimize is when the instruction is scheduled. When you use volatile, then the compiler will not re-schedule. In your example, re-scheduling would be moving before or after the printf.

The other optimization the compiler might make is to get C values to register for you. Register allocation is very important to optimization. This doesn't optimize the assembler, but allow the compiler to do sensible things with other code with-in the function.

To see the effect of volatile, here is some sample code,

int example(int test, int add)
{
  int v1=5, v2=0;
  int i=0;
  if(test) {
    asm volatile("add %0, %1, #7" : "=r" (v2) : "r" (v2));
    i+= add * v1;
    i+= v2;
  } else {
    asm ("add %0, %1, #7" : "=r" (v2) : "r" (v2));
    i+= add * v1;
    i+= v2;
  }
  return i;
}

The two branches have identical code except for the volatile. gcc 4.7.2 generates the following code for an ARM926,

example:
   cmp  r0, #0
   bne  1f           /* branch if test set? */
   add  r1, r1, r1, lsl #2
   add  r0, r0, #7   /* add seven delayed */
   add  r0, r0, r1
   bx   lr
1: mov  r0, #0       /* test set */
   add  r0, r0, #7   /* add seven immediate */
   add  r1, r1, r1, lsl #2
   add  r0, r0, r1
   bx   lr

Note: The assembler branches are reversed to the 'C' code. The 2nd branch is slower on some processors due to pipe lining. The compiler prefers that

   add  r1, r1, r1, lsl #2
   add  r0, r0, r1

do not execute sequentially.

The Ethernut ARM Tutorial is an excellent resource. However, optimize is a bit of an overloaded word. The compiler doesn't analyze the assembler, only the arguments and where the code will be emitted.

Timothy Baldwin · Answer

volatile is implied if the asm statement has no outputs declared.

Inline NOPs not optimized out in LLVM

Tags:

gcc

assembly

llvm

arm

Zeke

2 Answers

5 revs

Timothy Baldwin

Recent Activity

Donate For Us

Inline NOPs not optimized out in LLVM

Tags:

gcc

assembly

llvm

arm

Zeke

2 Answers

5 revs

Timothy Baldwin

Related questions

Recent Activity

Donate For Us