Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LLVM vs GCC MIPS code generation, any benchmarks?

I'm interested in knowing what is "best" free/OSS compiler for MIPS code, GCC or LLVM, or is there anything even better than those?

I'm interested in knowing more about fast and memory constrained generated Assembly code than code size.

In other words, does llvm-opt do the job better than gcc -O3?

like image 641
Paulo Lopes Avatar asked Jan 08 '09 11:01

Paulo Lopes


People also ask

Is LLVM faster than GCC?

While LLVM's Clang C/C++ compiler was traditionally known for its faster build speeds than GCC, in recent releases of GCC the build speeds have improved and in some areas LLVM/Clang has slowed down with further optimization passes and other work added to its growing code-base.

Is LLVM better than GCC?

While LLVM and GCC both support a wide variety languages and libraries, they are licensed and developed differently. LLVM libraries are licensed more liberally and GCC has more restrictions for its reuse. When it comes to performance differences, GCC has been considered superior in the past.

Will LLVM replace GCC?

Short answer: No.

Why is LLVM so good?

Each library supports a particular component in a typical compiler pipeline (lexing, parsing, optimizations of a particular type, machine code generation for a particular architecture, etc.). What makes it so popular is that its modular design allows its functionality to be adapted and reused very easily.


2 Answers

http://www.phoronix.com/scan.php?page=news_item&px=OTI1MA "LLVM 2.9 Release Candidate 2 Is Here

Posted by Michael Larabel on March 25, 2011 no LLVM ARM benchmarks due to lack of hardware..."

perhaps someone with a fast dual/quad arm cortex and LLVM ARM etc can run a http://openbenchmarking.org/ bench before monday and Michael can add these to his other results

like image 73
techU Avatar answered Oct 05 '22 03:10

techU


I dont know about mips, I tried ARM and llvm code was around 10-20% slower than the current gcc. The tests in question were zlib based. a decompression by itself and a compression then decompression. used both clang and llvm-gcc. I preferred clang because the -m32 actually works on a 64 bit host. For the test in question I found that NOT using -O2 (or -O3) produced the fastest code. linked the bytecode modules into one big module and performed one opt with standard optimisations, to get the fastest code. llc was by default -O2 and that did help performance.

EDIT:

Interesting test between gcc and llvm/clang for mips.

void dummy ( unsigned int );
void dowait ( void )
{
    unsigned int ra;
    for(ra=0x80000;ra;ra--) dummy(ra);
}

gcc produced:

9d006034 <dowait>:
9d006034:   27bdffe8    addiu   sp,sp,-24
9d006038:   afb00010    sw  s0,16(sp)
9d00603c:   afbf0014    sw  ra,20(sp)
9d006040:   3c100008    lui s0,0x8
9d006044:   02002021    move    a0,s0
9d006048:   0f40180a    jal 9d006028 <dummy>
9d00604c:   2610ffff    addiu   s0,s0,-1
9d006050:   1600fffd    bnez    s0,9d006048 <dowait+0x14>
9d006054:   02002021    move    a0,s0
9d006058:   8fbf0014    lw  ra,20(sp)
9d00605c:   8fb00010    lw  s0,16(sp)
9d006060:   03e00008    jr  ra
9d006064:   27bd0018    addiu   sp,sp,24

And llvm after assembling

9d006034 <dowait>:
9d006034:   27bdffe8    addiu   sp,sp,-24
9d006038:   afbf0014    sw  ra,20(sp)
9d00603c:   afb00010    sw  s0,16(sp)
9d006040:   3c020008    lui v0,0x8
9d006044:   34440000    ori a0,v0,0x0
9d006048:   2490ffff    addiu   s0,a0,-1
9d00604c:   0f40180a    jal 9d006028 <dummy>
9d006050:   00000000    nop
9d006054:   00102021    addu    a0,zero,s0
9d006058:   1600fffb    bnez    s0,9d006048 <dowait+0x14>
9d00605c:   00000000    nop
9d006060:   8fb00010    lw  s0,16(sp)
9d006064:   8fbf0014    lw  ra,20(sp)
9d006068:   27bd0018    addiu   sp,sp,24
9d00606c:   03e00008    jr  ra
9d006070:   00000000    nop

I say after assembling because I saw gnu-as do things like this

.globl PUT32
PUT32:
    sw $a1,0($a0)
    jr $ra
    nop

and re-arrange the assembly for me:

9d00601c <PUT32>:
9d00601c:   03e00008    jr  ra
9d006020:   ac850000    sw  a1,0(a0)
9d006024:   00000000    nop

The difference between the llvm and gcc produced code is the instructions being placed in the branch defer slot. I used clang and llc to produce assembly output then used binutils, gnu as, to create the binary. So it is a curiosity that for my hand assembled code:

ori $sp,$sp,0x2000
jal notmain
nop

it optimized for me:

9d006004:   0f401820    jal 9d006080 <notmain>
9d006008:   37bd2000    ori sp,sp,0x2000
9d00600c:   00000000    nop

but the llc generated code

addiu   $16, $4, -1
jal dummy
nop

was not

9d006048:   2490ffff    addiu   s0,a0,-1
9d00604c:   0f40180a    jal 9d006028 <dummy>
9d006050:   00000000    nop
like image 38
old_timer Avatar answered Oct 05 '22 03:10

old_timer