I'm interested in knowing what is "best" free/OSS compiler for MIPS code, GCC or LLVM, or is there anything even better than those? I'm interested in knowing more about fast and memory constrained generated Assembly code than code size. In other words, does llvm-opt do the job better than gcc -O3?

http://www.phoronix.com/scan.php?page=news_item&px=OTI1MA "LLVM 2.9 Release Candidate 2 Is Here Posted by Michael Larabel on March 25, 2011 no LLVM ARM benchmarks due to lack of hardware..." perhaps someone with a fast dual/quad arm cortex and LLVM ARM etc can run a http://openbenchmarking.org/ bench before monday and Michael can add these to his other results

LLVM vs GCC MIPS code generation, any benchmarks?

2 Answers

http://www.phoronix.com/scan.php?page=news_item&px=OTI1MA "LLVM 2.9 Release Candidate 2 Is Here

Posted by Michael Larabel on March 25, 2011 no LLVM ARM benchmarks due to lack of hardware..."

perhaps someone with a fast dual/quad arm cortex and LLVM ARM etc can run a http://openbenchmarking.org/ bench before monday and Michael can add these to his other results

answered Oct 05 '22 03:10

techU

I dont know about mips, I tried ARM and llvm code was around 10-20% slower than the current gcc. The tests in question were zlib based. a decompression by itself and a compression then decompression. used both clang and llvm-gcc. I preferred clang because the -m32 actually works on a 64 bit host. For the test in question I found that NOT using -O2 (or -O3) produced the fastest code. linked the bytecode modules into one big module and performed one opt with standard optimisations, to get the fastest code. llc was by default -O2 and that did help performance.

EDIT:

Interesting test between gcc and llvm/clang for mips.

void dummy ( unsigned int );
void dowait ( void )
{
    unsigned int ra;
    for(ra=0x80000;ra;ra--) dummy(ra);
}

gcc produced:

9d006034 <dowait>:
9d006034:   27bdffe8    addiu   sp,sp,-24
9d006038:   afb00010    sw  s0,16(sp)
9d00603c:   afbf0014    sw  ra,20(sp)
9d006040:   3c100008    lui s0,0x8
9d006044:   02002021    move    a0,s0
9d006048:   0f40180a    jal 9d006028 <dummy>
9d00604c:   2610ffff    addiu   s0,s0,-1
9d006050:   1600fffd    bnez    s0,9d006048 <dowait+0x14>
9d006054:   02002021    move    a0,s0
9d006058:   8fbf0014    lw  ra,20(sp)
9d00605c:   8fb00010    lw  s0,16(sp)
9d006060:   03e00008    jr  ra
9d006064:   27bd0018    addiu   sp,sp,24

And llvm after assembling

9d006034 <dowait>:
9d006034:   27bdffe8    addiu   sp,sp,-24
9d006038:   afbf0014    sw  ra,20(sp)
9d00603c:   afb00010    sw  s0,16(sp)
9d006040:   3c020008    lui v0,0x8
9d006044:   34440000    ori a0,v0,0x0
9d006048:   2490ffff    addiu   s0,a0,-1
9d00604c:   0f40180a    jal 9d006028 <dummy>
9d006050:   00000000    nop
9d006054:   00102021    addu    a0,zero,s0
9d006058:   1600fffb    bnez    s0,9d006048 <dowait+0x14>
9d00605c:   00000000    nop
9d006060:   8fb00010    lw  s0,16(sp)
9d006064:   8fbf0014    lw  ra,20(sp)
9d006068:   27bd0018    addiu   sp,sp,24
9d00606c:   03e00008    jr  ra
9d006070:   00000000    nop

I say after assembling because I saw gnu-as do things like this

.globl PUT32
PUT32:
    sw $a1,0($a0)
    jr $ra
    nop

and re-arrange the assembly for me:

9d00601c <PUT32>:
9d00601c:   03e00008    jr  ra
9d006020:   ac850000    sw  a1,0(a0)
9d006024:   00000000    nop

The difference between the llvm and gcc produced code is the instructions being placed in the branch defer slot. I used clang and llc to produce assembly output then used binutils, gnu as, to create the binary. So it is a curiosity that for my hand assembled code:

ori $sp,$sp,0x2000
jal notmain
nop

it optimized for me:

9d006004:   0f401820    jal 9d006080 <notmain>
9d006008:   37bd2000    ori sp,sp,0x2000
9d00600c:   00000000    nop

but the llc generated code

addiu   $16, $4, -1
jal dummy
nop

was not

9d006048:   2490ffff    addiu   s0,a0,-1
9d00604c:   0f40180a    jal 9d006028 <dummy>
9d006050:   00000000    nop

answered Oct 05 '22 03:10

old_timer

Related questions
                            
                                Force CMake to use static libraries
                            
                                Mono/Cygwin Issue?
                            
                                What gcc option enables loop unrolling for SSE intrinsics with immediate operands?
                            
                                Should the visibility attribute be specified in declarations or in definitions?
                            
                                Why isn't it possible in C to initialize a constant with another constant? [duplicate]
                            
                                GCC generates different code depending on array index value
                            
                                Error while using R through the command line
                            
                                GCC not working but G++ does
                            
                                GCC optimization differences in recursive functions using globals
                            
                                Best practices to determine stack usage in Ravenscar program
                            
                                How to fix distcc error
                            
                                why are there multiple fcntl.h in linux?
                            
                                Is there any way of doing multiprecision arithmetic(with integers that are greater than 64-bit) in msp430?
                            
                                Writing to Unions, with gcc
                            
                                Is it possible to override virtual functions with external friend lambda functions?
                            
                                The difference between mov and movl instruction in X86? and I meet some trouble when reading assembly [duplicate]
                            
                                Replacing __aeabi_dsub to save space (-flto issues)
                            
                                g++ fails to resolve template function overload
                            
                                std::mutex::lock() produces weird (and unnecessary) asm code
                            
                                Why do gcc and clang generate different symbol names for an instantiation of a function template?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

LLVM vs GCC MIPS code generation, any benchmarks?

Tags:

gcc

assembly

benchmarking

llvm

mips

Paulo Lopes

People also ask

2 Answers

techU

old_timer

Recent Activity

Donate For Us