Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In GCC-style extended inline asm, is it possible to output a "virtualized" boolean value, e.g. the carry flag?

If I have the following C++ code to compare two 128-bit unsigned integers, with inline amd-64 asm:

struct uint128_t {
    uint64_t lo, hi;
};
inline bool operator< (const uint128_t &a, const uint128_t &b)
{
    uint64_t temp;
    bool result;
    __asm__(
        "cmpq %3, %2;"
        "sbbq %4, %1;"
        "setc %0;"
        : // outputs:
        /*0*/"=r,1,2"(result),
        /*1*/"=r,r,r"(temp)
        : // inputs:
        /*2*/"r,r,r"(a.lo),
        /*3*/"emr,emr,emr"(b.lo),
        /*4*/"emr,emr,emr"(b.hi),
        "1"(a.hi));
    return result;
}

Then it will be inlined quite efficiently, but with one flaw. The return value is done through the "interface" of a general register with a value of 0 or 1. This adds two or three unnecessary extra instructions and detracts from a compare operation that would otherwise be fully optimized. The generated code will look something like this:

    mov    r10, [r14]
    mov    r11, [r14+8]
    cmp    r10, [r15]
    sbb    r11, [r15+8]
    setc   al
    movzx  eax, al
    test   eax, eax
    jnz    is_lessthan

If I use "sbb %0,%0" with an "int" return value instead of "setc %0" with a "bool" return value, there's still two extra instructions:

    mov    r10, [r14]
    mov    r11, [r14+8]
    cmp    r10, [r15]
    sbb    r11, [r15+8]
    sbb    eax, eax
    test   eax, eax
    jnz    is_lessthan

What I want is this:

    mov    r10, [r14]
    mov    r11, [r14+8]
    cmp    r10, [r15]
    sbb    r11, [r15+8]
    jc     is_lessthan

GCC extended inline asm is wonderful, otherwise. But I want it to be just as good as an intrinsic function would be, in every way. I want to be able to directly return a boolean value in the form of the state of a CPU flag or flags, without having to "render" it into a general register.

Is this possible, or would GCC (and the Intel C++ compiler, which also allows this form of inline asm to be used) have to be modified or even refactored to make it possible?

Also, while I'm at it — is there any other way my formulation of the compare operator could be improved?

like image 397
Deadcode Avatar asked Feb 20 '10 08:02

Deadcode


People also ask

Does GCC support inline assembly?

There are, in general, two types of inline assembly supported by C/C++ compilers: asm (or __asm__) in GCC. GCC uses a direct extension of the ISO rules: assembly code template is written in strings, with inputs, outputs, and clobbered registers specified after the strings in colons.

What is __ asm __ in C?

The __asm keyword invokes the inline assembler and can appear wherever a C or C++ statement is legal. It cannot appear by itself. It must be followed by an assembly instruction, a group of instructions enclosed in braces, or, at the very least, an empty pair of braces.

What is __ asm __ volatile?

The __asm__ attribute specifies the name to be used in assembler code for the function or variable. The __volatile__ qualifier, generally used in Real-Time-Computing of embedded systems, addresses a problem with compiler tests of the status register for the ERROR or READY bit causing problems during optimization.


2 Answers

Here we are almost 7 years later, and YES, gcc finally added support for "outputting flags" (added in 6.1.0, released ~April 2016). The detailed docs are here, but in short, it looks like this:

/* Test if bit 0 is set in 'value' */
char a;

asm("bt $0, %1"
    : "=@ccc" (a)
    : "r" (value) );

if (a)
   blah;

To understand =@ccc: The output constraint (which requires =) is of type @cc followed by the condition code to use (in this case c to reference the carry flag).

Ok, this may not be an issue for your specific case anymore (since gcc now supports comparing 128bit data types directly), but (currently) 1,326 people have viewed this question. Apparently there's some interest in this feature.

Now I personally favor the school of thought that says don't use inline asm at all. But if you must, yes you can (now) 'output' flags.

FWIW.

like image 112
David Wohlferd Avatar answered Nov 08 '22 04:11

David Wohlferd


I don't know a way to do this. You may or may not consider this an improvement:

inline bool operator< (const uint128_t &a, const uint128_t &b)
{
    register uint64_t temp = a.hi;
    __asm__(
        "cmpq %2, %1;"
        "sbbq $0, %0;"
        : // outputs:
        /*0*/"=r"(temp)
        : // inputs:
        /*1*/"r"(a.lo),
        /*2*/"mr"(b.lo),
        "0"(temp));

    return temp < b.hi;
}

It produces something like:

mov    rdx, [r14]
mov    rax, [r14+8]
cmp    rdx, [r15]
sbb    rax, 0
cmp    rax, [r15+8]
jc is_lessthan
like image 34
andrewffff Avatar answered Nov 08 '22 05:11

andrewffff