Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In assembly, how to add integers without destroying either operand?

Using AT&T syntax on x86-64, I wish to assemble c = a + b; as

add %[a], %[b], %[c]

Unfortunately, GNU's assembler will not do it. Why not?

DETAILS

According to Intel's Software Developer's Manual, rev. 75 (June 2021), vol. 2, section 2.5,

VEX-encoded general-purpose-register instructions have ... instruction syntax support for three encodable operands.

The VEX prefix is an AVX feature, so x86-64 CPUs from Sandy Bridge/Bulldozer onward implement it. That's ten years ago, so GNU's assembler ought to assemble my three-operand instruction, oughtn't it?

To clarify, I am aware that one can write it in the old style as

mov %[a], %[c]
add %[b], %[c]

However, I wish to write it in the new, VEX style. Incidentally, I have informed the assembler that I have a modern CPU by issuing GCC the -march=skylake command-line option.

What is my mistake, please?

SAMPLE CODE

In a C++ wrapper,

#include <cstddef>
#include <iostream>

int main()
{
    volatile int a{8};
    volatile int b{5};
    volatile int c{0};
    //c = a + b;
    asm volatile (
        //"mov %[a], %[c]\n\t"
        //"add %[b], %[c]\n\t"
        "add %[a], %[b], %[c]\n\t"
        : [c] "=&r" (c)
        : [a] "r" (a), [b] "r" (b)
        : "cc"
    );
    std::cout << c << "\n";
}
like image 996
thb Avatar asked Nov 16 '21 19:11

thb


People also ask

How does add work in assembly?

The add instruction adds together its two operands, storing the result in its first operand. Note, whereas both operands may be registers, at most one operand may be a memory location. The inc instruction increments the contents of its operand by one. The dec instruction decrements the contents of its operand by one.

What does LEAQ mean in assembly?

lea is an abbreviation of "load effective address". It loads the address of the location reference by the source operand to the destination operand.


1 Answers

Only a few specific GPR instructions have VEX encodings, primarily the BMI1/BMI2 instructions that were added after AVX already existed. See the list in Table 2-28, which has ANDN, BEXTR, BLSI, BLSMSK, BLSR, BZHI, MULX, PDEP, PEXT, RORX, SARX, SHLX, SHRX, as well as the same list in 5.1.16.1. For example, andn's manual entry lists only a VEX encoding, and's manual entry doesn't list any.

So Intel (unfortunately) didn't introduce a brand new three-operand alternate encoding for the entire instruction set. They just introduced a few specific instructions that take three operands and use VEX for it. In some cases these have similar or equivalent functionality to an existing instruction, e.g. SHLX for SHL with a variable count, and so effectively provide a three-operand version of the previous two-operand instruction, but only in those special cases. There are not equivalent instructions across the board.

The "old style" two-operand form remains the only version of the add instruction. However, as fuz points out in comments, lea can be a good way to add two registers and write the result to a third, subject to some restrictions on operand size.

See Using LEA on values that aren't addresses / pointers? for more general things LEA can do, like copy-and-add a constant to a register, or shift-and-add. Compilers already know this and will use lea where appropriate, any time it saves instructions. (Or with some tune options like -mtune=atom for old in-order Atom, will use lea even when they could have used add.)

If more flexible encodings of common integer instructions other than add existed, like and/xor/sub, gcc -O3 -march=skylake would already be using them in its own asm output, without needing inline asm. Or if alternative instructions could get the job done, like lea for add, would be doing that, so it makes sense to look at compiler output to see what tricks it knows. Trying it yourself would make more sense as something to play around with in a stand-alone .s file that just makes an exit system call, or just to single-step, removing the complexity of using inline asm. (GAS by default doesn't restrict instruction-sets. gcc -march=skylake doesn't pass that on to the assembler, as.)


In your inline asm, your c operand should be to output-only: =r instead of +r. The old value is overwritten, so there's no need to tell the compiler to produce it as an input. (Like you said, you want c = a+b not c += a+b.)

Using a single lea as the asm template means you don't need a =&r early-clobber output, because your asm will read all its inputs before writing that output. In your case, having it as an input/output was probably stopping the compiler from choosing the same register as one of the inputs, which could have broken with mov; add.

like image 129
Nate Eldredge Avatar answered Nov 11 '22 05:11

Nate Eldredge