I want to write the following loop using GCC extended inline ASM:
long* arr = new long[ARR_LEN]();
long* act_ptr = arr;
long* end_ptr = arr + ARR_LEN;
while (act_ptr < end_ptr)
{
*act_ptr = SOME_VALUE;
act_ptr += STEP_SIZE;
}
delete[] arr;
An array of type long
with length ARR_LEN
is allocated and zero-initialized. The loop walks through the array with an increment of STEP_SIZE
. Every touched element is set to SOME_VALUE
.
Well, this was my first attempt in GAS:
long* arr = new long[ARR_LEN]();
asm volatile
(
"loop:"
"movl %[sval], (%[aptr]);"
"leal (%[aptr], %[incr], 4), %[aptr];"
"cmpl %[eptr], %[aptr];"
"jl loop;"
: // no output
: [aptr] "r" (arr),
[eptr] "r" (arr + ARR_LEN),
[incr] "r" (STEP_SIZE),
[sval] "i" (SOME_VALUE)
: "cc", "memory"
);
delete[] arr;
As mentioned in the comments, it is true that this assembler code is more of a do {...} while
loop, but it does in fact do the same work.
The strange thing about that piece of code really is, that it worked fine for me at first. But when I later tried to make it work in another project, it just seemed as if it wouldn't do anything. I even made some 1:1 copies of the working project, compiled again and... still the result is random.
Maybe I took the wrong constraints for the input operands, but I've actually tried nearly all of them by now and I have no real idea left. What puzzles me in particular is, that it still works in some cases.
I am not an expert at ASM whatsoever, although I learned it when I was still at university. Please note that I am not looking for optimization - I am just trying to understand how inline assembly works. So here is my question: Is there anything fundamentally wrong with my attempt or did I make a more subtle mistake here? Thanks in advance.
(Working with g++ MinGW Win32 x86 v.4.8.1)
Update
I have already tried out every single suggestion that has been contributed here so far. In particular I tried
... : [aptr] "=r" (arr) : "0" (arr) ...
instead, same result,... : [aptr] "+r" (arr) : ...
, still the same.Meanwhile I know the official documentation pretty much by heart, but I still can't see my error.
You are modifying an input operand (aptr
) which is not allowed. Either constrain it match an output operand or change it to an input/output operand.
Here is a complete code that has the intended behavior.
%%rbx
is used instead of %%ebx
as the base address for the array. For the same reason leaq
and cmpq
should be used instead of leal
and cmpl
. movq
should be used since the array is of type long
.long
is 8 byte not 4 byte on a 64-bit machine. jl
in the question should be changed to jg
. ebx
). Constraint "r"
can not be used. "r"
means any register can be used, however not any combination of registers is acceptable for leaq
. Look here: x86 addressing modes
#include <iostream>
using namespace std;
int main(){
int ARR_LEN=20;
int STEP_SIZE=2;
long SOME_VALUE=100;
long* arr = new long[ARR_LEN];
int i;
for (i=0; i<ARR_LEN; i++){
arr[i] = 0;
}
__asm__ __volatile__
(
"loop:"
"movq %%rdx, (%%rbx);"
"leaq (%%rbx, %%rcx, 8), %%rbx;"
"cmpq %%rbx, %%rax;"
"jg loop;"
: // no output
: "b" (arr),
"a" (arr+ARR_LEN),
"c" (STEP_SIZE),
"d" (SOME_VALUE)
: "cc", "memory"
);
for (i=0; i<ARR_LEN; i++){
cout << "element " << i << " is " << arr[i] << endl;
}
delete[] arr;
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With