I am writing some assembly code for ARM8 (aarch64). I want to do a division and use the remainder obtained for further calculations. In x86 when I use 'div', and I know my remainder is kept in RDX. My question is - is there an equivalent to that in the aarch64 instruction set? I know 'udiv' and 'sdiv' do unsigned and signed divisions and gets me the quotient. Is there a single instruction which would give me remainder? (I want % modulo operator in c). I understand I can obtain it using algebra, just wanted to confirm I am not missing out on a simpler method.
Barring constant power-of-two divisors which can be optimised down to an and
, there is no instruction that will calculate the remainder of a division. You can, however do it pretty neatly in two:
// input: x0=dividend, x1=divisor
udiv x2, x0, x1
msub x3, x2, x1, x0
// result: x2=quotient, x3=remainder
The Clang C compiler produced the following code for a modulo calculation:
udiv x10, x0, x9
msub x10, x10, x9, x0
While the x86 does this in a single instruction, that doesn't make it faster.
On the Apple M-1, the above instruction pair is executed in about the same time as a single step. Possibly this is due to instruction macro-fusion that decodes multiple instructions into a single µ-op. It could also be due to parallelism in multiple execution units. Possibly, it is done in one EU where the remainder from the division calculation is cached and returned immediately.
Whatever the implementation, it seems to be about as fast as Intel's single instruction form.
Timing:
$ time ./a.out 12345678901
Total: 301123495054
real 0m10.036s
user 0m9.668s
sys 0m0.031s
Instruction generated:
udiv x10, x0, x9
Timing:
$ time ./a.out 12345678901
Total: 8612082846779832640
real 0m10.190s
user 0m9.768s
sys 0m0.070s
Instructions generated:
udiv x10, x0, x9
msub x10, x10, x9, x0
Timing:
$ time ./a.out 12345678901
Total: 8612083123211969892
real 0m10.103s
user 0m9.752s
sys 0m0.019s
Instructions generated:
udiv x10, x0, x9
msub x11, x10, x9, x0
The following C code can be run with either q = n / d
or r = n % d
commented out:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
unsigned long long n, d, q=1, r=1, total=0;
n = strtoull(argv[1], NULL, 10);
total = 0;
for (d=1 ; d<=n ; d++) {
q = n / d;
r = n % d;
total += q + r;
}
printf("Total: %llu", total);
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With