Often in my inner loops I need to index an array in a "wrap-around" way, so that (for example) if the array size is 100 and my code asks for element -2, it should be given element 98. In many high level languages such as Python, one can do this simply with my_array[index % array_size]
, but for some reason C's integer arithmetic (usually) rounds toward zero instead of consistently rounding down, and consequently its modulo operator returns a negative result when given a negative first argument.
Often I know that index
will not be less than -array_size
, and in these cases I just do my_array[(index + array_size) % array_size]
. However, sometimes this can't be guaranteed, and for those cases I would like to know the fastest way to implement an always-positive modulo function. There are several "clever" ways to do it without branching, such as
inline int positive_modulo(int i, int n) { return (n + (i % n)) % n; }
or
inline int positive_modulo(int i, int n) { return (i % n) + (n * (i < 0)); }
Of course I can profile these to find out which is the fastest on my system, but I can't help worrying that I might have missed a better one, or that what's fast on my machine might be slow on a different one.
So is there a standard way to do this, or some clever trick that I've missed that's likely to be the fastest possible way?
Also, I know it's probably wishful thinking, but if there's a way of doing this that can be auto-vectorised, that would be amazing.
We use the % to denote this type of operator (it's the percentile operator). The modulus operator is added in the arithmetic operators in C, and it works between two available operands. It divides the given numerator by the denominator to find a result.
C library function - fmod()The C library function double fmod(double x, double y) returns the remainder of x divided by y.
a = -0.340515; to convert this into positive number I used the abs() method as: a = abs(a);
The standard way I learned is
inline int positive_modulo(int i, int n) { return (i % n + n) % n; }
This function is essentially your first variant without the abs
(which, in fact, makes it return the wrong result). I wouldn't be surprised if an optimizing compiler could recognize this pattern and compile it to machine code that computes an "unsigned modulo".
Edit:
Moving on to your second variant: First of all, it contains a bug, too -- the n < 0
should be i < 0
.
This variant may not look as if it branches, but on a lot of architectures, the i < 0
will compile into a conditional jump. In any case, it will be at least as fast to replace (n * (i < 0))
with i < 0? n: 0
, which avoids the multiplication; in addition, it's "cleaner" because it avoids reinterpreting the bool as an int.
As to which of these two variants is faster, that probably depends on the compiler and processor architecture -- time the two variants and see. I don't think there's a faster way than either of these two variants, though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With