It is told that modulo operator "%" and divide operator "/" are very inefficient in embedded C++.
How can I alternatively achieve the following expression:
a = b % c;
I understand that this can be achieved using the following logic:
a = b - c;
while (a >= c) {
a = a - c;
}
But my question is, is this code involving while loops efficient enough, compared to % operator?
Thanks, Kirti
That code will almost certainly be slower than however your processor/compiler decides to perform the divide/mod. Generally, shortcuts are pretty hard to come by for basic arithmetic operators, since the mcu/cpu designers and compiler programmers are pretty good at optimizing this for almost all applications.
One common shortcut in embedded devices (where every cycle/byte can make a difference) is to keep everything in terms of base-2 to use the bit shift operators to perform multiplication and division, and the bitwise and (&) to perform modulo.
Examples:
unsigned int x = 100;
unsigned int y1 = x << 4; // same as x * 2^4 = x*16
unsigned int y2 = x >> 6; // same as x / 2^6 = x/64
unsigned int y3 = x & 0x07; // same as x % 8
Nothing is going to be considerably more efficient than the %
operator. If there was a better way to do it, then any reasonable compiler would automatically convert it. When you're told that %
and /
are inefficient, that's just because those are difficult operations - if you need to perform a modulo, then do that.
There may be special cases when there are better ways - for example, mod a power of two can be written as a binary or - but those are probably optimized by your compiler.
Division and modulus are indeed costly hardware operations, whatever you do (this is more related to hardware architecture than to languages or compilers), perhaps ten times slower than addition.
However, on current laptops or servers, and on high-end microcontrollers, cache misses are often much slower than divisions!
The GCC compiler is often able to optimize them, when the divisor is a constant.
Your naive loop is usually much more slower than using the hardware division instruction (or the library routine doing it, if not provided by hardware). I believe you are wrong in avoiding the division & replacing it with your loop.
You might tune your algorithms -e.g. by having power of twos- but I don't recommend using your code. Remember that premature optimization is evil so first try to get your program correct, then profile it to find the trouble spots.
If the divisor is known at compile time, the operation can be transformed into a multiplication by a reciprocal, with some shifts, adds, and other fast operations. This will be faster on any modern processor, even if it implements division in hardware. Embedded targets usually have highly optimized routines for divide / modulo, since these operations are required by the standard.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With