I am searching for a faster method of accomplishing this:
int is_empty(char * buf, int size)
{
int i;
for(i = 0; i < size; i++) {
if(buf[i] != 0) return 0;
}
return 1;
}
I realize I'm searching for a micro optimization unnecessary except in extreme cases, but I know a faster method exists, and I'm curious what it is.
To check if an array is empty or not, you can use the .length property. The length property sets or returns the number of elements in an array. By knowing the number of elements in the array, you can tell if it is empty or not. An empty array will have 0 elements inside of it.
Explanation. Zero Buffer Credit Rate measures the percentage of time that the port was unable to send frames due to a lack of buffer credits at the port over a particular time interval.
On many architectures, comparing 1 byte takes the same amount of time as 4 or 8, or sometimes even 16. 4 bytes is normally easy (either int or long), and 8 is too (long or long long). 16 or higher probably requires inline assembly to e.g., use a vector unit.
Also, a branch mis-predictions really hurt, it may help to eliminate branches. For example, if the buffer is almost always empty, instead of testing each block against 0, bit-or them together and test the final result.
Expressing this is difficult in portable C: casting a char*
to long*
violates strict aliasing. But fortunately you can use memcpy
to portably express an unaligned multi-byte load that can alias anything. Compilers will optimize it to the asm you want.
For example, this work-in-progress implementation (https://godbolt.org/z/3hXQe7) on the Godbolt compiler explorer shows that you can get a good inner loop (with some startup overhead) from loading two consecutive uint_fast32_t
vars (often 64-bit) with memcpy and then checking tmp1 | tmp2
, because many CPUs will set flags according to an OR result, so this lets you check two words for the price of one.
Getting it to compile efficiently for targets without efficient unaligned loads requires some manual alignment in the startup code, and even then gcc may not inline the memcpy
for loads where it can't prove alignment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With