I have a C array like:
char byte_array[10];
And another one that acts as a mask:
char byte_mask[10];
I would like to do get another array that is the result from the first one plus the second one using a bitwise operation, on each byte.
What's the most efficient way to do this?
thanks for your answers.
Approach: Precompute the prefix and suffix OR arrays. In one iteration, multiply an element with x^k and do Bitwise OR it with prefix OR i.e. Bitwise OR of all previous elements and suffix OR i.e. Bitwise OR of all next elements and return the maximum value after all iterations.
Given a range [L, R], the task is to find a pair (X, Y) such that L ≤ X < Y ≤ R and X & Y is the maximum among all the possible pairs then print the bitwise AND of the found pair. In all the possible pairs, pair (8, 9) gives the maximum value for bitwise AND.
Yes, Bitwise operations are alot faster than any arithmetic operations because these operations are performed directly on the bits that is 0 and 1.
On simple low-cost processors, typically, bitwise operations are substantially faster than division, several times faster than multiplication, and sometimes significantly faster than addition.
for ( i = 10 ; i-- > 0 ; )
result_array[i] = byte_array[i] & byte_mask[i];
This will work for all arrays and processors. However, if you know your arrays are word-aligned, a faster method is to cast to a larger type and do the same calculation.
For example, let's say n=16
instead of n=10
. Then this would be much faster:
uint32_t* input32 = (uint32_t*)byte_array;
uint32_t* mask32 = (uint32_t*)byte_mask;
uint32_t* result32 = (uint32_t*)result_array;
for ( i = 4 ; i-- > 0 ; )
result32[i] = input32[i] & mask32[i];
(Of course you need a proper type for uint32_t
, and if n
is not a power of 2 you need to clean up the beginning and/or ending so that the 32-bit stuff is aligned.)
Variation: The question specifically calls for the results to be placed in a separate array, however it would almost certainly be faster to modify the input array in-place.
If you want to make it faster, make sure that byte_array has length that is multiple of 4 (8 on 64-bit machines), and then:
char byte_array[12];
char byte_mask[12];
/* Checks for proper alignment */
assert(((unsigned int)(void *)byte_array) & 3 == 0);
assert(((unsigned int)(void *)byte_mask) & 3 == 0);
for (i = 0; i < (10+3)/4; i++) {
((unsigned int *)(byte_array))[i] &= ((unsigned int *)(byte_mask))[i];
}
This is much faster than doing it byte per byte.
(Note that this is in-place mutation; if you want to keep the original byte_array also, then you obviously need to store the results in another array instead.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With