What is the best way to implement a bitwise memmove
? The method should take an additional destination and source bit-offset and the count should be in bits too.
_membitmove
, which does exactly what I need, but I couldn't find its source.isc_bitstring_copy
, but it's not efficientmemmove is safer. memcpy can be faster, and usually is. There are less restrictions on it's implementation, so more can be done to optimize it. But not necessarily a lot more — in fact, it could even be slower then memmove , and sometimes this is the case.
"memcpy is more efficient than memmove." In your case, you most probably are not doing the exact same thing while you run the two functions. In general, USE memmove only if you have to. USE it when there is a very reasonable chance that the source and destination regions are over-lapping.
memmove doesn't zero the original memory block though. If you want to do that, you'll have to explicitly do it yourself with memset. As a rule, C routines don't waste cycles doing things that may not be necessary, such as zeroing memory. Compare with malloc , which likewise does not zero the memory block.
That memmove might be slower than memcpy is because it is able to handle overlapping memory, but memmove still only copies the data once. profile it on the platform you're interested in the timings for. However, the chances of you writing a better memmove than memmove seems unlikely.
Assuming "best" means "easiest", you can copy bits one by one. Conceptually, an address of a bit is an object (struct) that has a pointer to a byte in memory and an index of a bit in the byte.
struct pointer_to_bit
{
uint8_t* p;
int b;
};
void membitmovebl(
void *dest,
const void *src,
int dest_offset,
int src_offset,
size_t nbits)
{
// Create pointers to bits
struct pointer_to_bit d = {dest, dest_offset};
struct pointer_to_bit s = {src, src_offset};
// Bring the bit offsets to range (0...7)
d.p += d.b / 8; // replace division by right-shift if bit offset can be negative
d.b %= 8; // replace "%=8" by "&=7" if bit offset can be negative
s.p += s.b / 8;
s.b %= 8;
// Determine whether it's OK to loop forward
if (d.p < s.p || d.p == s.p && d.b <= s.b)
{
// Copy bits one by one
for (size_t i = 0; i < nbits; i++)
{
// Read 1 bit
int bit = (*s.p >> s.b) & 1;
// Write 1 bit
*d.p &= ~(1 << d.b);
*d.p |= bit << d.b;
// Advance pointers
if (++s.b == 8)
{
s.b = 0;
++s.p;
}
if (++d.b == 8)
{
d.b = 0;
++d.p;
}
}
}
else
{
// Copy stuff backwards - essentially the same code but ++ replaced by --
}
}
If you want to write a version optimized for speed, you will have to do copying by bytes (or, better, words), unroll loops, and handle a number of special cases (memmove
does that; you will have to do more because your function is more complicated).
P.S. Oh, seeing that you call isc_bitstring_copy
inefficient, you probably want the speed optimization. You can use the following idea:
Start copying bits individually until the destination is byte-aligned (d.b == 0
). Then, it is easy to copy 8 bits at once, doing some bit twiddling. Do this until there are less than 8 bits left to copy; then continue copying bits one by one.
// Copy 8 bits from s to d and advance pointers
*d.p = *s.p++ >> s.b;
*d.p++ |= *s.p << (8 - s.b);
P.P.S Oh, and seeing your comment on what you are going to use the code for, you don't really need to implement all the versions (byte/halfword/word, big/little-endian); you only want the easiest one - the one working with words (uint32_t
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With