The key difference between memcpy()
and memmove()
is that memmove()
will work fine when source and destination overlap. When buffers surely don't overlap memcpy() is preferable since it's potentially faster.
What bothers me is this potentially. Is it a microoptimization or are there real significant examples when memcpy()
is faster so that we really need to use memcpy()
and not stick to memmove()
everywhere?
"memcpy is more efficient than memmove." In your case, you most probably are not doing the exact same thing while you run the two functions. In general, USE memmove only if you have to. USE it when there is a very reasonable chance that the source and destination regions are over-lapping.
That memmove might be slower than memcpy is because it is able to handle overlapping memory, but memmove still only copies the data once. profile it on the platform you're interested in the timings for. However, the chances of you writing a better memmove than memmove seems unlikely.
memmove() is similar to memcpy() as it also copies data from a source to destination. memcpy() leads to problems when source and destination addresses overlap as memcpy() simply copies data one by one from one location to another.
If you know the length of a string, you can use mem functions instead of str functions. For example, memcpy is faster than strcpy because it does not have to search for the end of the string. If you are certain that the source and target do not overlap, use memcpy instead of memmove .
There's at least an implicit branch to copy either forwards or backwards for memmove()
if the compiler is not able to deduce that an overlap is not possible. This means that without the ability to optimize in favor of memcpy()
, memmove()
is at least slower by one branch, and any additional space occupied by inlined instructions to handle each case (if inlining is possible).
Reading the eglibc-2.11.1
code for both memcpy()
and memmove()
confirms this as suspected. Furthermore, there's no possibility of page copying during backward copying, a significant speedup only available if there's no chance for overlapping.
In summary this means: If you can guarantee the regions are not overlapped, then selecting memcpy()
over memmove()
avoids a branch. If the source and destination contain corresponding page aligned and page sized regions, and don't overlap, some architectures can employ hardware accelerated copies for those regions, regardless of whether you called memmove()
or memcpy()
.
There is actually one more difference beyond the assumptions and observations I've listed above. As of C99, the following prototypes exist for the 2 functions:
void *memcpy(void * restrict s1, const void * restrict s2, size_t n);
void *memmove(void * s1, const void * s2, size_t n);
Due to the ability to assume the 2 pointers s1
and s2
do not point at overlapping memory, straightforward C implementations of memcpy
are able to leverage this to generate more efficient code without resorting to assembler, see here for more. I'm sure that memmove
can do this, however additional checks would be required above those I saw present in eglibc
, meaning the performance cost may be slightly more than a single branch for C implementations of these functions.
At best, calling memcpy
rather than memmove
will save a pointer comparison and a conditional branch. For a large copy, this is completely insignificant. If you are doing many small copies, then it might be worth measuring the difference; that is the only way you can tell whether it's significant or not.
It is definitely a microoptimisation, but that doesn't mean you shouldn't use memcpy
when you can easily prove that it is safe. Premature pessimisation is the root of much evil.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With