I understand that memmove
and memcpy
difference is that memmove
handles the memory overlap case. I have checked the implementation in libgcc and got this article [memcpy performance] from the intel website.
In libgcc, the memmove
is similar to memcpy
, both just go though one byte and byte, so the performance should be almost same even after optimization.
Someone has measured this and got this article memcopy, memmove, and Speed over Safety. Even I don't think the memmove
can be faster than memcpy
, but there should be no big difference at least on Intel
platform.
So in what platform and how, memcpy
can be significantly faster than memmove
, if there is none, why providing two similiar functions instead of just memmove
, and lead to a lots of bug.
Edit: I'm not asking the difference of memmove and memcpy, I know memmove can handle overlap issue. The question is about is there really any platform where memcpy is faster than memmove?
Copy data twice will be slower. That memmove might be slower than memcpy is because it is able to handle overlapping memory, but memmove still only copies the data once. profile it on the platform you're interested in the timings for. However, the chances of you writing a better memmove than memmove seems unlikely.
"memcpy is more efficient than memmove." In your case, you most probably are not doing the exact same thing while you run the two functions. In general, USE memmove only if you have to. USE it when there is a very reasonable chance that the source and destination regions are over-lapping.
memmove() is similar to memcpy() as it also copies data from a source to destination. memcpy() leads to problems when source and destination addresses overlap as memcpy() simply copies data one by one from one location to another. For example consider below program.
The memmove function has the defined behavior in case of overlapping. So whenever in doubt, it is safer to use memmove in place of memcpy.
There is at least one recent case where the constraint of non-overlapping memory is used to generate faster code:
In Visual Studio memcpy
can be compiled using intrinsics, while memmove
cannot. This leads in memcpy
being much faster for small regions of a known size because of removing the function call and setup overhead. The implementation using movsd
/movsw
/movsb
is not suitable for overlapping blocks, as it starts copying at the lowest address, incrementing the edi/esi during the copy.
See also Make compiler copy characters using movsd.
The GCC also lists memcpy as implemented as built-ins, the implementation and motivation is likely to be similar to that of Visual Studio.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With