What are real significant cases when memcpy() is faster than memmove()?

Tags:

The key difference between memcpy() and memmove() is that memmove() will work fine when source and destination overlap. When buffers surely don't overlap memcpy() is preferable since it's potentially faster.

What bothers me is this potentially. Is it a microoptimization or are there real significant examples when memcpy() is faster so that we really need to use memcpy() and not stick to memmove() everywhere?

367

asked Sep 13 '10 13:09

sharptooth

2 Answers

There's at least an implicit branch to copy either forwards or backwards for memmove() if the compiler is not able to deduce that an overlap is not possible. This means that without the ability to optimize in favor of memcpy(), memmove() is at least slower by one branch, and any additional space occupied by inlined instructions to handle each case (if inlining is possible).

Reading the eglibc-2.11.1 code for both memcpy() and memmove() confirms this as suspected. Furthermore, there's no possibility of page copying during backward copying, a significant speedup only available if there's no chance for overlapping.

In summary this means: If you can guarantee the regions are not overlapped, then selecting memcpy() over memmove() avoids a branch. If the source and destination contain corresponding page aligned and page sized regions, and don't overlap, some architectures can employ hardware accelerated copies for those regions, regardless of whether you called memmove() or memcpy().

Update0

There is actually one more difference beyond the assumptions and observations I've listed above. As of C99, the following prototypes exist for the 2 functions:

void *memcpy(void * restrict s1, const void * restrict s2, size_t n);
void *memmove(void * s1, const void * s2, size_t n);

Due to the ability to assume the 2 pointers s1 and s2 do not point at overlapping memory, straightforward C implementations of memcpy are able to leverage this to generate more efficient code without resorting to assembler, see here for more. I'm sure that memmove can do this, however additional checks would be required above those I saw present in eglibc, meaning the performance cost may be slightly more than a single branch for C implementations of these functions.

112

answered Oct 12 '22 14:10

Matt Joiner

At best, calling memcpy rather than memmove will save a pointer comparison and a conditional branch. For a large copy, this is completely insignificant. If you are doing many small copies, then it might be worth measuring the difference; that is the only way you can tell whether it's significant or not.

It is definitely a microoptimisation, but that doesn't mean you shouldn't use memcpy when you can easily prove that it is safe. Premature pessimisation is the root of much evil.

answered Oct 12 '22 14:10

Mike Seymour

Related questions
                            
                                How to allocate a 2D array of pointers in C++
                            
                                Why does my program run way faster when I enable profiling?
                            
                                Do console apps run faster than GUI apps? [closed]
                            
                                Can I use the not operator in C++ on int values?
                            
                                Sorting a std::vector<std::pair<std::string,bool>> by the string?
                            
                                Checking the status of a child process in C++
                            
                                Why is python's dict implemented as hash table whereas std::map is tree-based?
                            
                                C++ class object pointers and accessing member functions
                            
                                double and stringstream formatting
                            
                                Search for a struct item in a vector by member data
                            
                                findChessboardCorners fails for calibration image
                            
                                How to find and replace all occurrences of a substring in a string?
                            
                                shared_from_this() causes std::bad_weak_ptr even when correctly using make_shared
                            
                                CMake could not find JNI
                            
                                Needless pointer-casts in C
                            
                                How to overload a destructor?
                            
                                Insert into an STL queue using std::copy
                            
                                QSplitter becoming undistinguishable between QWidget and QTabWidget
                            
                                C++: Should I use strings or char arrays, in general?
                            
                                C++ FILE pointer to stdout?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What are real significant cases when memcpy() is faster than memmove()?

Tags:

c++

c

memory

memcpy

memmove