State of "memset" functionality in C++ with modern compilers

Context:

A while ago, I stumbled upon this 2001 DDJ article by Alexandrescu: http://www.ddj.com/cpp/184403799

It's about comparing various ways to initialized a buffer to some value. Like what "memset" does for single-byte values. He compared various implementations (memcpy, explicit "for" loop, duff's device) and did not really find the best candidate across all dataset sizes and all compilers.

Quote:

There is a very deep, and sad, realization underlying all this. We are in 2001, the year of the Spatial Odyssey. (...) Just step out of the box and look at us — after 50 years, we're still not terribly good at filling and copying memory.

Question:

does anyone have more recent information about this problem ? Do recent GCC and Visual C++ implementations perform significantly better than 7 years ago ?
I'm writing code that has a lifetime of 5+ (probably 10+) years and that will process arrays' sizes from a few bytes to hundred of megabytes. I can't assume that my choices now will still be optimal in 5 years. What should I do:
- a) use the system's memset (or equivalent) and forget about optimal performance or assume the runtime and compiler will handle this for me.
- b) benchmark once and for all on various array sizes and compilers and switch at runtime between several routines.
- c) run the benchmark at program initialization and switch at runtime based on accurate (?) data.

Edit: I'm working on image processing software. My array items are PODs and every millisecond counts !

Edit 2: Thanks for the first answers, here are some additional informations:

Buffer initialization may represent 20%-40% of total runtime of some algorithms.
The platform may vary in the next 5+ years, although it will stay in the "fastest CPU money can buy from DELL" category. Compilers will be some form of GCC and Visual C++. No embedded stuff or exotic architectures on the radar
I'd like to hear from people who had to update their software when MMX and SSE appeared, since I'll have to do the same when "SSE2015" becomes available... :)

929

asked Oct 05 '08 12:10

rlerallut

1 Answers

The DDJ article acknowledges that memset is the best answer, and much faster than what he was trying to achieve:

There is something sacrosanct about C's memory manipulation functions memset, memcpy, and memcmp. They are likely to be highly optimized by the compiler vendor, to the extent that the compiler might detect calls to these functions and replace them with inline assembler instructions — this is the case with MSVC.

So, if memset works for you (ie. you are initializing with a single byte) then use it.

Whilst every millisecond may count, you should establish what percentage of your execution time is lost to setting memory. It is likely very low (1 or 2%??) given that you have useful work to do as well. Given that the optimization effort would likely have a much better rate of return elsewhere.

answered Nov 05 '22 05:11

Rob Walker

Related questions
                            
                                Lambda as default argument fails
                            
                                How to make clang-format skip sections of c++ code
                            
                                Cleanest way for conditional code instantiation in C++ template
                            
                                How to make QCheckBox readonly, but not grayed-out
                            
                                Why can I std::move elements from a const vector?
                            
                                How do I allocate a std::string on the stack using glibc's string implementation?
                            
                                How do I convert a big-endian struct to a little endian-struct?
                            
                                What type of exception should I throw?
                            
                                Access an element in a set?
                            
                                How to read and write a STL C++ string?
                            
                                C++: Create abstract class with abstract method and override the method in a subclass
                            
                                Making Xerces parse a string instead of a file
                            
                                Is GetLastError() kind of design pattern? Is it good mechanism?
                            
                                What was Wrong with void main()?
                            
                                OpenCV Orb not finding matches once rotation/scale invariances are introduced
                            
                                Erasing using backspace control character
                            
                                Insert clickable link in QLabel and detect click on this link to provoke an action
                            
                                Lookup table with constexpr
                            
                                Why are rvalues references variables not rvalue?
                            
                                C++ override private pure virtual method as public

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

State of "memset" functionality in C++ with modern compilers

Tags:

c++

c

optimization

memory