Recently, during a discussion I was asked by a fellow programmer to do some code changes. I had something like: <pre class="prettyprint"><code>if( mystring.size() == 0) // do something else // do something else </code></pre> The discussion was regarding the use of <code>mystring.empty()</code> to validate if the string is empty. Now, I agree that it can be argued that <code>string.empty()</code> is more verbose and readable code, but are there any performance benefits to it? I did some digging and found these 2 answers pertaining to my question: <ul> <li>Implementation from basic_string.h</li> <li> SO Answer that points to ISO Standard - here </li> </ul> Both the answers buttress my claim that the <code>string.empty()</code> is just more readable and doesn't offer any performance benefits, compared to <code>string.size() == 0</code>. I still want to be sure, if there are any implementations of <code>string</code> that keep an internal boolean flag to validate if a string is empty or not? Or there are other ways that some implementations use, that would nullify my claim??

Now, this is a pretty trivial matter, but I'll try to cover it exhaustively so whatever arguments are put by colleagues aren't likely to take you by surprise.... As usual, if profiling proved you really really had to care, measure: there could be a difference (see below). But in a general code review situation for not-proved-problematically-slow code, the outstanding issues are: <ul> <li>in some other containers (e.g. C++03 lists but not C++11), <code>size()</code> was less efficient than <code>empty()</code>, leading to some coding tips to prefer <code>empty()</code> over <code>size()</code> in general so that if someone needed to switch the container later (or generalise the processing into a template where the container type may vary) no change needs to be made to retain efficiency.</li> <li>does either reflect a more natural way to conceive of the test? - not just what you happened to think of first, or <code>size()</code> because you're not as used to using <code>empty()</code>, but when you're 100% focused on the surrounding code logic, does <code>size()</code> or <code>empty()</code> fit in better? For example, perhaps because it's one of several tests of <code>size()</code> and you like having consistency, or because you're implementing a famous algorithm or formula that's traditionally expressed in terms of size: being consistent might reduce the mental noise/effort in verifying the implementation against the formula.</li> </ul> Most of the time the factors above are insignificant, and raising the issue in a code review is really a waste of time. Possible performance difference While the Standard requires functional equivalence, some implementations might implement them differently, though I've struggled and so far failed to document a particularly plausible reason for doing so. C++11 has more constraints than C++03 over behaviours of other functions that impact implementation choices: <code>data()</code> must be NUL terminated (used to be just <code>c_str()</code>), <code>[size()]</code> is now a valid index and must return a reference to a NUL character. For various subtle reasons, these restrictions make it even more likely that <code>empty()</code> will be no faster than <code>size()</code>. Anyway - measure if you have to care.

Which is faster, string.empty() or string.size() == 0?

Tags:

c++

performance

string

c++11

Recently, during a discussion I was asked by a fellow programmer to do some code changes. I had something like:

if( mystring.size() == 0)
    // do something
else
    // do something else

The discussion was regarding the use of mystring.empty() to validate if the string is empty. Now, I agree that it can be argued that string.empty() is more verbose and readable code, but are there any performance benefits to it?

I did some digging and found these 2 answers pertaining to my question:

Implementation from basic_string.h
SO Answer that points to ISO Standard - here

Both the answers buttress my claim that the string.empty() is just more readable and doesn't offer any performance benefits, compared to string.size() == 0.

I still want to be sure, if there are any implementations of string that keep an internal boolean flag to validate if a string is empty or not?

Or there are other ways that some implementations use, that would nullify my claim??

403

asked Jul 09 '13 03:07

Jatin Ganhotra

2 Answers

The standard defines empty() like this:

bool empty() const noexcept;
Returns: size() == 0.

You'd be hard-pressed to find something that doesn't do that, and any performance difference would be negligible due to both being constant time operations. I would expect both to compile to the exact same assembly on any reasonable implementation.

That said, empty() is clear and explicit. You should prefer it over size() == 0 (or !size()) for readability.

162

answered Sep 22 '22 16:09

chris

Now, this is a pretty trivial matter, but I'll try to cover it exhaustively so whatever arguments are put by colleagues aren't likely to take you by surprise....

As usual, if profiling proved you really really had to care, measure: there could be a difference (see below). But in a general code review situation for not-proved-problematically-slow code, the outstanding issues are:

in some other containers (e.g. C++03 lists but not C++11), size() was less efficient than empty(), leading to some coding tips to prefer empty() over size() in general so that if someone needed to switch the container later (or generalise the processing into a template where the container type may vary) no change needs to be made to retain efficiency.
does either reflect a more natural way to conceive of the test? - not just what you happened to think of first, or size() because you're not as used to using empty(), but when you're 100% focused on the surrounding code logic, does size() or empty() fit in better? For example, perhaps because it's one of several tests of size() and you like having consistency, or because you're implementing a famous algorithm or formula that's traditionally expressed in terms of size: being consistent might reduce the mental noise/effort in verifying the implementation against the formula.

Most of the time the factors above are insignificant, and raising the issue in a code review is really a waste of time.

Possible performance difference

While the Standard requires functional equivalence, some implementations might implement them differently, though I've struggled and so far failed to document a particularly plausible reason for doing so.

C++11 has more constraints than C++03 over behaviours of other functions that impact implementation choices: data() must be NUL terminated (used to be just c_str()), [size()] is now a valid index and must return a reference to a NUL character. For various subtle reasons, these restrictions make it even more likely that empty() will be no faster than size().

Anyway - measure if you have to care.

answered Sep 26 '22 16:09

Tony Delroy

Related questions
                            
                                Forcing enum to be of unsigned long type
                            
                                Unsupported Operation. A document processed by the JRC engine cannot be opened in the C++ stack [duplicate]
                            
                                iostream vs ostream what is different?
                            
                                Must the definition of a C++ inline functions be in the same file?
                            
                                How to call C++ code from Node.js?
                            
                                Why is the complement operator not working when bool = true?
                            
                                How to serialize RapidJSON document to a string?
                            
                                Template class with conditional typenames
                            
                                How to wait until all child processes called by fork() complete?
                            
                                Forward Declaration of a Base Class
                            
                                Why do you need to append an L or F after a value assigned to a C++ constant?
                            
                                Where to get peer review of code and how to get my code attention?
                            
                                Is Pointer-to- " inner struct" member forbidden?
                            
                                Is it safe to get an object in std::map by reference?
                            
                                Is p = array the same as p = &array[0]?
                            
                                Is it a bad practice to use #ifdef in code?
                            
                                Is there a difference between <winsock.h> and <winsock2.h>?
                            
                                stl container with std::unique_ptr's vs boost::ptr_container
                            
                                Centering text on the screen with SFML
                            
                                What is the difference between iteration and traversing?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With