Why '==' is slow on std::string?

Tags:

While profiling my application I realized that a lot of time is spent on string comparisons. So I wrote a simple benchmark and I was surprised that '==' is much slower than string::compare and strcmp! here is the code, can anyone explain why is that? or what's wrong with my code? because according to the standard '==' is just an operator overload and simply returnes !lhs.compare(rhs).

#include <iostream>
#include <vector>
#include <string>
#include <stdint.h>
#include "Timer.h"
#include <random>
#include <time.h>
#include <string.h>
using namespace std;
uint64_t itr  = 10000000000;//10 Billion
int len = 100;
int main() {
  srand(time(0));
  string s1(len,random()%128);
  string s2(len,random()%128);

uint64_t a = 0;
  Timer t;
  t.begin();
  for(uint64_t i =0;i<itr;i++){
    if(s1 == s2)
      a = i;
  }
  t.end();

  cout<<"==       took:"<<t.elapsedMillis()<<endl;

  t.begin();
  for(uint64_t i =0;i<itr;i++){
    if(s1.compare(s2)==0)
      a = i;
  }
  t.end();

  cout<<".compare took:"<<t.elapsedMillis()<<endl;

  t.begin();
  for(uint64_t i =0;i<itr;i++){
    if(strcmp(s1.c_str(),s2.c_str()))
      a = i;
  }
  t.end();

  cout<<"strcmp   took:"<<t.elapsedMillis()<<endl;

  return a;
}

And here is the result:

==       took:5986.74
.compare took:0.000349
strcmp   took:0.000778

And my compile flags:

CXXFLAGS = -O3 -Wall -fmessage-length=0 -std=c++1y

I use gcc 4.9 on a x86_64 linux machine.

Obviously using -o3 does some optimizations which I guess rolls out the last two loops totally; however, using -o2 still the results are weird:

for 1 billion iterations:

==       took:19591
.compare took:8318.01
strcmp   took:6480.35

P.S. Timer is just a wrapper class to measure spent time; I am absolutely sure about it :D

Code for Timer class:

#include <chrono>

#ifndef SRC_TIMER_H_
#define SRC_TIMER_H_


class Timer {
  std::chrono::steady_clock::time_point start;
  std::chrono::steady_clock::time_point stop;
public:
  Timer(){
    start = std::chrono::steady_clock::now();
    stop = std::chrono::steady_clock::now();
  }
  virtual ~Timer() {}

  inline void begin() {
    start = std::chrono::steady_clock::now();
  }

  inline void end() {
    stop = std::chrono::steady_clock::now();
  }

  inline double elapsedMillis() {
    auto diff = stop - start;
    return  std::chrono::duration<double, std::milli> (diff).count();
  }

  inline double elapsedMicro() {
    auto diff = stop - start;
    return  std::chrono::duration<double, std::micro> (diff).count();
  }

  inline double elapsedNano() {
    auto diff = stop - start;
    return  std::chrono::duration<double, std::nano> (diff).count();
  }

  inline double elapsedSec() {
    auto diff = stop - start;
    return std::chrono::duration<double> (diff).count();
  }
};

#endif /* SRC_TIMER_H_ */

555

asked Feb 26 '15 04:02

Behrooz

1 Answers

UPDATE: output of improved benchmark at http://ideone.com/rGc36a

==       took:21
.compare took:21
strcmp   took:14
==       took:21
.compare took:25
strcmp   took:14

The thing that proved crucial to get it working meaningfully was "outwitting" the compiler's ability to predict the strings being compared at compile time:

// more strings that might be used...
string s[] = { {len,argc+'A'}, {len,argc+'A'}, {len, argc+'B'}, {len, argc+'B'} };

if(s[i&3].compare(s[(i+1)&3])==0)  // trickier to optimise
  a += i;  // cumulative observable side effects

Note that in general, strcmp is not functionally equivalent to == or .compare when the text may embed NULs, as the former will get to "exit early". (That's not the reason it's "faster" above, but do read below for comments re possible variations with string length/content etc..)

Discussion / Earlier answer

Just have a look at your implementation - e.g.

echo '#include <string>' > stringE.cc
g++ -E stringE.cc | less

Search for the basic_string template, then for the operator== working on two string instances - mine is:

template<class _Elem,
    class _Traits,
    class _Alloc> inline
    bool __cdecl operator==(
            const basic_string<_Elem, _Traits, _Alloc>& _Left,
            const basic_string<_Elem, _Traits, _Alloc>& _Right)
    {
    return (_Left.compare(_Right) == 0);
    }

Notice that operator== is inline and simply calls compare. There's no way it's consistently significantly slower with normal optimisation levels enabled, though the optimiser might occasionally happen to optimise one loop better than another due to subtle side effects of surrounding code.

Your ostensible problem will have been caused by e.g. your code being optimised beyond the point of doing the intended work, for loops arbitrarily unrolled to different degrees, or other quirks or bugs in the optimisation or your timing. That's not unusual when you have unvarying inputs and loops that don't have any cumulative side-effects (i.e. the compiler can work out that intermediate values of a are not used, so only the last a = i need take affect).

So, learn to write better benchmarks. In this case, that's a bit tricky as having lots of distinct strings in memory ready to invoke the comparisons on, and selecting them in a way that the optimiser can't predict at compile time that's still fast enough not to overwhelm and obscure the impact of the string comparison code, is not an easy task. Further, beyond a point - comparing things spread across more memory makes cache affects more relevant to the benchmark, which further obscures the real comparison performance.

Still, if I were you I'd read some strings from a file - pushing each to a vector, then loop over the vector doing each of the three comparison operations between adjacent elements. Then the compiler can't possibly predict any pattern in the outcomes. You might find compare/== faster/slower than strcmp for strings often differing in the first character or three, but the other way around for long strings that are equal or only differing near the end, so make sure you try different kinds of input before you conclude you understand the performance profile.

answered Oct 11 '22 16:10

Tony Delroy

Related questions
                            
                                memory allocation and 0 size: can I get the memory leaks?
                            
                                Implementation Inside a Declaration C++ [duplicate]
                            
                                What other useful casts can be used in C++
                            
                                what does the operator string() { some code } do?
                            
                                C++: STL troubles with const class members
                            
                                Limit floating point precision?
                            
                                Why isn't main defined `main(std::vector<std::string> args)`?
                            
                                Can I add numbers with the C/C++ preprocessor?
                            
                                C++ template friend operator overloading
                            
                                Should one always check, if the new operator worked?
                            
                                use getline and while loop to split a string
                            
                                What is wrong with my For loops? i get warnings: comparison between signed and unsigned integer expressions [-Wsign-compare]
                            
                                How to use C++11 features with Autoconf? [duplicate]
                            
                                Convert negative image to positive [closed]
                            
                                Pick a unique random subset from a set of unique values
                            
                                Can a memory block allocated by using operator new/malloc persist beyond end of program execution? [duplicate]
                            
                                How do I cast a pointer to an int
                            
                                How to avoid namespace collision in C and C++
                            
                                How to use openGL with gcc on Mac?
                            
                                Reverse String C++ using char array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why '==' is slow on std::string?

Tags:

c++

string

std

compare

Behrooz

People also ask

1 Answers

Tony Delroy

Recent Activity

Donate For Us