Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ faster way to do string addition?

I'm finding standard string addition to be very slow so I'm looking for some tips/hacks that can speed up some code I have.

My code is basically structured as follows:

inline void add_to_string(string data, string &added_data) {
   if(added_data.length()<1) added_data = added_data + "{";
   added_data = added_data+data;
}

int main()
{
   int some_int = 100;
   float some_float = 100.0;
   string some_string = "test";

   string added_data;
   added_data.reserve(1000*64);

   for(int ii=0;ii<1000;ii++)
   {
      //variables manipulated here
      some_int = ii;  
      some_float += ii;
      some_string.assign(ii%20,'A');
      //then we concatenate the strings!
      stringstream fragment;
      fragment<<some_int <<","<<some_float<<","<<some_string;
      add_to_string(fragment.str(),added_data);
   }
   return;
}

Doing some basic profiling, I'm finding that a ton of time is being used in the for loop. Are there some things I can do that will significantly speed this up? Will it help to use c strings instead of c++ strings?

like image 450
user788171 Avatar asked Dec 13 '12 01:12

user788171


4 Answers

String addition is not the problem you are facing. std::stringstream is known to be slow due to it's design. On every iteration of your for-loop the stringstream is responsible for at least 2 allocations and 2 deletions. The cost of each of these 4 operations is likely more than that of the string addition.

Profile the following and measure the difference:

std::string stringBuffer;
for(int ii=0;ii<1000;ii++)
{
  //variables manipulated here
  some_int = ii;  
  some_float += ii;
  some_string.assign(ii%20,'A');
  //then we concatenate the strings!
  char buffer[128];
  sprintf(buffer, "%i,%f,%s",some_int,some_float,some_string.c_str());
  stringBuffer = buffer;
  add_to_string(stringBuffer ,added_data);
}

Ideally, replace sprintf with _snprintf or the equivalent supported by your compiler.

As a rule of thumb, use stringstream for formatting by default and switch to the faster and less safe functions like sprintf, itoa, etc. whenever performance matters.

Edit: that, and what didierc said: added_data += data;

like image 108
Peter Avatar answered Nov 03 '22 17:11

Peter


You can save lots of string operations if you do not call add_to_string in your loop.

I believe this does the same (although I am not a C++ expert and do not know exactly what stringstream does):

stringstream fragment;
for(int ii=0;ii<1000;ii++)
{
  //variables manipulated here
  some_int = ii;  
  some_float += ii;
  some_string.assign(ii%20,'A');
  //then we concatenate the strings!
   fragment<<some_int<<","<<some_float<<","<<some_string;
}

// inlined add_to_string call without the if-statement ;)
added_data = "{" + fragment.str();
like image 36
Veger Avatar answered Nov 03 '22 16:11

Veger


I see you used the reserve method on added_data, which should help by avoiding multiple reallocations of the string as it grows.

You should also use the += string operator where possible:

added_data += data;

I think that the above should save up some significant time by avoiding unecessary copies back and forth of added_data in a temporary string when doing the catenation.

This += operator is a simpler version of the string::append method, it just copies data directly at the end of added_data. Since you made the reserve, that operation alone should be very fast (almost equivalent to a strcpy).

But why going through all this, when you are already using a stringstream to handle input? Keep it all in there to begin with!

The stringstream class is indeed not very efficient.

You may have a look at the stringstream class for more information on how to use it, if necessary, but your solution of using a string as a buffer seems to avoid that class speed issue.

At any rate, stay away from any attempt at reimplementing the speed critical code in pure C unless you really know what you are doing. Some other SO posts support the idea of doing it,, but I think it's best (read safer) to rely as much as possible on the standard library, which will be enhanced over time, and take care of many corner cases you (or I) wouldn't think of. If your input data format is set in stone, then you might start thinking about taking that road, but otherwise it's premature optimization.

like image 3
didierc Avatar answered Nov 03 '22 16:11

didierc


If you start added_data with a "{", you would be able to remove the if from your add_to_string method: the if gets executed exactly once, when the string is empty, so you might as well make it non-empty right away.

In addition, your add_to_string makes a copy of the data; this is not necessary, because it does not get modified. Accepting the data by const reference should speed things up for you.

Finally, changing your added_data from string to sstream should let you append to it in a loop, without the sstream intermediary that gets created, copied, and thrown away on each iteration of the loop.

like image 2
Sergey Kalinichenko Avatar answered Nov 03 '22 18:11

Sergey Kalinichenko