Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C-Style strings versus library string performance

Tags:

c++

Question

Is it true that C-style strings operations, on average, execute 5 times slower than library string class operations, as C++ Primer, 4th Edition would have me believe?

Why ask?

Because when I actually performance test, it turns out that C-style strings are about 50% faster for a particular example (one used in the book).


Setup

I am reading C++ Primer, 4th Edition, which (on page 138) lists this code:

//  C-style character string implementation
const char *pc = "a very long literal string";
const size_t  len = strlen(pc +1);    //  space to allocate

//  performance test on string allocation and copy
for (size_t ix = 0; ix != 1000000; ++ix) {
    char *pc2 = new char[len + 1];  //  allocate the space
    strcpy(pc2, pc);                //  do the copy
    if (strcmp(pc2, pc))            //  use the new string
        ;    //  do nothing
    delete [] pc2;                  //  free the memory
}

//  string implementation
string str("a very long literal string");

//  performance test on string allocation and copy
for(int ix = 0; ix != 1000000; ++ix) {
    string str2 = str;  //  do the copy, automatically allocated
    if (str != str2)    //  use the new string
        ;   //  do nothing
}    //  str2 is automatically freed

Now bear in mind that I'm aware of that strlen(pc +1) on line 2, and that the first for uses size_t but doesn't subscript the array so it might as well have been int, but this is exactly how it is written down in the book.

When I test this code (with strlen(pc) + 1, which I presume was intended), my results are that the first block executes about 50% faster than the second block, which leads to conclusion that C-style strings are faster than library string class for this particular example.

However, I bet I'm missing something (probably obvious), because of what is written in the book (page 139) relating to the code above:

As it happens, on average, the string class implementation executes considerably faster than the C-style string functions. The relative average execution times on our more than five-year-old PC are as follows:

 user    0.47  # string class 
 user    2.55  # C-style character string

So which one is it? Should I have used a longer string literal? Maybe it was because they used the GNU C Compiler and I used the Microsoft one? Is it because I have a faster computer?

Or is the book just wrong on this one?

Edit

Microsoft (R) 32-bit C/C++ Optimizing Compiler version 16.00.40219.01 for 80x86

like image 482
neeKo Avatar asked May 27 '26 14:05

neeKo


2 Answers

Your conclusion that C style strings are faster with this example with your compiler & machine, is almost certainly because – one must presume – you

  • forgot to turn on optimization,
  • forgot to make the string length "unknown" to the compiler (this is tricky) so as to prevent it from optmizing away strlen calls, and
  • forgot and turn off safety range checking (if applicable) which would slow down std::string.

Here's the code I tested with:

#include <assert.h>
#include <iostream>
#include <time.h>
#include <string>
#include <string.h>
using namespace std;

extern void doNothing( char const* );

class StopWatch
{
private:
    clock_t     start_;
    clock_t     end_;
    bool        isRunning_;
public:
    void start()
    {
        assert( !isRunning_ );
        start_ = clock();
        end_ = 0;
        isRunning_ = true;
    }

    void stop()
    {
        if( isRunning_ )
        {
            end_ = clock();
            isRunning_ = false;
        }
    }

    double seconds() const
    {
        return double( end_ - start_ )/CLOCKS_PER_SEC;
    }

    StopWatch(): start_(), end_(), isRunning_() {}
};

inline void testCStr( int const argc, char const* const argv0 )
{
    //  C-style character string implementation
    //const char *pc = "a very long literal string";
    const char *pc = (argc == 10000? argv0 : "a very long literal string");
    //const size_t  len = strlen(pc +1);    //  space to allocate
    const size_t  len = strlen(pc)+1;    //  space to allocate

    //  performance test on string allocation and copy
    for (size_t ix = 0; ix != 1000000; ++ix) {
        char *pc2 = new char[len + 1];  //  allocate the space
        strcpy(pc2, pc);                //  do the copy
        if (strcmp(pc2, pc))            //  use the new string
            //;   //  do nothing
            doNothing( pc2 );
        delete [] pc2;                  //  free the memory
    }
}

inline void testCppStr( int const argc, char const* const argv0 )
{
    //  string implementation
    //string str("a very long literal string");
    string str( argc == 10000? argv0 : "a very long literal string" );

    //  performance test on string allocation and copy
    for(int ix = 0; ix != 1000000; ++ix) {
        string str2 = str;  //  do the copy, automatically allocated
        if (str != str2)    //  use the new string
            //;   //  do nothing
            doNothing( &str2[0] );
    }    //  str2 is automatically freed
}

int main( int argc, char* argv[] )
{
    StopWatch   timer;

    timer.start();  testCStr( argc, argv[0] );  timer.stop();
    cout << "C strings: " << timer.seconds() << " seconds." << endl;

    timer.start();  testCppStr( argc, argv[0] );  timer.stop();
    cout << "C++ strings: " << timer.seconds() << " seconds." << endl;
}

Typical result:

[d:\dev\test]
> g++ foo.cpp doNothing.cpp -O2

[d:\dev\test]
> a
C strings: 0.417 seconds.
C++ strings: 0.084 seconds.

[d:\dev\test]
> a
C strings: 0.398 seconds.
C++ strings: 0.082 seconds.

[d:\dev\test]
> a
C strings: 0.4 seconds.
C++ strings: 0.083 seconds.

[d:\dev\test]
> _

The said, C++ strings are not generally the fastest possible implementation of strings.

Generally, immutable strings (reference counted) beat C++ strings by a good margin, and, surprising to me when I learned that, a string implementation that simply copies the string data is faster still, when it uses an appropriate, fast custom allocator. However, don't ask me how to implement the latter. I only saw the code and test results in another forum, which someone graciously provided after I'd pointed out the general superiority of immutable strings in a discussion with STL and there was some disagreement. ;-)

like image 140
Cheers and hth. - Alf Avatar answered May 30 '26 04:05

Cheers and hth. - Alf


First of all: there is no definitive answer to this question.

The reason is that the performance depends on the library implementation, the compiler and the options you use, the operating system you use and the CPU architecture you use.

The book is somewhat old(2005, hardware & software have evolved), and the code it has has been tested on old compilers, on old implementations and on old hardware. Whatever it says about the performance is based on the observations by it's authors which definitely would vary between different people trying out the code with different compiler, library and hardware combinations.

The best you can do, is to try yourself. Simple "benchmarks" like these won't tell much about performance between C-style strings vs. std::strings in real world, common situations unless they provide extensive coverage of as many possible ways to test and compare the performance as possible - something which would be quite a big project itself.

Note that compiler optimizations can deceive you with code like shown in the book. For example because of the empty if-blocks, the whole if-statement and the expression within it(in this case for example call to strcpy) can be removed(*). It can be very hard to do meaningful, real-world applicable benchmarks with code blocks as given in the book.

Also note that whatever the results of these micro-benchmarks turns out to be, only applies to the operations they benchmark - in other words - just because string allocation, copy and comparison seem to be x times faster with either std::string or C-style string, does not mean that the other is x times faster than the other in general!

*: Tested the C-style string code with GCC 4.7.1 with -Ofast and there is no reference to strcmp in the compiled executable, suggesting that the string comparison was eliminated as unnecessary in the code - which it indeed is - because the if-block is empty so there's no reason to even have the whole if there in the first place!

To add my own observations: I broke the two pieces of code to distinct functions and then made 100 repeated calls(with a for-loop) to one of them and then measured the running time with the time unix-utility. Compiled with GCC 4.7.1 and -Ofast.

100 calls to the C-Style string function took about 7.05 seconds(3 runs, variation between 7 and 7.1 seconds) while the 100 calls to the std::string version took only around 1.4 seconds on average over 3 runs! Indeed, this would suggest that std::string far outperforms C-style strings.

like image 28
zxcdw Avatar answered May 30 '26 03:05

zxcdw