Small is beautiful, but is it also fast?

Tags:

I had an argument with a co-worker about implementation of simple string parser. One is "small", 10 lines of code, using c++ and streams, the other is 70 lines of code, using switch cases and iterating string char by char. We tested it over 1 million of iterations, and measured speed using time command. It appears that the long and ugly approach is 1 second faster on average.

The problem: Input: string

"v=spf1 mx include:_spf-a.microsoft.com include:_spf-b.microsoft.com include:_spf-c.microsoft.com include:_spf-ssg-a.microsoft.com ip4:131.107.115.212 ip4:131.107.115.215 ip4:131.107.115.214 ip4:205.248.106.64 ip4:205.248.106.30 ip4:205.248.106.32 ~all a:1.2.3.4"

Output: map<string, list<string>> with all the values for each key such as: ip4, include,a

example output of one iteration, on the input string given above:

key:a

1.2.3.4,

key:include

_spf-a.microsoft.com, _spf-b.microsoft.com, _spf-c.microsoft.com, _spf-ssg-a.microsoft.com,

key:ip4

131.107.115.212, 131.107.115.215, 131.107.115.214, 205.248.106.64, 205.248.106.30, 205.248.106.32,

The "small is beautiful" parser:

        istringstream iss(input);
        map<string, list<string> > data;
        string item;
        string key;
        string value;

        size_t pos;
        while (iss.good()) {
                iss >> item;
                pos = item.find(":");
                key = item.substr(0,pos);
                data[key].push_back(item.substr(pos+1));
        }

The second faster approach:

  typedef enum {I,Include,IP,A,Other} State;
  State state = Other;
  string line = input;
  string value;
  map<string, list<string> > data;
  bool end = false;
  size_t pos = 0;
  while (pos < line.length()) {
   switch (state) {
    case Other:
     value.clear();
     switch (line[pos]) {
      case 'i':
       state = I;
       break;
      case 'a':
       state = A;
       break;
      default:
       while(line[pos]!=' ' && pos < line.length())
        pos++;
     }
     pos++;
     break;
    case I:
     switch (line[pos]) {
      case 'p':
       state = IP;
       break;
      case 'n':
       state = Include;
       break;
     }
     pos++;
     break;
    case IP:
     pos+=2;
     for (;line[pos]!=' ' && pos<line.length(); pos++) {
      value+=line[pos];
     }
     data["ip4"].push_back(value);
     state = Other;
     pos++;
     break;
    case Include:
     pos+=6;
     for (;line[pos]!=' ' && pos<line.length(); pos++) {
      value+=line[pos];
     }
     data["include"].push_back(value);
     state = Other;
     pos++;
     break;
    case A:
     if (line[pos]==' ')
      data["a"].push_back("a");
     else {
      pos++;
      for (;line[pos]!=' ' && pos<line.length(); pos++) {
       value+=line[pos];
      }
     }
     data["a"].push_back(value);
     state = Other;
     pos++;
     break;
   }
  }

I truly believe that "small is beautiful" is the way to go, and i dislike the longer code presented here, but it's hard to argue about it, when the code runs faster.

Can you suggest a ways to optimize or completely rewrite the small approach, in a way, where it stays small and beautiful but also runs faster?

Update: Added state definition and initialization. Context: the longer approach completes 1 million iterations on the same string in 15.2 seconds, the smaller code does the same in 16.5 seconds on average.

both versions compiled with g++ -O3, g++-4.4, ran on Intel(R) Core(TM)2 Duo CPU E8200 @ 2.66GHz, Linux Mint 10

The good side have won this battle :) I found small bug in the small program, it added even invalid values to the map, the ones that did not had the ":" colon in the string. After adding an "if" statement to check for the presence of colon, the smaller code runs faster, much faster. Now the timings are: "small and beautiful":12.3 and long and ugly: 15.2.

Small is beautiful :)

486

asked Nov 24 '10 09:11

Vladimir

2 Answers

Smaller may not be faster. One example: bubble sort is very short, but it is O(n * n). QuickSort and MergeSort is longer and seems more complicated, but it is O(n log n).

But having said that... always make sure the code is readable, or if the logic is complicated, add good comments to it so that other people can follow.

102

answered Sep 27 '22 19:09

nonopolarity

Less lines of code you have; the better. Don't add 60 lines more if you really don't need to. If it's slow, profile. Then optimize. Don't optimize before you need it. If it runs fine, leave it as it is. Adding more code will add more bugs. You don't want that. Keep it short. Really.

Read this wiki post.

"Premature optimization is the root of all evil" - Donald Knuth, a pretty smart guy.

It is possible to write faster code by writing less of it, just more intelligently. One way to aid speed: do less.

Quoting Raymond Chen:

"One of the questions I get is, "My app is slow to start up. What are the super secret evil tricks you guys at Microsoft are using to get your apps to start up faster?" The answer is, "The super evil trick is to do less stuff." -- "Five Things Every Win32 Programmer Needs to Know" (16 Sept. 2005)

Also, check out why GNU grep is fast.

answered Sep 27 '22 18:09

darioo

Related questions
                            
                                how is x&&y||z evaluated?
                            
                                What is the time complexity of traversing a 2d array
                            
                                Converting an uint64 to string in C++
                            
                                How to get std::string from command line arguments in win32 application?
                            
                                Why methods can not return multiple values
                            
                                Create BSOD from user mode?
                            
                                What is the name of this code construction: condition ? true_expression: false_expression
                            
                                Set a FourCC value in C++
                            
                                Is there a way of applying a function to each member of a struct in c++?
                            
                                Declaration and Implementation of Functions
                            
                                Difference between while(i=0) and while(i==0)
                            
                                Converting numbers into alphabets in c++
                            
                                SDL 2 Undefined Reference to "WinMain@16" and several SDL functions
                            
                                std::pow gives a wrong approximation for fractional exponents
                            
                                Declaring a variable in an if-else block in C++
                            
                                Why can't a function have a reference argument in C?
                            
                                when memory will be released?
                            
                                Which language/platform to develop desktop application based on following criteria [closed]
                            
                                How to store and call a compiled function in C / C++?
                            
                                Memory leaks in C++ (via new+delete)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Small is beautiful, but is it also fast?

Tags:

c++

c

parsing

coding-style

Vladimir

People also ask

2 Answers

nonopolarity

darioo

Recent Activity

Donate For Us