Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Small is beautiful, but is it also fast?

I had an argument with a co-worker about implementation of simple string parser. One is "small", 10 lines of code, using c++ and streams, the other is 70 lines of code, using switch cases and iterating string char by char. We tested it over 1 million of iterations, and measured speed using time command. It appears that the long and ugly approach is 1 second faster on average.

The problem: Input: string

"v=spf1 mx include:_spf-a.microsoft.com include:_spf-b.microsoft.com include:_spf-c.microsoft.com include:_spf-ssg-a.microsoft.com ip4:131.107.115.212 ip4:131.107.115.215 ip4:131.107.115.214 ip4:205.248.106.64 ip4:205.248.106.30 ip4:205.248.106.32 ~all a:1.2.3.4"

Output: map<string, list<string>> with all the values for each key such as: ip4, include,a

example output of one iteration, on the input string given above:

key:a

1.2.3.4,

key:include

_spf-a.microsoft.com, _spf-b.microsoft.com, _spf-c.microsoft.com, _spf-ssg-a.microsoft.com,

key:ip4

131.107.115.212, 131.107.115.215, 131.107.115.214, 205.248.106.64, 205.248.106.30, 205.248.106.32,

The "small is beautiful" parser:

        istringstream iss(input);
        map<string, list<string> > data;
        string item;
        string key;
        string value;

        size_t pos;
        while (iss.good()) {
                iss >> item;
                pos = item.find(":");
                key = item.substr(0,pos);
                data[key].push_back(item.substr(pos+1));
        }

The second faster approach:

  typedef enum {I,Include,IP,A,Other} State;
  State state = Other;
  string line = input;
  string value;
  map<string, list<string> > data;
  bool end = false;
  size_t pos = 0;
  while (pos < line.length()) {
   switch (state) {
    case Other:
     value.clear();
     switch (line[pos]) {
      case 'i':
       state = I;
       break;
      case 'a':
       state = A;
       break;
      default:
       while(line[pos]!=' ' && pos < line.length())
        pos++;
     }
     pos++;
     break;
    case I:
     switch (line[pos]) {
      case 'p':
       state = IP;
       break;
      case 'n':
       state = Include;
       break;
     }
     pos++;
     break;
    case IP:
     pos+=2;
     for (;line[pos]!=' ' && pos<line.length(); pos++) {
      value+=line[pos];
     }
     data["ip4"].push_back(value);
     state = Other;
     pos++;
     break;
    case Include:
     pos+=6;
     for (;line[pos]!=' ' && pos<line.length(); pos++) {
      value+=line[pos];
     }
     data["include"].push_back(value);
     state = Other;
     pos++;
     break;
    case A:
     if (line[pos]==' ')
      data["a"].push_back("a");
     else {
      pos++;
      for (;line[pos]!=' ' && pos<line.length(); pos++) {
       value+=line[pos];
      }
     }
     data["a"].push_back(value);
     state = Other;
     pos++;
     break;
   }
  }

I truly believe that "small is beautiful" is the way to go, and i dislike the longer code presented here, but it's hard to argue about it, when the code runs faster.

Can you suggest a ways to optimize or completely rewrite the small approach, in a way, where it stays small and beautiful but also runs faster?

Update: Added state definition and initialization. Context: the longer approach completes 1 million iterations on the same string in 15.2 seconds, the smaller code does the same in 16.5 seconds on average.

both versions compiled with g++ -O3, g++-4.4, ran on Intel(R) Core(TM)2 Duo CPU E8200 @ 2.66GHz, Linux Mint 10

The good side have won this battle :) I found small bug in the small program, it added even invalid values to the map, the ones that did not had the ":" colon in the string. After adding an "if" statement to check for the presence of colon, the smaller code runs faster, much faster. Now the timings are: "small and beautiful":12.3 and long and ugly: 15.2.

Small is beautiful :)

like image 486
Vladimir Avatar asked Nov 24 '10 09:11

Vladimir


People also ask

What is the main point of small is beautiful?

In his book Small Is Beautiful (1973), he argued that capitalism brought higher living standards at the cost of deteriorating culture. His belief that natural resources should be conserved led him to conclude that bigness—in particular, large industries and large cities—would lead to the depletion of those resources.

Is small is beautiful relevant today?

The emergence of buddhist economics In his book, Schumacher discusses the principles of Buddhist economics and addresses how modern economic thinking causes much of the emotional distress we experience in our 21st-century lives. Yes, the book published in 1973, but it is more relevant today than it was in its time.

What is the importance of the book small is beautiful?

Answer: The book 'small is beautiful' is published by schumacher in 1974 which explain the Gandhian model of environmental protection and resource conservation in an international platform.

Who wrote the book small is beautiful and what is it based on?

About the author Ernst Friedrich "Fritz" Schumacher was an internationally influential economic thinker, statistician and economist in Britain, serving as Chief Economic Advisor to the UK National Coal Board for two decades.


2 Answers

Smaller may not be faster. One example: bubble sort is very short, but it is O(n * n). QuickSort and MergeSort is longer and seems more complicated, but it is O(n log n).

But having said that... always make sure the code is readable, or if the logic is complicated, add good comments to it so that other people can follow.

like image 102
nonopolarity Avatar answered Sep 27 '22 19:09

nonopolarity


Less lines of code you have; the better. Don't add 60 lines more if you really don't need to. If it's slow, profile. Then optimize. Don't optimize before you need it. If it runs fine, leave it as it is. Adding more code will add more bugs. You don't want that. Keep it short. Really.

Read this wiki post.

"Premature optimization is the root of all evil" - Donald Knuth, a pretty smart guy.

It is possible to write faster code by writing less of it, just more intelligently. One way to aid speed: do less.

Quoting Raymond Chen:

"One of the questions I get is, "My app is slow to start up. What are the super secret evil tricks you guys at Microsoft are using to get your apps to start up faster?" The answer is, "The super evil trick is to do less stuff." -- "Five Things Every Win32 Programmer Needs to Know" (16 Sept. 2005)

Also, check out why GNU grep is fast.

like image 44
darioo Avatar answered Sep 27 '22 18:09

darioo