Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this OCaml program faster than my C program?

I wrote a basic Hippity Hop program in C, Python, and OCaml. Granted, this is probably not a very good benchmark of these three languages. But the results I got were something like this:

  • Python: .350 seconds
  • C: .050 seconds
  • interpreted OCaml: .040 seconds
  • compiled OCaml: .010

The python performance doesn't really surprise me, but I'm rather shocked at how fast the OCaml is (especially the interpreted version). For comparison, I'll post the C version and the OCaml version.

C

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

long get_count(char *name);

int main(int argc, char *argv[])
{
  if (argc != 2){
    printf("Filename must be specified as a positional argument.\n");
    exit(EXIT_FAILURE);
  }

  long count_no = get_count(argv[1]);

  int i;
  for (i = 1; i <= count_no; i++){
    if (((i % 3) == 0) && ((i % 5) == 0)){
      printf("Hop\n");
      continue;
    }
    if ((i % 3) == 0){
      printf("Hoppity\n");
    }
    if ((i % 5) == 0){
      printf("Hophop\n");
    }
  }
  return 0;
}

long get_count(char *name){
  FILE *fileptr = fopen(name, "r");
  if (!fileptr){
    printf("Unable to open file %s.\n", name);
    exit(EXIT_FAILURE);
  }
  size_t text_len = 20;
  char *file_text = calloc(text_len, sizeof(char));
  while (!feof(fileptr)){
    fread(file_text, sizeof(char), text_len, fileptr);
    assert(!ferror(fileptr));
    text_len += 20;
    file_text = realloc(file_text, text_len * sizeof(char));
  }
  long file_as_int = strtol(file_text, NULL, 10);

  free(file_text);
  return file_as_int;
}

OCaml

open String;;

let trim str =
  if str = "" then "" else
  let search_pos init p next =
    let rec search i =
      if p i then raise(Failure "empty") else
      match str.[i] with
      | ' ' | '\n' | '\r' | '\t' -> search (next i)
      | _ -> i
    in
    search init
  in
  let len = String.length str in
  try
    let left = search_pos 0 (fun i -> i >= len) (succ)
    and right = search_pos (len - 1) (fun i -> i < 0) (pred)
    in
    String.sub str left (right - left + 1)
  with
  | Failure "empty" -> ""
;;

let rec iterate_over_numbers curr_num max_num =
  (
   if curr_num <= max_num then (
     if ((curr_num mod 3) == 0) && ((curr_num mod 5) == 0) then 
       print_endline "Hop"
     else if (curr_num mod 3) == 0 then 
       print_endline "Hoppity"
     else if (curr_num mod 5) == 0 then
       print_endline "Hophop";
     iterate_over_numbers (curr_num + 1) max_num
   ))
;;


let fname = Sys.argv.(1);;
let infile = open_in fname;;
let file_text = trim (input_line infile);;
close_in infile;;
let input_number = int_of_string file_text;;
iterate_over_numbers 1 input_number;;

But I'm curious to know why I'm getting these results. Am I doing something dumb in my C program, or is this just something OCaml is faster at? It seems to me a bit strange that an interpreted program is running a little faster than the C version, and the compiled program is running 5 times as fast.

like image 841
Jason Baker Avatar asked Oct 19 '09 00:10

Jason Baker


2 Answers

Time under 0.05 can be a simple noise. Repeat the main program enough times to actually get ~1s of execution time in C. (I mean repeating it in a loop in the program itself, not by running it again)

Did you compile your code with optimisations turned on? Did you try reducing the number of branches? (and comparisons)

if (i % 3 == 0) {
  if (i % 5 == 0) {
    printf("Hop\n");
    continue;
  }
  printf("Hoppity\n");
} else if (i % 5 == 0){
  printf("Hophop\n");
}

Did you try looking at the assembler output?

Also printf is pretty slow. Try puts("Hop") instead, since you don't use the formating anyways.

like image 182
viraptor Avatar answered Oct 30 '22 03:10

viraptor


Your C code isn't the equivalent of the OCaml code - you used 'else if' in the OCaml to avoid having to recompute moduli quite so much.

There's an awful lot of code in that 'read the long integer'. Why not just use fscanf(); it skips blanks and all that automatically, and avoids you doing the malloc() etc. I don't often recommend using fscanf(), but this looks like a setup for it - a single line, possibly with spaces either side, no funny stuff.


Curiosity killed the cat - but in this case, not the Leopard.

I downloaded OCaml 3.11.1 for MacOS X Intel and copied the OCaml code from the question into xxx.ml (OCaml), and compiled that into an object file xxx (using "ocamlc -o xxx xxx.ml"); I copied the C code verbatim into yyy.c and created a variant zzz.c using fscanf() and fclose(), and compiled them using "gcc -O -o yyy yyy.c" and "gcc -O -o zzz zzz.c". I created a file 'file3' containing: "987654" plus a newline. I created a shell script runthem.sh as shown. Note that 'time' is a pesky command that thinks its output must go to stderr even when you'd rather it didn't - you have to work quite hard to get the output where you want it to go. (The range command generates numbers in the given range, inclusive - hence 11 values per program.)

Osiris JL: cat runthem.sh
for prog in "ocaml xxx.ml" ./xxx ./yyy ./zzz
do
    for iter in $(range 0 10)
    do
        r=$(sh -c "time $prog file3 >/dev/null" 2>&1)
        echo $prog: $r
    done
done
Osiris JL: 

I ran all this on a modern MacBook Pro (3 GHz Core 2 Duo etc, 4GB RAM) running Leopard (10.5.8). The timings I got are shown by:

Osiris JL: sh runthem.sh
ocaml xxx.ml: real 0m0.961s user 0m0.524s sys 0m0.432s
ocaml xxx.ml: real 0m0.953s user 0m0.516s sys 0m0.430s
ocaml xxx.ml: real 0m0.959s user 0m0.517s sys 0m0.431s
ocaml xxx.ml: real 0m0.951s user 0m0.517s sys 0m0.430s
ocaml xxx.ml: real 0m0.952s user 0m0.516s sys 0m0.431s
ocaml xxx.ml: real 0m0.952s user 0m0.514s sys 0m0.431s
ocaml xxx.ml: real 0m0.951s user 0m0.515s sys 0m0.431s
ocaml xxx.ml: real 0m0.959s user 0m0.515s sys 0m0.431s
ocaml xxx.ml: real 0m0.950s user 0m0.515s sys 0m0.431s
ocaml xxx.ml: real 0m0.956s user 0m0.516s sys 0m0.431s
ocaml xxx.ml: real 0m0.952s user 0m0.514s sys 0m0.432s
./xxx: real 0m0.928s user 0m0.494s sys 0m0.430s
./xxx: real 0m0.938s user 0m0.494s sys 0m0.430s
./xxx: real 0m0.927s user 0m0.494s sys 0m0.430s
./xxx: real 0m0.928s user 0m0.492s sys 0m0.430s
./xxx: real 0m0.928s user 0m0.493s sys 0m0.430s
./xxx: real 0m0.927s user 0m0.493s sys 0m0.430s
./xxx: real 0m0.928s user 0m0.492s sys 0m0.430s
./xxx: real 0m0.933s user 0m0.497s sys 0m0.428s
./xxx: real 0m0.926s user 0m0.494s sys 0m0.429s
./xxx: real 0m0.921s user 0m0.492s sys 0m0.428s
./xxx: real 0m0.925s user 0m0.494s sys 0m0.428s
./yyy: real 0m0.027s user 0m0.026s sys 0m0.001s
./yyy: real 0m0.031s user 0m0.026s sys 0m0.002s
./yyy: real 0m0.028s user 0m0.026s sys 0m0.001s
./yyy: real 0m0.029s user 0m0.026s sys 0m0.002s
./yyy: real 0m0.028s user 0m0.026s sys 0m0.001s
./yyy: real 0m0.029s user 0m0.026s sys 0m0.002s
./yyy: real 0m0.028s user 0m0.026s sys 0m0.001s
./yyy: real 0m0.031s user 0m0.026s sys 0m0.002s
./yyy: real 0m0.028s user 0m0.026s sys 0m0.001s
./yyy: real 0m0.030s user 0m0.026s sys 0m0.002s
./yyy: real 0m0.028s user 0m0.026s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
./zzz: real 0m0.029s user 0m0.027s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
./zzz: real 0m0.029s user 0m0.027s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
./zzz: real 0m0.029s user 0m0.027s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
./zzz: real 0m0.029s user 0m0.027s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
./zzz: real 0m0.029s user 0m0.027s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
Osiris JL:

I don't see the OCaml code running faster than the C code. I ran the tests with smaller numbers in the file that was read, and the results were similarly in favour of the C code:

Stop number: 345

ocaml xxx.ml: real 0m0.027s user 0m0.020s sys 0m0.005s
ocaml xxx.ml: real 0m0.021s user 0m0.016s sys 0m0.005s
ocaml xxx.ml: real 0m0.025s user 0m0.016s sys 0m0.004s
ocaml xxx.ml: real 0m0.020s user 0m0.015s sys 0m0.003s
ocaml xxx.ml: real 0m0.022s user 0m0.016s sys 0m0.004s
ocaml xxx.ml: real 0m0.019s user 0m0.015s sys 0m0.003s
ocaml xxx.ml: real 0m0.021s user 0m0.016s sys 0m0.004s
ocaml xxx.ml: real 0m0.020s user 0m0.015s sys 0m0.004s
ocaml xxx.ml: real 0m0.021s user 0m0.016s sys 0m0.004s
ocaml xxx.ml: real 0m0.020s user 0m0.015s sys 0m0.004s
ocaml xxx.ml: real 0m0.021s user 0m0.016s sys 0m0.004s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.002s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.002s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.005s user 0m0.001s sys 0m0.002s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.002s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.003s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.001s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.003s user 0m0.000s sys 0m0.002s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.001s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.003s user 0m0.000s sys 0m0.002s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s

Stop number: 87654

ocaml xxx.ml: real 0m0.102s user 0m0.059s sys 0m0.041s
ocaml xxx.ml: real 0m0.102s user 0m0.059s sys 0m0.040s
ocaml xxx.ml: real 0m0.101s user 0m0.060s sys 0m0.040s
ocaml xxx.ml: real 0m0.103s user 0m0.059s sys 0m0.041s
ocaml xxx.ml: real 0m0.102s user 0m0.059s sys 0m0.041s
ocaml xxx.ml: real 0m0.101s user 0m0.059s sys 0m0.041s
ocaml xxx.ml: real 0m0.102s user 0m0.059s sys 0m0.040s
ocaml xxx.ml: real 0m0.103s user 0m0.059s sys 0m0.040s
ocaml xxx.ml: real 0m0.101s user 0m0.059s sys 0m0.040s
ocaml xxx.ml: real 0m0.102s user 0m0.059s sys 0m0.040s
ocaml xxx.ml: real 0m0.105s user 0m0.059s sys 0m0.041s
./xxx: real 0m0.092s user 0m0.044s sys 0m0.038s
./xxx: real 0m0.087s user 0m0.044s sys 0m0.039s
./xxx: real 0m0.085s user 0m0.044s sys 0m0.038s
./xxx: real 0m0.084s user 0m0.044s sys 0m0.038s
./xxx: real 0m0.085s user 0m0.044s sys 0m0.039s
./xxx: real 0m0.086s user 0m0.045s sys 0m0.039s
./xxx: real 0m0.085s user 0m0.044s sys 0m0.039s
./xxx: real 0m0.085s user 0m0.044s sys 0m0.038s
./xxx: real 0m0.084s user 0m0.044s sys 0m0.038s
./xxx: real 0m0.084s user 0m0.044s sys 0m0.039s
./xxx: real 0m0.083s user 0m0.044s sys 0m0.038s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.005s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.005s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.006s user 0m0.003s sys 0m0.002s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.005s user 0m0.003s sys 0m0.002s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.005s user 0m0.003s sys 0m0.001s

Obviously, YMMV - but it appears that OCaml is slower than C by a considerable margin, but that if the number in the given file is small enough, then start up and file reading dominate the process time.

The C timings, especially at the smaller numbers, are so fast that they are not all that reliable.

like image 21
Jonathan Leffler Avatar answered Oct 30 '22 04:10

Jonathan Leffler