Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Node.js / coffeescript performance on a math-intensive algorithm

I am experimenting with node.js to build some server-side logic, and have implemented a version of the diamond-square algorithm described here in coffeescript and Java. Given all the praise I have heard for node.js and V8 performance, I was hoping that node.js would not lag too far behind the java version.

However on a 4096x4096 map, Java finishes in under 1s but node.js/coffeescript takes over 20s on my machine...

These are my full results. x-axis is grid size. Log and linear charts:

linear

log

Is this because there is something wrong with my coffeescript implementation, or is this just the nature of node.js still?

Coffeescript

genHeightField = (sz) ->
    timeStart = new Date()

    DATA_SIZE = sz
    SEED = 1000.0
    data = new Array()
    iters = 0

    # warm up the arrays to tell the js engine these are dense arrays
    # seems to have neligible effect when running on node.js though
    for rows in [0...DATA_SIZE]
        data[rows] = new Array();
        for cols in [0...DATA_SIZE]
            data[rows][cols] = 0

    data[0][0] = data[0][DATA_SIZE-1] = data[DATA_SIZE-1][0] = 
      data[DATA_SIZE-1][DATA_SIZE-1] = SEED;

    h = 500.0
    sideLength = DATA_SIZE-1
    while sideLength >= 2
      halfSide = sideLength / 2

      for x in [0...DATA_SIZE-1] by sideLength
          for y in [0...DATA_SIZE-1] by sideLength
              avg = data[x][y] +
                  data[x + sideLength][y] +
                  data[x][y + sideLength] +
                  data[x + sideLength][y + sideLength]
              avg /= 4.0;

              data[x + halfSide][y + halfSide] = 
                  avg + Math.random() * (2 * h) - h;
              iters++
              #console.log "A:" + x + "," + y

      for x in [0...DATA_SIZE-1] by halfSide
        y = (x + halfSide) % sideLength
        while y < DATA_SIZE-1
          avg = 
            data[(x-halfSide+DATA_SIZE-1)%(DATA_SIZE-1)][y]
            data[(x+halfSide)%(DATA_SIZE-1)][y]
            data[x][(y+halfSide)%(DATA_SIZE-1)]
            data[x][(y-halfSide+DATA_SIZE-1)%(DATA_SIZE-1)]
          avg /= 4.0;

          avg = avg + Math.random() * (2 * h) - h;
          data[x][y] = avg;

          if x is 0
            data[DATA_SIZE-1][y] = avg;
          if y is 0  
            data[x][DATA_SIZE-1] = avg;
          #console.log "B: " + x + "," + y
          y += sideLength
          iters++
      sideLength /= 2
      h /= 2.0

    #console.log iters
    console.log (new Date() - timeStart)

genHeightField(256+1)
genHeightField(512+1)
genHeightField(1024+1)
genHeightField(2048+1)
genHeightField(4096+1)

Java

import java.util.Random;

class Gen {

    public static void main(String args[]) {
        genHeight(256+1);
        genHeight(512+1);
        genHeight(1024+1);
        genHeight(2048+1);
        genHeight(4096+1);
    }

    public static void genHeight(int sz) {
        long timeStart = System.currentTimeMillis();
        int iters = 0;

        final int DATA_SIZE = sz;
        final double SEED = 1000.0;
        double[][] data = new double[DATA_SIZE][DATA_SIZE];
        data[0][0] = data[0][DATA_SIZE-1] = data[DATA_SIZE-1][0] = 
                data[DATA_SIZE-1][DATA_SIZE-1] = SEED;

        double h = 500.0;
        Random r = new Random();
        for(int sideLength = DATA_SIZE-1;
            sideLength >= 2;
            sideLength /=2, h/= 2.0){
          int halfSide = sideLength/2;

          for(int x=0;x<DATA_SIZE-1;x+=sideLength){
            for(int y=0;y<DATA_SIZE-1;y+=sideLength){
              double avg = data[x][y] + 
                      data[x+sideLength][y] +
                      data[x][y+sideLength] + 
                      data[x+sideLength][y+sideLength];
              avg /= 4.0;

              data[x+halfSide][y+halfSide] = 
                  avg + (r.nextDouble()*2*h) - h;

              iters++;
              //System.out.println("A:" + x + "," + y);
            }
          }

          for(int x=0;x<DATA_SIZE-1;x+=halfSide){
            for(int y=(x+halfSide)%sideLength;y<DATA_SIZE-1;y+=sideLength){
              double avg = 
                    data[(x-halfSide+DATA_SIZE-1)%(DATA_SIZE-1)][y] + 
                    data[(x+halfSide)%(DATA_SIZE-1)][y] + 
                    data[x][(y+halfSide)%(DATA_SIZE-1)] + 
                    data[x][(y-halfSide+DATA_SIZE-1)%(DATA_SIZE-1)]; 
              avg /= 4.0;

              avg = avg + (r.nextDouble()*2*h) - h;
              data[x][y] = avg;

              if(x == 0)  data[DATA_SIZE-1][y] = avg;
              if(y == 0)  data[x][DATA_SIZE-1] = avg;

              iters++;
              //System.out.println("B:" + x + "," + y);
            }
          }
        }
        //System.out.print(iters +" ");
        System.out.println(System.currentTimeMillis() - timeStart);
    }

}
like image 488
matt b Avatar asked Aug 12 '11 21:08

matt b


2 Answers

As other answerers have pointed out, JavaScript's arrays are a major performance bottleneck for the type of operations you're doing. Because they're dynamic, it's naturally much slower to access elements than it is with Java's static arrays.

The good news is that there is an emerging standard for statically typed arrays in JavaScript, already supported in some browsers. Though not yet supported in Node proper, you can easily add them with a library: https://github.com/tlrobinson/v8-typed-array

After installing typed-array via npm, here's my modified version of your code:

{Float32Array} = require 'typed-array'

genHeightField = (sz) ->
    timeStart = new Date()
    DATA_SIZE = sz
    SEED = 1000.0
    iters = 0

    # Initialize 2D array of floats
    data = new Array(DATA_SIZE)
    for rows in [0...DATA_SIZE]
      data[rows] = new Float32Array(DATA_SIZE)
      for cols in [0...DATA_SIZE]
          data[rows][cols] = 0

    # The rest is the same...

The key line in there is the declaration of data[rows].

With the line data[rows] = new Array(DATA_SIZE) (essentially equivalent to the original), I get the benchmark numbers:

17
75
417
1376
5461

And with the line data[rows] = new Float32Array(DATA_SIZE), I get

19
47
215
855
3452

So that one small change cuts the running time down by about 1/3, i.e. a 50% speed increase!

It's still not Java, but it's a pretty substantial improvement. Expect future versions of Node/V8 to narrow the performance gap further.

Caveat: It's got to be mentioned that normal JS numbers are double-precision, i.e. 64-bit floats. Using Float32Array will thus reduce precision, making this a bit of an apples-and-oranges comparison—I don't know how much of the performance improvement is from using 32-bit math, and how much is from faster array access. A Float64Array is part of the V8 spec, but isn't yet implemented in the v8-typed-array library.)

like image 166
Trevor Burnham Avatar answered Nov 16 '22 04:11

Trevor Burnham


If you're looking for performance in algorithms like this, both coffee/js and Java are the wrong languages to be using. Javascript is especially poor for problems like this because it does not have an array type - arrays are just hash maps where keys must be integers, which obviously will not be as quick as a real array. What you want is to write this algorithm in C and call that from node (see http://nodejs.org/docs/v0.4.10/api/addons.html). Unless you're really good at hand-optimizing machine code, good C will easily outstrip any other language.

like image 24
Aaron Dufour Avatar answered Nov 16 '22 02:11

Aaron Dufour