Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

handwriting asm.js - how can you track javascript objects in the heap?

I am writing priority queues and octrees in the asm.js subset of Javascript in order to squeeze the last possible performance out of them.

However, how do you store references to Javascript objects in the asm.js function's heap buffer?

Right now, my structs in the heap have to have an integer ID for the Javascript object they are referencing, and I need a classic Javascript object to act as a dict between these ints and the Javascript objects.

For example, I have an asm.js octree with which exposes an add function like add(x1,y1,z1,x2,y2,z2,object_id) where object_id is integer. And the find(x1,y1,z1,x2,y2,z2) function returns a list of all the object_ids that are within the bounds. This means I have to maintain a dictionary of objects to object_ids in Javascript so I can determine the actual objects that are in that box; the mapping of object_ids to objects.

This feels wrong. The idea of casting an int to a string to do a lookup in the Javascript world is just wrong. One key point of writing inner-loop data-structures in asm.js is to avoid creating garbage.

(I am targeting Chrome as much as Firefox; I am hoping that asm.js strict code runs faster on both. Yes I will be profiling.)

No matter how many properties you can materialise into the asm.js heap - an object's position and dimensions, for example - you usually need to associate some Javascript objects with the item too; strings and webGL objects and DOM objects and so on.

Is there a better way for asm.js heap to contain pointers to Javascript objects? And if using integer ID mappings, is it better it use arrays or objects-as-dictionaries, for example?

like image 645
Will Avatar asked Jul 10 '13 07:07

Will


3 Answers

After reading the asm.js specs several times over and experimenting with it in Firefox I agree with bks:

an asm.js program can only interact indirectly with external data via numeric handles

However this doesn't pose a major problem. Since asm.js is a subset of JavaScript you will not be able to use a lot of JavaScript constructs in asm.js, including:

  1. JavaScript Objects
  2. Dynamic Arrays
  3. Higher Order Functions

Nevertheless, asm.js does provide a way to call JavaScript functions using the Foreign Function Interface (FFI). This is a very powerful mechanism as it allows you to callback JavaScript from asm.js (allowing you to create procedures partially written in asm.js and partially written in JavaScript).

It's important to distinguish which parts of your code can be converted to asm.js and will benefit from using asm.js. For example asm.js is great for graphics processing since its requires a lot of calculations. However it's not suitable for string manipulation. Plain JavaScript would be better for that purpose.

Getting back to the topic, the problem you're facing is that you need to reference JavaScript objects from within asm.js code. Since the only way to do this is to use numeric handles (which you don't want), there's only one other solution I see:

Instead of referencing JavaScript objects from within asm.js, reference asm.js structures from within JavaScript.

There are many reasons why this method is better:

  1. Since JavaScript is a superset of asm.js you can already use asm.js structures in JavaScript as is.
  2. Since JavaScript is more powerful than asm.js it's easier to make asm.js structures behave like JavaScript objects.
  3. By importing asm.js structures into JavaScript your asm.js code becomes simpler, more cohesive and less tightly coupled.

Enough of talk, let's see an example. Let's take Dijkstra's shortest path algorithm. Luckily I already have a working demo (I had to implement Dijkstra's algorithm for a college assignment):

http://jsfiddle.net/3fsMn/

The code linked to above is fully implemented in plain old JavaScript. Let's take some parts of this code and convert it to asm.js (keeping in mind that the data structures will be implemented in asm.js and then exported to JavaScript).

To start with something concrete, this is the way I'm creating a graph in JavaScript:

var graph = new Graph(6)
    .addEdge(0, 1, 7)
    .addEdge(0, 2, 9)
    .addEdge(0, 3, 14)
    .addEdge(1, 2, 10)
    .addEdge(1, 4, 15)
    .addEdge(2, 3, 2)
    .addEdge(2, 4, 11)
    .addEdge(3, 5, 9)
    .addEdge(4, 5, 6);

We want to keep the same interface. Hence the first thing to modify is the Graph constructor. This is how it's currently implemented:

function Graph(v) {
    this.v = --v;

    var vertices = new Array(v);

    for (var i = 0, e; e = v - i; i++) {
        var edges = new Array(e);
        for (var j = 0; j < e; j++)
            edges[j] = Infinity;
        vertices[i] = edges;
    }

    this.vertices = vertices;
}

I won't bother explaining all the code in depth, but a general understanding is required:

  1. The first thing to note is that suppose I'm creating a graph consisting of 4 vertices, then I only create an array of 3 vertices. The last vertex is not required.
  2. Next, for each vertex I create a new array (representing the edges) between two vertices. For a graph with 4 vertices:
    1. The first vertex has 3 edges.
    2. The second vertex has 2 new edges.
    3. The third vertex has 1 new edge.
    4. The fourth vertex has 0 new edges (which is the reason we only need an array of 3 vertices).

In general a graph of n vertices has n * (n - 1) / 2 edges. So we can represent the graph in a tabular format as follows (the table below is for the graph in the demo above):

+-----+-----+-----+-----+-----+-----+
|     |  f  |  e  |  d  |  c  |  b  |
+-----+-----+-----+-----+-----+-----+
|  a  |     |     |  14 |  9  |  7  |
+-----+-----+-----+-----+-----+-----+
|  b  |     |  15 |     |  10 |
+-----+-----+-----+-----+-----+
|  c  |     |  11 |  2  |
+-----+-----+-----+-----+
|  d  |  9  |     |
+-----+-----+-----+
|  e  |  6  |
+-----+-----+

This is the data structure we need to implement in the asm.js module. Now that we know what it looks like let's get down to implementing it:

var Graph = (function (constant) {
    function Graph(stdlib, foreign, heap) { /* asm.js module implementation */ }

    return function (v) {
        this.v = --v;
        var heap = new ArrayBuffer(4096);
        var doubleArray = this.doubleArray = new Float62Array(heap);
        var graph = this.graph = Graph(window, {}, heap);
        graph.init(v);

        var vertices = { length: v };

        for (var i = 0, index = 0, e; e = v - i; i++) {
            var edges = { length: e };

            for (var j = 0; j < e; j++) Object.defineProperty(edges, j, {
                get: element(index++)
            });

            Object.defineProperty(vertices, i, {
                get: constant(edges)
            });
        }

        this.vertices = vertices;

        function element(i) {
            return function () {
                return doubleArray[i];
            };
        }
    };
}(constant));

As you can see our Graph constructor has become a lot more complicated. In addition to v and vertices we have two new public properties, doubleArray and graph, which are required to expose the data structure and the data operations from the asm.js module respectively.

The vertices property is particular is now implemented as an object instead of an array, and it uses getters to expose the asm.js data structure. This is how we reference asm.js data structures from within JavaScript.

The heap is simply an ArrayBuffer and it can be operated on by either asm.js code or plain old JavaScript. This allows you to share data structures between asm.js code and JavaScript. On the JavaScript side you can wrap up this data structure in an object and use getters and setters to dynamically update the heap. In my humble opinion this is better than using numeric handles.

Conclusion: Since I've already answered your question and demonstrated how to import asm.js data structures into JavaScript I would conclude that this answer is complete. Nevertheless I would like to leave behind a working demo as a proof of concept. However this answer is already getting too big. I'll write a blog post on this topic and post a link to it here as soon as possible.

JSFiddle for Dijkstra's shortest algorithm path algorithm implemented in asm.js coming up soon.

like image 62
Aadit M Shah Avatar answered Nov 18 '22 01:11

Aadit M Shah


As I read the asm.js spec at http://asmjs.org/spec/latest/ and the FAQ at http://asmjs.org/faq.html, the short answer is that you can't store JS object references in the asmjs heap. Quoting from the FAQ:

Q. Can asm.js serve as a VM for managed languages, like the JVM or CLR?

A. Right now, asm.js has no direct access to garbage-collected data; an asm.js program can only interact indirectly with external data via numeric handles. In future versions we intend to introduce garbage collection and structured data based on the ES6 structured binary data API, which will make asm.js an even better target for managed languages.

So your current method of storing an external id-to-object map seems to be the current recommended way to solve your problem, as long as you care about the object instances rather than just their contents. Otherwise, I think the idea is that you dematerialize the stored objects: store the complete contents of each object at its slot in the priority queue, and turn it back into a true JS object only when it is fetched. But that only works if your objects are safe to recreate on demand.

like image 27
bks Avatar answered Nov 18 '22 02:11

bks


This feels wrong. The idea of casting an int to a string to do a lookup in the Javascript world is just wrong. One key point of writing inner-loop data-structures in asm.js is to avoid creating garbage.

There is no need to cast an int to a string here. You should have a JS array that maps indexes to JS objects, then indexing it with an integer should be optimized in JS engines to be a direct use of that integer. They will know when the lookup table is an array, and when the values flowing in are integers.

This is how emscripten (in both asm.js output mode and non-asm.js output mode) handles things like function pointers. You have an integer ID, and there is a JS array mapping those IDs to the relevant objects. For example,

var FUNCTION_TABLE = [function zero() {}, function one() {}];

later called with

FUNCTION_TABLE[i]();

It is important to keep the array properly optimized, which basically means starting its values at 0 and not having holes. Otherwise, it can be implemented as a dictionary instead of a fast flat list.

like image 4
Alon Zakai Avatar answered Nov 18 '22 02:11

Alon Zakai