Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using emscripten how to get C++ uint8_t array to JS Blob or UInt8Array

In emscripten C++, I have

class MyClass {
public:
   MyClass() {}
   std::shared_ptr<std::vector<uint8_t>> buffer;
   int getPtr() {
      return (int)(buffer->data());
   }
   int getLength() {
      return buffer->size();
   }
};
EMSCRIPTEN_BINDINGS() {
    class_<MyClass>("MyClass").constructor()
      .function("getLength",&MyClass::getLength)
      .function("getPtr",&MyClass::getPtr,
                allow_raw_pointers());
}

I can invoke getLength() and getPtr() from JS but I'm don't know how to get JS to treat it as an ArrayBuffer for download as a Blob.

How can I get the buffer data into JS in a form where I can then download it using code similar to https://github.com/kennethjiang/js-file-download/blob/master/file-download.js.

like image 694
Glenn Avatar asked Dec 24 '22 02:12

Glenn


1 Answers

Currently WebAssembly only defines basic number types to communicate between JS and WASM. There is no Object types nor Array types. This is the WebAssembly's design goal. Emscripten have done some hacks to make C++ Class <=> JS bindings, but they are not WASM standard.

WebAssembly.Memory()

BUT there is a way around this to get the array. JS has direct access to the internal memory of WASM modules, even without an API. WASM has a linear memory model and the linear memory is interfaced through WebAssembly.Memory(). WebAssembly.Memory() is a single ArrayBuffer WebAssembly.Memory.buffer where your WASM module uses as the heap memory region and where memory allocations (e.g. malloc()) happen.

1. Accessing it as UInt8Array

What does it mean? It means that the pointer (integer in JS side) you get from getPtr() is actually an offset to WebAssembly.Memory.buffer.

Emscripten automatically generates JS (this is generated from a template called preamble.js) code that create WebAssembly.Memory(). You can search Emscripten-generated code yourself and should be able find out a line similar to this line:

Module['wasmMemory'] = new WebAssembly.Memory({ 'initial': TOTAL_MEMORY / WASM_PAGE_SIZE, 'maximum': TOTAL_MEMORY / WASM_PAGE_SIZE });

So you can access to the ArrayBuffer used by your WASM module through Module['wasmMemory'].buffer:

let instance = new Module.MyClass();

// ... Do something

let ptr = instance.getPtr();
let size = instance.getLength();
// You can use Module['env']['memory'].buffer instead. They are the same.
let my_uint8_buffer = new Uint8Array(Module['wasmMemory'].buffer, ptr, size);

2. Emscripten HEAPU8

Alternatively, Emscripten offers an official way to access the heap memory region as typed arrays: HEAPU8,HEAPU16, HEAPU32, and etc. as defined here. So you can do like this:

let instance = new Module.MyClass();

// ... Do something

let ptr = instance.getPtr();
let size = instance.getLength();
let my_uint8_buffer = new Uint8Array(Module.HEAPU8.buffer, ptr, size);

Using HEAPU8 would be safer, since HEAPU8 is documented whereas the attribute name of Module['wasmMemory'] is kinda undocumented and may be subject to change; but they do the same thing.

3. Using emscripten::val (C++ only)

Emscripten also provides a class called emscripten::val for C++ developers to interact between JS and C++. This abstracts any JS/C++ types for convenience. You can get the array using this.

This is the example taken from the documentation and Glenn's comment:

#include <emscripten/bind.h>
#include <emscripten/val.h>

emscripten::val getInt8Array() {
    return emscripten::val(
       emscripten::typed_memory_view(buffer->size(),
                                     buffer->data()));
}

EMSCRIPTEN_BINDINGS() {
    function("getInt8Array", &getInt8Array);
}

Then you can call getInt8Array() in JS side to get the typed array.

Conclusion

There are 3 options suggested here to get the array from WASM. In any way, I think you should understand the concepts of WebAssembly.Memory and things behind option 1, because this is the lowest level to get an array from WASM, and, most importantly, this is unmanaged and unsafe memory access so that it's easy to corrupt data when the object is freed or modified in C/C++ side. The knowledge of the low-level implications is required for this specific case.

like image 181
Bumsik Kim Avatar answered Dec 25 '22 15:12

Bumsik Kim