Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing client files to webassembly from the front-end

I'm looking to pass user-submitted data to a c++ function which I've compiled to wasm. The data is a file which the user submits on the front end via an input tag, like so:

<input type="file" onChange={this.handleFile.bind(this)} />

The onChange callback currently looks like this:

handleFile(e){
    const file = e.currentTarget.files[0];
    const reader = new FileReader();
    reader.onloadend = evt => {
        window.Module.readFile(evt.target.result);
    }
    reader.readAsArrayBuffer(file);
}

Finally, the .cpp file containing the readFile function looks like this:

void readFile(const std::string & rawString){
  std::vector<uint8_t> data(rawString.begin(), rawString.end());
  //...
}

EMSCRIPTEN_BINDINGS(my_module) {
  emscripten::function("readFile", &readFile);
}

I've spend my afternoon reading various docs so I'm aware that I'm supposed to allocate memory for these files on the heap and then pass a ptr from js to readFile instead of passing all of the data. My problem is that I just don't really understand how all of that is supposed to work. Could someone explain?

like image 320
Mogadishu Avatar asked Nov 15 '17 17:11

Mogadishu


2 Answers

This is a partial answer. It's superior to what I originally did and I feel like it might be closer to what the creators intended. However, I'm still creating more than one copy of the file. Credit to this post for making it click for me.

This is now my handleFile callback, commented with things I learned.

handleFile(e){

    const file = e.currentTarget.files[0];
    if(!(file instanceof Blob)) return;
    const reader = new FileReader();
    reader.onloadend = evt => {

        //evt.target.result is an ArrayBuffer. In js, 
        //you can't do anything with an ArrayBuffer 
        //so we have to ???cast??? it to an Uint8Array
        const uint8_t_arr = new Uint8Array(evt.target.result);

        //Right now, we have the file as a unit8array in javascript memory. 
        //As far as I understand, wasm can't directly access javascript memory. 
        //Which is why we need to allocate special wasm memory and then
        //copy the file from javascript memory into wasm memory so our wasm functions 
        //can work on it.

        //First we need to allocate the wasm memory. 
        //_malloc returns the address of the new wasm memory as int32.
        //This call is probably similar to 
        //uint8_t * ptr = new uint8_t[sizeof(uint8_t_arr)/sizeof(uint8_t_arr[0])]
        const uint8_t_ptr = window.Module._malloc(uint8_t_arr.length);

        //Now that we have a block of memory we can copy the file data into that block
        //This is probably similar to 
        //std::memcpy(uint8_t_ptr, uint8_t_arr, sizeof(uint8_t_arr)/sizeof(uint8_t_arr[0]))
        window.Module.HEAPU8.set(uint8_t_arr, uint8_t_ptr);

        //The only thing that's now left to do is pass 
        //the address of the wasm memory we just allocated
        //to our function as well as the size of our memory.
        window.Module.readFile(uint8_t_ptr, uint8_t_arr.length);

        //At this point we're forced to wait until wasm is done with the memory. 
        //Your site will now freeze if the memory you're working on is big. 
        //Maybe we can somehow let our wasm function run on a seperate thread and pass a callback?

        //Retreiving our (modified) memory is also straight forward. 
        //First we get some javascript memory and then we copy the 
        //relevant chunk of the wasm memory into our javascript object.
        const returnArr = new Uint8Array(uint8_t_arr.length);
        //If returnArr is std::vector<uint8_t>, then is probably similar to 
        //returnArr.assign(ptr, ptr + dataSize)
        returnArr.set(window.Module.HEAPU8.subarray(uint8_t_ptr, uint8_t_ptr + uint8_t_arr.length));

        //Lastly, according to the docs, we should call ._free here.
        //Do we need to call the gc somehow?
        window.Module._free(uint8_t_ptr);

    }
    reader.readAsArrayBuffer(file);
}

Here is readFile.cpp.

#include <emscripten/bind.h>

//We get out pointer as a plain int from javascript
void readFile(const int & addr, const size_t & len){
  //We use a reinterpret_cast to turn our plain int into a uint8_t pointer. After
  //which we can play with the data just like we would normally.
  uint8_t * data = reinterpret_cast<uint8_t *>(addr);
  for(size_t i = 0; i < len; ++i){
    data[i] += 1;
  }
}

//Using this command to compile
//  emcc --bind -O3 readFile.cpp -s WASM=1 -s TOTAL_MEMORY=268435456 -o api.js --std=c++11
//Note that you need to make sure that there's enough memory available to begin with.
//I got only 16mb without passing the TOTAL_MEMORY setting.
EMSCRIPTEN_BINDINGS(my_module) {
  emscripten::function("readFile", &readFile);
}
like image 43
Mogadishu Avatar answered Nov 19 '22 05:11

Mogadishu


With Emscripten you can use a virtual file system for WASM. First, you compile your C/C++ code with -s FORCE_FILESYSTEM=1 option. Inside the C/C++, you just work with files as usual, with standard library functions. At the HTML page you have an input type=file element.

Sample JS code to get the file from the input element and pass it into the WASM:

function useFileInput(fileInput) {
    if (fileInput.files.length == 0)
        return;
    var file = fileInput.files[0];

    var fr = new FileReader();
    fr.onload = function () {
        var data = new Uint8Array(fr.result);

        Module['FS_createDataFile']('/', 'filename', data, true, true, true);
        Module.ccall('YourCppFunctionToUtilizeTheFile', null, [], null);

        fileInput.value = '';
    };
    fr.readAsArrayBuffer(file);
}

Links:

  1. Emscripten - File System Overview
  2. Here I use the approach, see emulatorAttachFileInput() function
like image 71
nzeemin Avatar answered Nov 19 '22 06:11

nzeemin