Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing a JavaScript string to a Rust function compiled to WebAssembly

I have this simple Rust function:

#[no_mangle]
pub fn compute(operator: &str, n1: i32, n2: i32) -> i32 {
    match operator {
        "SUM" => n1 + n2,
        "DIFF" => n1 - n2,
        "MULT" => n1 * n2,
        "DIV" => n1 / n2,
        _ => 0
    }
}

I am compiling this to WebAssembly successfully, but don't manage to pass the operator parameter from JS to Rust.

The JS line which calls the Rust function looks like this:

instance.exports.compute(operator, n1, n2);

operator is a JS String and n1, n2 are JS Numbers.

n1 and n2 are passed properly and can be read inside the compiled function so I guess the problem is how I pass the string around. I imagine it is passed as a pointer from JS to WebAssembly but can't find evidence or material about how this works.

I am not using Emscripten and would like to keep it standalone (compilation target wasm32-unknown-unknown), but I see they wrap their compiled functions in Module.cwrap, maybe that could help?

like image 972
vinzdef Avatar asked Feb 27 '18 17:02

vinzdef


People also ask

Does Rust compile to WebAssembly?

Building the packageCompiles your Rust code to WebAssembly. Runs wasm-bindgen on that WebAssembly, generating a JavaScript file that wraps up that WebAssembly file into a module the browser can understand. Creates a pkg directory and moves that JavaScript file and your WebAssembly code into it. Reads your Cargo.

How do I use JavaScript with Rust?

To call Rust from JavaScript, you need to compile the Rust code to Wasm and provide the thin JavaScript wrapper. The template project already has it configured. You only need to use the wasm-bindgen macro on the Rust functions you want to make available.

Is Rust wasm faster than JS?

Rust is 2x (200%) faster, but uses only 1% of the memory compared with Java. Rust is 150x (15,000%) faster, and uses about the same amount of memory compared with Python.

What is Wasm_bindgen?

The goal of wasm-bindgen is to provide a bridge between the types of JS and Rust. It allows JS to call a Rust API with a string, or a Rust function to catch a JS exception.


1 Answers

Easiest and most idiomatic solution

Most people should use wasm-bindgen, which makes this whole process much simpler!

Low-level manual implementation

To transfer string data between JavaScript and Rust, you need to decide

  1. The encoding of the text: UTF-8 (Rust native) or UTF-16 (JS native).
  2. Who will own the memory buffer: the JS (caller) or Rust (callee).
  3. How to represent the strings data and length: NUL-terminated (C-style) or distinct length (Rust-style).
  4. How to communicate the data and length, if they are separate.

Common setup

It's important to build C dylibs for WASM to help them be smaller in size.

Cargo.toml

[package]
name = "quick-maths"
version = "0.1.0"
authors = ["An Devloper <[email protected]>"]

[lib]
crate-type = ["cdylib"]

.cargo/config

[target.wasm32-unknown-unknown]
rustflags = [
    "-C", "link-args=--import-memory",
]

package.json

{
  "name": "quick-maths",
  "version": "0.1.0",
  "main": "index.js",
  "author": "An Devloper <[email protected]>",
  "license": "MIT",
  "scripts": {
    "example": "node ./index.js"
  },
  "dependencies": {
    "fs-extra": "^8.0.1",
    "text-encoding": "^0.7.0"
  }
}

I'm using NodeJS 12.1.0.

Execution

$ rustup component add rust-std --target wasm32-unknown-unknown
$ cargo build --release --target wasm32-unknown-unknown

Solution 1

I decided:

  1. To convert JS strings to UTF-8, which means that the TextEncoder JS API is the best fit.
  2. The caller should own the memory buffer.
  3. To have the length be a separate value.
  4. Another struct and allocation should be made to hold the pointer and length.

lib/src.rs

// A struct with a known memory layout that we can pass string information in
#[repr(C)]
pub struct JsInteropString {
    data: *const u8,
    len: usize,
}

// Our FFI shim function    
#[no_mangle]
pub unsafe extern "C" fn compute(s: *const JsInteropString, n1: i32, n2: i32) -> i32 {
    // Check for NULL (see corresponding comment in JS)
    let s = match s.as_ref() {
        Some(s) => s,
        None => return -1,
    };

    // Convert the pointer and length to a `&[u8]`.
    let data = std::slice::from_raw_parts(s.data, s.len);

    // Convert the `&[u8]` to a `&str`    
    match std::str::from_utf8(data) {
        Ok(s) => real_code::compute(s, n1, n2),
        Err(_) => -2,
    }
}

// I advocate that you keep your interesting code in a different
// crate for easy development and testing. Have a separate crate
// with the FFI shims.
mod real_code {
    pub fn compute(operator: &str, n1: i32, n2: i32) -> i32 {
        match operator {
            "SUM"  => n1 + n2,
            "DIFF" => n1 - n2,
            "MULT" => n1 * n2,
            "DIV"  => n1 / n2,
            _ => 0,
        }
    }
}

index.js

const fs = require('fs-extra');
const { TextEncoder } = require('text-encoding');

// Allocate some memory.
const memory = new WebAssembly.Memory({ initial: 20, maximum: 100 });

// Connect these memory regions to the imported module
const importObject = {
  env: { memory }
};

// Create an object that handles converting our strings for us
const memoryManager = (memory) => {
  var base = 0;

  // NULL is conventionally at address 0, so we "use up" the first 4
  // bytes of address space to make our lives a bit simpler.
  base += 4;

  return {
    encodeString: (jsString) => {
      // Convert the JS String to UTF-8 data
      const encoder = new TextEncoder();
      const encodedString = encoder.encode(jsString);

      // Organize memory with space for the JsInteropString at the
      // beginning, followed by the UTF-8 string bytes.
      const asU32 = new Uint32Array(memory.buffer, base, 2);
      const asBytes = new Uint8Array(memory.buffer, asU32.byteOffset + asU32.byteLength, encodedString.length);

      // Copy the UTF-8 into the WASM memory.
      asBytes.set(encodedString);

      // Assign the data pointer and length values.
      asU32[0] = asBytes.byteOffset;
      asU32[1] = asBytes.length;

      // Update our memory allocator base address for the next call
      const originalBase = base;
      base += asBytes.byteOffset + asBytes.byteLength;

      return originalBase;
    }
  };
};

const myMemory = memoryManager(memory);

fs.readFile('./target/wasm32-unknown-unknown/release/quick_maths.wasm')
  .then(bytes => WebAssembly.instantiate(bytes, importObject))
  .then(({ instance }) => {
    const argString = "MULT";
    const argN1 = 42;
    const argN2 = 100;

    const s = myMemory.encodeString(argString);
    const result = instance.exports.compute(s, argN1, argN2);

    console.log(result);
  });

Execution

$ yarn run example
4200

Solution 2

I decided:

  1. To convert JS strings to UTF-8, which means that the TextEncoder JS API is the best fit.
  2. The module should own the memory buffer.
  3. To have the length be a separate value.
  4. To use a Box<String> as the underlying data structure. This allows the allocation to be further used by Rust code.

src/lib.rs

// Very important to use `transparent` to prevent ABI issues
#[repr(transparent)]
pub struct JsInteropString(*mut String);

impl JsInteropString {
    // Unsafe because we create a string and say it's full of valid
    // UTF-8 data, but it isn't!
    unsafe fn with_capacity(cap: usize) -> Self {
        let mut d = Vec::with_capacity(cap);
        d.set_len(cap);
        let s = Box::new(String::from_utf8_unchecked(d));
        JsInteropString(Box::into_raw(s))
    }

    unsafe fn as_string(&self) -> &String {
        &*self.0
    }

    unsafe fn as_mut_string(&mut self) -> &mut String {
        &mut *self.0
    }

    unsafe fn into_boxed_string(self) -> Box<String> {
        Box::from_raw(self.0)
    }

    unsafe fn as_mut_ptr(&mut self) -> *mut u8 {
        self.as_mut_string().as_mut_vec().as_mut_ptr()
    }
}

#[no_mangle]
pub unsafe extern "C" fn stringPrepare(cap: usize) -> JsInteropString {
    JsInteropString::with_capacity(cap)
}

#[no_mangle]
pub unsafe extern "C" fn stringData(mut s: JsInteropString) -> *mut u8 {
    s.as_mut_ptr()
}

#[no_mangle]
pub unsafe extern "C" fn stringLen(s: JsInteropString) -> usize {
    s.as_string().len()
}

#[no_mangle]
pub unsafe extern "C" fn compute(s: JsInteropString, n1: i32, n2: i32) -> i32 {
    let s = s.into_boxed_string();
    real_code::compute(&s, n1, n2)
}

mod real_code {
    pub fn compute(operator: &str, n1: i32, n2: i32) -> i32 {
        match operator {
            "SUM"  => n1 + n2,
            "DIFF" => n1 - n2,
            "MULT" => n1 * n2,
            "DIV"  => n1 / n2,
            _ => 0,
        }
    }
}

index.js

const fs = require('fs-extra');
const { TextEncoder } = require('text-encoding');

class QuickMaths {
  constructor(instance) {
    this.instance = instance;
  }

  difference(n1, n2) {
    const { compute } = this.instance.exports;
    const op = this.copyJsStringToRust("DIFF");
    return compute(op, n1, n2);
  }

  copyJsStringToRust(jsString) {
    const { memory, stringPrepare, stringData, stringLen } = this.instance.exports;

    const encoder = new TextEncoder();
    const encodedString = encoder.encode(jsString);

    // Ask Rust code to allocate a string inside of the module's memory
    const rustString = stringPrepare(encodedString.length);

    // Get a JS view of the string data
    const rustStringData = stringData(rustString);
    const asBytes = new Uint8Array(memory.buffer, rustStringData, encodedString.length);

    // Copy the UTF-8 into the WASM memory.
    asBytes.set(encodedString);

    return rustString;
  }
}

async function main() {
  const bytes = await fs.readFile('./target/wasm32-unknown-unknown/release/quick_maths.wasm');
  const { instance } = await WebAssembly.instantiate(bytes);
  const maffs = new QuickMaths(instance);

  console.log(maffs.difference(100, 201));
}

main();

Execution

$ yarn run example
-101

Note that this process can be used for other types. You "just" have to decide how to represent data as a set of bytes that both sides agree on then send it across.

See also:

  • Using the WebAssembly JavaScript API
  • TextEncoder API
  • Uint8Array / Uint32Array / TypedArray
  • WebAssembly.Memory
  • Hello, Rust! — Import memory buffer
  • How to return a string (or similar) from Rust in WebAssembly?
like image 97
Shepmaster Avatar answered Nov 12 '22 09:11

Shepmaster