WebAssembly, a relatively new and rapidly-evolving low-level language, is designed as an assembly language that run on the Web, and gains support from all major browsers. High-level languages like C++ and Rust can use WebAssembly as a compilation target to be deployed to the web.

Set up the Toolchain

C/C++ source code can be compiled to WebAssembly by either clang or emscripten. Installing the emscripten toolchain is fairly straightforward,

  • Clone the emsdk,

      git clone https://github.com/emscripten-core/emsdk.git
      cd emsdk/
    
  • Install the latest release, or the tip-of-tree build,

      ./emsdk install tot
      ./emsdk activate tot
    
  • Finally, populate the environment variables,

      source ./emsdk_env.sh
    

Now emcc should works,

$ emcc --version
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.6-git (e1bfc04f8234eff40739c7932402f2148487c192)
Copyright (C) 2014 the Emscripten authors (see AUTHORS.txt)
This is free and open source software under the MIT license.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Besides compiling C/C++ source code to WebAssembly. Emscripten supports a set of polyfill libraries, e.g., SDL, glfw, to emulate native behaviors in the web browsers.

To inspect and manipulate the WebAssembly language, we need to install the binaryen tools,

git clone https://github.com/WebAssembly/binaryen.git
cd binaryen/
git submodule update
mkdir build && cd build && cmake ..
make install -j`nproc`

Compiling with Emscripten

Let’s try a “hello world” to get a glance at the emscripten toolchain,

#include <stdio.h>

int main() {
  printf("hello world, webassembly!\n");
  return 0;
}

Compiling the C source code with emcc,

$ emcc hello.cc

A a.out.js and a.out.wasm was generated, and the a.out.js is the entry file,

$ node ./a.out.js
hello world, webassembly!

Compiling with LLVM

Emscripten is not a necessity for compiling C/C++ source to WebAssembly, and LLVM supports natively WebAssembly code generation,

$ clang --print-targets
  Registered Targets:
    ...
    wasm32     - WebAssembly 32-bit
    wasm64     - WebAssembly 64-bit

We actually do could compile to WebAssembly using clang,

int add (int first, int second) {
  return first + second;
}

Compiling the code above to WebAssembly,

$ clang -O2 --target=wasm32 --no-standard-libraries -Wl,--export-all -Wl,--no-entry -o add.wasm add.cc

We use --target=wasm32 for the clang compiler, as wasm64 is not supported well by binaryen at the time of writing, see WebAssembly/binary#.

The wasm2js tool in binaryen can translate the wasm module to js module (see difference between .js and .mjs),

$ wasm2js add.wasm -o add.mjs

And we can import and evaluate the js module in node,

$ node --experimental-repl-await
Welcome to Node.js v14.18.2.
Type ".help" for more information.
>
> let addModule = await import("./add.mjs")

> addModule
[Module: null prototype] {
  __wasm_call_ctors: [Function: __wasm_call_ctors],
  add: [Function: add],
  memory: {}
}

> addModule.add(1, 2)
3

Actually we can import the WebAssembly into node using WebAssembly.instantiate(),

$ node --experimental-repl-await
Welcome to Node.js v14.18.2.
Type ".help" for more information.
>
> let addModule = await WebAssembly.instantiate(fs.readFileSync("add.wasm"));
{
  instance: Instance [WebAssembly.Instance] {},
  module: Module [WebAssembly.Module] {}
}

> addModule.instance.exports
[Object: null prototype] {
  memory: Memory [WebAssembly.Memory] {},
  __wasm_call_ctors: [Function: 0],
  add: [Function: 1],
  __dso_handle: Global [WebAssembly.Global] {},
  __data_end: Global [WebAssembly.Global] {},
  __global_base: Global [WebAssembly.Global] {},
  __heap_base: Global [WebAssembly.Global] {},
  __memory_base: Global [WebAssembly.Global] {},
  __table_base: Global [WebAssembly.Global] {}
}

> addModule.instance.exports.add(1, 2)
3

We use --no-standard-libraries to compile .cc to WebAssembly, to avoid linking the standard libc/libc++ libraries (namely libc.a, libc++.a, libc++abi.a and libclang_rt.builtins-wasm32.a), which are usually not available in typical LLVM/Clang distributions, e.g., apt.llvm.org. Fortunately, the project wasm-libc provides a libc compatible layer for WebAssembly and the project wasm-sdk provides a pre-built cross-compiled bundle of above standard libraries with sys-root correctly configured.

The wasm-sdk toolchain is available at https://github.com/WebAssembly/wasi-sdk/releases.

Once we have downloaded and unpacked the wasm-sdk, we can verify the support of WebAssembly by

$ $WASM/bin/clang --print-targets
  Registered Targets:
    wasm32 - WebAssembly 32-bit
    wasm64 - WebAssembly 64-bit

where the $WASM is the location of the wasm-sdk installation.

Lets try compile C/C++ source to WebAssembly in which standard libraries are involved,

#include <math.h>

extern "C" double add(int first, int second) {
  return sin(first + second);
}

Compiling using $WASM/bin/clang,

$ $WASM/bin/clang++ -O3 --target=wasm32-unknown-wasi -nostartfiles -Wl,--export-all -Wl,--no-entry -o add.wasm add.cc

Note that we use -nostartfiles as it doesn’t have a main entry function, and use wasm32-unknown-wasi target triple as the ABI.

Evaluating WebAssembly binary that WASI involves requires an implementation of the WASI APIs (namely wasi_snapshot_preview1), and there are some polyfills provides such functionalities, e.g., deno.std and lib/wasi.js.

$ node --experimental-repl-await --experimental-wasi-unstable-preview
Welcome to Node.js v14.18.2.
Type ".help" for more information.
>
> const WASI = await import("wasi");

> let wasi = new WASI.WASI();

> const importObject = { wasi_snapshot_preview1: wasi.wasiImport };

> let addModule = await WebAssembly.instantiate(fs.readFileSync("add.wasm"), importObject);

> addModule.instance.exports.add(1, 2)
0.141120008059867

Memory Pointers in WASM

Consider the following case, where we need to share some memory pointers between WebAssembly (i.e., the C/C++ source code) and JavaScript1,

#include <math.h>
#include <string.h>

extern "C" double add(int first, int second) {
  return sin(first + second);
}

extern "C" int add_string(char *input, char *output) {
    int nlen = strlen(input);
    strncpy(output, input, nlen);
    return nlen;
}

Compile the above C source code with -Wl,--import-memory to makes the heap in WASM sharable with the environment2,

$WASM/bin/clang++ -O3 --target=wasm32-unknown-wasi \
    -nostartfiles \
    -Wl,--export-all \
    -Wl,--no-entry \
    -Wl,--import-memory \
    -o add.wasm add.cc

Inspecting the generated add.wasm with wasm-dis from Binaryen, there’re some “import” statements in the wat,

(import "env" "memory" (memory $mimport$0 2))

We need to provide a env.memory entry in the importObject,

> let memory = new WebAssembly.Memory({ initial: 2 });

> let importObject = { wasi_snapshot_preview1: wasi.wasiImport, env: { memory: memory } };
> let addModule = await WebAssembly.instantiate(fs.readFileSync("add.wasm"), importObject);

> let view = new Uint8Array(memory.buffer);
> let base = addModule.instance.exports.__heap_base;

The block of memory can be accessed from both the two environments, as a typed buffer in JavaScript and a raw pointer in WebAssembly (i.e., C/C++).

Next, we define two functions to peek JavaScript value from the typed buffer and poke the JavaScript value into the memory view,

> function fromJSString(memory, base, text) {
      for (let i = 0; i < text.length; ++i) {
          memory[base + i] = text.charCodeAt(i);
      }
      memory[base + text.length] = 0;
  }

> function toJSString(memory, base) {
      let p = base + 0;
      let result = '';
      while (memory[p] !== 0) {
          result += String.fromCharCode(memory[p++]);
      }
      return result;
  }

Combining the above pieces as a whole, we can verify that the add_string defined in C does work expectedly,

$ node --experimental-repl-await --experimental-wasi-unstable-preview
Welcome to Node.js v14.18.2.
Type ".help" for more information.
>
> const WASI = await import("wasi");
> const wasi = new WASI.WASI();

> function fromJSString(memory, base, text) {
      for (let i = 0; i < text.length; ++i) {
          memory[base + i] = text.charCodeAt(i);
      }
      memory[base + text.length] = 0;
  }

> function toJSString(memory, base) {
      let p = base + 0;
      let result = '';
      while (memory[p] !== 0) {
          result += String.fromCharCode(memory[p++]);
      }
      return result;
  }

> let memory = new WebAssembly.Memory({ initial: 2 });

> let importObject = { wasi_snapshot_preview1: wasi.wasiImport, env: { memory: memory } };
> let addModule = await WebAssembly.instantiate(fs.readFileSync("add.wasm"), importObject);

> let view = new Uint8Array(memory.buffer);
> let base = addModule.instance.exports.__heap_base;

> let text = "hello world, webassembly!";
> fromJSString(view, base, text);
> toJSString(view, base);
'hello world, webassembly!'

> addModule.instance.exports.add_string(base, base + text.length);
25
> toJSString(view, base);
'hello world, webassembly!hello world, webassembly!'

Note that the view will be invalidated once the underlying memory buffer changed, e.g., malloc happens in the WebAssembly functions, the a new view will be needed after returning from the WebAssembly functions.

  1. https://hacks.mozilla.org/2017/07/memory-in-webassembly-and-why-its-safer-than-you-think/ 

  2. https://lld.llvm.org/WebAssembly.html#cmdoption-import-memory