Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can WebAssembly compiled with Emscripten generate smaller file sizes

I'm very interested in WebAssembly, yet am dismayed that even a "Hello World" example, coded in C++ & compiled using Emscripten, produces a total of 396KB to load in the browser. What gives? How can this be made more size-efficient?

like image 627
Jack Avatar asked Sep 15 '25 09:09

Jack


1 Answers

Summary

  • For larger projects like a game engine, the Emscripten generated code has proportionally less size overhead compared to a small Hello World example.
  • Emscripten has recently made large improvements in shrinking the code size. Make sure you use a recent Emscripten release.
  • Adding -Os –closure 1 may reduce the size of the generated code by 10x.

Below follows a description to answer the question how can this be made more size-efficient


Why is so much code generated?

The amount of Webassembly generated is proportional to the amount of C++ code written and the dependencies of that code. A C++ program that has a dependency on the standard library is depending on more code than you might expect. A simple add() function like this...

int add(int x, int y) {
    return x + y;
}

..Will generate a short Webassembly function like this:

(func $add (param $x i32 $y i32) (return i32)
  (get_local 0
   get_local 1
   i32.add))

But a call to printf will need to have definitions for functions like strlen, flockfile, funlockfile, memcpy, fwrite, fputs, __stdio_write, i.e. all the functions from the standard library needed for making the printf call. A C++ program running in the native environment would just be linked against the proper libc for the platform, but Webassembly needs to carry those library dependencies along.

In addition to the userspace library dependencies, the tool that generates Webassembly also needs to provide a runtime environment that handle system calls. So a Hello World program needs to have definitions that overrides the system calls for allocating memory and for writing bytes.


How can the compiler shrink the code size?

Alon Zakai, the creator and maintainer of emscripten, has written the Mozilla Hacks article Shrinking Webassembly and Javascript code sizes in Emscripten. I'm gonna summarize the main points from that article here:

Emscripten initially focused on making it easy to port existing C and C++ programs by providing a Posix environment by implementing a libc and a runtime for system calls. In the name of convinience, more code was often included than was needed.

A lot of the runtime was implemented as Javascript code. Emscripten generates code that calls back and forth between the application/library Webassembly code and the Javascript runtime.

Code that is never called, should be removed. In compilers that's handled by an optimization called Dead Code Elimination. Emscripten builds a graph of all functions and removes those parts that are never called from main. Ok, this is not strictly correct but suffices for this explanation.

But the compiler wasn't previously capable of generating that sort of graph for calls that crossed the boundary between Webassembly and Javascript. That changed with the inclusion of the wasm-dce tool. Now, Emscripten can create a graph of both the Webassembly and the Javascript code.


What is the limit of "shrinkage" for a Hello World program

printf is a general function that operates on file descriptors and that is thread-safe. The code that is generated for a printf call pretty much all must be there.

If you want to experiment more with what code is being generated, I recommend the Webassembly Studio online IDE. It provides an example Hello World project with a README that goes over what library code and runtime Javascript code is generated.

like image 54
Daniel Näslund Avatar answered Sep 18 '25 10:09

Daniel Näslund