Part #1: Memory management in WASM

Divya Mohan
Dev Genius
Published in
5 min readFeb 14, 2022

--

This is the fourth post in my series on WebAssembly and can also be read independently. This post is intended to be a theoretical primer to memory management in WASM. However, if you’d like to check out previous posts in this series I’ve included them in the resources section below.

Motivations behind exploring memory management

Memory in WebAssembly is a challenging topic to explore when your existing frame of reference includes programming languages with memory management features. With direct access to raw bytes and manual memory management, it feels pretty alien to work with for those who have been using these aforementioned languages.

What do I need to know before I start?

Nothing. Even if you’re entirely new to WebAssembly (and memory management), this post will explain the basics in a contrasting manner so that a person entirely new to the ecosystem can understand both sides of the coin.

What is memory management?

Simply put, it is the process of controlling & coordinating the way an application (i.e. code written in any programming language) accesses your CPU memory. Now you might ask, why does my application need memory? When your code runs on any operating system, it requires access to RAM for the following reasons:

  • To load its own bytecode that will then be executed
  • To store the structures & data that will be used during execution
  • Load any runtime systems that are required for the program to execute

Over and above the space required to load the bytecode, the software program utilizes two special regions known as the Heap & the Stack memory. You can reference this link for an in-depth coverage of stack v/s heap.

TL;DR if you didn’t read the link above: Stack v/s heap memory categorization is done on the basis of memory allocation. There are also differences in the how these regions are accessed, the speed at which they are accessed, what can be stored in each, sizes to which they can grow, and the errors you’ll encounter if these areas are mismanaged.

Per the v1 spec, WASM modules execute within a sandbox environment that separate them from the host/OS runtime. This effectively translates to them not being able to access data from the host or from other WebAssembly guests unless explicitly allowed. So given the above, how exactly does WebAssembly memory look like?

Memory and WebAssembly

WebAssembly modules were designed with the assumption to NOT control the entire process address space unlike languages like C/C++. They do share the memory with other instances via import/export but each module gets allocated only a small continuous section of virtual memory with an offset, also known as its linear memory. In a very crude manner, this is what linear memory allocation looks like,

Linear memory: A small contiguous section of virtual memory

This memory is created with an initial size and is measured in pages. As of WebAssembly v1 spec, this can be dynamically grown with grow_memory instruction to a maximum of 65536 pages, for a total of 2^32 bytes (4 gibibytes).

Why does any of this matter?

The spec ensures memory safety by accounting for the below,

  • Potentially malicious modules cannot access data outside the linear memory
  • Since the memory size is always known, the runtime is always able to check whether a module is accessing memory within the boundaries specified
  • Unless explicit access is given, the module may not access any other memory.

But WASM modules can be written in and compiled from different programming languages with their own memory management models. How does this spec stay consistent across these languages? Simple. By understanding how allocations and deallocations are achieved when attempting to write/read from the host’s runtime and by taking into account and how to simplify the exchange across the varying complex data types.

Given the above there are two ways to go about managing memory in WASM,

  • Your module owns the data and is responsible for managing its lifetime
  • Copying the data from the host to the module

While you could do any of this manually by writing your own code, for the second option you could let tools like wasm-bindgen or the AssemblyScript loader do the magic for you instead.

What wasm-bindgen does is wrap the WebAssembly module in a JavaScript wrapper so that both JavaScript and WASM are reading from/writing to the same linear memory in a way that both understand. Kinda like this.

Credit: https://hacks.mozilla.org/2018/06/babys-first-rustwebassembly-module-say-hi-to-jsconf-eu/

Similarly loading Wasm modules and exposing them via WebAssembly API, the AssemblyScript loader also provides utilities to allocate and read strings, arrays, and classes. By providing the glue code, it does most of the heavy lifting and allows you to play around with linear memory.

The aforementioned utilities are scoped to the respective languages. Therefore, there is definitely an appetite for more development to be made in this space. At the time of writing this post, there are few proposals in the works, namely interface types, multi-value Wasm, Garbage collection, that could potentially simplify memory management and exchange of complex data types in WebAssembly.

So, that’s it for this post! In the next one, we shall look at a simple program and how we can leverage WASM’s memory management capabilities for a simple Web Application written in TypeScript.

Resources:

A primer on WebAssembly,

How to read WASM — part #1

How to read WASM — part #2

Mozilla Hacks blogs by Lin Clark on WebAssembly

This GitHub repo of awesome WASM resources.

To stay updated with my latest tech shenanigans, do follow me on Twitter and LinkedIn.

--

--

Technical Evangelism @ Rancher by SUSE • SIG Docs co-chair @ Kubernetes