Important
The v2 from-scratch reimplementation is nearly complete. You can try it from the v2.x.x tags, though some instability and unimplemented items are still expected. Because development resources are limited, we are pausing acceptance of Issues and Pull Requests for now — if you hit a problem, please reach out via Discussions and mention @chaploud. On 2026-07-01 the zwasm-from-scratch branch will be merged into main and replace it. For the old version, refer to its tags and releases. See the v1 → v2 migration guide.
A small, full-featured WebAssembly runtime written in Zig. Library and CLI.
Supported host targets:
aarch64-macosx86_64-linux,aarch64-linuxx86_64-windows
| Runtime | Binary (stripped) | Memory (fib) | Execution | Wasm 3.0 |
|---|---|---|---|---|
| zwasm | 1.20–1.56 MB | ~3.5 MB | Interp + ARM64/x86_64 JIT | Full |
| wasmtime | ~56 MB | ~12 MB | Cranelift AOT/JIT | Full |
| wasmer | 30+ MB | ~15 MB | LLVM/Cranelift/Singlepass | Partial |
| wazero | 8–12 MB | ~6 MB | Pure Go interp + Compiler | Partial |
| wasm3 | ~0.3 MB | ~1 MB | Pure interpreter | Partial |
zwasm sits in the niche between "tiny but limited" runtimes (wasm3, WAMR) and "full-featured but large" ones (wasmtime, wasmer): full Wasm 3.0 with JIT and SIMD in roughly the same byte budget as a pure interpreter.
zwasm was extracted from ClojureWasm (a Zig reimplementation of Clojure) where keeping a Wasm subsystem inside the language runtime created a "runtime within runtime" layering problem. ClojureWasm remains the primary consumer.
- Full Wasm 3.0. Core MVP plus all 9 ratified 3.0 proposals (GC, exception handling, tail calls, function references, multi-memory, memory64, branch hinting, extended const, relaxed SIMD), plus threads (79 atomics) and wide arithmetic. 581+ opcodes total.
- 4-tier execution. Bytecode → predecoded IR → register IR → ARM64/x86_64 JIT. Hot functions promote automatically (HOT_THRESHOLD=3).
- SIMD JIT. ARM64 NEON 253/256 native, x86_64 SSE 244/256 native. Contiguous v128 register storage with Q-cache (Q16–Q31 / XMM6–XMM15).
- WASI Preview 1 + Component Model. 46/46 P1 syscalls (100%); P2 via component-model adapter, WIT parser, Canonical ABI.
- Spec conformance. 62,263 / 62,263 spec tests on Mac aarch64, Linux x86_64, Windows x86_64 (CI). 796 / 796 E2E tests on all three. 50 / 50 real-world programs (Rust + C + C++ + Go + TinyGo) on Mac and Linux; on Windows the GitHub-hosted CI runner exercises the C+C++ subset (25 / 25), while a local Windows checkout reaches the full 50 / 50 once
pwsh scripts/windows/install-tools.ps1has provisioned Rust + Go + TinyGo. CI parity tracked as W50 (CI Nix-ify). - WAT support. Run
.wattext files directly; build-optional via-Dwat=false. - Security. Deny-by-default WASI capabilities, fuel metering, wall-clock timeout, memory ceiling, JIT W^X pages, signal-handled traps.
- No libc. CLI / library / tests link
link_libc = false(Mac uses libSystem auto-link). C-API shared/static targets keeplink_libc = truebecausestd.heap.c_allocatoris exposed. - Allocator-parameterized. The library takes a
std.mem.Allocatorat load time; embedders own all allocation.
| Spec layer | Proposals included | Status |
|---|---|---|
| Wasm 1.0 | MVP (172 opcodes) | Complete |
| Wasm 2.0 | Sign extension, non-trapping float→int, bulk memory, reference types, multi-value, fixed-width SIMD (236) | Complete |
| Wasm 3.0 | Memory64, exception handling, tail calls, extended const, branch hinting, multi-memory, relaxed SIMD (20), function references, GC (31) | Complete |
| Phase 3 | Wide arithmetic (4), custom page sizes | Complete |
| Phase 4 | Threads (79 atomics) | Complete |
| Layer | Component Model (WIT, Canon ABI, WASI P2 adapter) | Complete |
18 / 18 proposals complete. 399 unit tests, 796 / 796 E2E tests, 50 / 50 real-world programs (Rust, C, C++, TinyGo compiler outputs). Per-proposal opcode and test counts are in the Spec Coverage chapter.
Apple M4 Pro, ReleaseSafe, hyperfine 5 runs / 3 warmup; vs wasmtime 41.0.1 (Cranelift JIT), Bun 1.3.8, Node v24.13.0:
| Benchmark | zwasm | wasmtime | Bun | Node |
|---|---|---|---|---|
| nqueens(8) | 2 ms | 5 ms | 14 ms | 23 ms |
| nbody(1M) | 22 ms | 22 ms | 32 ms | 36 ms |
| sieve(1M) | 5 ms | 7 ms | 17 ms | 29 ms |
| tak(24,16,8) | 5 ms | 9 ms | 17 ms | 29 ms |
| fib(35) | 46 ms | 51 ms | 36 ms | 52 ms |
| st_fib2 | 900 ms | 674 ms | 353 ms | 389 ms |
Of 29 benchmarks, the majority match or beat wasmtime; a few compute-heavy long-running ones (e.g. st_fib2) still trail Cranelift AOT. Memory usage is roughly 3–4× lower than wasmtime and 8–10× lower than Bun/Node. Full data: bench/runtime_comparison.yaml.
SIMD microbenchmarks (ARM64 NEON / x86_64 SSE JIT):
| Benchmark | zwasm scalar | zwasm SIMD | wasmtime SIMD |
|---|---|---|---|
| matrix_mul (16×16) | 10 ms | 6 ms | 8 ms |
| image_blend (128×128) | 73 ms | 16 ms | 12 ms |
| byte_search (64 KB) | 52 ms | 43 ms | 5 ms |
Hand-written SIMD kernels (matrix_mul, image_blend) are competitive with wasmtime; matrix_mul is faster, image_blend is within 1.4×. Compiler-generated SIMD code (e.g. C -msimd128 with heavy i16x8.replace_lane) still shows larger gaps; further work is tracked in .dev/checklist.md. Full data: bench/simd_comparison.yaml.
macOS / Linux:
zig build -Doptimize=ReleaseSafe
cp zig-out/bin/zwasm ~/.local/bin/
# or one-liner:
curl -fsSL https://raw.githubusercontent.com/clojurewasm/zwasm/main/install.sh | bashWindows (PowerShell):
zig build -Doptimize=ReleaseSafe
Copy-Item zig-out\bin\zwasm.exe "$env:LOCALAPPDATA\Microsoft\WindowsApps\zwasm.exe"
irm https://raw.githubusercontent.com/clojurewasm/zwasm/main/install.ps1 | iexzwasm module.wasm # Run a WASI module (run is implicit)
zwasm module.wat # Run a WAT text module directly
zwasm module.wasm -- arg1 arg2 # WASI args after `--`
zwasm module.wasm --invoke fib 35 # Call a specific exported function
zwasm run module.wasm --allow-all # Explicit `run` subcommand
zwasm inspect module.wasm # Show imports, exports, memory
zwasm validate module.wasm # Validate without executing
zwasm compile module.wasm # Pre-warm the IR cache to disk
zwasm features [--json] # List supported proposalsconst zwasm = @import("zwasm");
var module = try zwasm.WasmModule.load(allocator, wasm_bytes);
defer module.deinit();
var args = [_]u64{35};
var results = [_]u64{0};
try module.invoke("fib", &args, &results);
// results[0] == 9227465See docs/usage.md for fuel / timeout / memory limits, host functions, multi-module linking, and WASI configuration.
zig build lib # libzwasm.{dylib,so,dll} + .a + include/zwasm.h#include "zwasm.h"
zwasm_module_t *mod = zwasm_module_new(wasm_bytes, len);
uint64_t results[1] = {0};
zwasm_module_invoke(mod, "f", NULL, 0, results, 1);
zwasm_module_delete(mod);For execution limits use zwasm_config_t: zwasm_config_set_fuel, zwasm_config_set_timeout, zwasm_config_set_max_memory, zwasm_config_set_force_interpreter, zwasm_config_set_cancellable. Fuel applies to module startup (_start) as well as subsequent invocations.
Full reference: C API chapter. Working examples in examples/c/, examples/python/, examples/rust/ (same workflow from Python ctypes and Rust extern "C").
| # | Category | Examples |
|---|---|---|
| 01–09 | Basics | hello_add, if_else, loop, factorial, fibonacci, select, collatz, stack_machine, counter |
| 10–15 | Types | i64_math, float_math, bitwise, type_convert, sign_extend, saturating_trunc |
| 16–19 | Memory | memory, data_string, grow_memory, bulk_memory |
| 20–24 | Functions | multi_return, multi_value, br_table, mutual_recursion, call_indirect |
| 25–26 | Wasm 3.0 | return_call (tail calls), extended_const |
| 27–29 | Algorithms | bubble_sort, is_prime, simd_add |
| 30–33 | WASI | wasi_hello, wasi_echo, wasi_args, wasi_write_file |
zwasm examples/wat/01_hello_add.wat --invoke add 2 3 # → 5
zwasm examples/wat/05_fibonacci.wat --invoke fib 10 # → 55
zwasm examples/wat/30_wasi_hello.wat --allow-all # → Hi!Other languages: examples/zig/ (5 embedding examples), examples/c/, examples/python/, examples/rust/.
Requires Zig 0.16.0.
zig build # Debug build
zig build test # Run all unit tests (399 tests)
zig build c-test # Run C API tests
./zig-out/bin/zwasm run file.wasmOn Windows use zig-out\bin\zwasm.exe.
Strip features at compile time:
| Flag | Description | Default |
|---|---|---|
-Djit=false |
Disable JIT compiler | true |
-Dcomponent=false |
Disable Component Model | true |
-Dwat=false |
Disable WAT parser | true |
-Dsimd=false |
Disable SIMD opcodes | true |
-Dgc=false |
Disable GC proposal | true |
-Dthreads=false |
Disable threads/atomics | true |
Linux x86_64 ReleaseSafe stripped, measured on the current main:
| Variant | Flags | Size | Delta |
|---|---|---|---|
| Full (default) | (none) | 1.56 MB | — |
| No JIT | -Djit=false |
1.41 MB | −10% |
| No WAT | -Dwat=false |
1.41 MB | −10% |
| Minimal | -Djit=false -Dcomponent=false -Dwat=false |
1.26 MB | −19% |
(-Dcomponent=false alone is currently neutral — the Component Model code path is already dead-code-eliminated when not exercised; combining it with -Djit=false -Dwat=false is what produces the 300 KB saving.)
Mac aarch64 stripped is roughly 350 KB smaller than the Linux numbers (1.20 MB full / 0.92 MB minimal). CI enforces per-OS ceilings on the stripped binary: Mac 1.30 MB, Linux 1.60 MB, Windows 1.80 MB (PE overhead). Stripping is portable across ELF / Mach-O / PE via -Dstrip=true (LLD --strip-all); see D137 in .dev/decisions.md.
.wat text .wasm binary .wasm component
| | |
v | v
WAT Parser | Component Decoder
(optional) | (WIT + Canon ABI)
| | |
+------>-----+-----<---------+
|
v
Module (decode + validate)
|
v
Predecoded IR (fixed-width, cache-friendly)
|
v
Register IR (stack elimination, peephole opts)
| \
v v
RegIR Interpreter ARM64 / x86_64 JIT
(default) (HOT_THRESHOLD=3)
Hot functions are detected via call counting and back-edge counting, then compiled to native code. Functions using opcodes outside the JIT's coverage continue to run in the register-IR interpreter. JIT pages use W^X protection — code is RW during emit, then switched to RX before execution; signal handlers translate guard-page faults back into Wasm traps.
Small, full, fast — pick three. zwasm tries to keep the byte budget of an interpreter while delivering the feature set of a tier-1 runtime. The primary metric is performance per byte of binary; secondary metrics are spec fidelity and startup latency.
Spec fidelity over expedience. Every change runs the 62,263-test spec suite, the 796 E2E assertions, and the 50 real-world program suite. We don't keep "known limitations".
ARM64 + x86_64 first class. Apple Silicon is the primary optimization target; x86_64 Linux and Windows are equally supported. Both backends ship the same SIMD coverage.
zwasm follows Semantic Versioning. The public API surface is defined in docs/api-boundary.md.
- Stable types and functions (
WasmModule,WasmFn, etc.) won't break in minor/patch releases - Experimental types (
runtime.*, WIT) may change in minor releases - Deprecation: at least one minor version notice before removal
- Book (English) — getting started, architecture, embedding, CLI reference
- Book (日本語)
- API Boundary — stable vs experimental surface
- CHANGELOG
See CONTRIBUTING.md for build, workflow, and CI checks.
MIT.
Developed in spare time alongside a day job. Sponsorship via GitHub Sponsors is welcome and helps keep work going.