Wasm-R3: Record-Reduce-Replay for Realistic and Standalone WebAssembly Benchmarks
WebAssembly (Wasm for short) brings a new, powerful capability to the web as well as Edge, IoT, and embedded systems. Wasm is a portable, compact binary code format with high performance and robust sandboxing properties. As Wasm applications grow in size and importance, the complex performance characteristics of diverse Wasm engines demand robust, representative benchmarks for proper tuning. Stopgap benchmark suites, such as PolyBenchC and libsodium, continue to be used in the literature, though they are known to be unrepresentative. Yet, porting of more complex suites is difficult because Wasm lacks many system APIs and extracting real-world Wasm benchmarks from the web is difficult due to complex host interactions. To address this challenge, we introduce \emph{Wasm-R3}, the first record and replay technique for Wasm. Wasm-R3 transparently injects instrumentation into Wasm modules to \emph{record} an execution trace from inside the module, then \emph{reduces} the execution trace via several optimizations, and finally produces a \emph{replay} module that is executable standalone without any host environment—on any engine. The benchmarks created by our approach are (i) realistic, because the approach records real-world web applications, (ii) faithful to the original execution, because the replay module includes the unmodified original code, only adding emulation of host interactions, and (iii) standalone, because the benchmarks run on any engine. Applying Wasm-R3 to web-based Wasm applications in the wild demonstrates the correctness of our approach as well as the effectiveness of our optimizations, which reduce the recorded traces by 99.53% and the size of the replay module by 9.98%. We release the resulting benchmark suite of 27 applications, called \emph{Wasm-R3-Bench}, to the community, to inspire a new generation of realistic and standalone Wasm benchmarks.