WebAssembly Text Format S-Expression Syntax Rules

This article provides a concise guide to the syntax rules for writing S-expressions in the WebAssembly text format (WAT). It covers the fundamental structure of S-expressions, including parenthesized lists, identifiers, keywords, literals, and comments, helping developers write and understand WAT code efficiently.

Parentheses and Tree Structure

The core of the WebAssembly text format is the S-expression (symbolic expression), which represents nested tree structures. * All expressions are enclosed in parentheses: ( ... ). * The first element inside a parenthesis is typically a keyword (such as a module declaration, function definition, or instruction), followed by its parameters, children, or arguments. * Example of a simple module structure: wat (module (func $add (param $x i32) (param $y i32) (result i32) (i32.add (local.get $x) (local.get $y)) ) )

Keywords and Instructions

Keywords define the operations, types, and module components in WebAssembly. * Keywords are case-sensitive and must always be written in lowercase (e.g., module, func, i32.const, br_if). * Instructions can be written in a folded S-expression format (nested parentheses) or in a flat, stack-based instruction list format.

Identifiers (Names)

Identifiers are optional names used to reference functions, globals, locals, parameters, and other module components. * Every identifier must begin with a dollar sign ($). * Subsequent characters can be letters, digits, or symbols such as underscores, hyphens, and periods (e.g., $my_function, $x, $var.1). * Identifiers cannot contain spaces or parenthesis characters.

Numeric Literals

WebAssembly supports several types of numeric literals: * Integers: Can be written in decimal (e.g., 42, -105) or hexadecimal (e.g., 0x2A, -0x69). Hexadecimal numbers must be prefixed with 0x or 0X. * Floating-Point Numbers: Can be written in decimal (e.g., 3.1415, -0.5) or hexadecimal (e.g., 0x1.5p+3). Special values like inf, -inf, and nan are also valid. * Separators: You can use underscores (_) inside numbers to improve readability (e.g., 1_000_000 or 0x_FF_00_AA).

String Literals

Strings are used for names of imports, exports, and data segment payloads. * Strings must be enclosed in double quotes (e.g., "Hello, WebAssembly"). * Strings can contain UTF-8 encoded text. * Escape sequences can be used for special characters, such as \n for newline, \t for tab, and \\ for backslash. Hexadecimal escapes are also supported using the format \hh (e.g., \0a for a newline).

Comments

Comments allow you to document your WAT code and are ignored by the compiler. * Single-line comments: Begin with double semicolons ;; and continue to the end of the line. * Block comments: Begin with (; and end with ;). Block comments can span multiple lines and can be nested inside other block comments.

Whitespace and Formatting