WebAssembly Text Format S-Expression Syntax Rules
This article provides a concise guide to the syntax rules for writing S-expressions in the WebAssembly text format (WAT). It covers the fundamental structure of S-expressions, including parenthesized lists, identifiers, keywords, literals, and comments, helping developers write and understand WAT code efficiently.
Parentheses and Tree Structure
The core of the WebAssembly text format is the S-expression (symbolic
expression), which represents nested tree structures. * All expressions
are enclosed in parentheses: ( ... ). * The first element
inside a parenthesis is typically a keyword (such as a module
declaration, function definition, or instruction), followed by its
parameters, children, or arguments. * Example of a simple module
structure:
wat (module (func $add (param $x i32) (param $y i32) (result i32) (i32.add (local.get $x) (local.get $y)) ) )
Keywords and Instructions
Keywords define the operations, types, and module components in
WebAssembly. * Keywords are case-sensitive and must always be written in
lowercase (e.g., module, func,
i32.const, br_if). * Instructions can be
written in a folded S-expression format (nested parentheses) or in a
flat, stack-based instruction list format.
Identifiers (Names)
Identifiers are optional names used to reference functions, globals,
locals, parameters, and other module components. * Every identifier must
begin with a dollar sign ($). * Subsequent characters can
be letters, digits, or symbols such as underscores, hyphens, and periods
(e.g., $my_function, $x, $var.1).
* Identifiers cannot contain spaces or parenthesis characters.
Numeric Literals
WebAssembly supports several types of numeric literals: *
Integers: Can be written in decimal (e.g.,
42, -105) or hexadecimal (e.g.,
0x2A, -0x69). Hexadecimal numbers must be
prefixed with 0x or 0X. *
Floating-Point Numbers: Can be written in decimal
(e.g., 3.1415, -0.5) or hexadecimal (e.g.,
0x1.5p+3). Special values like inf,
-inf, and nan are also valid. *
Separators: You can use underscores (_)
inside numbers to improve readability (e.g., 1_000_000 or
0x_FF_00_AA).
String Literals
Strings are used for names of imports, exports, and data segment
payloads. * Strings must be enclosed in double quotes (e.g.,
"Hello, WebAssembly"). * Strings can contain UTF-8 encoded
text. * Escape sequences can be used for special characters, such as
\n for newline, \t for tab, and
\\ for backslash. Hexadecimal escapes are also supported
using the format \hh (e.g., \0a for a
newline).
Comments
Comments allow you to document your WAT code and are ignored by the
compiler. * Single-line comments: Begin with double
semicolons ;; and continue to the end of the line. *
Block comments: Begin with (; and end with
;). Block comments can span multiple lines and can be
nested inside other block comments.
Whitespace and Formatting
- Whitespace characters (spaces, tabs, and newlines) serve as separators between tokens.
- Extra whitespace is ignored by the parser, meaning indentation is purely cosmetic but highly recommended to make the nested tree structure readable.