How does libaom parse the AV1 sequence header OBU?

The AOMedia Video 1 (AV1) bitstream is structured into Open Bitstream Units (OBUs), with the Sequence Header OBU serving as the foundational block that defines global coding parameters such as profile, level, tier, and color configuration. In libaom, the reference software implementation for AV1, parsing this critical header is primarily handled by the internal function aom_read_sequence_header_obu. This article explores the specific code path, data structures, and bit-stream reading mechanisms libaom employs to decode the sequence header and initialize the decoder state.

The Entry Point: Recognizing the OBU Type

Before parsing the sequence header itself, libaom processes the OBU header, which is present at the start of every OBU. The decoder extracts the obu_type using a simple bit-masking operation on the first byte of the unit. When obu_type matches OBU_SEQUENCE_HEADER (value 1), the decoder routes the bitstream payload to the dedicated sequence header parsing pipeline. At this stage, the driver initializes an aom_reader or a raw bit buffer wrapper (rbsp_bit_buffer) to safely read uncompressed header bits sequentially.

Core Architecture and aom_read_sequence_header_obu

The primary heavy lifting is executed within av1/decoder/decodeframe.c (or associated bitstream parsing modules) via the function aom_read_sequence_header_obu. This function maps the raw bitstream syntax elements directly to the internal C structure SequenceHeader.

The parsing process follows a strict sequential order defined by the AV1 specification:

Color Config and Tool Enablement Flags

Once the foundational properties are set, aom_read_sequence_header_obu calls a helper function, typically named read_color_config, to parse critical chroma and luma parameters. This sub-routine decodes bit depth (8, 10, or 12 bits), color primaries, transfer characteristics, matrix coefficients, and chroma subsampling positions.

Following the color configuration, libaom parses a dense sequence of single-bit flags that globally enable or disable advanced coding tools for the entire video sequence. These include:

By reading these as boolean values, libaom flags which syntax elements the decoder should expect—or skip entirely—when it parses subsequent Frame Headers and Tile Groups.

Finalizing the Sequence State

The parsing function concludes by checking for trailing bits to ensure bitstream alignment and verification. Once aom_read_sequence_header_obu successfully returns, the populated SequenceHeader structure is cached inside the main decoder context (AV1Decoder). This structural data acts as the immutable baseline for the decoder until a new Sequence Header OBU is encountered in the stream, ensuring that all subsequent frame buffers and decoding loops conform to the validated hardware and software limits.