How Does Libaom Handle Color Spaces?
This article provides a technical overview of how
libaom, the reference software library for the AV1 video
codec, defines and manages color spaces and pixel formats. It explores
the internal enumerations used for color primaries, transfer
characteristics, and matrix coefficients, alongside the library’s
approach to chroma subsampling and bit depth configuration.
Understanding these mechanisms is essential for developers looking to
ensure accurate color reproduction and efficient video encoding when
working with the AV1 ecosystem.
Internal Color Space Representations
Libaom relies on standard specifications defined by the ITU-T H.273 /
ISO/IEC 23091-2 standards to manage color representation. It categorizes
color properties into three distinct components, typically mapped via
the aom_color_primaries,
aom_transfer_characteristics, and
aom_matrix_coefficients enums in the API.
- Color Primaries: Defines the chromaticity coordinates of the source red, green, and blue color channels (e.g., BT.601, BT.709, or BT.2020/HDR).
- Transfer Characteristics: Dictates the opto-electronic transfer function (OETF) or “gamma curve” used to map linear light to digital values (e.g., sRGB, PQ/ST.2084, or HLG).
- Matrix Coefficients: Specifies how the red, green, and blue signals are matrixed into luma and chroma channels (e.g., YCbCr or YCoCg).
By separating these properties, libaom ensures full compatibility with a broad spectrum of color spaces, spanning standard definition (SD), high definition (HD), and high dynamic range (HDR) workflows.
Pixel Formats and Chroma Subsampling
To accommodate various compression and fidelity requirements, libaom
supports multiple pixel formats. These are primarily defined by their
chroma subsampling structures and are handled via the
aom_img_fmt enumeration.
The library natively handles the following configurations:
- YUV 4:2:0: The most common format for web delivery, where chroma channels are subsampled both horizontally and vertically to save bandwidth.
- YUV 4:2:2: Often used in broadcast and editing environments, offering horizontal subsampling but full vertical resolution.
- YUV 4:4:4: A high-fidelity format with no subsampling, preserving full color detail for complex graphics or professional mastering.
Additionally, libaom accounts for the specific alignment of chroma samples relative to luma samples (colocated versus disparately sampled), which is critical for accurate color decoding and preventing artifacting during playback.
Bit Depth Handling
Libaom is designed for both consumer-grade and professional video pipelines, supporting 8-bit, 10-bit, and 12-bit color depths. The library scales its internal processing pipelines based on the requested configuration:
- Standard Profiles (8-bit): Optimizes for performance and compatibility with standard displays.
- High Profiles (10-bit and 12-bit): Utilizes larger data structures internally (such as 16-bit integer containers) to maintain precision during complex mathematical encoding processes like motion estimation and intra-prediction, minimizing banding artifacts in gradients.
During initialization, developers pass configuration structures
(aom_codec_enc_cfg_t) to explicitly dictate the input and
output bit depths, ensuring the encoder allocates the appropriate memory
and processing pathways.