How Do Tile Counts Affect libaom Threading?
This article explores the direct relationship between tile configuration and multi-threading efficiency in the libaom AV1 encoder library. It covers how partitioning video frames into independent horizontal and vertical columns allows the encoder to distribute workloads across multiple CPU cores, while also addressing the compression efficiency trade-offs associated with higher tile counts.
Understanding Tiles in AV1 and libaom
In the AV1 video coding format, a tile is a self-contained, rectangular region of a video frame that can be encoded and decoded independently of other regions. The libaom encoder leverages this architectural feature as its primary mechanism for multi-threading. By breaking a large frame into smaller grid pieces, libaom can assign individual tiles to separate threads, allowing a multi-core processor to work on different parts of the same frame simultaneously.
How Tile Counts Drive Threading Efficiency
The relationship between the number of tiles and threading
performance boils down to a fundamental rule: the maximum number
of parallel frame threads is strictly bounded by the number of
tiles. * Tile Columns and Rows: You can
configure tiles using the --tile-columns and
--tile-rows parameters (usually expressed in \(\log_2\) units). For example, setting
--tile-columns=2 creates \(2^2 =
4\) tile columns.
- Thread Utilization: If you set libaom to use 8
threads (
--threads=8) but only specify 4 tiles, 4 of those threads will sit idle during the tile-encoding phase. To fully utilize 8 threads on a single frame, you need at least 8 tiles. - Row-Based Multi-Threading: libaom also features
row-based multi-threading (
--row-mt=1). When enabled, it allows threads to work on different pixel rows within the same tile simultaneously, significantly boosting efficiency even when tile counts are low. However, combining explicit tiles with row-mt yields the highest performance scalability on high-core-count CPUs.
The Trade-off: Threading Speed vs. Video Quality
While increasing tile counts unlocks higher encoding speeds, it introduces a noticeable penalty to compression efficiency. Because tile boundaries restrict the encoder from using intra-frame prediction or motion vectors across those boundaries, the encoder loses context.
- Lower Compression Efficiency: More tiles mean more boundaries, which forces the encoder to spend more bits to achieve the same visual quality.
- Bitrate Spikes: At a fixed bitrate, a high tile count can lead to a drop in overall visual fidelity or visible blocking artifacts along the tile edges.
Finding the Sweet Spot
Optimizing libaom requires balancing your target encoding speed with
acceptable quality loss. For standard 1080p video, a common
configuration is 2 tile columns (--tile-columns=1,
resulting in 2 tiles), which pair well with row-mt to saturate 4 to 8
threads efficiently without severely degrading the video quality. For 4K
video, higher tile counts (e.g., --tile-columns=2) are
generally required and acceptable, as the sheer abundance of pixels
mitigates the relative overhead of the tile boundaries.