How Does Libaom Implement Palette Prediction?

Libaom, the reference software encoder for the AV1 video coding format, implements palette prediction as a specialized intra-coding tool designed to dramatically improve compression efficiency for screen content, such as computer desktops, text, and user interfaces. Unlike natural video, screen content features sharp edges, flat regions, and a highly restricted number of unique colors. The libaom encoder approaches palette prediction by isolating these distinct color sets, mapping block pixels to specific color indices, and using a smart palette cache based on neighboring blocks to minimize the bitrate overhead required to signal color tables.

1. Candidate Evaluation and Block Constraints

Libaom does not attempt palette prediction on every part of a video frame because the mode is computationally expensive. It first runs a screening mechanism to detect whether the frame contains screen content. When active, libaom restricts palette mode to blocks that meet specific size constraints. Palette prediction is allowed only on blocks that are at least 8x8 in size and do not exceed 64x64 in either width or height. If these structural criteria are met, the encoder triggers candidate injection to build the local color tables.

2. Color Selection via Clustering and Histograms

To establish the actual palette for a valid coding block, libaom evaluates up to eight dominant colors using two primary searching techniques:

Through these combined searches, libaom determines an optimal palette size ranging between 2 and 8 base colors for each video plane (Y, U, and V). Pixels inside the block are then converted into an index map corresponding to these selected colors.

3. The Palette Predictor Cache and Signaling

Directly transmitting a brand-new color table for every block would generate a massive amount of overhead, neutralizing the compression gains. To solve this, libaom manages a predictor cache at both the encoder and decoder levels.

The cache keeps track of the colors utilized in previously encoded neighboring blocks. When evaluating a new block, libaom checks its newly generated palette against the cache. If a color matches an entry in the cache, the encoder simply transmits a reuse flag. If a color is entirely new, it is explicitly signaled. By transmitting a bitmask of reused colors and only encoding the delta values for new entries, libaom significantly reduces the bitstream footprint required for screen content coding.