Why is libaom Slower Than libvpx?

This article provides an analysis of the primary factors that cause the libaom encoder to run slower than libvpx. While both encoders were developed by the Alliance for Open Media (AOMedia) and Google, libaom is the reference software for the next-generation AV1 codec, whereas libvpx encodes the older VP8 and VP9 formats. The performance gap between them stems from AV1’s significantly higher architectural complexity, an expanded suite of advanced coding tools, and a massive search space required for compression optimization.

Architectural Complexity and Coding Tools

The most significant driver of the speed disparity is the sheer complexity of the AV1 standard compared to VP9. To achieve its superior compression efficiency (often 20-30% better than VP9), libaom implements highly sophisticated coding tools that require massive computational overhead.

Exponentially Larger Search Space

During the encoding process, an encoder must decide on the best combination of block sizes, motion vectors, and transform types to minimize bitrate while maximizing quality. This is known as Rate-Distortion Optimization (RDO). Because libaom features vastly more options for every single pixel macroblock than libvpx, the mathematical search space scales exponentially. Testing all these permutations to find the optimal encoding path naturally takes considerably more time.

Filter and Post-Processing Overhead

AV1 introduces a highly sophisticated pipeline of in-loop filters to clean up compression artifacts before the frame is used as a reference for future frames. libaom spends a significant amount of processing time on the Constrained Directional Enhancement Filter (CDEF) and the Loop Restoration Filter. These filters analyze edge directions and apply complex mathematical corrections that simply do not exist in the simpler libvpx (VP9) pipeline.

Maturity and Optimization Codebase

Age and optimization history also play a critical role. The libvpx codebase has undergone over a decade of aggressive, real-world optimization. It features highly mature AVX2, AVX-512, and ARM Neon assembly optimizations that allow it to execute instructions rapidly on modern hardware. While libaom has received substantial optimization updates over recent years, its baseline algorithmic complexity means that even with SIMD assembly optimizations, it inherently requires more operations per second to achieve a complete encode.