Speed Up libvpx-vp9 Encoding with Multi-Threading

VP9 is a highly efficient video codec, but its default single-threaded encoding process can be incredibly slow. By leveraging multi-threading parameters in the libvpx-vp9 encoder, you can utilize modern multi-core processors to drastically cut down render times. This article explains how to configure thread counts, tiles, and row-based multi-threading to optimize your encoding workflow without sacrificing video quality.

Key Multi-Threading Parameters in libvpx-vp9

To enable parallel processing in libvpx-vp9, you must configure three primary settings in your FFmpeg command: -threads, -tile-columns (along with -tile-rows), and -row-mt.

1. The -threads Parameter

This setting defines the maximum number of threads the encoder is allowed to spawn. * For modern multi-core CPUs, setting this to the number of your physical or logical CPU cores (e.g., -threads 8 or -threads 16) is ideal. * Note that simply increasing this number will not speed up encoding unless you also configure tiles, as VP9 relies on “tiles” to split the frame for parallel processing.

2. Tiling (-tile-columns and -tile-rows)

VP9 divides each video frame into a grid of tiles. Each tile can be processed by a separate thread. * -tile-columns: This parameter uses a \(log_2\) scale. * 0 = 1 tile column (no split) * 1 = 2 tile columns * 2 = 4 tile columns (Recommended for 1080p) * 3 = 8 tile columns (Recommended for 4K) * -tile-rows: This also uses a \(log_2\) scale (usually set to 0 or 1). Setting -tile-rows 1 splits the frame horizontally into 2 rows.

To utilize 4 threads effectively, you need at least 4 tiles (e.g., -tile-columns 2 -tile-rows 0). For 8 threads, use -tile-columns 2 -tile-rows 1 (yielding 8 tiles) or -tile-columns 3 -tile-rows 0.

3. Row-Based Multi-Threading (-row-mt 1)

Enabling row-based multi-threading is the most significant way to speed up VP9 encoding. By adding -row-mt 1 to your command, you allow the encoder to run multi-threading across rows of blocks within a single tile. This drastically improves thread utilization and allows libvpx-vp9 to scale efficiently across CPUs with high core counts (such as 16, 32, or more threads), even with a lower number of tile columns.


Here is a practical example of an optimized FFmpeg command for a 1080p video using a CPU with 8 or more threads:

ffmpeg -i input.mp4 -c:v libvpx-vp9 -b:v 3M \
-threads 8 -tile-columns 2 -tile-rows 1 -row-mt 1 \
-g 240 -speed 1 -quality good -c:a libopus -b:a 128k output.webm

Parameter Breakdown:

By combining -row-mt 1 with an appropriate number of tile columns and rows matching your CPU thread capacity, you will experience a massive reduction in VP9 encoding times while maintaining high-quality visual output.