Implement VP9 Spatial Scalability with libvpx-vp9
This article provides a practical guide on implementing spatial
scalability (SVC) using the libvpx-vp9 encoder in FFmpeg.
You will learn the core concepts of VP9 spatial layers, the key
configuration parameters required by the encoder, and a step-by-step
command-line example to generate a scalable video bitstream.
Understanding VP9 Spatial Scalability
Spatial scalability allows you to encode a single video stream containing multiple resolution layers (such as 360p, 720p, and 1080p). Clients can decode only the layers they need or can support based on their network bandwidth and hardware capabilities.
In VP9, spatial scalability is typically implemented alongside
temporal scalability. To implement spatial scalability using
libvpx-vp9, you must define the number of spatial layers,
the number of temporal layers, and assign target bitrates to each layer
combination.
Key FFmpeg Parameters for libvpx-vp9 SVC
To configure spatial scalability, you must pass specific flags to the
libvpx-vp9 encoder:
-ss_number_layers: Sets the total number of spatial layers (e.g.,3).-ts_number_layers: Sets the total number of temporal layers (e.g.,1if you only want spatial scalability).-layer_target_bitrate: A comma-separated list specifying the target bitrate (in kilobits per second) for each layer. The list must follow the order of the layers, starting from the lowest spatial/temporal layer up to the highest.-speed: VP9 SVC encoding requires real-time speed settings, typically between6and9.-static-thresh: Helps the encoder optimize static areas. A value of0is common for real-time streaming.
Step-by-Step FFmpeg Implementation
Below is a standard FFmpeg command to encode a 1080p input video into 3 spatial layers (typically downscaled by a factor of 2 per layer: 270p, 540p, and 1080p) using a single temporal layer.
ffmpeg -i input.mp4 \
-c:v libvpx-vp9 \
-s 1920x1080 \
-b:v 3500k \
-minrate 3500k \
-maxrate 3500k \
-bufsize 7000k \
-g 150 \
-keyint_min 150 \
-ss_number_layers 3 \
-ts_number_layers 1 \
-layer_target_bitrate 400,1200,3500 \
-speed 7 \
-quality realtime \
-error-resilient 1 \
output_svc.webmExplanation of the Parameters
- Rate Control: VP9 SVC works best with Constant
Bitrate (CBR) mode. Setting
-b:v,-minrate, and-maxrateto the same value (e.g.,3500k) forces CBR. - GOP Size (
-gand-keyint_min): Keeps keyframe intervals consistent across all layers. For 30fps video,150ensures a keyframe every 5 seconds. - Layer Bitrate Allocation: The
-layer_target_bitrate 400,1200,3500argument allocates:- Layer 0 (270p): 400 kbps
- Layer 1 (540p): 1200 kbps (cumulative, including Layer 0)
- Layer 2 (1080p): 3500 kbps (cumulative, including lower layers)
- Error Resilience (
-error-resilient 1): Enables error resilience features, which are mandatory for SVC decoding to allow smooth frame drops and layer switching without stream corruption.