VP9 HDR10 Encoding with libvpx-vp9

To achieve HDR10 compatibility when encoding video using the libvpx-vp9 encoder, you must explicitly signal the SMPTE ST 2084 transfer characteristics alongside correct color primaries and matrix coefficients. This article explains how the VP9 codec and the WebM/MKV container format store this metadata, and provides the exact configuration parameters required to output a compliant HDR10 stream using FFmpeg.

The HDR10 Signaling Mechanism in VP9

HDR10 relies on specific color parameters to tell the display how to render high-dynamic-range content. For VP9, this signaling happens at two levels: the VP9 bitstream (within the sequence header) and the container level (typically WebM or Matroska).

To define HDR10, three main video signal parameters must be set to specific values: 1. Color Primaries: BT.2020 (Value: 9) 2. Transfer Characteristics: SMPTE ST 2084 / PQ (Value: 16) 3. Matrix Coefficients: BT.2020 non-constant luminance (Value: 9)

The libvpx-vp9 encoder exposes controls to inject these values directly into the VP9 uncompressed frame headers via the color_config() syntax element.

Additionally, HDR10 requires 10-bit color depth. In VP9, 10-bit depth is only supported in Profile 2 (for YUV 4:2:0) or Profile 3 (for YUV 4:2:2 or 4:4:4). Therefore, you must force the encoder to use Profile 2 to enable 10-bit color representation.

Configuring FFmpeg for libvpx-vp9 HDR10

When using FFmpeg to encode with libvpx-vp9, you must pass the appropriate color space and profile flags to ensure the encoder signals SMPTE ST 2084 correctly.

Below is the standard configuration command:

ffmpeg -i input_hdr.mkv \
  -c:v libvpx-vp9 \
  -profile:v 2 \
  -pix_fmt yuv420p10le \
  -color_primaries bt2020 \
  -color_trc smpte2084 \
  -colorspace bt2020nc \
  -color_range tv \
  output_hdr.webm

Parameter Breakdown

Mastering Display Metadata and CLL

While the transfer characteristics and primaries define the color space, HDR10 also requires static metadata to describe the mastering display and content light levels. This includes: * Mastering Display Color Volume (MDCV): The color primaries, white point, and luminance range of the monitor used to master the video. * Content Light Level (CLL): The Maximum Content Light Level (MaxCLL) and Maximum Frame-Average Light Level (MaxFALL).

In a WebM/VP9 workflow, these values are typically written into the container metadata block rather than the raw VP9 video bitstream. When you pass a source file containing this metadata through FFmpeg, it will automatically extract and map the MKV/WebM side data elements (such as MasteringDisplayMetadata and ContentLightLevel) into the output container, completing the HDR10 compatibility requirements.