Opus Audio Resampling and Native Sample Rates
The Opus audio codec is a highly versatile format designed for interactive speech and music transmission over the internet, operating natively at specific sample rates up to 48 kHz. When input audio does not match these native rates, Opus utilizes a highly optimized internal resampling process to convert the signal. This article explains how the Opus codec handles non-native input sample rates, how its internal resampler operates, and the impact of this conversion on audio quality and system performance.
Native Sample Rates in Opus
The Opus codec internally supports five specific native sample rates: * 8 kHz (Narrowband) * 12 kHz (Mediumband) * 16 kHz (Wideband) * 24 kHz (Super-wideband) * 48 kHz (Fullband)
Regardless of the input sample rate, the Opus encoder operates internally at one of these rates. For most high-quality stereo and mono audio, the encoder defaults to 48 kHz.
The Internal Resampling Process
When you feed an audio signal with a non-native sample rate—most
commonly 44.1 kHz (the standard CD sample rate)—into the Opus encoder,
the libopus library automatically resamples the audio to 48
kHz.
This process is handled by a built-in, highly optimized resampler derived from the Speex DSP library. The resampler uses band-limited sinc interpolation, which is a mathematical method designed to convert sample rates while preserving the original frequency response and avoiding aliasing.
Encoder-Side Resampling
- API Detection: The Opus encoder API accepts any arbitrary sample rate.
- Conversion to 48 kHz: If the input rate is not 48 kHz (or another native rate selected by the encoder’s bandwidth decision logic), the internal resampler converts the signal to 48 kHz before the core encoding process begins.
- Frame Size Calculation: Opus operates on fixed frame sizes (typically 20 ms). The resampler adjusts the number of input samples to match the required number of output samples per frame.
Decoder-Side Resampling
On the decoding side, the process is reversed. Opus natively decodes and outputs audio at 48 kHz. If the output device or application requests a different sample rate (such as 44.1 kHz or 16 kHz): 1. The decoder decodes the bitstream to 48 kHz. 2. The internal resampler downsamples or upsamples the decoded PCM audio to the rate requested by the application.
Quality and Performance Implications
The resampling process in Opus is designed with specific trade-offs in mind:
- High Fidelity: Sinc interpolation ensures that the resampling process introduces virtually no audible distortion or frequency loss. The frequency response remains flat, and aliasing artifacts are pushed far below the threshold of human hearing.
- Low Latency: The internal resampler is optimized for low latency, which is critical for real-time communication applications like VoIP and gaming.
- CPU Efficiency: While resampling requires
additional CPU cycles, the
libopusimplementation is highly optimized for modern processors, using SIMD (Single Instruction, Multiple Data) instructions to minimize the computational footprint.