Opus Audio Format Supported Sample Rates
This article provides a direct overview of the native sample rates supported by the Opus audio codec. It explains the five specific sample rates the format uses internally, how it handles various input and output rates, and why it is designed around these specific frequencies for optimal audio compression.
The Five Native Sample Rates
The Opus audio format, standardized by the Internet Engineering Task Force (IETF) as RFC 6716, natively supports five specific sample rates for internal processing. These rates represent different audio bandwidths:
- 8 kHz (Narrowband)
- 12 kHz (Mediumband)
- 16 kHz (Wideband)
- 24 kHz (Super-wideband)
- 48 kHz (Fullband)
These rates allow Opus to dynamically scale from low-bitrate speech transmission to high-fidelity, full-range stereo music.
How Opus Handles Input and Output Sample Rates
While the Opus encoder and decoder internally operate only at the five native rates listed above, the codec is highly compatible with other common sample rates, such as 44.1 kHz (the standard CD sample rate).
Encoding
If you feed an audio file with a non-native sample rate (like 44.1 kHz or 96 kHz) into an Opus encoder, the encoder automatically resamples the input signal to one of its five native rates before compression. This resampling is handled transparently by the user-space API.
Decoding
When decoding an Opus stream, the decoder can resample the output to any desired target rate requested by the hardware or software player. However, the underlying compressed stream itself always represents audio at one of the five native rates, with 48 kHz being the standard default for high-quality audio playback.
Why 44.1 kHz is Not Natively Supported
Opus does not natively support 44.1 kHz because it is a hybrid codec built from two different technologies: Skype’s SILK (optimized for voice) and Xiph.Org’s CELT (optimized for music).
To ensure low latency and seamless transitions between these two technologies, the developers chose mathematically aligned sample rates. The frequencies 8, 12, 16, 24, and 48 kHz are all easily divisible by one another, which simplifies the signal processing math and keeps CPU usage low during real-time encoding and decoding.