How Does MKV Handle Multiple Subtitle Tracks?
The Matroska (MKV) container format is highly regarded for its ability to store an unlimited number of video, audio, picture, and subtitle tracks within a single file. This article explains how the MKV format handles multiple subtitle tracks, detailing its container architecture, the types of subtitle formats it supports, and how media players utilize metadata flags to manage and display these tracks to the user.
The Container Structure and Muxing
Unlike formats that require external subtitle files (like
.srt files sitting next to an MP4), the MKV format is a
multimedia container. It uses a process called “multiplexing” (or
“muxing”) to embed multiple subtitle files directly into the master
.mkv file.
Inside the container, each subtitle track is treated as an independent data stream running parallel to the video and audio streams. Because Matroska is based on EBML (Extensible Binary Meta Language), it can scale infinitely, allowing users to pack dozens of different language tracks into one file without them interfering with one another.
Support for Diverse Subtitle Formats
One of MKV’s greatest strengths is its broad compatibility with different subtitle technologies. It handles two primary categories of subtitle tracks:
- Text-Based Subtitles: These are stored as text strings with timing markers. MKV natively supports formats like SRT (SubRip), SSA/ASS (SubStation Alpha, which allows for advanced styling, positioning, and animations), and WebVTT. Because they are text, these tracks take up virtually no storage space.
- Bitmapped (Image-Based) Subtitles: These are stored as a series of images (usually overlay graphics). MKV can seamlessly contain PGS (Presentation Graphic Stream used on Blu-rays) and VOBSUB (used on DVDs).
Within the MKV container, these different formats can coexist. For example, a single MKV file can contain one SRT track, two ASS tracks, and a PGS track simultaneously.
Track Metadata and Flags
To help media players make sense of multiple subtitle tracks, the MKV format utilizes specific metadata headers. When an MKV is created, creators can assign flags to each subtitle track:
- Language Tags: Every track can be tagged with an
industry-standard language code (such as
engfor English orspafor Spanish). This allows media players to automatically select the user’s preferred language. - Track Name: Creators can give tracks descriptive titles, such as “English (SDH)” for the hearing impaired, or “French (Director’s Commentary).”
- The “Default” Flag: This metadata tells the media player which subtitle track should automatically turn on when the video starts playing.
- The “Forced” Flag: This is used for subtitles that must be shown even if the user has subtitles turned off—for example, when a character speaks a foreign language in an otherwise English-language movie.
Player Demuxing and Rendering
When you open an MKV file in a media player (such as VLC, MPC-HC, or MPV), the player’s “demuxer” splits the single MKV file back into its constituent video, audio, and subtitle streams.
Because the subtitle tracks are cleanly separated and flagged, the player can easily display a menu allowing the user to switch between languages on the fly. The player reads the timing packets of the selected subtitle stream and renders the text or images on top of the video frame in real time, ignoring the unselected subtitle streams.