Extracting subtitles or captions from a video can be helpful for things like accessibility and localization. Using FFmpeg you can easily extract subtitle tracks from a video file in a variety of formats. In this article we will demonstrate different methods for extracting subtitles and show how to work around subtle gotchas in text formatting.
Why Extract Subtitles?
There are several reasons you may want to extract subtitles:
- Localization: When translating a video, having the subtitles as a separate file is easier for editors to work with.
- Accessibility: Adding subtitles to different platforms or reformatting them for screen readers often requires them in a separate file.
- Automation: If you're using AI tools to do sentiment analysis or things like automatic translation then you'll likely want to automate extracting the subtitles or captions.
Types of Subtitles in Video Files
Videos can contain different types of subtitles and only some of them are easily extracted:
Embedded subtitles
Subtitles embedded as part of the video file. These can be extracted without altering the video at all because they are stored as a text track alongside the other media tracks inside the video container. Because they are independent tracks extracting them is usually faster.
Soft subtitles
These are stored in a separate file alongside the video file and so are already separate from the video itself.
Hardcoded subtitles
These are burned directly into the video and can’t be extracted as a separate file. This means that they are literally part of the video image and don't exist as text in any parsable way. Extracting these types of subtitles would requires analyzing the frames themselves and attempting to convert the image into text. This is not something we'll cover in this article.
We'll be extracting embedded subtitles with the examples below.
Things to consider before extracting subtitles and captions
Videos can have multiple subtitle streams, each in a different language or format, so you’ll need to identify the stream that needs extracting and it's ID within the list of streams in the file. Running FFprobe against the file like this: ffprobe my-file.mp4 would output information about what streams are available and their respective ID's.
Some subtitles might use special character sets also, so it’s best to specify encoding where needed, especially with non-Latin languages. We'll show an example later on for how to do this.
How to Extract Subtitles and captions with FFmpeg
Identifying which streams to extract
To identify the subtitle streams in a video, run:
This command lists all streams within the file, including video, audio, and subtitle streams. Subtitle streams will be marked as Stream #0:x, where x is the stream index.
Extracting subtitles to an SRT File
Once you know the subtitle stream index, you can extract it like this:
-map 0:s:0 specifies the subtitle stream index. The first 0 identifies the input file, which will always be 0 when working with a single input. s selects subtitle tracks and the last 0 identifies which subtitle stream ID to select for extraction.
Extracting subtitles to VTT (WebVTT)
Extracting subtitles to ASS (Advanced SubStation Alpha)
For more complex styling and positioning:
Dealing with Character Encoding
If the subtitle file’s encoding doesn’t render correctly, specify the character set with -sub_charenc. For example, to handle UTF-8:
Automating Subtitle Extraction for Multiple Streams
If a video has multiple subtitle streams, you can extract each with a loop in a bash script like this:
This command finds each subtitle stream, extracting them sequentially into separate .srt files.
Taking it further
Here's some more articles that you may find helpful for doing common tasks with FFmpeg: