How do you quantify the performance of a video stream? What metrics should you use?
Long after Andy Grove stressed the importance of metrics for decision-making, software companies are becoming more and more data-driven. SaaS businesses live or die on CAC and MAU. Applications have Apdex; servers have load average; networks have latency and throughput.
But video streaming is complicated. Like network traffic, a video stream has latency (loading/seeking time) and throughput (bitrate). But it also has a target speed (frame rate); two different size dimensions (bitrate and resolution); variable sizing, when doing adaptive bitrate streaming; and subjective elements, like audio and video quality.
While there are dozens of metrics that can be used to understand video streaming performance, it's important to keep metrics simple. Remember, the goal is not to have a lot of data, but to extract insight from data. Too much complexity is counterproductive. Fortunately, all of these metrics can be distilled into four basic categories, which represent four major points for viewers1: playback failures, startup time, rebuffering, and video quality.
1. Playback failures are easy to understand. Did a video fail to play due to an error? If so, it is also important to track the underlying errors themselves. HTML5 video has four predefined error codes, for example. Other parts of the playback chain (like a DRM library or a MPEG-DASH playback technology) can throw errors as well.
2. Startup time measures how long a video takes to start playing. The longer a video takes to load, the more likely a viewer is to leave. Startup time can be measured in several ways: the time it takes for the first frame of video to appear after a user hits "play", the time it takes to load a video player, or the time it takes to load an entire page. Various metrics related to advertising fall into this category as well, like the wait time between an ad and video content.
3. Rebuffering is stalling in the middle of playback due to a buffer underrun. In other words, the video stream is loading slower than the video wants to play, so playback has to stall to buffer more video. Rebuffering can be measured in several ways:
- Rebuffering count is the number of times that playback stalled.
- Rebuffering duration is the total time that playback was stalled.
- Rebuffering frequency is how often rebuffering events occur (like rebuffering count / minutes of watching time).
- Rebuffering percentage is the percentage of the viewer's time spent watching that playback was stalled (rebuffering duration / watching time).
- Rebuffering ratio is the ratio between the rebuffering duration and the actual duration of video that played (rebuffering duration / playback duration).
4. Video quality is easy to understand, but probably the hardest element to measure. Objectively measuring the quality of a standalone video stream is basically an unsolved problem.2 So the metrics are forced to take a simplified approach, such as looking at video bitrate or resolution (or both).
- Video bitrate is an important contributor to video quality, and is an important metric to track. Unfortunately, a bitrate number alone doesn't communicate much; 1000kbps might look good or bad, depending on resolution and content type.
- The ratio between the video stream resolution and the viewing device resolution also correlates to video quality. If a video is upscaled significantly - like watching a 320p movie full-screen on a 1080p display - quality suffers.
In practice, since video quality is not easily measured by any single metric, some combination of bitrate, resolution, and other factors (like the underlying content) may be needed to accurately measure quality.3
These four categories cover the four critical dimensions of video streaming. Did video play? How quickly? Was playback seamless? And how did it look/sound? Each of these is important, and you really need to track at least one metric from all four categories if you want to understand video performance.
We believe this "viewer-centric" approach to video performance is important. The goal of video performance analytics is to improve a user's actual quality of experience (QoE), and so publishers should focus on performance dimensions that actually affect user behavior or happiness. ↩
Metrics like SSIM, PNSR, and VMAF measure the quality of an encoded video by comparing it to the original (the "reference"). "Full reference" quality metrics aren't perfect, but are accurate enough to be useful. In the context of performance analytics, however, the reference video isn't available, and the field of "no reference" quality metrics has a long way to go to be generally useful. ↩
Stay tuned for a longer discussion of video quality metrics on this blog. ↩