Mux receives over a billion video views each month through Mux Data, and we recently spent some time looking at the most important metrics, such as rebuffering percentage, time to first frame, and video upscaling. Many video analytics reports distill the impact of such metrics into simple round numbers, such as claims that an X% increase in a QoE metrics reduces watch time by XX%.
However, in reality, viewing behavior is far more complex than that, as is seen in the lack of consensus between multiple QoE studies. One oft-cited study from Akamai claims that a 1% rebuffering rate reduces user engagement by 5.02%, while a more recent study reported that a 1% rebuffering rate reduces user engagement by 67.6%. These results are often widely divergent because it's difficult to control for all the factors that can affect user behavior such as user device, different viewer cohorts, or the length and popularity of the video.
Trying to Measure the Impact of QoE Metrics
Imagine perusing your Facebook/Instagram feed and enjoying an endless supply of 20-second cat videos.
Suddenly, a five-second rebuffering event occurs.
Those five seconds can feel like an eternity; it might even be enough time for you to reflect on what you're doing and abandon the cat videos all together.
On the other hand, if your video was a 1-hour show on Netflix, a five-second rebuffering event might not be so bad, and your Jessica Jones binge could continue unabated.
Few people would abandon watching a movie due to five seconds of rebuffering halfway through. Similarly, viewers might have a higher tolerance for QoE metrics if the video content is really interesting, and this could dramatically skew or mask the results if most video views are not evenly distributed across content.1
For example, Big Video publishes 100 equal-length videos with three levels of popularity. The first video gets one million views, 9 get 100,000 views, and 90 get 10,000 views.
Viewers in the highest popularity level will tolerate .75% rebuffering before abandoning the video, the second tier will tolerate .5% of rebuffering, and the least popular videos can have up to .25% of rebuffering. From a top-level view, it appears that every .25% increase in rebuffer percentage increases abandonment by 33%.
Big Video might think this level of abandonment is acceptable (after years at the top, Big Video has developed some low standards). But they don't realize that actually 90% of their videos are being prematurely abandoned. If Big Video adds more third tier videos, they would continue to see an increase in abandonment rate due to rebuffering, even though the rebuffering metrics stay exactly the same.
These differences in content are rarely considered when companies look at improving their QoE metrics. For this blog post, Mux has decided to take a closer look into how these complex video interactions work.
This blog post will be the first in a series to deeply analyze QoE metrics, and how they measurably impact viewer experiences. We will be looking at rebuffering in this first part, while future parts will focus on other QoE metrics like startup time and video quality.
To start, we will look at the way that rebuffering impacts the different video lengths. First, we took a sample consisting of several million video views across diverse content platforms, and bucketed those views across different length categories:
For each bucket, we then look at the cumulative change in average watch time as rebuffering increases. The plot below shows the change in watch time as rebuffering percentages increase from .1% to .5%
The graph shows widely varying impacts on watch time, but it still roughly matches our expectations. Users are more tolerable of rebuffering when it comes to shorter videos vs longer videos.
For 0 to 5 minute videos, .1% of rebuffering has almost no impact (since that translates to roughly half a second of rebuffering at most), but we see a 40% drop in watch times as the rebuffering percentage increases to .3%.
As the length of video increases, we see greater changes in watch time at the .1% level. Videos over 10 minutes all experience steep drop-offs in watch time with minimal rebuffering percentages, the decrease continues at a steep rate with each additional tenth of a percentage.
Strangely, 5-10 min videos seem to be the fairly tolerant towards rebuffering times, as they only result in half of the watch time reduction at higher rebuffering percentages.
This could be related to the length of the video, or it could explained by a confounding factor, such as the content type or popularity. In order to investigate further, we took a closer look at the top videos of one of our high volume customers.
For this customer, there were 25,000 videos which received a total of 8 million views across a 7-day period. The most popular video contributed 4% of all views, and the top five videos represented 12.5% of the total views and watch time.
We start by plotting the playback time distribution for videos that experience rebuffering vs no rebuffering. We exclude views that are watched for less than 5 seconds since those are primarily impacted by startup times than rebuffering1. We've adjusted the playback time here to exclude the time spent rebuffering (all playback times given in seconds). Additionally, we cleaned up the data to exclude significant outliers or rebuffering events lasting less than .5 seconds.
In this density plot, we see some surprising results. It doesn't appear that views with rebuffering events change the distribution of playback times! Users who experience rebuffering track fairly closely with users who have a completely smooth experience.
To investigate why this might be, let's look at the density plot for our most popular video. The playback time on the bottom axis includes the time spent watching the video plus the video startup time.
This video's length is around 200 seconds, which is why there is a spike at that point (some views extend past this time due to variable startup times). We can see that viewers without a rebuffering event are slightly more likely to reach the end of the video, but the difference doesn't look very large. A quick regression analysis confirms that, while rebuffering is a statistically significant predictor of watch time (given the low p-value), it explains a very low percentage of the variance (given by the R squared).
Linear regression for rebuffering on playback time
It could be that most rebuffering events are not significant enough to cause a change in user behavior. To test, we breakout the views into buckets with over or under 5% rebuffering (e.g. for a 100 second video, 5 seconds of total rebuffering time would be classified as a High Rebuffer view) to see if the rebuffering percentage has an impact:
Now we can see that the previous chart was masking the effect of views with high levels of rebuffering. Most viewers seem to tolerate low levels of rebuffering (in this case, about three to eight seconds). However, once rebuffering rises above that level, users begin abandoning the video at a much higher rate.
That was the density plot for the most popular video in a three day period, so what does a less popular video look like? Here's the 10th most popular video in our sample, which had a third of the views of the most popular video.
This video is about 100 seconds long, and it appears to have a multiple drop-off points during the video. The viewers who experience no rebuffering have a much higher percentage of making it to the end, which is what we would expect. A fair number of viewers with high rebuffering still make it to the first drop-off point, but at a much lower rate.
Interestingly, it looks like if a viewer makes it past 25 seconds, they are likely to stick around until the 70-second point, regardless of rebuffering percentage. This implies a stickiness of popular content, in which users that want to watch the content will keep watching despite a poor experience.
What happens when we look at the 1000th most popular video?
Here, we again see that there is little difference between groups who experience no rebuffering vs a bit of rebuffering. However, there is a steady rate of drop-off for the high rebuffer group since their density distribution is fairly constant throughout. While some viewers were able to tolerate higher rebuffering for the more popular videos, the rate of abandonment increases much more easily when it comes to less popular videos.
From the examples above, we can see that QoE metrics are a tricky business. Even if you do have all the data available at your fingertips, it still takes a considerable amount of analysis to understand what to improve, how much impact that improvement will have, and how to improve it.
Let us do this for you
If you're interested in learning more about video analytics, check back on our blog for the next post in the series where will we be looking at the impact of video startup time.