Device detection is a journey, not a destination

I’ll admit, I'm new to the technical aspects of device detection. I've been involved in multiple analytics products that use device detection for reporting, but it was never something I worried too much about. Just throw the user agent at a device detection library, and off you go.

Boy, was I wrong. I had to learn about device detection the hard way. Let’s hope my experience makes things easier for you.

First, let's talk about Mux’s products and what we’re trying to accomplish with device detection. Mux Data is a video analytics platform that lets you measure the viewership and quality of your viewers’ experience, so that you can accelerate your development and operate your video applications.

We don’t collect personally identifiable data on viewers, profile viewers across customers, or share data with advertising companies. We do try to collect the device type that viewers use to access your application, because video can behave very differently based on the device. For example, there can be playback failures or poor performance that occur only on one model of Google Pixel phones due to chipset bugs (seriously, it's happened!), or only on Safari browser. These failures can be difficult to track down without knowing the type of device, operating system version, or browser version on which issues occur. Mux provides analytics SDKs for many different platforms and video players which means we have to detect a wide variety of devices.

We also want to support Mux Video, which provides an API for applications that include video. Ideally, we’d be able to tailor the video we send to viewers based on the device they’re using. Users on newer devices could stream using more modern codecs, with higher quality video or smaller video files. But in order to do that, we need to know what device is being used and what it can do.

Now let’s talk about how device information is often detected on the Internet. When web browsers or applications connect to a service (e.g., opening www.google.com from your browser, or an application loading a video from stream.mux.com on your mobile device), the application request usually includes some metadata called a User Agent.

The user agent looks like this:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:86.0) Gecko/20100101 Firefox/86.0

This bulky, awkward string is for Firefox 86 on my Mac but all browsers provide something similar (and similarly inscrutable). You can parse these strings to get some basic information about the device and browser/application that’s accessing the service, including browser and version, operating system and version, and device model. Unfortunately, these get very complicated, so an entire industry of services and libraries has evolved to help you make sense of the user agents for visitors to your service.

We have an existing component that we use for device detection, but recently it started having problems. It isn’t updated very often so we sometimes misidentify new devices. We wanted to update our device detection libraries to improve accuracy and make the detection more consistent across web and native platforms.

So I dove into the topic and came out… if not an expert, at least wiser.

Lesson 1: Native devices don’t know as much as you might think

Going in, I assumed that mobile devices would at least know the important details about themselves - their name, category (phone? tablet? something else?) and operating system. They do know some of that, but not nearly as much as I expected.

Android devices, for example, know their model numbers, but (usually) not the marketing names, and when they do, those names are in lowercase or other unfamiliar formats. As for categories, you’re on your own.

Things are a little better on iOS, but you still can’t get the device’s marketing name. Instead, you get device name values like “iPhone10,1" - which, inconveniently, is better known as the iPhone 8.

In order for most device detection services to give you the missing information about a device, you first have to generate a user agent string from the device data. This string depends on what specific library you’re using. The libraries that are easiest to use let you specify basic information in the user agent such as: (; iOS 13.1.2; iPhone9,3). From that, you’ll get the device marketing name (iPhone 7), the category (Smartphone), and other operating system information.

Lesson 2: I contain multitudes

I assumed that, for the most part, the data generated by a library would be generally the same across the device detection services.

For example, the OS version. I examined six different providers, and found four different ways to report versions. The same iOS version could be called: 13.3.1, iOS 13, 13, or 13_3_1. That might not look like it makes a big difference, but trust me, it does. This happens again and again: if someone is using Safari on mobile, that could be "Mobile Safari"... or "Safari Mobile". They could be using "Edge"... or "Microsoft Edge". They could be using Mac OS X... or Mac, or OS X, or macOS.

A bigger issue is how some libraries treat webviews - the web browser interfaces that are built into operating systems, and that a lot of applications use to show media. The Facebook app, for example, uses webviews to play videos in the feed. For our purposes, since we’re trying to help identify failures and playback problems, our customers care more about which webview is being used, and less about the fact that it’s a webview in the Facebook app.

These applications will often add to the user agent string to identify themselves and detection libraries will handle those strings differently. Some will report the webview (Safari Mobile) and others will report the application that’s hosting the webview (Facebook App). As before, this can make a big difference, depending on what your users want to know. If you - or they - are using the data for marketing purposes, for instance, it might be less helpful to know the webview being used instead of the application.

The differences are bad enough when you’re trying to compare two detection services. It gets worse when you realize that they can also be inconsistent within a service. For example, when the Facebook App is used, one service we tested would identify the view as coming from the Facebook App when on Android but as Safari Mobile webview if the view came from iOS.

Lesson 3: Business models make things complicated

These technical and data consistency hurdles can be complicated but are manageable if you choose one that most closely matches your customers’ needs. Now comes the challenging part... money.

Many of the device detection services charge quite a bit to access their libraries, but it's a valuable service and it’s worth it to get updates for new devices. That said, you need to be sure that the pricing model matches your business model.

A lot of the detection libraries charge a base rate and then an additional fee per customer. For some services, that's great, since cost is directly correlated to the number of paying customers you have. On the other hand, a per-customer charge can be a problem for companies that (like Mux) offer free or usage-based services: the incremental costs become harder to manage as you scale up to a larger number of customers - who might barely be using the service (edited)

As always, your negotiating power depends on how much money you bring to the table. We'd be spending six figures a year on these services, and they'd refuse to consider a flat rate. One vendor even stopped talking to us after we asked. You might have better luck!

Fortunately, other services have other pricing models - a few are even flat rate based on the number of queries you make and where you run the service. Shop around and find a device detection provider that's right for you.

There are also a few free options, like DeviceDetector. These are good but tend to focus just on basic device information such as name, brand, browser, etc. and don’t provide information on extended device capabilities.

Lesson 4: This space is always evolving

Like most things in application development, it’s most important to take what you can get from a tool and apply it to your context. In our case, we weighed all these factors, chose a vendor, and created some post-processing rules to massage the data into a state that better aligned with our customers' needs. That approach requires a bit more maintenance and upkeep, but it provides us with the flexibility and quality we need.

Although we recently rolled out these changes, technology continues to evolve as well. For a number of reasons - most notably, privacy concerns, and browsers that send misleading information in user agents - device detection methodologies are undergoing some big changes. Browsers are starting to implement User Agent Client Hints in an attempt to work around some of the worst aspects of user agents. As well, some browsers regularly represent themselves as an alternate browser in order to get a different experience - like how the iPad can identify as a Mac in order to get the desktop browser experience.

All of these changes just add to the complexity of device detection, and are left as an exercise for the reader.

How does this impact Mux Data customers?

If you’re a Mux Data customer, you’ll see these improvements get applied automatically to some of your reporting, but other platforms will need some work on your part.

If you’re using the Mux SDK for AVPlayer, ExoPlayer, or Roku, you‘ll get the new detection automatically without needing to change anything. There are updated SDKs available that you should update to, but we can make changes on the server side.

For web-based SDKs, you need to update to the latest versions. The new versions are about 15% smaller in size, and include other optimizations to make Mux SDKs more efficient.

As always, let us know how things are going. If you run into problems or have questions, we’re here to help.

Device detection is a journey, not a destination

Lesson 1: Native devices don’t know as much as you might think

Lesson 2: I contain multitudes

Lesson 3: Business models make things complicated

Lesson 4: This space is always evolving

How does this impact Mux Data customers?

Written By

Steven Lyons – VP, Product

Leave your wallet
where it is

Read more like this

AI models on CPUs: accurate audio transcriptions without breaking the bank

Edge Config: first line of defense against script kiddies

How we made Mux Player’s loading feel great

Check out our newsletter

Mux Video

Mux Data

Mux Player

Integrations

Device detection is a journey, not a destination

LinkLesson 1: Native devices don’t know as much as you might think

LinkLesson 2: I contain multitudes

LinkLesson 3: Business models make things complicated

LinkLesson 4: This space is always evolving

LinkHow does this impact Mux Data customers?

Written By

Steven Lyons – VP, Product

Leave your wallet where it is

Read more like this

AI models on CPUs: accurate audio transcriptions without breaking the bank

Edge Config: first line of defense against script kiddies

How we made Mux Player’s loading feel great

Check out our newsletter

Lesson 1: Native devices don’t know as much as you might think

Lesson 2: I contain multitudes

Lesson 3: Business models make things complicated

Lesson 4: This space is always evolving

How does this impact Mux Data customers?

Leave your wallet
where it is