Has the JavaScript ecosystem ever been simple? It used to be JavaScript in the browsers that had wild inconsistencies. We'd have to detect which browser we're in and write custom code. Thankfully, those browser inconsistencies seem to mostly be papered over, but the JavaScript server ecosystem is the new challenge. Only a few short years ago "Node" was just "Node", but now it's TypeScript, and a bunch of other runtimes like Cloudflare Workers, Bun, Deno, Vercel Edge, and Nitro, I call these “Node-ish” environments and there's even a thing called Web-interoperable Runtimes Community Group to keep track of all these environments.
At Mux, we have to meet developers where they are. If developers are using these platforms, we have to support them. We have a 6 year old Node SDK for making server-side API requests to Mux. It was written in JavaScript (not TypeScript) and over the years we have added types (by hand) and layered on code as we've tried to keep up with Mux's evolving and advancing platform.
Over the past few years, the Mux-Node SDK hasn't been living up to our standards. It's been difficult to maintain, had TypeScript inconsistencies, and lacked support for the new "Node-ish" runtimes. It's time we solve this.
Big breaking changes suck, and you should never do them
This is basically my stance. I push back very hard on the idea that we should introduce a breaking change into our SDK. Particularly if that breaking change is going to require everyone or almost everyone using the SDK to update their code.
Sometimes I feel that engineers can be a bit flippant about breaking changes in SDKs. They'll say things like "Well, it's a major version release and we follow SemVer, so it’s no big deal". Let's just say: I disagree. It IS a big deal. And it should be a big deal. Sure, you have not technically violated your contract with the developers using your SDK. Your agreement is — if there are breaking changes we will do major version bumps. If you, as the developer, see a major version bump, it's your responsibility to do your diligence and make sure you are taking all the steps to upgrade.
But I think that approach and that mindset lack empathy for the people using your SDK. Let’s all try not to be so casual about breaking changes. Let’s make sure that when we introduce breaking changes we’re doing so with care and caution, understanding that we are requiring actual developers using our SDKs to go and change code in their codebase, then test that code, put in a pull request, get someone from their team to review the pull request, merge the pull request, deploy it, and monitor the new code to make sure nothing is broken. All of this is time they could have spent building their product! If you're doing your job right you should be maximizing the time developers get to spend on their product, and minimizing the amount of time they have to be interacting with your SDK, that’s the goal, anyway.
There is a version of breaking changes that are simple: dropping support for legacy systems and removing some code that is not used by many people and has previously been clearly deprecated. That’s not what I’m talking about here. What I’m talking about is breaking changes that require a large plurality of developers using your SDK to change their application code. These kinds of changes should be considered hostile and only done with great consideration.
So then why are we doing this really big breaking change?
Well, based on what I just said you can call me a hypocrite. And you wouldn’t be wrong. I had extremely mixed feelings about this, and it was with great pain and suffering that we pulled the trigger on this change. I promise you it was carefully considered and in this particular case, I believe the benefits outweigh the costs (and the costs are enormous). Let’s get into it and you can decide for yourself.
Iterating on legacy SDKs
The thing that makes this difficult in our situation is that none of the things that backend Node developers care about today really existed 6 years ago. It was more or less only "Node" and we didn’t have to worry about all these other complexities.
Up until version 8, our major releases contained breaking changes usually in the form of "dropping support for Node < X" because it's considered end-of-life and no longer getting security updates. Or at least once when we had to refactor how the package is bundled to support CSM and ESM. In those cases, nothing should have broken, but it was impossible to be 100% sure so we erred on the side of caution and marked it as a major release.
But the technical debt was adding up. Among many things, what we lacked was:
- Reliable types for TypeScript users. We were hand-rolling types which means something was always wrong.
- Manually maintaining all the code. We have a well-maintained Open API specification. Our API reference docs are automatically generated from this spec, and we have a number of other SDKs that we generate from this spec. There are pros and cons to how we generate those SDKs, but more on code generation later.
- Lacking support for new runtimes. Cloudflare workers, Deno, Bun, Vercel Edge, Nitro, etc. All of these were not supported because our SDK took a dependency on node:crypto, which is a Node package that doesn't exist in the newer runtimes. The newer runtimes have Web Crypto APIs, which isn't exactly a drop-in replacement, particularly when it comes to supporting pks-8 v.s pks-1 signatures.
How we did Mux-Node 8
The pain we were feeling with Mux-Node before version 8 was getting to the point that we had to re-assess everything. We have been shipping features on the Mux Platform rapidly this year (metrics filtering, auto-generated captions, and resolution modifiers to name a few). Many of these features required API-level changes. Updating the OpenAPI spec is part of our normal process (and the spec updates flow through to the docs). But then we have to go in and manually update the Node SDK, and update the types for the Node SDK. This was starting to be cumbersome and we would always find ourselves lagging behind or making mistakes with the types. It felt natural to say: we have the OpenAPI spec, can’t we just use that for the types?
Similar to how we generate other server-side SDKs with OpenAPI, we looked to see if we could do the same for TypeScript. We found out we could, but we weren’t totally happy with the output. It didn’t feel idiomatic, it lacked custom functionality not represented in the spec like signing JWTs or verifying webhook signatures and it would overall be a step back from the current Node SDK.
One thing was for certain: if we were going to do a big breaking change, we sure as hell were going to make it worth it. We were not willing to lose things like the JWT generation for the sake of auto-generation, that would be a win for us in terms of maintainability, but it would be a huge loss for our users.
The only acceptable solution to going down the auto-generation route would be if it resulted in an SDK that was much better than our current one. This was a hard line we weren’t willing to compromise on.
Enter Stainless
Lucky for us, we got in touch with the folks from Stainless. Based on our previous experience with code generation we were skeptical if it would give us everything we needed, but we were willing to give it a shot. The Mux-Node version 8 SDK that you see now is the output of what Stainless generated based on our Open API spec. And it is MUCH better than what we had before. All the syntax is different, it is a large departure from what we had, but it is so much better in every way. It gives us:
- Automatically generated code based on our OpenAPI spec
- Utilities that are not represented in the spec, like JWT generation and webhook signature verification
- Support for all the non-Node runtimes
- Real TypeScript types based on our OpenAPI spec
- Idiomatic code. Unlike other attempts at generated SDKs, the code actually looks and feels like something a developer would design
And other nice quality-of-life things that we were missing before:
- Retry configurations for failed requests
- Auto pagination with for await
- A better pattern for accessing raw responses
- Ability to make custom requests to undocumented endpoints (useful against staging or when testing things in beta)
- Configuring HTTP proxies
- Logging & middleware to customize the fetch client
And, to smooth things over there is a migration tool that will perform a codemod to help developers upgrade from 7.x to 8.
Running this on your project will modify all your existing code and update it to the version 8 syntax.
Is the tradeoff worth it?
Overall, we think so. With ~100k downloads per week, Mux-Node is our most popular server-side SDK, so we did not take this lightly. But the issues we were experiencing were starting to really cause pain for developers using it. And we felt like the more we grew, the more we were going to run up against those pains.
Now we feel like we’re in a place where we can scale the usage of this SDK to 1 million and 10 million weekly downloads.