You don’t need breaking changes

Leo Sjöberg • October 18, 2022

If you take a look at the API versions (which denote breaking changes) in the Stripe API upgrades documentation, you’ll see something remarkable – between August 2020 and August 2022, Stripe made zero breaking changes. If you read the changes, they’re remarkably minor for two years worth of changes; a few renamed values, some removed properties, and one or two behaviours that changed. Now contrast that with Stripe’s API Changelog. You’ll quickly realise that Stripe did not do one massive release for two years of change. Rather, they’re releasing several times a week. But how do you manage to not make breaking changes for two years, and why is that something you should strive for? And let’s not forget the most important question: what do you do when you do need to make a breaking change (something Stripe has done over 100 times)?

The argument against breaking changes

I’ll be blunt, I have strong opinions on this matter. Let me tell you why.

Firstly, my opinions are grounded in one core belief: any team should be able to release multiple times a day, every day. This is the premise for all that is to follow. This is important because it makes scaling engineering teams easier, increases agility, and allows you to test changes with minimal friction. While traditionally a core part of scrum and many agile teams, releasing once a sprint is a terrible idea (I’ll write another post on why soon).

With the goal of releasing multiple times a day, it quickly becomes apparent that traditional workflows that release once per cycle/sprint/increment are not sufficient. By now, the industry has therefore settled on trunk-based development.

With trunk-based development, we no longer coordinate releases. Neither do we fix things on a release branch. Changes are in main or they’re not. Working with trunk-based development is the only way we can release daily, but it also means things cannot be allowed to break.

Let’s take an example. If you develop a feature that makes a breaking change, and push it to main, it’ll go into your staging/test environment. You made that breaking change because you’ve coordinated with another developer, making a change elsewhere, that depends on your broken change. Until they release onto the same environment, you’ve broken the environment.

If someone needs to release a hotfix to prod, you’ve now broken prod (because main was released to prod, and remember, your change is in main). Simultaneously, by making breaking changes, you’ve shot yourself in the foot. Why? You’ve now prevented yourself from working with stability-enhancing features like canary rollouts and automatic rollback, because those only work if your changes are backwards compatible (in a canary rollout, any code may hit one of your two releases, with the same API call, so it needs to be compatible).

If you want to move fast, you must avoid breaking changes wherever possible.

How to avoid breaking changes

I have three straightforward ways to avoid the pain of breaking change.

Firstly, don’t modify API tests when you add functionality. You want your tests to ensure your old functionality still works, so add new ones instead. This reduces the risk that you or others may introduce breaking changes. API tests should be added or removed, not modified.

Secondly, and most importantly, think about your change. You need to design your change properly. It’s shocking how easy it is to make changes non-breaking if you have it as a goal. You move away from “what do I want this API to look like”, towards “how can I add or change this without breaking the API”. And personally, I think the latter also leads to better, more considered API design.

Finally, understand what you mean by “breaking”. A traditional breaking change can often be architected into two non-breaking steps. This may be confusing, so let me clarify.

Making breaking changes non-breaking

Unless you’re maintaining a public API, the only consumer you have to worry about is other developers in the same company. The whole reason you’re making breaking changes in the first place is because someone else needs them. Therefore, it’s relatively easy to get people onto your new “version” of the API. So let’s define what we consider “breaking” here. A breaking change is a change that, when released, will break current usage. What this means is that if everyone’s migrated onto your new API (whether that’s a different endpoint, a new required parameter, or a changed value), you’re free to remove the old one!

For example, if you need to introduce a required parameter, all you have to do to make this “non-breaking” is to split the work into 2 steps. First release it as an optional parameter, with either a default or nullable value (whatever you intend to do with all existing records). Then, if you really think you need to make the parameter required, just make sure all incoming requests are sending that parameter. Once they are, you can drop the optionality, and it’s not a breaking change. Because remember, it didn’t break any current use.

To conclude, breaking change is not a necessity to good API design. Design your changes with intent, and design breaking changes into steps that on their own are non-breaking. This way, you become a better software architect, a better API designer, and a coworker highly appreciated by other teams that no longer need to coordinate releases with you!