Dec 1, 2022
·
5
Min Read

The One-Step Routine: Why API Monitoring Is the Bug-Finding Solution You Need

by
Jean Yang
Share This Article

There are a lot of people out there telling you how to write better software.

Use types to catch bugs before testing. Write unit tests. Write end-to-end tests to catch more bugs. Fuzz! Oh, and instrument your code so you can test in prod to find even more bugs.

Read the Internet and you’d think that most companies have a seventeen-step bug prevention routine that starts with types and ends with re-instrumenting all of your services every time you make any code changes so you can precisely trace every request to the ultimate response fifty services away. (This is not all that different from skincare routines, which often end up being many more steps than a person can count on one hand. See my Twitter thread for a fuller analogy.)

An example of how many steps can end up in a skincare routine. Not entirely different than how people end up stacking up lots of tools in their dev-to-prod routine!

But what if you don’t need to have the glass skin equivalent of software quality? You’re working on a web app with maybe millions of users, but you don’t need to scale to every human on the planet. And because you’re not in banking or healthcare, buggy interactions in many user flows can get corrected by refreshing, eventual consistency, or customer service representatives. What if you’re looking for… a one-step routine?

The Tweet that made me an accidental skincare influencer on developer Twitter.

In this post, I posit that if you wanted just one tool to tell you about your bugs that mattered, an API monitoring tool is a good candidate. But not the API monitoring tools of today: we need to see some innovation in this space!

Finding bugs that matter means understanding prod bugs

Here’s a dirty secret that most of us already know: bugs don’t matter unless they show up in prod. And only if they show up in the critical prod flows.

Back in the day, when code was small and self-contained, it may have been possible to identify and fix all possible bugs. With average phone apps alone being thousands of lines of code, the number of bugs is long beyond the point that we can reasonably expect to fix them all.

Sure, many bugs are bad because they could happen in prod. And there’s no denying that types, static analysis, and other code-focused techniques help with whole classes of bugs.

But there are a whole lot of possible bugs that are unlikely to show up in prod because there would have to be a pretty unlikely sequence of events for the bug to occur. And yes, some of them are pretty bad if they happen. For those, we should take the necessary precautions. But the majority of software teams are not going to prioritize fixing bugs that are recoverable and infrequent.

Because of this, understanding how, when, and how often bugs occur in production is the best way to prioritize bugs in a modern system.

Everything is now APIs

When people think of prod, they often think of performance optimization and error tracking. They think of lots of logs, traces, and dashboards. They think of technical power users running around wielding their technical power tools.

When people think of APIs, they think automation and good developer experience: Stripe; Twilio; Okta. Or they might think of design, governance, and management.

But more and more, prod is APIs. The rise of SaaS and microservices means that, increasingly, software is made up of services calling other services. Understanding production behavior means understanding the web of who is calling what, when, and how.

Trying to root cause an incident? Debug high latency? The picture is a lot clearer if you know whether a service that your service depends on, that you’re calling through an API, is contributing to your issues.

Why it matters that finding prod bugs means finding API bugs

Okay, you might now be saying. But we have tools to understand prod. What about Prometheus, Grafana, and all of those technical power tools that we see technical power users running around using?

The problem isn’t that you can’t monitor prod. It’s that it is not so easy to monitor all of prod. With application performance monitoring and observability tools, it’s easy to end up with either too little or too much information about prod. Monitor on a small set of known issues and you don’t have the coverage you’d ideally like. Turn on monitoring for everything and you get a fire hose of information, requiring you sift through lots of logs and/or look at lots of dashboards to see what’s going wrong. This requires some understanding of the system under monitoring, as well as time. some understanding of how to read low-level dashboards—and time.

Looking at prod through the lens of APIs is like looking at biology through cell theory. You could try to understand how animals and plants function by looking at pH levels, or looking at how well they’re doing overall. But the minute you realize they’re made up of cells and start looking at how things are flowing in and out of cells, it gives you this whole abstraction that unlocks all kinds of understanding and predictive power about the system. It’s the same with APIs: you can either look at your entire system in terms of low-level metrics like “how many errors am I getting overall,” but being able to specifically associate high latency, errors, and more with specific APIs, especially when you start understanding how the APIs are interconnected, is incredibly powerful.

The next generation of developer tools: API tools for production

In the last ten years, production behavior has surpassed code to become the source of truth about software. In the next ten years, there will be major innovations around production tooling for developers. Not just production tooling for ops and infra teams: production tooling for app development teams.

Since finding production bugs increasingly means finding API bugs, I predict major innovations in API tools for production in the next decade. This means:

  • API tools will branch out from just covering external APIs.
  • API tools will branch out from being design, testing, and governance tools.
  • API tools will become more integrated with other tools that app developers use.

It’s time to have a higher-level framework for understanding what’s going on in the complex organisms that are our modern software systems. Like cell theory, API-based tooling provides a much-needed framework on top of the logs, metrics, and traces for understanding production today.

For those of us looking for a one-step bug-finding routine, API monitoring tools are a good bet for the future. I’m looking forward to seeing how the industry builds more structured tools for prod.

The future update to my skincare products as dev tools Twitter thread.

And if you’re curious about what we’re building at Akita, check out our beta.

Photo by Chris Ried on Unsplash.

Share This Article

Join Our Private Beta now!

Thank you!

Your submission has been sent.
Oops! Something went wrong while submitting the form.