“How can you be an observability tool that doesn’t focus on logs, metrics, or traces?”
At Akita, we’ve gotten this question a lot. You see, we’re an observability company that didn’t start out focusing on logs, metrics, or traces. Instead, we’re watching API traffic and building models to understand API behavior.
It’s been hard to answer this question because most people think about observability in terms of logs, metrics, or traces. But, as I’ve said in this Tweet, here’s how I see things. Saying observability is about logs, metrics, and traces is like saying programming is about manipulating assembly instructions. Observability is about building models of system behavior and how they change. Today, state-of-the-art tools give you logs, metrics, and traces—and then people build those models in their heads.
Especially as people have begun generalizing observability to other domains, for instance data science, it’s important for people to understand observability in terms of its goals, rather than its implementation. In this post, we’ll talk about how observability tools are really about system understanding, how logs, metrics, and traces are the implementation strategy, and what it might look like to raise the level of abstraction.
To understand the high-level goals of observability, let’s look at the websites of two companies that are at the forefront of the devops observability movement, Lightstep and Honeycomb. Here’s what they have to say about the benefits of observability:
To generalize, observability is about helping people build models of their systems so they can:
Given these goals, the less tools are about logs, metrics, and traces, the better. I like what Michael Hibay says here in response to my Tweet: the less tools require the user to build models in their heads, the more observability is possible.
But today, people everywhere are still thinking about observability in terms of log, metrics, and traces. For instance, here’s an article about how observability has become a critical concern for data teams as well as devops teams, with an excerpt below:
And talking about logs, metrics, and traces as the “three pillars of observability” is consistent with the messaging from observability products. For instance, this is from Datadog’s observability page:
And saying that observability’s key elements are logs, metrics, and traces is not wrong. Today, observability tools make it possible to get channels of visibility into system function in exchange for code instrumentation. People who know how they want to understand their systems can get the support they need to collect and query the information to do it.
But saying observability’s “pillars” are metrics, traces, and logs is like saying programming’s key elements are storing data, moving data, and arithmetic operations on data. To build on this analogy: today, compilers and interpreters do most of the job of expressing computations in terms of stores, loads, and arithmetic operations, while programmers get to work with the languages and paradigms that have gotten built on top. I believe this is where observability is headed as well. This leads us to the question: what are the appropriate abstractions on top of logs, metrics, and traces?
Just as there have evolved to be different programming languages for different communities and tasks, there are going to be different tools built on top of logs, metrics, and traces, each making their own set of observability tasks easier.
At Akita, we’ve set out the following challenge for ourselves: what would it look like to build a tool that abstracts over logs, metrics, and traces the way Python abstracts over assembly? Our strategy has been to take an API-centric view, supporting the following use cases:
The approach we’re taking at Akita is a combination of passively watching API traffic and building API behavior models that understand endpoints, data types, and per-endpoint performance. Passively watching API traffic makes it possible for our solution to not require code instrumentation: you don’t get every single log or trace you might want, but you can drop our solution anywhere there’s network traffic and get results quickly. Automatically modeling API behavior makes it possible for users to quickly answer endpoint-centric questions. You can’t answer every question you might want to know about your API behavior, but the idea is that you can now easily answer many common questions.
With Akita, as with high-level programming languages like Python, you get a 90% solution on Day One. You might not have all the control of getting full logs, metrics, and traces of today’s observability tools, but you get a drop-in solution that takes you a lot of the way quickly, centered around questions you might have around APIs.
Especially for those of you out there thinking about how observability applies outside of devops, here’s the main takeaway: observability isn’t about logs, metrics, and traces. It’s really about getting the tools you need to understanding your systems and how they change. Logs, metrics, and traces are implementation strategies—and a good place to start, but you might want to build for higher-level concepts.
What we’re doing at Akita is just one kind of abstraction one can build on top of logs, metrics, and traces. Just as we saw programming languages evolve and make different tasks go from expert-only to very easy in the last few decades, I believe we’re going to see the same with system understanding. We’re just getting to the point where people are understanding logs, metrics, and traces for observability, so we’re really at the beginning of this conversation.
These are exciting times! We’d love to hear your thoughts on what abstractions would be most helpful to build on top. And, if you’re interested in API-centric observability, we’d love to have you try out our beta.
Photo by Dennis Kummer on Unsplash.