Seeing API Endpoint Performance in Akita

by

Mark Gritter

For most modern applications, the performance of the system is the performance of its APIs. A single web page may call dozens of APIs. The implementation of those APIs call dozens more. The user experience is often determined by the slowest API. Today, testing and observability tools leave it up to the developer to untangle the dependencies and provide clear answers.

I recently joined Akita, a startup that automatically learns API behavior by observing requests and responses, automatically generating specs and diffs. Up to now, Akita’s focus has been on the structure of those API calls: for instance, path arguments and the request and response data types. But the trace of requests and responses also contains the latency information, which can be associated with the structural information that Akita detects. For my starter project, I extended the current Akita tool with per-endpoint performance measurements.

In this blog post, I’ll talk about why API performance matters, how this looks with Akita’s new endpoint performance feature, and how tracking performance at the endpoint level can help.

Understanding API Performance is Hard Today

I was particularly excited to work on per-endpoint performance reporting because I have previously run into several challenges involving APIs, such as:

API costs can be opaque before code hits production. In one system I worked on, a CPU profile showed that a lot of time was being spent processing a “status” API. The engineer who built the monitoring front-end simply didn’t know it was expensive to call, as this was not documented anywhere, so they did so every 30 seconds. Had they known that it took several seconds for the API to respond, the engineer would have made a different decision, or pushed for API improvements in the back-end.

The details of an API call matter. A call to get the current version of an object can often be answered out of memory, giving good performance for the common case. But, if the same API receives a request for multiple object IDs, or an old version, or uses filtering criteria, the request might instead be sent to a database to handle the more complex query. API implementations may branch to quite different behavior based on the contents of the request. This can be hard to diagnose if tools report only the path.

Major performance problems can be created by small changes. In my previous work, I recall a case where my team added an index to a database to help with query performance for a particular API. But, that index must necessarily be updated on every write, which affected a different set of APIs than the one being tuned. Because the system was write-heavy, the overall system performance suffered, which we only discovered later during pre-release load testing.

‍

Collecting and Understanding API Performance in Akita

There are several components to API latency; I have focused on measuring “processing latency” or “wait time”, which is the time the server spends between receiving a request and sending a response. That is the measure most tightly coupled to the API itself instead of other factors.

An HTTP request’s total latency can be broken down into the following components:

Wait for a network connection to be available (if one must be allocated from a connection pool)
Send and receive a DNS request to resolve the host name
Create a TCP connection and receive the initial ACK
Conclude a TLS handshake (if applicable)
Send the HTTP request over the network
Processing latency for the server to send the first byte of the response
Send the HTTP response over the network

If you capture a browser HTTP Archive (HAR) trace, all of these measurements will be present. For example, Chrome provides a nice graphical breakdown of the various pieces of a request:

Chrome's graphical breakdown of a request

We’re able to collect the same request latency in the Akita agent, which uses pcap filters to passively monitor API traffic in your system. When Akita is operating in “learn mode,” we observe the timestamps of the first and last packets in the HTTP request, and the first and last packet of the HTTP response. This gives us a measurement of the processing latency on the server; the “waiting” time in the graph above.

The latency that we measure within Akita

As Akita processes the collected trace, and turns it into a specification, it associates each latency measurement with an API endpoint.

Seeing API Endpoint Performance with Akita Software

In the Akita UI, you can now see the median latency observed during the training session, next to each API endpoint. In our akibox demo, we observe that creating users (via POST) is more than 300x as expensive as any other operation:

Akita API information with latency measurements

In the details of each API endpoint, we show the 50th percentile (median), 90th percentile, and 99th percentile of the observed processing latencies:

To see performance measurements for your API endpoints, either collect traces with the Akita CLI (version 0.12 or later) or upload HAR files from a browser session. You can now collect per-endpoint information in CI on every pull request, or your staging and production integrations. (More information on the latter coming soon.)

How Associating Performance with API Models Helps

Admittedly, measuring the time taken for an HTTP request is sort of a “101” level observability feature, starting to associate performance with spec creation and comparison has the following advantages.

Performance is part of the interface: A developer using an API needs to know not only what types to use, but whether an endpoint is “cheap” or “expensive”. Even well-documented APIs typically omit this information. Showing the expected or relative cost of an API up front can help developers make better decisions. Knowing expected performance of individual APIs can also help give insights about how APIs compose.

Break down performance by arguments, not just paths: Many observability tools offer latency breakdowns by “route”. But often the performance of an API can differ based on details of the request payload: the state of a flag, or the presence of a particular field. For now, we provide only a top-level summary, but Akita’s API visibility will let us highlight performance differences between variations on a single API path.

‍Warn about performance changes: Our users currently run Akita to get notified about potentially breaking changes such as added/removed endpoints, added/removed fields, and modified data types and formats. Just as Akita can warn you of differences in the response format of an API, we should also be able to warn you of differences in the performance of the API, when they appear in your test or staging environment. An API that suddenly takes 10x as long to execute can increase queue depths, leading to upstream problems and a less reliable system. Latency is both a leading and trailing indicator of problems and Akita can now report on this as early as test.

Building API-based Observability

Akita’s vision is a new generation of observability tools that help you discover what’s really going on with your API. The performance of APIs will be an important part of that, and we’re eager for your feedback on what you’d like to see—particularly things that your existing observability tools can’t do!

As we improve our ability to monitor systems throughout their lifecycle, from test to staging to production, there are many things we can build to compare the observed API behaviors with the expected or historical behavior. For example, does the staging environment accurately reflect the challenges of production? Which methods are historically the slowest-performing, and how has that changed over time? Are changes in your software correlated with changes in API behavior and performance?

I’m excited to be part of the team at Akita, and I’d love to hear more about the challenges you face in understanding your APIs.

With thanks to Nelson Elhage and Jean Yang for comments. Cover photo by James Ting on Unsplash.