May 4, 2021
April 26, 2021

Programmatically Analyze Packet Captures with GoPacket

by
Kevin Ku
Dog with glasses
Share This Article

tcpdump and Wireshark are great, but what if you want to programmatically analyze network traffic?

At Akita, we've built a tool that analyzes API traffic to build API models. One of the ways Akita collects API traffic is by passively watching packets on the network. To watch network traffic in a minimally invasive way, we’ve built a custom packet processor in Golang using the GoPacket library.

In this post, I'll walk through how to use GoPacket to capture and analyze network traffic, explaining the key concepts of the library, along with working examples you can try at home. You can also see this in action by checking out the Akita CLI on GitHub.

tcpdump + Wireshark: Good for One-Off Debugging

You may be familiar with using the command-line packet analyzer tcpdump to create a packet capture:

Tcpdump makes it possible to inspect network traffic by letting you print the contents of packets on a network interface that match a given filter.

You can then load the output in Wireshark, which provides a nice GUI to perform all sorts of analysis on the packets.

This setup is great for one-off debugging sessions. But it’s not easy to programmatically run the analysis, let alone package up custom analyses to ship off to your users.

Using GoPacket for General Purpose Packet Processing

Now I’ll show how to automate package processing with GoPacket, a general purpose packet processing library. You can perform all sorts of analysis on packets using the awesome Go programming language.

But it’s a little more complicated than a couple of lines of code, so I’ll walk you through what you’ll need to do. In this post, we’ll focus on using GoPacket to:

  1. Collect packets from network interfaces (replace TCPDump)
  2. Reassemble TCP streams (a Wireshark feature)

Collecting Packets

GoPacket provides a nice mechanism to interface with libpcap, the underlying library powering tcpdump. This means you can capture packets right from your Go program!

The GoPacket/pcap interface is fairly straightforward. For example, here’s an equivalent of `tcpdump -i lo “port 3030”` with GoPacket:

GoPacket also allows you to import a packet capture file that you’ve previously collected with TCPDump. For example:

You can see our code for calling into GoPacket on GitHub here.

Reassemble TCP Streams from Packets

Once you’ve collected your packets, the next thing you might want to do is reconstruct TCP streams from those packets. This is useful for examining higher level protocols such as HTTP that run over TCP. Now I’ll show how to do that. In addition to this guide, you can check out the full code for how Akita reconstructs TCP streams and parses HTTP here.

As a quick refresher, a TCP stream is a sequential flow of data exchanged between two hosts on the network. To allow the network to accommodate different bandwidths, the networking stack splits each TCP stream into multiple packets. Since the underlying IP network does not guarantee in-order delivery, your packet capture may contain duplicate or out-of-order packets for each stream. 😱

But worry not, GoPacket can help you remove the noise and reassemble those TCP streams from packets.

To use the reassembly package, you need to implement two interfaces:

  1. Stream - each stream represents a reassembled TCP stream and is the mechanism through which the reassembly packages passes data from TCP packets to you.
  2. StreamFactory - a wrapper for constructing a new Stream for each TCP stream.

See our implementation of Stream below—and the full context here.

And here is our implementation of StreamFactory, which relies on the definition of newTCPStream here.

You will then need to wrap StreamFactory in a StreamPool, whose purpose is to create a new stream with the factory if data from a new TCP stream arrives or to pass data to an existing stream. The StreamPool in turn is used by an Assembler, which contains all the fancy logic that takes care of reconstructing TCP streams from packets and its associated edge cases (out-of-order packets, early connection termination, etc). To process packets, your program simply hands packets to the Assembler. I summarize the interaction in this figure:

How the reassembly components interact.
How the reassembly components interact.

See how we do it at Akita here, with the relevant code below:

What’s next?

If you want to see more of how we listen to and parse network traffic, check out the Akita CLI on GitHub. If you’re interested in how Akita listens to API traffic that doesn’t go over the network, you may be interested in reading this blog post about how Akita understands Flask APIs. We’ll also have a few more blog posts coming out soon about how our tool works.

If you get excited about automatically hooking into all parts of a system to watch API traffic in the least invasive way, Akita is hiring. 😉

With thanks to Nelson Elhage, Mark Gritter, Cole Schlesinger, and Jean Yang for comments. Photo by Ruby Schmank on Unsplash.

Share This Article

Thank you!

Your submission has been sent.
Oops! Something went wrong while submitting the form.