Advanced Fun and Profit with Packet Capture
With the rise of microservice architecture and containerization, programs increasingly communicate with each other over the network. This makes tcpdump a very powerful debugging tool. At Akita, we make liberal programmatic use of tcpdump in order to watch API traffic to build API models, for the purpose of catching breaking code changes and more.
For better or worse, getting tcpdump to work programmatically with the right filters and in your desired environments takes a bit of work. First of all, tcpdump captures all of the network traffic, meaning it also captures a lot of noise, making it necessary to use filters to get to only the API-related network traffic. Also, using tcpdump with Docker containers is not as straightforward as capturing packets sent from a process running on your local machine.
In a previous post, I talked about how to watch network packets using GoPacket. In this post, I first talk about how to filter those packets with packet capture filters (cBPF). Then I provide a quick start on how to use tcpdump under the common scenarios you might encounter with docker containers.
If you’re interested in how we do this at Akita, check out our CLI on GitHub. If you’re interested in using Akita to model API traffic, sign up for our beta!
Using Packet Capture Filters (cBPF)
Pcap filters (pcap-ftiler(7)), also known as Berkeley Packet Filter (cBPF), offers a powerful way to filter packets captured by tcpdump. At Akita, we use cBPF filters under the hood to allow users to customize filter out noise and focus the analysis on only API related network traffic.
In the Akita CLI, we expose custom packet filters using the `--filter` option (see docs). For context, the Akita code passes the filter directly into the pcap library (see here).
One of the most basic filters is filter by port.
Example: only capture HTTP traffic (most servers use port 80):
Here is an example of using the `”port 80”` filter in an Akita command:
The most common type of host filtering is by IP.
Example: only capture packets sent/received by a specific host:
Conjunctions and Disjunctions
All conditions can be joined by “and” or “or” to create more powerful filters.
Example: only capture HTTP traffic sent/received by a specific host:
You can specify packets that are coming into the interface you’re capturing or leaving it.
Example: only capture inbound HTTP traffic sent from 172.16.0.1. This means the destination port is 80 (receiving end) while the src IP should be 172.16.0.1 (sending end)
Using tcpdump with Docker
Now I’ll talk about how to use tcpdump with Docker containers. As I mentioned, using tcpdump with Docker containers is more complicated than capturing packets sent from a process running on your local machine. This is because each Docker container has its own set of networking interfaces, making even the out-of-box Docker network configuration tricky when it comes to packet capture. (You may find a deeper reference of Docker networking in the Docker docs here.)
For simplicity, for the rest of this post we’ll use the example of two copies of your program communicating with each other over the loopback interface or Docker’s default bridge network.
Traditional Setup: Capture Packets from a Local Process
Traditionally, your programs run as processes on your machine (the host) and send/receive packets directly from your machine’s network interfaces. Running tcpdump in this case is quite straightforward - you just need to specify the interface you want to capture from. For example, the following command captures packets from loopback lo interface (see diagram below):
Capturing Packets from Docker Containers
This section describes the default behavior of docker networking, custom setups are not covered by this post.
Unlike programs running natively on your host, each docker container has its own set of network interfaces that are distinct from the host’s and each other’s. To allow containers to communicate with each other, docker creates a bridge interface to connect them. Figure 2 illustrates this setup.
As an example, here are the hops needed for container 1 to send a packet to container 2:
- The process running in container 1 sends a packet through container1:eth0 interface. Note it does not use container1:lo interface since that loopback is for traffic internal to the container, not the host.
- The packet goes to the docker0 interface on the host.
- The packet travels to container2:eth0, which is then forwarded to the process in container 2.
As seen above, you have two options for capturing traffic between two containers, that we’ll outline below. It is possible to use the Akita CLI with either approach; we recommend the second one.
This method allows you to capture all packets going in and out of a single container. It works by running a separate tcpdump container that shares its network interfaces with your program’s container.
Setting this up involves two steps:
- Run a tcpdump container attached to your container’s network
- Set up docker volumes to store the pcap files on your host’s filesystem.
Note: this currently only works on linux systems where the docker bridge interface is easily accessible from the host.
As seen in Figure 2, docker creates a bridge interface on the host. All inter-container traffic goes through this interface, so you can simply run tcpdump on it. Note that you won’t be able to observe loopback traffic within each container using this setup.
To filter for packets by container, you can look up each container’s IP address on the docker bridge network and use BPF to filter packets by IP. For example:
I hope this post has shown you that it’s possible to programmatically do a lot of things with packet capture.
As I mentioned, you can check out our CLI on GitHub if you’re interested in seeing some of these ideas in action. If you’re interested in trying out Akita to learn more about your APIs and catch regressions, sign up for our beta!
With thanks to Nelson Elhage, Mark Gritter, and Jean Yang for comments. Photo by Braydon Anderson on Unsplash.