Why I Chose vsock Over TCP for a Firecracker Serverless Runtime

When I was building my own serverless runtime, the first major decision I had to make was: how will the host and the VM communicate? It looks simple on the surface, but every choice has a tradeoff.

My invocation flow looks something like this — a request arrives at the control plane, it gets sent to a warm VM, executes inside it, and the output is sent back. This is the hot path. It happens for every single request, and a wrong decision here could blow up latency significantly.

So I chose vsock. Here’s why.

The Obvious Alternative: TCP/IP

When you think about networking, the first thing that comes to mind is TCP/IP — and rightfully so. It’s universal, well-documented, and has solid support in every language.

But for microvms, it would not have been a great choice. TCP/IP carries a lot of overhead, and that overhead hits latency.

To use TCP/IP between host and guest you need: a virtual network interface inside the VM, an IP address assigned to it, and a TAP device with routing to the host (since TCP uses packets and the NIC wants Ethernet frames). For a long-running VM this is a one-time cost. But for a serverless platform spinning up thousands of short-lived VMs, that cost becomes unbearable.

On top of that, every request travels through the Linux networking stack twice — once in the host and once in the VM. For traffic that never leaves the physical machine, you’re paying a real cost for a problem that doesn’t exist.

My Choice: vsock

vsock is a socket address family designed specifically for host-guest communication in virtualized environments. Instead of IP addresses and ports, vsock uses Context Identifiers (CIDs) to address VMs. The host always gets CID 2; VMs are assigned unique CIDs at boot.

vsock bypasses the kernel network stack entirely. Communication goes directly through the hypervisor layer — in Firecracker’s case, via a virtio-vsock device it implements natively. No IP stack, no routing table. Just a byte stream between two endpoints, mediated by the VMM.

Firecracker exposes vsock on the host side as a Unix domain socket. The guest connects over AF_VSOCK. Firecracker bridges the two. From the application’s perspective on both sides, it looks like a normal socket — you read and write bytes. The transport underneath is just much simpler.

[Host Process] ←→ [UDS] ←→ [Firecracker VMM] ←→ [virtio-vsock] ←→ [Guest Runtime]

The Tradeoffs

TCP/IP gives you familiarity and ecosystem support. vsock gives you lower overhead and higher throughput, but sparse documentation and a harder implementation.

For my use case — thousands of requests per second, host and guest always on the same physical machine, strong isolation required — vsock was the clear winner on every dimension that mattered for performance. The tooling gap was real but manageable.

What the Numbers Looked Like

The system runs with 10 concurrent connections over 30 seconds against a deployed Node.js handler, benchmarked with autocannon:

Throughput ~3,500 req/sp50 latency ~2msp99 latency ~10ms

Benchmarks were executed on Intel Core i5–11400H (6C/12T, up to 4.5 GHz), 8 GB RAM, 512 GB NVMe SSD, Ubuntu 24.04 (Linux 6.17), KVM-enabled with Intel VT-x hardware virtualization.

Several things contributed to those numbers — snapshot reuse, persistent runtime reuse, and warm VM pooling. But the IPC layer matters too. Each invocation requires at least two trips across the host-guest boundary: request in, response out. At 3,500 requests per second, that’s over 10,000 vsock operations per second on a single host. Running that through a full TCP stack would have added measurable overhead on both sides.

The Parts That Were Actually Hard

To be honest, vsock is not a drop-in replacement for TCP. The API looks similar, but the implementation is genuinely difficult and Firecracker’s quirks combined with a lack of proper documentation made it significantly harder. A few things that were especially tricky:

Race conditions at connection time

The guest runtime would start listening on a vsock port before the host-side bridge was ready, or vice versa. When that happened, the connection would either fail silently or hang indefinitely. No retry logic, no handshake that naturally surfaces the problem. I had to build explicit lifecycle ordering: host bridge initializes first, then the guest listener starts, then a readiness signal is sent before any IPC traffic flows. Fragile, but it worked.

Snapshot restore corrupts vsock state

This one cost me a few days. When you take a Firecracker snapshot, it captures the full VM memory state — including any open vsock connections. When you restore multiple VMs from the same snapshot, they all wake up with identical vsock state and try to resume the same connection. They collide, hang, or fail in ways that are completely unexpected.

There’s a second issue too: the vsock device is tied to a host UDS path. If every restored VM points at the same UDS, you get equally confusing failures even when no requests are in flight. Each restored microVM needs its own host-side UDS endpoint.

Neither of these were obvious at first. The first request would go through fine. Everything looked good. But as soon as I started testing concurrently, things fell apart — requests failing, getting stuck, no clear pattern.

The fix was using a newer Firecracker dev build (the changes are now merged into the main build) with improved vsock snapshot handling. Restored VMs now start with a clean slate, get a fresh UDS endpoint, and establish a new vsock connection on first invocation.

vsock is just a byte stream

This sounds obvious but it has practical consequences. There are no built-in message boundaries. If you write 200 bytes from the guest, you might read 200 bytes on the host in one call — or 47 bytes and then 153 bytes in two separate reads. Partial reads happen constantly, and more frequently under load. I had to design an explicit framing protocol on top: newline-delimited JSON, with reassembly handled on both sides.

The Framing Protocol

The fix is straightforward but you have to actually build it: newline-delimited JSON. Every message ends with n, and both sides accumulate chunks into a buffer until they see one.

On the guest side (the runtime running inside the VM), the server listens on a Unix socket and processes incoming messages:

The while loop is the important part. A single data event can contain multiple complete messages, or half a message, or a message and a half. You keep slicing until there’s no newline left, then leave the remainder in the buffer for the next chunk.

The host side mirrors this exactly — buffer incoming chunks, scan for newlines, parse complete lines:

The host also sets a timeout on the whole read. If the VM doesn’t respond within the deadline, the socket gets destroyed and the request fails fast rather than hanging. Without this, a crashed runtime would stall the entire connection pool.

The protocol itself is minimal — two message types, response and error, both with a data field. That’s all you need when the only thing crossing the boundary is a function invocation and its result.

My Takeaway

The reason vsock makes sense in a Firecracker context is that it’s the right primitive for the actual communication pattern: two processes on the same physical machine, where one happens to be inside a VM, exchanging structured messages at high frequency.

TCP was designed for unreliable networks. It does a remarkable job of making packet loss, reordering, and congestion invisible to applications. Inside a microVM on a single host, none of those problems exist. Choosing TCP here is overkill — it works, but you’re carrying a lot of infrastructure for a problem you don’t have.

vsock removes the abstraction layers that don’t apply to this environment and gives you a direct channel to the hypervisor. For a hot IPC path in a serverless runtime, that directness translates into latency and throughput that would be harder to achieve otherwise.

The full project is on GitHub.