5 Comments
User's avatar
Stephen Warr's avatar

Thankyou for the interesting insights. I’m intrigued though, if the architecture rewrites and forwards the IP header, how does it handle return traffic? The host that is chosen to receive a given request will get inbound traffic over a given Ethernet path (from the balancer), that isn’t the outbound path it needs to take back to the original client. This seems to creates some interesting constraints on the network topology.

Lorenzo Bradanini's avatar

This is actually a great question! As you probably know, Katran handles return traffic in two ways. DSR (Direct Server Return) rewrites only inbound packets; backends see traffic addressed to themselves and reply directly to the client, bypassing the balancer entirely. This cuts kernel processing and bandwidth usage in half, though it requires a bit of backend routing setup. Then we have the NAT mode rewrites both inbound and outbound packets, so responses flow back through the balancer, which swaps the source back to the VIP.

Either way, the data plane remains completely stateless; no flow tables, no session memory. Every packet is routed independently using deterministic math, while the control plane quietly updates eBPF maps and monitors health. It’s like a perfectly choreographed dance: the balancer doesn’t remember the steps, it just ensures every packet reaches the right partner.

Lorenzo Bradanini's avatar

Hope sincerely that this will help you to understand and clarify your doubts a little bit more

Stephen Warr's avatar

NAT mode requires tracking session state (remembering the client IP address/port) does it not?

Lorenzo Bradanini's avatar

Yes, it does actually.

NAT mode can’t be stateless, because the moment the load balancer rewrites the packet, it has to remember what it did. Otherwise, when the response comes back, it wouldn’t know where to send it.

So every new connection creates a small memory inside the load balancer. A mapping between the client and the backend. Not because it wants to, but because it has to. It changed the packet’s identity, and now it’s responsible for restoring it on the way back.

This is what makes NAT stateful. It’s not just forwarding traffic. It’s keeping track of relationships.

At small scale, this is perfectly fine. But as connections grow into the millions, that memory becomes real work. Real pressure. Real limits.

That’s why systems like those developed at Meta Platforms try to avoid NAT when possible. Not because NAT is wrong, but because avoiding state removes an entire class of constraints.

It’s the difference between remembering every conversation, and simply knowing where to point the next one.