Module 13 of 21 · Applied

How NAT uses state, and why it breaks assumptions

17 min read 3 outcomes Scenario quiz

By the end of this module you will be able to:

  • Explain what NAT changes, why it requires state, and how return traffic is matched
  • Describe the end-to-end principle and how NAT breaks it
  • Explain NAT traversal techniques (STUN, TURN, ICE) and why peer-to-peer applications need them
Network address translation diagram concept

Real-world problem · Ongoing

Carrier-grade NAT: when your ISP gives you a shared IP address you cannot control

Carrier-grade NAT (CGN or CGNAT) is NAT performed by an Internet Service Provider (ISP) or mobile carrier rather than a home router. The ISP assigns a single public IP address to multiple customers simultaneously. All traffic from those customers reaches the internet from the same source IP, with different port numbers distinguishing each customer's connections.

RFC 6888 documents the requirements and implications of CGNAT. The practical consequences for developers include broken IP geolocation (the source IP is the ISP's NAT device in a data centre, not the customer's location), broken rate limiting by IP (one IP serves thousands of customers), and broken services that assume a persistent 1:1 mapping between user and IP address.

For end users, CGNAT makes inbound connections impossible, breaks certain gaming services, and can make port forwarding unavailable. IPv6 addresses this problem by providing enough addresses for every device to have its own public address, removing the need for NAT entirely. But IPv4 CGNAT is standard on most mobile networks and many residential broadband connections globally.

A developer reports that their application's IP geolocation feature shows wrong results for customers on certain mobile carriers. All those customers show the same IP address, and it is located in a data centre, not their city. What is happening?

13.1 Why NAT exists: the IPv4 address exhaustion problem

IPv4 uses 32-bit addresses, providing approximately 4.3 billion unique addresses. That sounded sufficient in 1981. By the early 1990s, it was clear it would not be enough for the growing internet. NAT (Network Address Translation) was one solution: allow many devices to share a single public IP address.

RFC 3022 defines traditional NAT. The idea is straightforward. An organisation with one public IP address can have hundreds of internal devices using private address ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16 as defined in RFC 1918). The NAT device rewrites source addresses on outgoing packets and reverses the process for incoming traffic.

The rewriting creates an obligation: the NAT device must track every active translation so it knows where to send the return traffic. This is the state requirement. NAT is not stateless translation. It is translation plus a per-connection mapping table.

13.2 SNAT, DNAT, and port address translation

SNAT (Source NAT) rewrites the source address of outgoing packets. A device at 192.168.1.5 sends a packet; the NAT device rewrites the source to 203.0.113.1 (the public IP) before forwarding it. Return traffic to 203.0.113.1 is rewritten back to 192.168.1.5.

DNAT (Destination NAT) rewrites the destination address of incoming packets. Traffic arriving for 203.0.113.1:80 is rewritten to 192.168.1.10:80 to reach a web server on the internal network. Port forwarding is DNAT. Load balancers often use DNAT to distribute traffic across backend servers.

PAT (Port Address Translation), also called NAPT (Network Address Port Translation), allows many internal addresses to share one public IP by using different source port numbers to distinguish connections. This is what home routers do. Device A at 192.168.1.2 and Device B at 192.168.1.3 both reach the internet from the same public IP, but their connections use different source port numbers so the NAT device can tell return traffic apart.

Traditional NAT allows hosts within a private network to transparently access hosts in the external network, in most cases. In traditional NAT, sessions are uni-directional, outbound from the private network.

RFC 3022 - Section 4, Traditional NAT

The phrase 'uni-directional, outbound' is the key limitation. Traditional NAT creates mappings on demand when an internal device initiates a connection. Unsolicited inbound connections have no mapping to follow. This is why a home device behind NAT cannot receive inbound connections unless port forwarding is configured.

13.3 The NAT state table

For every active connection through the NAT device, a table entry records the original source IP and port, the translated source IP and port, and the destination IP and port. Return traffic is matched against this table: incoming packets with a destination matching a translated entry are rewritten back to the original source.

Each entry has a timer. When no matching traffic is seen for the timeout period, the entry is removed. The timeout varies by protocol. TCP connections have longer timeouts than UDP flows. This creates intermittent failures: a long-lived TCP connection that is idle for longer than the NAT timeout will have its state entry removed. When the connection becomes active again, return traffic has no entry to match and is dropped.

This is why applications that maintain long-lived idle connections, such as SSH sessions, sometimes drop silently after a period of inactivity. The connection appears open on both ends but the NAT state has expired. TCP keepalives or application-level heartbeats prevent this by sending periodic small packets to refresh the NAT entry.

13.4 How NAT breaks the end-to-end principle

The end-to-end principle (described in the original internet architecture work by Saltzer, Reed, and Clark in 1984) holds that intelligence should be placed at the endpoints of a network, not in the middle. Each device should be directly addressable and reachable. The network should move packets, not make decisions about them.

NAT violates this. An internal device's address is not globally routable. It cannot receive unsolicited connections. Two devices behind different NATs cannot directly communicate without one initiating to the other's public address, or without a third-party relay.

Peer-to-peer applications (video calls, file sharing, gaming) break under NAT because both peers may be behind NAT and neither can directly receive a connection from the other. This is the NAT traversal problem.

13.5 NAT traversal: STUN, TURN, and ICE

STUN (Session Traversal Utilities for NAT), defined in RFC 5389, is a protocol that allows a client to discover its public IP address and port as seen from outside its NAT. The client sends a request to a public STUN server; the response tells the client what source IP and port the server observed. The client can then share this "reflexive" address with peers.

TURN (Traversal Using Relays around NAT) is a fallback when direct connection is impossible. When both peers are behind NATs that block hole-punching, a TURN server relays all traffic between them. TURN works in every NAT configuration but adds latency and bandwidth cost.

ICE (Interactive Connectivity Establishment), defined in RFC 8445, is the framework that coordinates candidate gathering and connectivity checks. An ICE agent gathers multiple candidate addresses (local, STUN-discovered, TURN relay), shares them with the peer, then tests each pair to find the best working path. WebRTC, which powers most web-based video calling, uses ICE.

Common misconception

NAT is a security feature.

NAT restricts unsolicited inbound connections as a side effect of its stateful translation design, not as a deliberate security mechanism. It provides no authentication, no access control, and no filtering of connection content. A firewall with a deny-inbound policy provides the same reachability restriction while also applying explicit security policy. Relying on NAT as your primary security boundary is a design error.

13.6 Check your understanding

Two devices behind the same NAT both connect to the same web server on port 443. How does the NAT device distinguish the return traffic?

An SSH session to a remote server stops responding after 30 minutes of inactivity. The connection appears still open on both sides. What is the most likely cause?

Key takeaways

  • NAT rewrites IP addresses (and ports in PAT) in packets and maintains a state table to match return traffic. State entries expire when idle.
  • SNAT rewrites source addresses on outbound traffic. DNAT rewrites destination addresses for inbound traffic. PAT allows many devices to share one IP using different port numbers.
  • NAT breaks the end-to-end principle. Internal devices cannot receive unsolicited inbound connections. Peer-to-peer applications require NAT traversal techniques (STUN, TURN, ICE).
  • NAT is not a security feature. It restricts inbound reachability as a side effect of stateful translation, not through explicit access control policy.

Standards and sources cited in this module

  1. RFC 3022, Traditional IP Network Address Translator (Traditional NAT)

    Section 2, Terminology; Section 4, Traditional NAT

    Defines traditional NAT types and the stateful mapping requirement. Quoted in Section 13.2 for the uni-directional session design.

  2. RFC 5389, Session Traversal Utilities for NAT (STUN)

    Section 1, Introduction; Section 5, Definitions

    Defines STUN and the reflexive address discovery mechanism. Referenced in Section 13.5 for the NAT traversal description.

  3. RFC 6888, Common Requirements for Carrier-Grade NATs (CGNs)

    Section 3, Requirements; Section 4, Logging Requirements

    Defines CGNAT requirements and documents operational problems. Used in the opening case study for the geolocation and rate limiting consequences.

  4. RFC 8445, Interactive Connectivity Establishment (ICE)

    Section 2, Overview; Section 5, Gathering Candidates

    Defines the ICE framework used in WebRTC. Referenced in Section 13.5 for the candidate gathering and connectivity check description.

NAT handles addressing at the edge. Module 14 covers what protects the data itself: TLS encryption, certificate chain validation, forward secrecy, and the common errors you will encounter when something in the trust chain breaks.

Module 13 of 21 · Applied stage