Firewalls &VPNs

Protecting the network

Paul Krzyzanowski

November 13, 2024

Firewalls

A firewall protects the junction between different network segments, most typically between an untrusted network (e.g., external Internet) and a trusted network (e.g., internal network). Two approaches to firewalling are packet filtering and proxies.

A normal router has the task of determining how to route a packet. That is, a router is connected to two or more networks, each connected to a different port on the router. An IP packet is received on one port and the router needs to determine which port to send it to.

A packet filter, also known as a screening router, is a router that not only selects the route for a packet but also determines whether the packet should be routed or dropped based on specific rules. These rules are applied to the packet’s IP header, TCP/UDP header, and the network interface (port) where the packet was received. Packet filtering is typically performed by a border router (also called a gateway router), which controls the flow of traffic between an internal network and an external network, such as the Internet. The border router determines whether a packet should be forwarded to its destination or rejected, helping protect the internal network from unauthorized access.

The basic principle of firewalls is to never have a direct inbound connection from the originating host from the Internet to an internal host; all traffic must flow through a firewall and be inspected.

The packet filter evaluates a set of rules to determine whether to drop or accept a packet. This set of rules forms an access control list, often called a chain. Strong security follows a default deny model, where packets are dropped unless some rule in the chain specifically permits them.

First-generation packet filters implemented stateless inspection. A packet is examined on its own with no context based on previously-seen packets.

Second-generation packet filters

Second-generation packet filters, also known as stateful packet inspection (SPI) firewalls, improve upon first-generation firewalls by tracking the state of active connections and making decisions based on the context of previously seen packets. These firewalls can monitor sessions, understand the relationship between packets, and apply rules based on the state of a connection. By maintaining a state table of active sessions and their statuses, SPI firewalls enhance security and functionality, ensuring that traffic is allowed only when it corresponds to a legitimate, established connection.

Here are key features of SPI firewalls:

  • TCP Connection Tracking: SPI firewalls ensure that TCP data traffic is only allowed if the proper connection setup (via the three-way handshake) has occurred. This prevents attacks like sequence number prediction, where an attacker might try to inject malicious packets into a session.

  • Allowing Return Traffic: SPI firewalls can track when a client inside the network initiates a connection to a remote server and allow the corresponding return traffic. This is crucial for enabling responses to requests made by users within the network, such as loading web pages or retrieving email.

  • UDP and ICMP Support: For connectionless protocols like UDP and ICMP, SPI firewalls can track outgoing messages (e.g., DNS queries or ping requests) and permit the corresponding responses (e.g., DNS replies or ICMP echo-replies) back to the internal network.

  • Understanding Related Connections: SPI firewalls recognize relationships between primary and secondary connections. For instance:

  • When a client initiates an FTP session to a server on port 21, the server may open a secondary connection back to the client on a different port to transfer data.

  • Similarly, media servers often use a TCP connection for control commands and return media streams on separate UDP ports. SPI firewalls can recognize these patterns and allow such related connections.

Third-Generation Packet Filters

Traditional packet filters primarily inspect packet headers up to the transport layer (e.g., examining TCP/UDP protocols and port numbers) to make routing or filtering decisions. Third-generation packet filters add deep packet inspection (DPI), enabling firewalls to go beyond examining network and transport-layer headers and analyze the actual application data within the packets. This capability allows these firewalls to make decisions based on the specific contents of network traffic.

Deep Packet Inspection (DPI)

Deep packet inspection examines the application-layer data within packets to validate protocols, enforce policies, and detect malicious content. For instance, DPI firewalls can:

  • Identify application-layer protocols (e.g., HTTP, FTP) regardless of the port they are running on.
  • Apply application-specific rules, such as checking for malformed URLs or blocking certain types of content, like suspicious Java applets or ActiveX controls.
  • Detect security threats, such as malicious payloads or protocol anomalies, making DPI a core feature of modern Intrusion Prevention Systems (IPS).

DPI focuses on analyzing individual packets in real-time, providing protocol validation and some content filtering without reassembling large chunks of data.

Deep Content Inspection (DCI)

Deep content inspection (DCI) builds on the principles of DPI but goes further by buffering and analyzing large chunks of data from multiple packets. This allows the firewall to handle complete objects (like files or encoded messages) rather than individual packets. For example, DCI can:

  • Reassemble file downloads or email attachments and scan them for malware.
  • Decode base64-encoded content (commonly used in web and email traffic) to reveal its actual payload for analysis.
  • Perform signature-based or heuristic analysis on the full object to detect advanced threats, such as embedded malware or hidden exploits.

Key Distinction Between DPI and DCI

  • DPI works in real-time by analyzing the contents of individual packets to enforce policies or detect protocol violations, focusing on data in transit. For example, DPI might block a suspicious URL or identify unusual HTTP methods.
  • DCI processes and examines full data objects by assembling data from multiple packets. For instance, DCI might analyze an entire email attachment for malware or reconstruct a file transfer to identify harmful content within the file.

Application proxies

An application proxy is software that presents the same protocol to the outside network as the application for which it is a proxy. For example, a mail server proxy will listen on port 25 and understand SMTP, the Simple Mail Transfer Protocol. The primary job of the proxy is to validate the application protocol and thus guard against protocol attacks (extra commands, bad arguments) that may exploit bugs in the service. Valid requests are then regenerated by the proxy to the real application that is running on another server and is not accessible from the outside network.

Application proxies are usually installed on dual-homed hosts. This is a term for a system that has two “homes,” or network interfaces: one for the external network and another for the internal network. Traffic never passes between the two networks. The proxy is the only one that can communicate with the internal network. Unlike DPI, a proxy may modify the data stream, such as stripping headers, modifying machine names, or even restructuring the commands in the protocol used to communicate with the actual servers (that is, it does not have to relay everything that it receives).

DMZs and Micro-Segmentation

Each network connected to a router can be thought of as a security zone, representing a group of systems with a similar level of trust. Two basic zones are the internal zone and the external zone. The internal zone includes the organization’s systems, which are generally trusted, while the external zone represents untrusted systems on the Internet. A gateway router with a packet filter is used to control the flow of traffic between these zones, screening and enforcing security rules.

The primary danger in this design is that all internal systems are on the same local area network and share the same range of addresses. If attackers successfully compromise one system, they could gain full access to the rest of the internal network. Additionally, improperly configured firewall rules may allow unauthorized external requests to reach internal systems.

To address these risks, many organizations implement a screened subnet architecture, which introduces a separate network segment known as the DMZ (demilitarized zone). The DMZ contains systems that offer externally accessible services, such as web servers, email servers, or DNS servers, isolating them from the internal network. This separation ensures that systems in the internal network, which do not provide external-facing services, are not directly accessible from the Internet. In this architecture:

  • The DMZ is protected by a screening router, which controls packet flow based on the interface where packets arrive and their header values.
  • Internal systems are placed in a separate, more secure subnet that is shielded from direct Internet access. A packet filter controls access between these systems and those in the DMZ as well as the external network.
  • A single firewall can protect the networks since the packet origin can be identified by the incoming port and filtering decisions can be made based on that.

Router Access Control Policies

The router enforces strict access control policies to manage traffic between the external network, the DMZ, and the internal network:

  1. From the Internet (External Network):
  • No direct traffic is allowed into the internal network unless it is a response to a request initiated internally (e.g., a DNS query or TCP return traffic).
  • Only packets destined for valid services in the DMZ (specific IP addresses, protocols, and ports) are allowed.
  • Packets masquerading as internal traffic are rejected.
  1. From the DMZ:
  • Only designated systems in the DMZ that require access to internal network services are allowed through, and their access is limited to specific services.
  • Outbound traffic from the DMZ to the Internet may be restricted to prevent attackers from downloading tools or malware if a DMZ system is compromised.
  • This segmentation limits the ability of an intruder who compromises a DMZ system to further attack internal systems.
  1. From the Internal Network:
  • Internal traffic is typically allowed to flow to the DMZ or the Internet, though restrictions may be applied to block specific external services or prevent certain activities (e.g., torrenting or accessing prohibited websites).
  • Internal users generally have unrestricted access to DMZ systems, including public-facing services and potentially additional internal-facing services, such as login portals.

Micro-Segmentation

The separation of internal and DMZ networks can be further enhanced with micro-segmentation, where additional subnets or VLANs (Virtual Local Area Networks)** are created to isolate different groups or functions within an organization. Examples include separate subnets for departments like Development, Human Resources, Legal, and Marketing. Traffic between these subnets passes through firewalls, allowing organizations to enforce strict access controls and limit exposure. For instance:

  • Developer’s systems may only access specific development servers or repositories.
  • Marketing systems may be isolated from legal systems, minimizing the risk of lateral movement during an attack.

Micro-segmentation can also be applied within the DMZ itself. For example, each system or service in the DMZ (e.g., a web server, a mail server) can be placed in its own subnet, restricting the ability of a compromised DMZ system to attack others. This additional layer of defense reduces the attack surface and provides granular control over inter-system communication.

With micro-segmentation, organizations can implement a layered security model, ensuring that even if one segment of the network is compromised, the damage is contained, and the risk of further breaches is minimized.

Deperimeterization and Zero Trust

Deperimeterization refers to the diminishing effectiveness of traditional network boundaries (perimeters) in protecting systems and their resources. Traditionally, networks relied on perimeter security models that assumed systems inside the network could be trusted and focused on securing the boundary between the internal network and the external world (e.g., the Internet). This model became less effective due to several factors:

  • The rise of mobile devices, where users move a device (such as a laptop or phone) between the home, office, and possibly various Wi-Fi hotspots (e.g., Starbucks, airport lounge). The same device connects to trusted as well as untrusted networks at different times.
  • The rise of remote work, which require users to access internal resources from outside the corporate network.
  • The widespread use of cloud services, which store sensitive data outside traditional perimeters.
  • The increased use of unvetted software and deceptive downloads, where trusted users may inadvertently download malware.
  • Cyber threats that exploit compromised internal devices or privileged users to move laterally within a network.

As a result, the concept of a “secure perimeter” began to break down, leading to the development of the Zero Trust Security Model.

The Zero Trust Security Model

The Zero Trust model is a modern approach to security that eliminates the assumption that systems, users, or devices inside a network are inherently trustworthy. Instead, Zero Trust operates on the principle of “never trust, always verify.” Under this model:

  1. Verification: Every connection to a service, whether from inside or outside the network, is thoroughly authenticated, authorized, and encrypted.
  2. Least Privilege Access: Users and devices are granted the minimum level of access necessary to perform their functions, reducing the attack surface.

Zero Trust assumes that breaches are inevitable or may have already occurred and focuses on minimizing damage and maintaining security through strict access controls.

Zero Trust Network Access (ZTNA) is a key component of the Zero Trust model, specifically focused on securing network access. It replaces the role of traditional Virtual Private Networks (VPNs) by providing more granular and secure access controls that focus on connecting to specific systems hosting services rather than providing access to an entire subnet. Key characteristics of ZTNA include:

  1. Granular Access Control: Users and devices are granted access only to specific applications, services, or resources they are authorized to use, rather than broad network access.

  2. Dynamic Authentication and Authorization: Access requests are verified through mechanisms like multi-factor authentication (MFA), device security checks, and behavioral analysis.

  3. No Implicit Trust: ZTNA treats all access requests as originating from an untrusted network, whether the user is inside or outside the organization’s physical network.

  4. Application-Centric Security: Instead of granting access to an entire network, ZTNA connects users directly to specific applications or services, hiding the rest of the network from view. This reduces the risk of lateral movement by attackers. Think of it as a host-to-host VPN with a firewall that restricts what a host can access on the other host.

Host-based firewalls

Firewalls intercept all packets entering or leaving a local area network. A host-based firewall, on the other hand, runs on a user’s computer. Unlike network-based firewalls, a host-based firewall can associate network traffic with individual applications. Its goal is to prevent malware from accessing the network. Only approved applications will be allowed to send or receive network data. Host-based firewalls are particularly useful in light of deperimiterization: the boundaries of external and internal networks have become fuzzy as people connect their mobile devices to different networks and import data on flash drives. A concern with host-based firewalls is that if malware manages to get elevated privileges, it may be able to shut off the firewall or change its rules.

Intrusion detection/prevention systems

An enhancement to screening routers is the use of intrusion detection systems (IDS). Intrusion detection systems are often parts of DPI firewalls and try to identify malicious behavior. There are three forms of IDS:

  1. A protocol-based IDS validates specific network protocols for conformance. For example, it can implement a state machine to ensure that messages are sent in the proper sequence, that only valid commands are sent, and that replies match requests.

  2. A signature-based IDS is similar to a PC-based virus checker. It scans the bits of application data in incoming packets to try to discern if there is evidence of “bad data”, which may include malformed URLs, extra-long strings that may trigger buffer overflows, or bit patterns that match known viruses.

  3. An anomaly-based IDS looks for statistical aberrations in network activity. Instead of having predefined patterns, normal behavior is first measured and used as a baseline. An unexpected use of certain protocols, ports, or even amount of data sent to a specific service may trigger a warning.

Anomaly-based detection implies that we know normal behavior and flag any unusual activity as bad. This is difficult since it is hard to characterize what normal behavior is, particularly since normal behavior can change over time and may exhibit random network accesses (e.g., people web surfing to different places). Too many false positives will annoy administrators and lead them to disregard alarms.

A signature-based system employs misuse-based detection. It knows bad behavior: the rules that define invalid packets or invalid application layer data (e.g., ssh root login attempts). Anything else is considered good.

Intrusion Detection Systems (IDS) monitor traffic entering and leaving the network and report any discovered problems. Intrusion Prevention Systems (IPS) serve the same function but are positioned to sit between two networks like a firewall and can actively block traffic that is considered to be a threat or policy violation.

Type Description
Firewall (screening router) 1st generation packet filter that filters packets between networks. Blocks/accepts traffic based on IP addresses, ports, protocols
Stateful inspection firewall 2nd generation packet filter. Like a screening router but also takes into account TCP connection state and information from previous connections (e.g., related ports for TCP)
Deep Packet Inspection firewall 3rd generation packet filter. Examines application-layer protocols
Application proxy Gateway between two networks for a specific application. Prevents direct connections to the application from outside the network. Responsible for validating the protocol
IDS/IPS Can usually do what a stateful inspection firewall does + examine application-layer data for protocol attacks or malicious content
Host-based firewall Typically screening router with per-application awareness. Sometimes includes anti-virus software for application-layer signature checking
Host-based IPS Typically allows real-time blocking of remote hosts performing suspicious operations (port scanning, ssh logins)

Virtual Private Networks (VPNs)

Suppose we want to connect two local area networks in geographically-separated areas together. For instance, we might have a company with locations in New York and in San Francisco. One way of doing this is to get a dedicated private network link between the two points. Many phone companies and network providers offer a private line service but it can be extremely expensive and is not feasible in many circumstances, such as if one of your endpoints is in the Amazon cloud rather than at your physical location.

Instead, we can use the public Internet to communicate between the two locations. Our two subnets will often have private IP addresses (such as 192.168.x.x), which are not routable over the public internet. To overcome this, we can use a technique called tunneling. Tunneling is the process of encapsulating an IP datagram within another IP datagram. An IP datagram in one subnet (a local area network in one of our locations) that is destined to an address on the remote subnet will be directed to a gateway router. There, it will be treated as payload (data) and packaged within an IP datagram whose destination is the IP address of the gateway router at our other location. This datagram is now routed over the public Internet. The source and destination addresses of this outer datagram are the gateway routers at both sides.

IP networking relies on store-and-forward routing. Network data passes through routers, which are often unknown and may be untrustworthy. We have seen that routes may be altered to pass data through malicious hosts or directed to malicious hosts that accept packets destined for the legitimate host. Even with TCP connections, data can be modified or redirected and sessions can be hijacked. We also saw that there is no source authentication on IP packets: a host can place any address it would like as the source. What we would like is the ability to communicate securely, with the assurance that our traffic cannot be modified and that we are truly communicating with the correct endpoints.

Virtual private networks (VPNs) take the concept of tunneling and safeguard the encapsulated data by adding a MAC (message authentication code) so that we can detect if the data is modified and encrytion so that others cannot read the data. This way, VPNs allow separate local area networks to communicate securely over the public Internet.

IPsec is a popular VPN protocol that is really a set of two protocols.

  1. The IPsec Authentication Header (AH) is an IPsec protocol that does not encrypt data but simply affixes a message authentication code to each datagram. It ensures the integrity of the each datagram.

  2. The Encapsulating Security Payload (ESP), which provides integrity checks and also encryts the payload, ensuring secrecy.

IPsec can operate in tunnel mode or transport mode. In both cases, IPsec communciates at the same layer as the Internet Protocol. That is, it is not used by applications to communciate with one another but rather by routers or operating systems to direct an entire stream of traffic.

Tunnel mode VPNs provide network-to-network or host-to-network communication. The communication takes place between either two VPN-aware gateway routers or from a host to a VPN-aware router. The entire datagram is treated like payload and encapsulated within a datagram that is sent over the Internet to the remote gateway. That gateway receives this VPN datagram, extracts the payload, and routes it on the internal network where it makes its way to the target system.

Transport mode VPNs provide communication between two hosts. In this case, the IP header is not modified but data is protected. Note that, unlike transport layer security (TLS), which we examine later, setting up a transport mode VPN will protect all data streams between the two hosts. Applications are unaware that a VPN is in place.

Authentication Header (AH)

The Authentication Header (AH) protocol guarantees the integrity and authenticity of IP packets. AH adds an extra chunk of data (the authentication header) with a MAC to the IP datagram. Anyone with knowledge of the key can create the MAC or verify it. This ensures message integrity since an attacker will not be able to modify message contents and have the HMAC remain valid. Attackers will also not be able to forge messages because they will not know the key needed to create a valid MAC. Every AH also has a sequence number that is incremented for each datagram that is transmitted, ensuring that messages are not inserted, deleted, or replayed.

Hence, IPsec AH protects messages from tampering, forged addresses, and replay attacks.

Encapsulating Security Payload (ESP)

The Encapsulating Security Payload (ESP) provides the same integrity assurance but also adds encryption to the payload to ensure confidentiality. Data is encrypted with a symmetric cipher (usually AES).

IPsec cryptographic algorithms

Authentication

An IPsec session begins with authenticating the endpoints. IPsec supports the use of X.509 digital certificates or the use of pre-shared keys. Digital certificates contain the site’s public key and allow us to validate the identity of the certificate if we trust the issuer (the certification authority, or CA). We authenticate by proving that can take a nonce that the other side encrypted with our public key and decrypt it using our private key. A pre-shared key means that both sides configured a static shared secret key ahead of time. We prove that we have the key in a similar manner: one side creates a nonce and asks the other side to encrypt it and send the results. Then the other side does the same thing.

Key exchange

HMAC message authentication codes and encryption algorithms both require the use of secret keys. IPsec uses Diffie-Hellman to create random shared session keys. Diffie-Hellman makes it quick to generate a public-private key pair that is needed to derive a common key ao there is no dependence on long-term keys, assuring forward secrecy.

Confidentiality

In IPsec ESP, the payload is encrypted using either AES-CBC or 3DES-CBC. CBC is cipher-block chaining, which has the property that the ciphertext of each datagram is dependent on all previous datagrams, ensuring that datagrams cannot be substituted from old messages.

Integrity

IPsec uses HMAC, a form of a message authentication code that uses a cryptographic hash function and a shared secret key. It supports either SHA-1 or SHA-2 hash functions.

IPsec Authentication Header mode is rarely used since the overhead of encrypting data these days is quite low and ESP provides both encryption in addition to authentication and integrity.

Transport Layer Security (TLS)

Virtual Private Networks were designed to operate at the network layer. They were designed to connect networks together. Even with transport mode connectivity, they tunnel all IP traffic between two systems and do not differentiate one data stream from another. They do not solve the problem of an application being able to communicate with another application over a network via an authenticated, tamper-proof, and encrypted channel.

Secure Sockets Layer (SSL) was created as a layer of software above TCP that provides authentication, integrity, and encrypted communication while preserving the abstraction of a sockets interface to applications. An application sets up an SSL session to a service. After that, it simply sends and receives data over a socket just like it would with the normal sockets-based API that operating systems provide. The programmer does not have to think about network security.

As SSL evolved, it morphed into a new version called TLS, Transport Layer Security. While SSL is commonly used in conversation and names of APIs, all current implementations are TLS.

Any TCP-based application that may not have addressed network security can be security-enhanced by simply using TLS. For example, the standard email protocols, SMTP, POP, and IMAP, all have TLS-secured interfaces. Web browsers use HTTP, the Hypertext Transfer Protocol, and also support HTTPS, which is the exact same protocol but uses a TLS connection.

TLS has been designed to provide:

Data confidentiality
Symmetric cryptography is used to encrypt data.
Key exchange
During the authentication sequence, TLS performs a Diffie-Hellman key exchange so that both sides can obtain random shared session keys. From the common key, TLS uses a pseudorandom generator to create any number of keys, creating separate keys for each direction of communication and separate keys for data integrity in each direction (MAC keys). Since version 1.3, TLS derives a new key for every message sent.
Data integrity
Ensure that we can detect if data in transit has not been modified and new data has not been injected. TLS includes an HMAC function based on the SHA-256 hash for each message.
Authentication
TLS authenticates the endpoints prior to sending data. Authentication can be unidirectional (the client may just authenticate the server) or bidirectional (each side authenticates the other). TLS uses public key cryptography and X.509 digital certificates as a trusted binding between a user’s public key and their identity.
Interoperability & evolution
TLS was designed to support different key exchange, encryption, integrity, & authentication protocols. The start of each session enables the protocol to negotiate what protocols to use for the session.

TLS 1.3

As of this writing, the current version of the TLS protocol is TLS 1.3. TLS 1.2 and older versions were not supported since 2020.

TLS 1.3 is significant because it simplified the TLS protocol to make it less efficient and to remove the ability to choose algorithms that may be cryptographically weaker than other options.

A few key improvements in TLS 1.3 are:

Removed support for older ciphers & hashes
TLS 1.3 reduced the set of acceptable algorithms as well as choices for parameters that drive encryption algorithms (such as the choice of the modulus for Diffie-Hellman key exchange). The motivation for this was to remove weaker algorithms so that attackers would not have the opportunity to perform downgrade attacks. A downgrade attack is one where the client and server will renegotiate the protocol to use a different and weaker algorithm for encryption or message authentication or even disable it.
Require the use of Diffie-Hellman for key exchange
Older versions of TLS (and SSL) allowed using RSA public key cryptography to transmit a session key (e.g., the client encrypts a random session key with the server’s public key). TLS 1.3 no longer supports RSA public keys since they were invariable long-term keys. Generating strong RSA keys is computationally costly and systems that use them would reuse the same key for each session. A past attack (Heartbleed) enabled an attacker to grab memory contents from a server that included the server’s private key. Diffie-Hellman allows keys to be generated efficiently so that new key pairs (used to produce a common key) can be created spontaneously. This assures Perfect Forward Secrecy: knowledge of a past key will yield no information for decrypting future sessions.
Reduce handshake complexity
Earlier versions of TLS (and SSL) involved several back-and-forth messages to agree on a suite of ciphers for encryption, key exchange, and message authentication codes, for sending certificates, sending nonces (random data), and authenticating. TLS 1.3 optimized this initialization so that the most common usage of the protocol will involve only one set of back-and-forth messages. This dramatically reduces the delay between connecting to a server and commencing secure communication.
TLS 1.3 also authenticates all data starting from the first response from the server, which reduces opportunities for attackers to inject unauthenticated data that can alter how data is treated throughout the session.
0-RTT: zero round-trip time for connection restart
TLS 1.3 added support for near-instantaneous connection resumption in the event that the client needs to re-establish a connection to the server. After the initial setup handshake phase, both sides generate a Resumption Master Key. If the TCP connection between the client and server terminates, the client can re-establish it and send the server a session ticket (to identify itself) along with the first set of data encrypted with this Resumption Master Key. If the server cached the session ID and the resumption key (this is optional for servers), it can start processing client data with the very first message it received on this new connection.

TLS sub-protocols

TLS operates in two phases

(1) Handshake Protocol: Authentication and key exchange
Authentication uses public key cryptography with X.509 certificates to authenticate a system. The server presents the client with an X.509 digital certificate. The server can, optionally, also ask the client for a certificate. Either RSA or Elliptic Curve (the Elliptic Curve Digital Signature Algorithm) public keys are supported for this phase. TLS validates the signature of the certificate. An endpoint authenticates by signing a hash of a set of messages with their private key.
Key exchange used to support several options. With TLS 1.3, only Ephemeral Diffie-Hellman key exchange is supported since it supports the efficient generation of shared keys and there is no long-term key storage, providing forward secrecy.
(2) Record Protocol: Communication
The record protocol is used for sending data back and forth between the two systems. Each message is encrypted and contains a message authentication code. Data encryption uses symmetric cryptography and supports a variety of algorithms, including 128- and 256-bit AES-GCM and ChaCha20-Poly1305. AES is the Advanced Encryption Standard. GCM is Galois/Counter Mode, an alternative to cipher-block chaining (CBC) that encrypts an incrementing counter and exclusive-ors it with a block of plaintext. ChaCha20 is an encryption algorithm that is generally more efficient than AES on low-end processors.
Data integrity is provided by a hashed message authentication code (HMAC) that is attached to each block of data. Previous versions of TLS allowed the client and server to negotiate which MAC to use but TLS 1.3 requires the use of a specific MAC based on the chosen cipher to ensure the cryptographically strongest choice. It supports either HMAC-SHA256 or HMAC-SHA384 or, for ChaCha20 encryption, the Poly1305 message authentication code.

TLS 1.3 Handshake

What follows is a high-level overview of the TLS 1.3 handshake. I’m leaving out descriptions of handshake secrets, traffic secrets, handshake keys, and initialization vectors so that the details will not obscure the basic mechanisms. Please consult additional references for actual details if you’re interested in learning about this.

Client Hello

The client connects to the server via TCP. It generates a public-private pair of Diffie-Hellman keys and sends the server a block of information that includes:

  1. TLS version number
  2. Client random data
  3. Diffie-Hellman public key

There’s additional data that includes the list of ciphers, versions, and signature algorithm it supports.

Server Hello

When the server receives the “client hello” message, the server generates its own public-private Diffie-Hellman key pair and responds back with a “server hello” block of information that includes:

  1. TLS version (the lesser of the maximum version it can support and the client can support; confirmation that it can use version 1.3)

  2. Server random

  3. Selected cipher suite (the set of algorithms it agrees to use)

  4. Server’s Diffie-Hellman public key

  5. Server certificate (the X.509 certificate containing the server’s public key)

  6. Certificate verification

With the Diffie-Hellman public key it received from the client and the private key that it generated, the server has all the information it needs to compute a common key. The client will be able to compute the same value when it receives the “server hello” message.

The server authenticates itself to the client with a digital certificate. The entire “server hello” message is signed to prevent tampering and the client has all the information it needs to validate the signature.

Key Derivation

After the handshake, both sides can compute the Diffie-Hellman common key.

The initial key is derived from the common key and the SHA-384 hash of the client_hello and the server_hello messages. When the client receives the message from the server, it will be able to compute the same key. This is used to derive all the keys that will be needed for the session.

TLS 1.3 derives all the keys it needs from this initial secret key by using HKDF, the HMAC-based Extract-and-Expand Key Derivation Function. This is an IETF standard (see RFC 5869) for deriving any number of keys starting from one initial secret. Conceptually, it is similar to the technique used to derive one-time passwords with the S/key algorithm, where each key was a one-way function of the previous key.

With HKDF, we first create an initial fixed-length pseudorandom key, K, from the initial secret:

K = hash(non_secret_salt, initial_secret)

Then, each successive key, Kn is generated as:

Kn = HMAC(K, Kn-1, n)

TLS 1.3 Communication

In previous version of TLS, the client and server could mix and match any of a variety of encryption and MAC algorithms. In TLS 1.3, the choices are restricted. Data is sent using a mechanism called AEAD, which stands for Authenticated Encryption with Additional Data. AEAD uses a different key for each message that is sent.

We start with:

  • Message (data to be sent)

    • Initial secret key
    • Initialization value (IV), used for counter-mode encryption
  • Optional additional non-secret data (AD)
    The key and IV are then used to encrypt the message using the chosen cipher (either AES or ChaCha20). A secondary key is used to create the HMAC message authentication code for

HMAC(ciphertext, AD, lengthciphertext, lengthAD)

The encrypted message and associated HMAC are sent to the others side.

Unidirectional vs. mutual authentication

To implement authentication, the server sends the client its X.509 digital certificate so the client can authenticate the server by having the server prove it knows the private key. TLS also supports mutual authentication: the client will send its X.509 certificate to the server so the server can authenticate the client.

One notable aspect of TLS sessions is that, in most cases, only the server will present a certificate. Hence, the server will not authenticate or know the identity of the client.

Client-side certificates have been problematic. Generating keys and obtaining trustworthy certificates is not an easy process for users. A user would have to install the certificate and the corresponding private key on every system she uses. This would not be practical for shared systems. Moreover, if a client did have a certificate, any server can request it during TLS connection setup, thus obtaining the identity of the client. This could be desirable for legitimate banking transactions but not for sites where a user would like to remain anonymous.

We generally rely on other authentication mechanisms, such as the password authentication protocol, but carry them out over TLS’s secure communication channel.

References

A few relatively easy-to-digest references for the TLS 1.3 protocol:

Last modified November 27, 2024.
recycled pixels