pk.org: CS 419/Lecture Notes

Command Injection and Input Validation Attacks

Terms and concepts you should know

Paul Krzyzanowski – 2025-10-26

Core Concepts

Command injection
A class of vulnerabilities where untrusted input is interpreted as executable commands rather than as data.
Untrusted input
Data from external sources that may contain malicious content and must be validated before use.
Input validation
The process of checking and constraining external input to ensure it meets expected formats, types, and ranges.
Allowlist
A validation strategy that accepts only explicitly permitted values, characters, or patterns.
Denylist
A validation strategy that rejects explicitly forbidden values or patterns, which is prone to bypass.
Canonicalization
Converting data to a single, standard representation before further processing or validation.
Normalization
The act of decoding and standardizing input encodings and form before applying security checks.
Trust boundary
A point in a system where data crosses from an untrusted domain into a trusted domain.
Separating commands from data
Security principle where command structure is fixed and user input is treated only as data, never as executable code.

Injection Types

SQL injection
Injection of SQL syntax through user input that alters query structure or behavior.
NoSQL injection
Injection targeting nonrelational databases by supplying unexpected operators or executable code.
Shell command injection
Injection that leverages shell metacharacters or substitution to execute unintended system commands.
Shell metacharacters
Special characters (such as ;, |, $(), backticks) that have special meaning to command shells and can be exploited to chain or substitute commands.
Path traversal
Using relative path elements (like ..) or encodings to access files outside a permitted directory.
Path equivalence
A vulnerability where multiple different path strings reference the same file or directory, allowing attackers to bypass validation through alternate representations such as redundant slashes, alternative separators, or case variations.
Character encoding attack
Attack that exploits multiple representations of the same character to bypass validation that decodes input after checking.
Overlong UTF-8 encoding
Invalid character encoding that represents a character using more bytes than necessary, used to bypass validation filters.

Environment and Operating System Attacks

Environment variable manipulation
Attacks that alter environment variables to change program behavior or influence code loading.
PATH hijacking
Inserting attacker-controlled directories earlier in PATH so that named executables resolve to malicious binaries.
ENV and BASH_ENV
Environment variables that specify initialization scripts to execute when shells start, exploitable to run arbitrary commands.
Symbolic link
File system reference that points to another file or directory, which can be exploited to redirect file access.
File descriptor
Integer representing an open file or I/O stream in Unix-like systems, where 0, 1, and 2 represent standard input, output, and error.
File descriptor reuse
A class of attacks that exploit the reuse of closed file descriptor numbers, often affecting privileged programs.
TOCTTOU
Time-of-check to time-of-use, a race condition where state changes between a security check and its use.

Library Loading Attacks

Shared library hijacking
Attacks that redirect programs to load malicious libraries instead of legitimate ones, enabling function-level code injection.
LD_PRELOAD
A Unix environment variable that forces the dynamic loader to load specified shared libraries before others, allowing function override.
LD_LIBRARY_PATH
A Unix environment variable that specifies additional search directories for shared libraries before system directories.
Function interposition
Technique where an attacker's function wraps the original library function, calling it after modifying parameters or adding malicious behavior while maintaining normal operation.
DLL sideloading
A Windows attack where a program loads a malicious DLL due to insecure search order or unspecified paths.
DLL hijacking
Another term for DLL sideloading; exploiting Windows DLL search order to load attacker-controlled libraries.
Dynamic linker
Operating system component that loads shared libraries into programs at runtime and resolves function addresses.

Supply Chain and Package Attacks

Typosquatting
Publishing malicious packages with names similar to popular libraries to trick developers into installing them.
Dependency confusion
Exploiting mixed private and public registries so that package managers retrieve attacker-controlled public packages.
Installation hook
A script or action that runs during package installation and can execute code with the installing user's privileges.
Supply chain attack
Attack that compromises software dependencies or development tools to inject malicious code into applications.

Defenses and Principles

Parameterized query
A database query mechanism that separates SQL structure from data, preventing data from being interpreted as code.
Stored procedure
A fixed database routine that accepts parameters, keeping query structure constant and data separate.
Escaping
Transforming special characters in input so they are treated as literal data by a specific interpreter or context.
Sanitization
A general term for modifying input to remove or neutralize potentially dangerous content; ambiguous and should be paired with context-aware design or safe APIs.
Context-aware validation
Validation that considers how input will be used, as what is dangerous depends on the destination system.
Least privilege
A principle of running processes with the minimum permissions necessary to limit the impact of a compromise.
Sandboxing
Isolating code execution to restrict access to system resources and reduce attack surface.
Defense in depth
Security strategy that uses multiple layers of protection so that bypassing one layer still leaves others in place.
Atomic operation
An operation that completes indivisibly, eliminating windows where race conditions can occur.
mkstemp
A POSIX function that creates and opens a temporary file atomically to prevent temporary file races.
Dependency scanning
Automated analysis of dependencies to detect known vulnerabilities or suspicious package behavior.
Monitoring and logging
Observability practices that detect anomalous input patterns, failed validations, or unexpected execution paths.
Comprehension error
Programming mistake caused by misunderstanding how systems parse input or how APIs behave.