Exam 2 study guide

The one-hour study guide for exam 2

Paul Krzyzanowski

Latest update: Thu Apr 26 16:05:30 EDT 2018

Disclaimer: This study guide attempts to touch upon the most important topics that may be covered on the exam but does not claim to necessarily cover everything that one needs to know for the exam. Finally, don't take the one hour time window in the title literally.

App Confinement

We examined several mechanism created to handle compromised applications and limit the amount of damage they can do. The earliest of these, the chroot mechanism, constrained the namespace (the directory tree it could access) but made it easy for an application to bypass that if it was able to get elevated privileges and run as root. FreeBSD Jails were a substantial improvement in that they not only restricted the namespace but enabled an administrator to limit what system calls a process can invoke even with root privileges. Linux followed up with three distinct mechanisms:

  • Control groups: allowed processes to be grouped together and control the amount of system resources (e.g., CPU percentage, file system space) that the group could use.

  • Capabilities: restricted the system calls a process could execute as root.

  • Namespaces: restricted what parts of the file system, process IDs, user IDs. mount points, and network that a process group could see.


Software rarely lives as an isolated application. Some software requires multiple applications and most software relies on the installation of other libraries, utilities, and packages. Keeping track of these dependencies can be difficult. Worse yet, updating one shared component can sometimes cause another application to break. What was needed was a way to isolate the installation, execution, and management of multiple software packages that run on the same system.

Various attempts were undertaken to address these problems.

  1. The most basic was to fix problems when they occurred. This required carefully following instructions for installation, update, and configuration of software and extensive testing of all services on the system when anything changed. Should something break, the service would be unavailable until the problems were fixed.

  2. A drastic, but thorough, approach to isolation was to simply run each service on its own computer. That avoids conflicts in library versions and other dependencies. However, it is an expensive solution, is cumbersome, and is often overkill in most environments.

  3. Finally, administrators could deploy virtual machines. This is a technology that allows one to run multiple operating systems on one computer and gives the illusion of services running on distinct systems. However, this is a heavyweight solution. Every service needs its own instance of the operating system and all supporting software for the service as well as standard services (networking, device management, shell, etc.). It is not efficient in terms of CPU, disk, or memory resources.

Containers are a mechanism created not for security but for making it easy to package, distribute, relocate, and deploy collections of software. The focus of containers is not to enable end users to install their favorite apps bur rather for administrators to be able to deploy services on a system. A container encapsulates all the necessary software for a service, all of its dependencies, and its configuration into one package that can be easily passed around, installed, and removed.

In many ways, a container feels like a virtual machine. Containers provide a service with a private process namespace, its own network interface, and its own set of libraries to avoid problems with incompatible versions used by other software. Containers also allow an administrator to give the service restricted powers even if it runs with root (administrator) privileges. Unlike a virtual machine, however, multiple containers on one system all share the same operating system and kernel modules.

Containers are not a new mechanism. They are implemented using Linux’s control groups, namespaces, and capabilities to provide resource control, isolation, and privilege control, respectively. They also make use of a copy on write file system. This makes it easy to create new containers where the file system can track the changes made by that container over a clean base version of a file system. AppArmor software in the container framework provides a basic form of mandatory access controls based on the pathnames of files. It allows an administrator to restrict the ability of a program to access specific files even within its file system namespace.

The best-known and first truly popular container framework is Docker. A Docker Image is a file format that creates a package of applications, their supporting libraries, and other needed files. This image can be stored and deployed on many environments. Docker made it easy to deploy containers using git-like commands (docker push, docker commit) and also to perform incremental updates. By using a copy on write file system, Docker images can be kept immutable (read-only) while any changes to the container during its execution are stored separately.

As people found Docker to be useful, the next design goal was to make it easier to manage containers across a network of many computers. This is called container orchestration. There are many solutions for this, including Apache Mesos, Kubernetes, Nomad, and Docker Swarm. The best known of these is
kubernetes, which was designed by Google. It coordinates storage of containers, failure of hardware and containers, and dynamic scaling: deploying the container on more machines to handle increased load. Kubernetes is coordination software, not a container system; it uses the Docker framework to run the actual container.

Even though containers were designed to simplify software deployment rather than provide security to services, they do offer several benefits in the area of security:

  • They make use of namespaces, cgroups, and capabilities with restricted capabilities configured by default. This provides isolation among containers.

  • Containers provide a strong separation of policy (defined by the container configuration) from the enforcement mechanism (handled by the operating system).

  • They improve availability by providing the ability to have a watchdog timer monitor the running of applications and restarting them if necessary. With orchestration systems such as Kubernetes, containers can be re-deployed on another system if a computer fails.

  • The environment created by a container is reproducible. The same container can be deployed on multiple systems and tested in different environments. This provides consistency and aids in testing and ensuring that the production deployment matches the one used for development and test. Moreover, it is easy to inspect exactly how a container is configured. This avoids problems encountered by manual installation of components where an administrator may forget to configure something or may install different versions of a required library.

  • While containers add nothing new to security, they help avoid comprehension errors. Even default configurations will provide improved security over the defaults in the operating system and configuring containers is easier than learning and defining the rules for capabilities, control groups, and namespaces. Administrators are more likely to get this right or import containers that are already configured with reasonable restrictions.

Containers are not a security panacea. Because all containers run under the same operating system, any kernel exploits can affect the security of all containers. Similarly, any denial of service attacks, whether affecting the network or monopolizing the processor, will impact all containers on the system. If implemented and configured properly, capabilities, namespaces, and control groups should ensure that privilege escalation cannot take place. However, bugs in the implementation or configuration may create a vulnerability. Finally, one has to be concerned with the integrity of the container itself. Who configured it, who validated the software inside of it, and is there a chance that it may have been modified by an adversary either at the server or in transit?


The goal of an application sandbox is to provide a controlled and restricted environment for code execution. Sandboxes were created to allow users to download and execute untrusted or semi-trusted applications with minimal risk of causing widespread damage to the system. The sandbox can define what an individual application is allowed to do while executing in its sandbox. FreeBSD Jails and containers are a form of sandboxing that were designed primarily for server software deployment (by system administrators). Sandboxes were designed with the focus of allowing normal users to run their apps in a restricted environment.

A sandbox provides the ability to set finer-grain policies on what an application cannot do than those provided by mechanisms such as Linux’s namespaces or access control lists. For example, an application may be disallowed from reading the file /etc/passwd; be disallowed from writing any files; or be allowed to establish a TCP/IP connection but not send UDP messages. Sandboxing is currently supported on a wide variety of platforms at either the kernel or application level.

Applications interact with their environment via system calls to the operating system. Anything that an application needs to do, whether legitimate or via an attack, must be done through system calls: accessing files or devices, changing permissions, accessing the network, talking with other processes, etc.

Application sandboxing works through system call interposition. This means that an application’s system calls are intercepted, examined, and validated before they are allowed to be processed by the operating system kernel. The validation may be handled outside the kernel or within it. Alternatively, applications may be compiled in such a way that they do not have direct access to system calls but instead make requests to libraries that perform the validation.

User-level sandboxing


One example of doing validation at the user level is the Janus sandboxing system, developed at UC Berkeley, originally for SunOS but later ported to Linux. Janus uses a loadable kernel module. It then sets up system call hooks to redirect system call requests to to the Janus kernel module. A hook is simply a mechanism that redirects an API request somewhere else and allows it to return back for normal processing. For example, a function can be hooked to log the fact that it has been called.

In the case of Janus, when a user starts an application, a policy engine is started instead. This is a normal process that reads the policy file for that application, which defines allowable file accesses and network operations. It then forks the process (creates a clone of itself) and the child process executes the actual application.

Whenever the application makes a system call, it is redirected by the hook in the kernel to the Janus kernel module. The module blocks the thread (it is still waiting for the return from the system call) and sends the query to the policy engine. The policy engine determines whether, based on the policy, the process should be permitted to make the system call. If so, the system call is directed back to the operating system. If not, an error code is returned to the application.

The biggest shortcoming of Janus is that it needs to mirror the state of the operating system’s environment. For example, if the application forks its process, Janus’ security engine must do the same. It needs to keep track of not just network operations but the proper sequencing of the steps in the protocol. Since it hands off approved system calls to the kernel, it does not know if any of these calls failed. This may enable attack vectors such as trying to send data on an unconnected socket. Keeping track of duplicated file descriptors is also tricky as an operation on one file descriptor may mimic that on another one. We saw how validating file pathnames can be error-prone, yet Janus must do this as well. Part of keeping state involves tracking changes to the current directory or even changes to the root of the file system. In addition to the difficulty of doing so accurately, this state mirroring also opens up risks of TOCTTOU (time of check to time of use) vulnerabilities.

Chromium Native Client (NaCl)

Another example of sandboxing is the Chromium Native Client, called NaCl. Chromium is the open source project behind Google Chrome browser and Chrome OS. NaCl is also a user-level sandbox and works by restricting the type of code it can sandbox. It is designed for the safe execution of platform-independent, untrusted native code inside a browser. The motivation was that some browser-based applications will be so compute-intensive that writing them in JavaScript will not be sufficient. These native applications may be interactive and may use various client resources but will need to do so in a controlled and monitored manner.

NaCl supports two categories of code: trusted and untrusted. Trusted code can run without a sandbox. Untrusted code must run inside a sandbox. This code has to be compiled using the NaCl SDK or any compiler that adheres to NaCl’s data alignment rules and instruction restrictions (not all machine instructions can be used). Since applications cannot access resources directly, the code is also linked with special NaCl libraries that provide access to system services, including the file system and network. Before executing an application, NaCl statically verifies the code of the application to check whether it has any privileged instructions encoded in it. NaCl executes with two sandboxes in place. The inner sandbox uses Intel’s IA–32 architecture’s segmentation capabilities to isolate memory regions among apps so that even if multiple apps run in the same process space, their memory is still isolated. The outer sandbox uses system call interposition to restrict the capabilities of apps at the system call level.

OS-level sandboxing

Some operating systems provide kernel support for sandboxing. These include the Android Application Sandbox, the iOS App Sandbox, the macOS sandbox, and AppArmor on Linux. Let’s take a cursory look at the Apple Sandbox on macOS. It uses sandboxing at the operating system level and allows a detailed set of policies to be enumerated that govern networking and file system reads and writes. Policies can include patterns to allow one to restrict file operations to specific directories or to files matching certain names. These policies are parsed, converted into a compact binary format, and loaded into the kernel, avoiding the need for future upcalls to a user process as Janus does. The kernel runs the TrustedBSD subsystem. This subsystem was originally designed to enforce mandatory access control policies. It hooks system calls and passes requests to a kernel extension that goes through the list of rules to determine whether a specific instance of a system call should be allowed or not. These rules give users the ability to define regular expression patterns to match names of resources.

Process virtual machine sandboxes

A different type of sandbox is the Java Virtual Machine. The Java language was originally designed as a language for web applets, compiled Java programs that would get download and run dynamically upon fetching a web page. As such, confining how those applications run and what they can do was extremely important. Because the author of the application would not know what operating system or hardware architecture a client had, Java would compile to a hypothetical architecture called the Java Virtual Machine (JVM). An interpreter on the client would simulate the JVM and process the instructions in the application. The Java sandbox has three parts to it:

The bytecode verifier verifies Java bytecodes before they are executed. It tries to ensure that the code looks like valid Java byte code with no attempts to circumvent access restrictions, convert data illegally, bypass array bounds, or forge pointers.

The class loader enforces restrictions on whether a program is allowed to load additional classes and that key parts of the runtime environment are not overwritten. It implements ASLR (Address Space Layout Randomization) by randomly laying out Runtime data areas (stacks, bytecodes, heap).

The security manager enforces the protection domain. It defines the boundaries of the sandbox and is consulted before any access to a resource is permitted. It is invoked at the time an application makes a call to specific methods to provide run-time verification of whether a program has been given rights to invoke the method, such as file I/O or network access.

Java security is deceptively complex. After over twenty years of bugs one hopes that the truly dangerous ones have been fixed. Even though the Java language itself is pretty secure and provides dynamic memory management and array bounds checking, buffer overflows have been found in the underlying C support library, which has been buggy in general. Varying implementations of the JVM environment on different platforms make it unclear how secure any specific client will be. Moreover, Java supports the use of native methods, libraries that you can write in compiled languages such as C that interact with the operating system directly. These bypass the Java sandbox.

Virtual Machines

As a general concept, virtualization is the addition of a layer of abstraction to physical devices. With virtual memory, for example, a process has the impression that it owns the entire memory address space. Different processes can all access the same virtual memory location and the memory management unit (MMU) on the processor maps each access to the unique physical memory locations that are assigned to the process.

Process virtual machines present a virtual CPU that allows programs to execute on a processor that does not physically exist. The instructions are interpreted by a program that simulates the architecture of the pseudo machine. Early pseudo-machines included o-code for BCPL and P-code for Pascal. The most popular pseudo-machine today is the Java Virtual Machine (JVM). This simulated hardware does not even pretend to access the underlying system at a hardware level. Process virtual machines will often allow “special” calls to invoke system functions or provide a simulation of some generic hardware platform.

System virtual machines allow a physical computer to act like several real machines with each machine running its own operating system (on a virtual machine) and applications that interact with that operating system. The key to this machine virtualization is to not allow each operating system to have direct access to certain privileged instructions in the processor. These instructions would allow an operating system to directly access I/O ports, MMU settings, the task register, the halt instruction and other parts of the processor that could interfere with the processor’s behavior and with the other operating systems on the system.

Instead, a trap and emulate approach is used. Privileged instructions, as well as system interrupts, are caught by the Virtual Machine Monitor (VMM), also known as a hypervisor. The hypervisor arbitrates access to physical resources and presents a set of virtual device interfaces to each guest operating system (including the memory management unit, I/O ports, disks, and network interfaces). The hypervisor also handles preemption. Just as an operating system may suspend a process to allow another process to run, the hypervisor will suspend an operating system to give other operating systems a chance to run.

The two configurations of virtual machines are hosted virtual machines and native virtual machines. With a hosted virtual machine (also called a type 2 hypervisor), the computer has a primary operating system installed that has access to the raw machine (all devices, memory, and file system). This host operating system does not run in a virtual environment. One or more guest operating systems can then be run on virtual machines. The VMM serves as a proxy, converting requests from the virtual machine into operations that get sent to and executed on the host operating system. A native virtual machine (also called a type 1 hypervisor) is one where there is no “primary” operating system that owns the system hardware. The hypervisor is in charge of access to the devices and provides each operating system drivers for an abstract view of all the devices.

Earlier Intel and AMD processors (prior to 2006), as well as ARM processors (prior to 2011) did not generate a trap when privileged were executed by non-privileged users; the instructions were simply ignored. This made implementing a virtual machine challenging. Two mechanisms were employed:

Binary Translation
Kernel code is translated to replace non-virtualizable privileged instructions with new sequences of instructions that act on the virtual hardware. Everything else is executed directly. This is transparent to the operating system; it does not know it’s being virtualized. VMware used this for Intel platforms.
With paravirtualization, the operating system is written in a way that it does not use non-virtualizable instructions. Any privileged operations are invoked as direct calls to the hypervisor. While this is straightforward, it does require modifying the operating system and precludes virtualizing closed-source systems, such as Windows. Xen is an example of a system that uses paravirtualization.

The latest processors from Intel and AMD support the concept of a virtual machine layer and the ability to intercept privileged instructions. Prior to that, one of two approaches was used to implement virtualization.

Security implications

Unlike app confinement mechanisms such as jails, containers, or sandboxes, virtual machines enable isolation all the way through the operating system. A compromised application, even with escalated privileges, can wreak havoc only within the virtual machine. Even compromises to the operating system kernel are limited to that virtual machine. However, a compromised virtual machine is not much different form having a compromised physical machine sitting inside your organization: not desirable and capable of attacking other systems in your environment.

Multiple virtual machines are usually deployed on one physical system. In cases such cloud services (e.g., such as those provided by Amazon), a single physical system may host virtual machines from different organizations or running applications with different security requirements. If a malicious application on a highly secure system can detect that it is co-resident on a computer that is hosting another operating system and that operating system provides fewer restrictions, the malware may be able to create a covert channel to communicate between the highly secure system with classified data and the more open system. A covert channel is a general term to describe the the ability for processes to communicate via some hidden mechanism when they are forbidden by policy to do so. In this case, the channel can be created via a side channel attack. A side channel is the ability to get or transmit information using some aspects of a system’s behavior, such as changes in power consumption, radio emissions, acoustics, or performance. For example, processes on both systems, even though they are not allowed to send network messages, may create a means of communicating by altering and monitoring system load. The malware on the classified VM can create CPU-intensive task at specific times. Listener software on the unclassified VM can do CPU-intensive tasks at a constant rate and periodically measure their completion times. These completion times may vary based on whether the classified system is doing CPU-intensive work. The variation in completion times creates a means of sending 1s and 0s and hence transmitting a message.


Malware is a generic term for malicious software. Malware can be distributed in various ways: viruses, worms, or trojan horses. It may spy on user actions and collect information on them (spyware), or present unwanted ads (adware). It may disable components of the system or encrypt files, undoing its damage if the owner pays money (ransomware). The software may sit dormant and wait for directives from some coordinator, who assembled an arsenal of hundreds of thousands of computers ready to do his bidding (for example, launch a distributed denial of service, DDoS, attack). Some software might be legitimate but may contain backdoors – undocumented ways to allow an outsider to use that software to perform other operations on your system.

Worms and viruses

A virus is software that attaches itself to another piece of software. It may also be content, such as scripts inside a Microsoft Word document, that will be accessed and executed by some software. It might even be a modification of the boot loader of a computer or the firmware on a flash drive. The key point is that it does not run as an independent process. A virus is executed because something else ran. Viruses are usually spread by sharing files or software. On a computer, a virus may replicate itself onto other files or software to maximize its chance of spreading.

A worm is conceptually similar in that it can do the same damage to the computer as a virus can. The distinction from a virus is that a worm runs as a standalone process while a virus requires a host program. Like virus, some worms require human intervention to spread. In other cases, worms can replicate themselves and spread to other systems automatically, exploiting weaknesses in software on those computers to allow themselves to infiltrate those machines. The popular use of both terms, worm and virus has often blurred the distinctions between them.

When using non-legitimate ways of getting into a system or elevating their privileges, attackers often try to find zero-day vulnerabilities. These are vulnerabilities (bugs or configuration errors) that have not been publicly reported and hence are unpatched.

Virus components

A virus contains three components:

Infection mechanism
The purpose of the infection mechanism is to enable a virus to spread. The software searches for infection targets, which may be other programs, specific files, or disk regions.
This is the malicious part of the virus and contains the code that does the actual harm to the system such as uploading personal information or deleting files.
The trigger, also called a logic bomb, is code that is run whenever a file containing the virus is run and it makes the decision whether the payload should be executed. For example, some viruses may stay dormant for some time.

File infector virus

A file infector virus is a virus that adds itself to an executable program. The virus patches the program so that, upon running, control will flow to the the virus code. Ideally, the code will install itself in some unused area of the file so that the file length will remain unchanged. A comparison of file sizes with the same programs on other systems will not reveal anything suspicious. When the virus runs, it will run the infector to decide whether to install itself on other files. The trigger will then decide whether the payload should be executed. If not, the program will appear to run normally.

Boot sector virus

Boot sector viruses had an infector that would install itself in the Master Boot Record (MBR) of a disk. In BIOS-based PC systems, the first sector of the bootable storage device is read into memory and executed when the system boots, Normally, the code that is loaded is the boot loader that then loads the operating system. By infecting the master boot record, the virus can repeatedly re-infiltrate the operating system or files on the disk even if any malware on the system was detected and removed previously.

Boot sector viruses were popular in the early days of PCs when users often booted off floppy disks and shared these disks. The virus would often use DOS commands to install itself onto other disks that it detects. These viruses were have largely diminished as attackers found more appealing targets. However, there is no reason that malware that attacks the bootloader should not be considered to be a continued threat. 2011 saw the emergence of ransomware that modified the boot loader to prevent the operating system from booting unless a ransom was paid. In 2016, the Petya Trojan appeared, which also infects the MBR and encrypts disk contents.

Infected flash drives

In the early days of PCs, people would share content by passing around floppy disks. This became a means for viruses to spread, which could be planted in either the boot sector or in files. These days, people share USB flash drives the way they used to share floppies.


In earlier Windows systems, Microsoft provided a feature called AutoRun. It was designed to make the CD (and, later, DVD and flash drive) experience better for users, particularly when using CDs for software installation. If the CD contained a file called autorun.inf, Windows would automatically execute a program identified within that file. While this made the experience of figuring out what to do after a PC is installed easier for most users, it created a horrific security vulnerability: all that an adversary had to do was to get you to insert the media. Moreover, this functionality worked with any removable storage so that inserting a flash drive would automatically run a program defined by autorun.inf on the drive. Microsoft eventually removed this capability from flash drives but some manufacturers created USB drives that emulated a CD drive to offer the “convenience” of AutoRun. Microsoft ultimately removed this functionality altogether in Windows 7. However, there are still old, unpatched versions of Windows out there that can be exploited with this vulnerability.


The more insidious problem with USB flash drives now is unprotected firmware. A USB flash drive is a bunch of memory as well as firmware – embedded software on the chip. The firmware runs when you plug the drive into your computer. It identifies the drive as a USB storage device and manages the transferring of data. You don’t see this firmware and cannot tell if it has been changed. Because the firmware defines what the USB device is, modified firmware on the flash drive could present the drive as a keyboard and send a set of keyboard commands to the host system (for example, commands to open the terminal window and delete files). A USB device can have multiple profiles associated with it and thus present itself as multiple devices, so the flash drive can tell the computer it is a keyboard but also a flash drive, so the user will still be able to use the device as a storage device. The firmware could also modify file contents as they pass between the USB storage device and host computer. The same attack can be user on other USB devices. For example, an ethernet adapter can redirect network messages to an attacker’s site. Reprogramming the firmware has not been exploited by malware thus far but the vulnerability has been demonstrated.

Data leakage

The most common problem with flash drives is their portability and small size: they are easy to lose and easy to borrow. This makes them vulnerable to data leakage, which is just a fancy term that means some adversary may access your data simply by borrowing your flash drive.

Inadvertent program execution

The portability of flash drives makes them a distribution mechanism. Experiments of scattering a number of them in parking lots revealed that many people are all too willing to plug a random drive into their system.

Even without automatic execution capabilities enabled, attackers can use flash drives as a distribution mechanism for malware. The Stuxnet attack exploited a windows bug in rendering shortcut icons where just viewing them in Windows Explorer enabled the execution of arbitrary code. Others have exploited a bug in video playback that allowed code execution. Even something as simple as an HTML file on a drive may direct the target to a website that can launch an attack.

Macro viruses

Some applications have support for macros, which allow the user to define a set of commands to avoid repetitive tasks and improve productivity. They are particularly common in text editors but are present in other applications as well, such as Photoshop and Microsoft Word and Excel. In some cases, as with Microsoft Office applications, macros are embedded in the document, which means they can be passed on to other users who access that document. Some macro capabilities are far more powerful than defining repetitive commands. Microsoft Excel and Word, for example, provide Visual Basic scripting, which effectively allows users to embed complete programs into their documents. These programs, in turn, can modify the default template file, normal.dot, which will in turn affect every other document on the system. This is a ripe area for attack. If you can convince somebody to open a document, they will run your program on their machine.

The challenge, of course, is to get a file with a malicious macro to target users and get them to open it. One of the most common techniques is to send it as an email attachment with some inducement to get the user to click on the document. One hugely-successful virus that did this was the ILOVEYOU virus from 2000. The subject of the message stated that it is a letter from a secret admirer. The attachment wasn’t even a document; it was a visual basic script. To provide a better user experience, Microsoft would hide file extensions by default (macOS does this too). The file was named LOVE-LETTER-FOR-YOU.TXT.vbs but the .vbs suffix, which indicated that the file was a visual basic script, was hidden from users, so they only saw LOVE-LETTER-FOR-YOU.TXT. Not being aware of when extensions are hidden and when they are not, millions of users assumed they received an innocuous text file and clicked on it. Upon execution, the script would copy itself into various folders, modify and add new entries to the system registry, replace various types of files with copies of itself (targeting music and video files), and try to propagate itself through Internet relay Chat clients as well as email. If that wasn’t enough, it would download a file called WIN-BUGFIX.EXE and execute it. This was not a bug fixing program but rather a program that extracted user passwords and mailed them to the hacker.

Social engineering plays a role in getting users to take the bait. In general, social engineering refers to any techniques used by an adversary to trick you into disclosing information, opening a file, downloading an attachment, reading a message, or running a program. The ILOVEYOU virus transmitted itself largely through email to contacts in infected computers, so your “secret admirer” message came from someone you knew and hence you were more likely to click on it. An earlier highly successful virus, Melissa, spread by offering a list of passwords for X-rated web sites. Email-based virus transmission is still a dominant mechanism. Sender headers and links are often disguised to make it look like the content is from a legitimate party.

JavaScript and PDF

JavaScript, like Visual Basic, has evolved into a full programming language. Most browsers have security holes that involve Javascript. JavaScript can not only modify the content and structure of a web page but can connect to other sites. This allows any malicious site to leverage your machine. For example, systems can perform port scans on a range of IP addresses and report any detected unsecured services.

PDF (Portable Document Format) files, would seem to be innocent printable documents, incapable of harboring executable code. However, PDF is a complex format that can contain a mix of static and dynamic elements. Dynamic elements may contain Javascript, dynamic action triggers (e.g., “on open”), and the ability to retrieve “live” data via embedded URLs. As with Visual Basic scripts, PDF readers warn users of dynamic content but, depending on the social engineering around the file, the user may choose to trust the file … or not even pay attention to the warning in yet-another-dialog-box.

Trojan horses

A Trojan horse is a program with two purposes: an overt purpose and a covert one. The overt purpose is what compels the user to get and run the program in the first place. The covert purpose is unknown to the user and is the malicious part of the program.

For example, a script with the name of a common Linux command might be added to a target user’s search path. When the user runs the command, the script is run. That script may, in turn, execute the proper command, leading the user to believe that all is well. As a side effect, the script may create a setuid shell to allow the attacker to impersonate that user or mail copy over some critical data. Users install trojan horses because they believe they are installing useful software, such as an anti-virus tool (BTW, a lot of downloadable hacker tools contain trojan horses: hackers hacking wannabe hackers). The side-effect of this software can activate cameras, enable key loggers, or deploy bots for anonymization servers, DDoS attacks, or spam attacks.

Trojans may include programs (games, utilities, anti-malware programs), downloading services, rootkits (see next) and backdoors (see next). They appear to perform a useful task that does not raise suspicion on the part of the victim.


A backdoor is software that is designed with some undocumented mechanism to allow someone who knows about the mechanism to be able to access the system or specific functions in a way that bypasses proper authentication mechanisms. In many cases, they are not designed for malicious use: they may allow a manufacturer to troubleshoot a device or a software author to push an update. However, if adversarial parties discover the presence of a backdoor, they can use it for malicious purposes.

An old example of a backdoor is the sendmail mail delivery server. The author of sendmail wanted to have development access on a production system that had the program installed so that he can continue to improve it. The system administrator refused such access. His next release of sendmail contained a password-protected backdoor that gave him access to the system via the sendmail server. The password was hard-coded in the program and soon became well-known. Robert Morris used the knowledge of this backdoor as one of the mechanisms for his worm to propagate to other systems. More recently, in 2014, some Samsung Galaxy phones were delivered with backdoors that provide remote access to the data on the phone.


A rootkit is software that is designed to allow an attacker to access a computer and hide the existence of the software … and sometimes hide the presence of the user on the system.

Historically, a basic rootkit would replace common administration commands (such as ps, ls, find, top, netstat, etc.) with commands that mimic their operation but hide the presence of intruding users, intruding processes, and intruding files. The idea is that a system administrator should be able to examine the system and believe that all is fine and the system is free of malware (or of unknown user accounts).

User mode rootkits

The rootkit just described is a user mode rootkit and involves replacing commands, intercepting messages, and patching commonly-used APIs that may divulge the presence of the malware. A skilled administrator may find unmodified commands or import software to detect the intruding software.

Kernel mode rootkits

A kernel mode rootkit is installed as a kernel module. Being in the kernel gives the rootkit unrestricted access to all system resources and the ability to patch kernel structures and system calls. For example, directory listings from the getdents64 system call may not report any names that match the malware. Commands and libraries can be replaced and not give any indication that malicious software is resident in the system.

Hypervisor rootkits

The most insidious rootkits are hypervisor rootkits. A hypervisor sits below the operating system and is responsible for translating between virtual device operations from operating systems and the underlying hardware. All I/O flows through the hypervisor. Most computer systems do not run virtual machines and hence have no hypervisor. These systems are prime targets for a hypervisor-based rootkit. Now you can have an environment where the entire operating system can run unmodified - or even be reinstalled - and be unaware that its operations are being intercepted at a lower level. The hypervisor does not have to virtualize all hardware interactions: just the ones it cares about. For example, it might want to grab keyboard events to record passwords and messages.

Hypervisor attacks have not been deployed but have been demonstrated as a proof of concept. Detection is difficult and often relies on measuring completion times of system calls. If they go through a hypervisor, they will take a longer time and the on-chip Time Stamp Counter (TSC), which counts CPU cycles, will show a longer value with a hypervisor in place. An alternative, and far more obscure, method of detection, is the use of an instruction that stores the interrupt descriptor table register (IDTR) into a memory location (the SIDT instruction). The hypervisor changes the register’s value and the instruction can detect that. However, this does not have to take place on a system with only one operating system, so measuring timing differences may still be the more foolproof approach.

Gathering information

Malware has varying goals. These goals may include spying on user activity, destroying content, assembling a collection of servers, or extracting money from a victim. One common goal is to gather information … or get the user to provide information. Your computer might not have anything of direct value to an adversary, but your PayPal, bank, Amazon, or eBay credentials might be useful.


Phishing is a social engineering attack to get personal information from someone, usually login credentials to some service. These are often carried out vie email with similar techniques that are used to spread infected files. A message announcing that your PayPal account is being canceled, that your bank detected a fraudulent transaction, or that FedEx could not deliver a package may prompt the receiver to panic and immediately click on a link in the message, which may result in the browser displaying a site crafted to look like PayPal, the bank, or FedEx and prompt the user for login and password information.

Spear phishing is a targeted form of phishing. A phishing attack sends the same message to a large set of users, hoping that some percentage of them will be fooled. A spear phishing attack sends a customized message that demonstrates some knowledge of the target, which will usually lead the target to think that the message is legitimate. For example, the 2016 Democratic National Committee (DNC) was facilitated by spear phishing. Targets were sent a message containing bit.ly links, which hid the actual underlying URLs. Once clicked, the web site would display what looked like a legitimate Google accounts login page, already pre-populated with the victim’s GMail address.

Recent GMail spear phishing attacks send email to contacts of compromised accounts. The email contains an innocent-looking attachment: a thumbnail image of a document. When the victim clicks on the attachment, a web page that looks like a legitimate Google sign-in page is presented. As soon as the victim enters a name and password, the attackers get the credentials, log into the account, and target people in the victim’s contact list. They use an image of an actual attachment in the victim’s email and an actual subject line to make the email look more legitimate.


Another way of obtaining information is to snoop on a user’s actions. Keyloggers record everything a victim types and allow a user to extract login names, passwords, and entire messages.

Keyloggers can be implemented in several ways:

Malicious hypervisor
Since a hypervisor provides virtual interfaces for all the resources of a computer, it can capture all keyboard, mouse, and even video data. These attacks are difficult since they rely on the ability to install a hypervisor.
Kernel-based rootkit
All input/output operations go through the operating system kernel. Modifying the kernel allows malicious software to log and upload keystroke data.
System call hooking
Some operating systems provide a system call hooking mechanism that allows data to and from system calls to be intercepted. We saw how this was used to implement sandboxing. Windows enables this without having to install any kernel-level drivers. The SetWindowsHookEx system call can be used to report WH_KEYBOARD and WH_MOUSE events, capturing keyboard and mouse activity.
Browser-based logging
JavaScript can be used to capture onKeyUp() events. The restriction is that the events will be captured for one page but other hacks can be used to create a broader context with embedded pages. Form submission can also be intercepted to get populated form data without having to reassemble key presses into coherent account credentials.
Hardware loggers
Although visible to the user, hardware key loggers can be used for USB-connected keyboards. Some of these have embedded Wi-Fi transceivers that enable an attacker to collect the data from a distance.


If we consider the goals of malware agin, one common goal was to extract money: even hackers need to monetize their efforts. An indirect way of accomplishing this was by collecting information to gain access to bank account data, PayPal data, or modifying accounts that may take money, such as eBay accounts. A more direct way of getting money is to demand it from the victim. Ransomware is a relatively new form of malware that locks a computer, keeps it from booting, or encrypts all files on the system. It then asks the suer to pay a ransom (usually via bitcoin) to get a decryption program.


Malware was particularly easy to spread on older Windows systems since user processes ran with full administrative rights, which made it easy to modify any files on the system and even install kernel drivers. Adding file protection mechanisms, such as a distinction between user and administrator accounts added a significant layer of protection. However, malware installed by the user would run with that user’s privileges and would have full access to all of a user’s files. If any files are read or write protected, the malware can change DAC permissions.

Systems took the approach of warning users if software wanted to install software or asked for elevated privileges. In the case of Trojans, social engineering leads to users believing tha they actually want to install the software (or view the document). They will happily grant permissions and install the malware. MAC permissions can stop some viruses as they will not be able override permissions on, say, executable files but macro viruses and the user files are still a problem.

In general, however, studies have shown that by simply taking away admin rights (avoiding privilege escalation) from users, 94% of the 530 Microsoft vulnerabilities that were reported in 2016 could be mitigated and 100% of vulnerabilities in Office 2016 could be mitigated.

There is no way to recognize all possible viruses. Anti-virus programs look for bit patterns that match known viruses, Each bit pattern is an excerpt of code from the virus and is called a signature (not to be confused with digital signatures, discussed later) . The scanning process is called signature scanning. Lists of signatures have to be updated by the anti-virus software vendor as new viruses are discovered. A virus signature is simply a set of bytes that make up a portion of the virus and allow scanning software to see whether that virus is inside a file. The hope is that the signature is long enough and unique enough that the byte pattern will not occur in legitimate programs.

Some viruses try to defend themselves from anti-virus software. They encrypt most of the virus and decrypt it only upon execution. The only non-encrypted part is the decryption software and the key. A virus scanner will need to match the code for the decryption component since the key and the encrypted components can change each time the virus propagates itself.

Polymorphic viruses mutate their code each time they run while keeping the algorithm the same. This involves replacing sequences of instructions with functionally-identical ones. For example, one can change additions to subtractions of negative numbers, invert conditional tests and branches, and insert or remove no-op instructions. It thwarts signature scanning software because the the byte pattern of the virus is different each time.

Some virus detecting software will try to apply sandboxing: run suspect code in a sandbox or in an interpreted environment and see what the software tries to do. This is not foolproof since a trigger may keep the virus from immediately performing malicious actions.

A more sophisticated, and also unreliable, approach is anomaly detection. The virus-detection software would examine system behavior and look for abnormal-looking patterns of file or network activity.

Ultimately, access controls help but do not stop the problem. Limiting privilege escalation works much better but user files remain at risk. Containment mechanisms such as containers work well for server software but are usually impractical for user software (e.g., you want Microsoft Word to be able to read documents anywhere in a user’s directories).

Trojans and phishing attacks are insidiously difficult to defend against since we are dealing with human nature: users want to install the software or provide the data. Users are conditioned to accepting pop-up messages and entering a password. Better detection in browsers & mail clients against suspicious content or URLs helps.


Cryptography deals with encrypting plaintext using a cipher, also known as an encryption algorithm, to create ciphertext, which is unintelligible to anyone unless they can decrypt the ciphertext. It is a tool that helps build protocols that address:

Showing that the user really is that user.
Validating that the message has not been modified.
Binding the origin of a message to a user so that she cannot deny creating it.
Hiding the contents of a message.

A restricted cipher is one where the workings of the cipher must be kept secret. There is no reliance on any key and the secrecy of the cipher is crucial to the value of the algorithm. This has obvious flaws: people in the know leaking the secret, designers coming up with a poor algorithm, and reverse engineering.

For any serious use of encryption, we use well-tested, non-secret algorithms that rely on secret keys. A key is a parameter to a cipher that alters the resulting ciphertext. Knowledge of the key is needed to decrypt the ciphertext. Kerckhoff’s Principle states that a cryptosystem should be secure even if everything about the system, except the key, is public knowledge. We expect algorithms to be publicly known and all security to rest entirely on the secrecy of the key.

A symmetric encryption algorithm uses the same secret key for encryption and decryption.

An alternative to symmetric ciphers are asymmetric ciphers. An asymmetric, or public key cipher uses two related keys. Data encrypted with one key can only be decrypted with the other key.

Properties of good ciphers

For a cipher to be considered good, ciphertext should be indistinguishable from random values. Given ciphertext, there should be no way to extract the original plaintext or the key that was used to create it except by of enumerating over all possible keys. This is called a brute-force attack. The keys used for encryption should be large enough that a brute force attack is not feasible. Each additional bit in a key doubles the number of possible keys and hence doubles the search time.

Classic cryptography

Monoalphabetic substitution ciphers

The earliest form of cryptography was the monoalphabetic substitution cipher. In this cipher, each character of plaintext is substituted with a character of ciphertext based on a substitution alphabet (a lookup table). The simplest of these is the Cæsar cipher, known as a shift cipher, in which a plaintext character is replaced with a character that is n positions away in the alphabet. The key is the simply the the shift value: the number n. Substitution ciphers are vulnerable to frequency analysis attacks, in which an analyst analyzes letter frequencies in ciphertext and substitutes characters with those that occur with the same frequency in natural language text (e.g., if “x” occurs 12% of the time, it’s likely to really be an “e” since “e” occurs in English text approximately 12% of the time while “x” occurs only 0.1% of the time).

Polyalphabetic substitution ciphers

Polyalphabetic substitution ciphers were designed to increase resiliency against frequency analysis attacks. Instead of using a single plaintext-to-ciphertext mapping for the entire message, the substitution alphabet may change periodically. In the Alberti cipher (essentially a secret decoder ring), the substitution alphabet changes every n characters as the ring is rotated one position every n characters. The Vigenère cipher is a grid of Cæsar ciphers that uses a repeating key. A repeating key is a key that repeats itself for as long as the message. Each character of the key determines which Cæsar cipher (which row of the grid) will be used for the next character of plaintext. The position of the plaintext character identifies the column of the grid.

One-time Pads

The one-time pad is the only provably secure cipher. It uses a random key that is as long as the plaintext. Each character of plaintext is permuted by a character of ciphertext (e.g., add the characters modulo the size of the alphabet or, in the case of binary data, exclusive-or the next byte of the text with the next byte of the key). The reason this cryptosystem is not particularly useful is because the key has to be as long as the message, so transporting the key securely becomes a problem. The challenge of sending a message securely is now replaced with the challenge of sending the key securely. The position in the key (pad) must by synchronized at all times. Error recovery from unsynchronized keys is not possible. Finally, for the cipher to be secure, a key must be composed of truly random characters, not ones derived by an algorithmic pseudorandom number generator. The key can never be reused.

The one-time pad provides perfect secrecy (not to be confused with forward secrecy, also called perfect forward secrecy, which will be discussed later), which means that the ciphertext conveys no information about the content of the plaintext. It has been proved that perfect secrecy can be achieved only if there are as many possible keys as the plaintext, meaning the key has to be as long as the message.

Stream ciphers

A stream cipher simulates a one-time pad by using a keystream generator to create a set of key bytes that is as long as the message. A keystream generator is a pseudorandom number generator that is seeded, or initialized, with a key that drives the output of all the bytes that the generator spits out. The keystream generator is fully deterministic: the same key will produce the same stream of output bytes each time. Because of this, receivers only need to have the key to be able to decipher a message. However, because the keystream generator does not generate true random numbers, the stream cipher is not a true substitute for a one-time pad. Its strength rests on the strength of the key. A keystream generator will, at some point, will reach an internal state that is identical to some previous internal state and produce output that is a repetition of previous output. This also limits the security of a stream cipher but the repetition may not occur for a long time, so stream ciphers can still be useful for many purposes.

Rotor machines

A rotor machine is an electromechanical device that implements a polyalphabetic substitution cipher. It uses a set of disks (rotors), each of which implements a substitution cipher. The rotors rotate with each character in the style of an odometer: after a complete rotation of one rotor, the next rotor advances one position. Each successive character gets a new substitution alphabet applied to it. The multi-rotor mechanism allows for a huge number of substitution alphabets to be employed before they start repeating when the rotors all reach their starting position. The number of alphabets is cr, where c is the number of characters in the alphabet and r is the number of rotors.

Transposition ciphers

Instead of substituting one character of plaintext for a character of ciphertext, a transposition cipher scrambles the position of the plaintext characters. Decryption is the knowledge of how to unscramble them.

A skytale is an ancient implementation of a transposition cipher where text written along a strip of paper is wrapped around a rod and the resulting sequences of text are read horizontally. This is equivalent to entering characters in a two-dimensional matrix horizontally and reading them vertically. Because the number of characters might not be a multiple of the width of the matrix, extra characters might need to be added at the end. This is called padding and is essential for block ciphers, which encrypt chunks of data at a time.

Block ciphers

Most modern ciphers are block ciphers, meaning that they encrypt a chunk of bits, or block, of plaintext at a time. The same key is used to encrypt each successive block of plaintext.

AES and DES are two popular symmetric block ciphers. Symmetric block ciphers are usually implemented as iterative ciphers. The encryption of each block of plaintext iterates over several rounds. Each round uses a subkey, which is a key generated from the main key via a specific set of bit replications, inversions, and transpositions. The subkey is also known as a round key since it is applied to only one round, or iteration. This subkey determines what happens to the block of plaintext as it goes through a substitution-permutation (SP) network. The SP network, guided by the subkey, flips some bits by doing a substitution, which is a table lookup of an input bit pattern to get an output bit pattern and a permutation, which is a scrambling of bits in a specific order. The output bytes are fed into the next round, which applies a substitution-permutation step onto a different subkey. The process continues for several rounds (16 rounds for DES, 10–14 rounds for AES). and the resulting bytes are the ciphertext for the input block.

Feistel ciphers

A Feistel cipher is a form of block cipher where a block plaintext is split into two parts. The substitution-permutation round is applied to only one part. That output is then XORed with the other part and the two halves are swapped. At each round, half of the input block remains unchanged. DES, the Data Encryption Standard, is an example of a Feistel cipher. AES, the Advanced Encryption Standard, is not.


Two popular symmetric block ciphers are DES, the Data Encryption Standard, and AES, the Advanced Encryption Standard. DES was adopted as a federal standard in 1976 and is a block cipher based on the Feistel cipher that encrypts 64-bit blocks using a 56-bit key.

DES has been shown to have some minor weaknesses against cryptanalysis. Key can be recovered using 247 chosen plaintexts or 243 known plaintexts. Note that this is not a practical amount of data to get for a real attack. The real weakness of DES is not the algorithm but but its 56-bit key. An exhaustive search requires 255 iterations on average (we assume that, on average, the plaintext is recovered halfway through the search). This was a lot for computers in the 1970s but is not much for today’s dedicated hardware or distributed efforts.


Triple-DES (3DES) solves the key size problem of DES and allows DES to use keys up to 168 bits. It does this by applying three layers of encryption:

  1. C’ = Encrypt M with key K1
  2. C’’ = Decrypt C’ with key K2
  3. C = Encrypt C’’ with key K3

If K1, K2, and K3 are identical, we have the original DES algorithm since the decryption in the second step cancels out the encryption in the first step. If K1 and K3 are the same, we effectively have a 112-bit key and if all three keys are different, we have a 168-bit key.

Cryptanalysis is not effective with 3DES: the three layers of encryption use 48 rounds instead of 16 making it infeasible to reconstruct the substitutions and permutations that take place. DES is relatively slow compared with other symmetric ciphers, such as AES. It was designed with hardware encryption in mind 3DES is, of course, three times slower than DES.


AES, the Advanced Encryption Standard, was designed as a successor to DES and became a federal government standard in 2002. It uses a larger block size than DES: 128 bits versus DES’s 64 bits and supports larger key sizes: 128, 192, and 256 bits. Even 128 bits is complex enough to prevent brute-force searches.

No significant academic attacks have been found thus far beyond brute force search. AES is also typically 5–10 times faster in software than 3DES.

Block cipher modes

Electronic codebook (ECB)

When data is encrypted with a block cipher, it is broken into blocks and each block is encrypted separately. This leads to two problems. If different encrypted messages contain the same substrings and use the same key, an intruder can see that the same data is encrypted. Secondly, a malicious party can delete, add, or replace blocks (perhaps with random junk or perhaps with blocks that were captured from previous messages). This basic form of a block cipher is called an electronic codebook (ECB). Think of the codebook as a database of encrypted content. You can look up a block of plaintext and find the corresponding ciphertext.

Cipher block chaining (CBC)

Cipher block chaining (CBC) addresses these problems. Every block of data is still encrypted with the same key. However, prior to being encrypted, the data block is exclusive-ored with the previous block of ciphertext. The receiver does the process in reverse: a block of received data is decrypted and then exclusive-ored with the previously-received block of ciphertext to obtain the original data. The very first block is exclusive-ored with a random initialization vector, which must be transmitted to the remote side. Note that CBC does not make the encryption more secure; it simply makes the result of each block of data dependent on all previous previous blocks so that data cannot be inserted or deleted in the message stream.

Counter mode (CTR)

Counter mode (CTR) also addresses these problems but in a different way. The ciphertext of each block is a function of its position in the message. Encryption starts with a message counter. The counter is incremented for each block of input. Only the counter is encrypted. The resulting ciphertext is then exclusive-ored with the corresponding block of plaintext, producing a block of message ciphertext. To decrypt, the receiver does the same thing and needs to know the starting value of the counter as well as the key. An advantage of CTR mode is that each block has no dependance on other blocks and encryption on multiple blocks can be done in parallel.


The goal of cryptanalysis is break codes. Most often, it is to identify some non-random behavior of an algorithm that will give the analyst an advantage over an exhaustive search of the key space.

Differential cryptanalysis seeks to identify non-random behavior by examining how changes in plaintext input affect changes in the output ciphertext. It tries to find whether certain bit patterns are unlikely for certain keys or whether the change in plaintext results in likely changes in the output.

Linear cryptanalysis tries to create equations that attempt to predict the relationships between ciphertext, plaintext, and the key. An equation will never be equivalent to a cipher but any correlation of bit patterns give the analyst an advantage.

Neither of these methods will break a code directly but may help find keys or data that are more likely are that are unlikely. It reduces the keys that need to be searched.

Public key cryptography

Public key algorithm, also known as asymmetric ciphers, use one key for encryption and another key for decryption. One of these keys is kept private (known only to the creator) and is known as the private key. The corresponding key is generally made visible to others and is known as the public key.

Anything encrypted with the private key can only be decrypted with the public key. This is the basis for digital signatures. Anything that is encrypted with a public key can be encrypted only with the corresponding private key. This is the basis for authentication and covert communication.

Public and private keys are related but, given one of the keys, there is no feasible way of computing the other. They are based on trapdoor functions, which are one-way functions – there is no known way to compute the inverse unless you have extra data: the other key.

RSA public key cryptography

The RSA algorithm is the most popular algorithm for asymmetric cryptography. Its security is based on the difficulty of finding the factors of the product of two large prime numbers. Unlike symmetric ciphers, RSA encryption is a matter of performing arithmetic on large numbers. It is also a block cipher and plaintext is converted to ciphertext by the formula:

c = me mod n

Where m is a block of plaintext, e is the encryption key, and n is an agreed-upon modulus that is the product of two primes. Given the ciphertext c, e, and n, there is no efficient way to compute the inverse to obtain m. To decrypt the ciphertext, you need the decryption key, d:

m = cd mod n

Elliptic curve cryptography (ECC)

Elliptic curve cryptography (ECC) is a more recent public key algorithm that is an alternative to RSA. It is based on finding points along a prescribed elliptic curve, which is an equation of the form:

y2 = x3 + ax + b

Elliptic curves have nothing to do with ellipses or conic sections. Here, the security rests not our inability to factor numbers but our inability to perform discrete logarithms in a finite field.

The RSA algorithm is still the most widely used public key algorithm, but ECC has some advantages:

  • ECC can use far shorter keys for the same degree of security. Security comparable to 256 bit AES encryption requires a 512-bit ECC key but a 15,360-bit RSA key

  • ECC requires less CPU consumption and uses less memory than RSA.

  • Generating ECC keys is faster than RSA (but much slower than AES, where a key is just a random number).

On the downside, ECC is more complex to implement and encryption is slower than with RSA.

Secure communication

Symmetric cryptography

Communicating securely with symmetric cryptography is easy. All communicating parties must share the same secret key. Plaintext is encrypted with the secret key to create ciphertext and then transmitted or stored. It can be decrypted by anyone who has the secret key.

Asymmetric cryptography

Communicating securely with asymmetric cryptography is a bit different. Anything encrypted with one key can be decrypted only by the other related key. For Alice to encrypt a message for Bob, she encrypts it with Bob’s public key. Only Bob has the corresponding key that can decrypt the message: Bob’s private key.

Hybrid cryptography

Asymmetric cryptography, however, is considerably slower than symmetric cryptography. AES, for example, is approximately 1,500 times faster for decryption than RSA and 40 times faster for encryption. AES is also much faster than ECC. Key generation is far slower with RSA or ECC than it is with symmetric algorithms, where the key is just a random number rather than a set of carefully chosen numbers with specific properties. Moreover, certain keys with RSA may be weaker than others. Because of these factors, RSA (and ECC) is rarely used to encrypt large chunks of information. Instead, it is common to use hybrid cryptography, where a public key algorithm is used to encrypt a randomly-generated key that will encrypt the message with a symmetric algorithm. This randomly-generated key is called a session key, since it is generally used for one communication session and then discarded.

Key Exchange

The biggest problem with symmetric cryptography is key distribution. For Alice and Bob to communicate, they must share a secret key that no adversaries can get. However, Alice cannot send the key to Bob since it would be visible to adversaries. She cannot encrypt it because Alice and Bob do not share a key yet.

Key exchange using a trusted third party

For two parties to communicate using symmetric ciphers they need to share the same key. The ways of doing this are:

  1. Share the key via some trusted mechanism outside of the network, such are reading it over the phone or sending a flash drive via FedEx.

  2. Send the key using a public key algorithm.

  3. Use a trusted third party.

We will first examine the use of a trusted third party. A trusted third party is a trusted system that has everyone’s key. Hence, only Alice and the trusted party (whom we will call Trent) have Alice’s secret key. Only Bob and Trent have Bob’s secret key.

The simplest way of using a trusted third party is to ask it to come up with a session key and send it to the parties that wish to communicate. Foe example, Alice sends a message to Trent requesting a session key to communicate with Bob. This message is encrypted with Alice’s secret key so that Trent knows the message could have only come from Alice.

Trent generates a random session key and encrypts it with Alice’s secret key. He also encrypts the same key with Bob’s secret key. Alice gets both keys and passes the one encrypted for Bob to Bob. Now Alice and Bob have a session key that was encrypted with each of their secret keys and they can communicate by encrypting messages with that session key.

This simple scheme is vulnerable to replay attacks. An eavesdropper, Eve, can record messages from Alice to Bob and replay them at a later time. Eve might not be able to decode the messages but she can confuse Bob by sending him seemingly valid encrypted messages.

The second problem is that Alice sends Trent an encrypted session key but Trent has no idea that Alice is requesting to communicate with him. While Trent authenticated Alice (simply by being able to decrypt her request) and authorized her to talk with Bob (by generating the session key), that information has not been conveyed to Bob.

Needham-Schroeder: nonces

The Needham-Schroeder protocol improves the basic key exchange protocol by adding nonces to messages. A nonce is simply a random string – a random bunch of bits. Alice sends a request to Trent, asking to talk to Bob. This time, it doesn’t have to even be encrypted. As part of the request she sends a nonce.

Trent responds with a message that contains:

  • Alice’s ID
  • Bob’s ID
  • the nonce
  • the session key
  • a ticket: a message encrypted for Bob containing Alice’s ID and the same session key

This entire message is encrypted with Alice’s secret key. Alice can validate that the message is a response to her message because:

  • It is encrypted for her: nobody but Alice and Trent has Alice’s secret key.
  • It contains the same nonce as in her request, so it is not a replay of some earlier message.

Alice sends the ticket (the message encrypted with Bob’s key) to Bob. He can decrypt it and knows:

  • The message must have been generated by Trent since only Trent and Bob know Bob’s key.
  • That he will be communicating with Alice because Trent placed Alice’s ID in that ticket.
  • The session key since Trent placed that in the ticket too.

Bob can now communicate with Alice but he will first authenticate Alice to be sure that he’s really communicating with her. He’ll believe it’s Alice if she can prove that she has the session key. To do this, Trent creates another nonce, encrypts it with the session key, and sends it to Alice. Alice decrypts the message, subtracts one from the nonce, encrypts the result, and sends it back to Trent. She just demonstrated that she could decrypt a message using the session key and return back a known modification of the message. Needham-Schroeder is a combined authentication and key exchange protocol.

Denning-Sacco modification: timestamps to avoid key replay

One flaw in the Needham-Schroeder algorithm is when Alice sends the ticket to Bob. The ticket is encrypted with Bob’s secret key and contains Alice’s ID as well as the session key. If an attacker grabbed a communication session and managed to decrypt the session key, she can replay the transmission of the ticket to Bob. Bob won’t know that he received that same session key in the past. He will proceed to validate “Alice” by asking her to prove that she indeed knows the session key. In this case, Eve, our eavesdropper, does know it; that’s why she sent the ticket to Bob. Bob completes the authentication and thinks he is talking with Alice when in reality he is talking to Eve.

A fix for this is to add a timestamp to the ticket. When Trent creates the ticket that Alice will give to Bob, it is a message encrypted for Bob and contains Alice’s ID, the session key, and a timestamp.

When Bob receives a ticket, he checks the timestamp. If it is older than some recent time (e.g., a few seconds), Bob will simply discard the ticket, assuming that he is getting a replay attack.

Otway-Rees protocol: session IDs instead of timestamps

A problem with timestamps is that their use relies on all entities having synchronized clocks. If Bob’s clock is significantly off from Trent’s, he may falsely accept or falsely reject a ticket that Alice presents to him. Time synchronization becomes an attack vector for this protocol. If an attacker can change Bob’s concept of time, she may be able to convince Bob to accept an older ticket. To do this, she can create fake NTP (network time protocol) responses to force Bob’s clock to synchronize to a different value or, if Bob is paranoid and uses a GPS receiver to synchronize time, create fake GPS signals.

A way to avoid the replay of the ticket without using timestamps is to use a session ID with each message. The Otway-Rees protocol differs a bit from Needham-Schroeder but is conceptually very similar.

  1. Alice sends a message to Bob that contains a session ID, both of their IDs, and a message encrypted with Alice’s secret key. This message contains Alice and Bob’s IDs as well as the session ID.

  2. Bob sends Trent a request to communicate with Alice, containing Alice’s message as well as a message encrypted with his secret key that also contains the session ID.

  3. Trent now knows that Alice wants to talk to Bob since the session ID is inside her encrypted message and that Bob agrees to talk to Alice since that same session ID is inside his encrypted message.

  4. Trent creates a random session key encrypted for Bob and the same key encrypted for Alice and sends both of those to Bob, along with the session key.

The protocol also incorporates nonces to ensure that there is no replay attack on Trent’s response even if an attacker sends a message to Bob with a new session ID and old encrypted session keys (that were cracked by the attacker).


Kerberos is a trusted third party authentication, authorization, and key exchange protocol using symmetric cryptography and based closely on the Needham-Schroeder protocol with the Denning Sacco modification (the use of timestamps).

When Alice wands to talk with Bob (they can be users and services), she first needs to ask Kerberos. If access is authorized, Kerberos will send her two messages. One is encrypted with Alice’s secret key and contains the session key for her communication with Bob. The other message is encrypted with Bob’s secret key. Alice cannot read or decode this second message. It a ticket, also known as a sealed envelope. It contains the same session key that Alice received but is encrypted for Bob. Alice will send that to Bob. When Bob decrypts it, he knows that the message must have been generated by an entity that knows its secret key: Kerberos. Now that Alice and Bob both have the session key, they can communicate securely by encrypting all traffic with that session key.

To avoid replay attacks, Kerberos places a timestamp in Alice’s response and in the ticket. For Alice to authenticate herself to Bob, she needs to prove that she was able to extract the session key from the encrypted message Kerberos sent her. She proves this by generating a new timestamp, encrypting it with the session key, and sending it to Bob. Bob now needs to prove to Alice that he can decode messages encrypted with the session key. He takes Alice’s timestamp, adds one (just to permute the value), and sends it back to Alice, encrypted with their session key.

Since your secret key is needed to decrypt every service request you make of Kerberos, you’ll end up typing your password each time you want to access a service. Storing the key in a file to cache it is not a good idea. Kerberos handles this by splitting itself into two components that run the same protocol: the authentication server (AS) and the ticket granting server (TGS). The authentication server handles the initial user request and provides a session key to access the TGS. This session key can be cached for the user’s login session and allows the user to send requests to the TGS without re-entering a password. The TGS is the part of Kerberos that handles requests for services. It also returns two messages to the user: a different session key for the desired service and a ticket that must be provided to that service.

Diffie-Hellman key exchange

The Diffie-Hellman key exchange algorithm allows two parties to establish a common key without disclosing any information that would allow any other party to compute the same key. Each party generates a private key and a public key. Despite their name, these are not encryption keys; they are just numbers. Diffie-Hellman does not implement public key cryptography. Alice can compute a common key using her private key and Bob’s public key. Bob can compute the same common key by using his private key and Alice’s public key.

Diffie-Hellman uses the one-way function abmod c. Its one-wayness is due to our inability to compute the inverse: a discrete logarithm. Anyone may see Alice and Bob’s public keys but will be unable to compute their common key. Although Diffie-Hellman is not a public key encryption algorithm, it behaves like one in the sense that it allows us to exchange keys without having to use a trusted third party.

Key exchange using public key cryptography

With public key cryptography, there generally isn’t a need for key exchange. As long as both sides can get each other’s public keys from a trusted source, they can encrypt messages using those keys. However, we rarely use public key cryptography for large messages. It can, however, be used to transmit a session key. This use of public key cryptography to transmit a session key that will be used to apply symmetric cryptography to messages is called hybrid cryptography. For Alice to send a key to Bob:

  1. Alice generates a random session key.
  2. She encrypts it with Bob’s public key & sends it to Bob.
  3. Bob decrypts the message using his private key and now has the session key.

Bob is the only one who has Bob’s private key to be able to decrypt that message and extract the session key. A problem with this is that anybody can do this. Charles can generate a random session key, encrypt it with Bob’s public key, and send it to Bob. For Bob to be convinced that it came from Alice, she can encrypt it with her private key (this is signing the message).

  1. Alice generates a random session key.
  2. She signs it by encrypting the key with her private key.
  3. She encrypts the result with Bob’s public key & sends it to Bob.
  4. Bob decrypts the message using his private key.
  5. Bob decrypts the resulting message with Alice’s public key and gets the session key.

If anybody other than Alice created the message, the result that Bob gets by decrypting it with Alice’s public key will not result in a valid key for anyone. We can enhance the protocol by using a standalone signature (encrypted hash) so Bob can identify a valid key from a bogus one.

Forward secrecy

If an attacker steals, for example, Bob’s private key, he will be able to go through old messages and decrypt old session keys (the start of every message to Bob contained a session key encrypted with his public key). Forward secrecy, also called perfect forward secrecy, is the use of keys and key exchange protocols where the compromise of a key does not compromise past session keys. There is no secret that one can steal that will allow the attacker to decrypt multiple past messages.

Diffie-Hellman enables forward secrecy. Alice and Bob can each generate a key pair and send their public key to each other. They can then compute a common key that nobody else will know and use that to communicate. Achieving forward secrecy requires single-use (ephemeral) keys. Next time Alice and Bob want to communicate, they will generate a new set of keys and compute a new common key. At no time do we rely on long-term keys, such as Alice’s secret key or RSA private key. Encrypting a session key with a long-term key, such as Bob’s public key, will not achieve forward secrecy. If an attacker ever finds Bob’s private key, she will be able to extract the session key.

Difie-Hellman is good for for achieving forward secrecy because it is efficient to create new new key pairs on the fly. RSA or ECC keys can be used as well but key generation is far less efficient. Because of this, RSA keys tend to be used only for long-term keys (e.g., for authentication).

Message Integrity

One-way functions

A one-way function is one that can be computed relatively easily in one direction but there is no known way of computing the inverse function. One-way functions are crucial in a number of cryptographic algorithms, including digital signatures, Diffie-Hellman key exchange, and both RSA and elliptic curve public key cryptography. For Diffie-Hellman and public key cryptography, they ensure that someone cannot generate the corresponding private key when presented with a public key.

Hash functions

A particularly useful form of a one-way function is the cryptographic hash function. This is a one-way function whose output is always a fixed number of bits for any input. Hash functions are commonly used in programming to construct hash tables, which provide O(1) lookups of keys.

Cryptographic hash functions produce far longer results than those used for hash tables. Common lengths are 224, 256, 384, or 512 bits. Good cryptographic hash functions (e.g., SHA–1, SHA–2, SHA–3) have several properties:

  1. Like all hash functions, take arbitrary-length input and produce fixed-length output

  2. Also like all hash functions, they are deterministic; they produce the same result each time when given identical input.

  3. They exhibit pre-image resistance, or hiding. Given a hash H, it should not be feasible to find a message M where H=hash(M).

  4. The output of a hash function should not give any information about any of the input. For example, changing a byte in the message should not cause any predictable change in the hash value.

  5. They are collision resistant. While hash collisions can exist (the number of possible hashes is smaller than than number of possible messages; see the pigeonhole principle), it is not feasible to find any two different messages that hash to the same value. Similarly, it is not feasible to modify the plaintext without changing its resultant hash.

  6. They should be relatively efficient to compute. We would like to use hash functions as message integrity checks and generate them for each message without incurring signifficant overhead.

The cryptographic hash function is the basis for message authentication codes and digital signatures.

Because of these properties, we have high assurance that a message would no longer hash to the same value if it is modified in any way. The holy grail for an attacker is to be able to construct a message that hashes to the same value as another message. That would allow the attacker to substitute a new message for some original one (for example, redirecting a money transfer). Searching for a collision with a pre-image (known message) is much harder than searching for any two messages that produce the same hash. The birthday paradox tells us that the search for a collision of any two messages is approximately the square root of the complexity of searching for a collision on a specific message. This means that the strength of a hash function for a brute-force collision attack is approximately half the number of bits of the hash. A 256-bit hash function has a strength of approximately 128 bits.

Popular hash functions include SHA–1 (160 bits), SHA–2 (commonly 256 and 512 bits), and SHA–3 (256 and 512 bits).

Message Integrity and Hash Pointers

A cryptographic hash serves as a checksum for a message. If a message has been modified, it will yield a different hash. By associating a hash with a message, we have a basis for managing the integrity of that message: being able to detect if the message gets changed.

Tamper-resistant linked-lists: blockchains

One way of associating a hash with a message is via the use of hash pointers. Pointers are used in data structures to allow one data element to refer to another. In processes, a pointer is a memory location. In distributed systems, a pointer may be an IP address and object identifier. A hash pointer is a tuple that contains a traditional pointer along with the hash of the data element that is being pointed to. It allows us ot validate that the information being pointed to has not been modified.

The same structures that use pointers can be modified to use hash pointers and create tamper-evident structures. For example, a linked list can be constructed with each element containing a hash pointer to the next element instead of a pointer.


Adding a new block is easy. You allocate the block, copy the head hash pointer into it (the next pointer), and update the head hash pointer to point to the new block and contain a hash of that block.

If an adversary modifies, say, data block 1, we can detect that. The hash pointer in Data–2 will point to Data–1 but the hash of Data–1 will no longer match the hash in the pointer. For a successful attack, the adversary will also need to modify the hash value in the hash pointer in block 2. That will make the hash pointer in block 3 invalid, so that will need to be changed. The adversary will need to change all the hash pointers leading up to the head of the list. If we’re holding on to the head of the list (e.g., in a variable) so that the adversary cannot modify it, then we will always be able to detect tampering. A linked list using hash pointers is called a blockchain.

Merkle Trees

Another useful structure using hash pointers in place of conventional pointers is a binary tree, called a Merkle tree when implemented with hash pointers.

Merkle Tree
Merkle Tree

Leaf nodes of a Merkle tree contain conventional hash pointers: pointers to the data blocks and the hashes of those data blocks. Non-leaf child nodes contain left and right pointers along with the hash of the two hashes they point to. As with binary trees, Merkle trees give us the advantage of being able to locate data in O(log n) time instead of linear time. Similarly, we can validate any data in O(log n) time by traversing from the root down to the last hash pointer at the leaf.

Applications of blockchains and Merkle trees

With Merkle trees, the information in a leaf does not generally contain information to help you traverse the tree to search for data; we’re not building a search tree. The entire purpose of the tree is to make it efficient to manage and validate the integrity of the underlying data. That is, the hash pointer strucute is there just to allow you to validate the underlying data rather than search for stuff. If you want to search, you can add extra information to each node – in general, we’re not concerned with secrecy and you can build whatever search structures an application needs. Hash pointers are all about helping assess the integrity of data. Structures such as hash-pointer-based linked lists and Merkle trees were designed with peer-to-peer systems in mind where data can come from various untrusted peers. You just need to get the root hash from a trusted place.

The top-level pointer (the root in the case of a tree; the head in the case of linked lists) represents the integrity of the entire set of data. If any data block changes, that top level pointer will allow the user to detect that there has been a change. Therefore, it is important that this value be stored securely and obtained via a trustworthy mechanism.

What a Merkle tree allows you to do is check the integrity of replicated data on a branch-by-branch basis in an efficient manner. Merkle trees are designed for environments where data is replicated among multiple systems and you want each system to be able to validate the integrity of the entire file. This helps in two cases:

  1. You can validate downloaded data without having to wait for the entire set of data to be downloaded.

  2. You can efficiently compare your data with that on another system.

Suppose you have a file and want to check whether any blocks in your version are corrupted with respect to a version on another server. Both you and another system assembled your own structure of hash pointers.

With a linked list of hash pointers, you’d start at the head of the list and compare hashes. If the hashes match, you are confident that your files match. If you have a mismatch, you need to compare the next hash. If it matches what you have then you know that first block has been modified. If it doesn’t then you need to get the hash after that. Ultimately, you may need to traverse the entire list linearly.

With Merkle trees, it becomes easier to find the block (or blocks) that have changed. If the root hash matches, you know that your entire data set matches. If not, you request the left & right hashes and compare those with your tree. If one doesn’t match then you can compare the hashes under that subtree, iterating down the tree until you find the mismatched data block. You do not need to iterate through the entire list of blocks. This is attractive for replicated data sets where we have tens of millions of data blocks, for example, and sending a hash list is not efficient. It is essentially a tree search to find the block that is inconsistent.

Message Authentication Codes (MACs)

A cryptographic hash helps us ensure message integrity: it serves as a checksum that allows us to determine if a message has been modified. If the message is modified, it no longer hashes to the same value as before. However, if an attacker modifies a message, she may be able to modify the hash value as well. To prevent this, we need a hash that relies on a key for validation. This is a message authentication code, or MAC. Two forms of MACs are hash-based ones and block cipher based ones:

Hash-based MAC (HMAC):
A hash-based MAC uses a cryptographic hash function to hash the message and the key. Anyone who does not know the key will not be able to recreate the hash.
Block cipher-based MAC (CBC-MAC):
Recall that cipher block chaining assures us that every encrypted block is a function of all previous blocks. CBC-MAC uses a zero initialization vector and runs through a cipher block chained encryption, discarding all output blocks except for the last one, which becomes the MAC. Any changes to the message will be propagated to that final block and the same encryption cannot be performed by someone without the key.

Digital signatures

Message authentication codes rely on a shared key. Anybody who posesses the key can modify and re-sign a message. There is no assurance that the action was done by the author of the message. Digital signatures have stronger properties than MACs:

  1. Only you can sign a message but anybody should be able to validate it.
  2. You cannot copy the signature from one message and have it be valid on another message.
  3. An adversary cannot forge a signature, even after inspecting an arbitrary number of signed messages.

Digital signatures require three operations:

  1. Key generation: {private_key, verification_key } := gen_keys(keysize)
  2. Signing: signature := sign(message, private_key)
  3. Validation: isvalid := verify(message, signature, verification_key)

Since we trust hashes to be collision-free, it makes sense to apply the signature to the hash of a message instead of the message itself. This ensures that the signature will be a small, fixed size and makes it easy to embed in hash pointers and other structures.

There are several commonly-used digital signature algorithms:

DSA, the Digital Signature Algorithm
The current NIST standard that generates key pairs that are secure because of the difficulty of computing discrete logarithms.
ECDSA, Elliptic Curve Digital Signature Algorithm
A variant of DSA that uses elliptic curve cryptography
Public key cryptographic algorithms
RSA or Elliptic Curve Cryptography applied to message hashes.

All these algorithms generate public and private key pairs. The first two are not general-purpose encryption algorithms but are designed solely for digital signatures.

We saw how public key cryptography can be used to encrypt messages: Alice encrypts a message using Bob’s public key to ensure that only Bob could decrypt it with his private key. We can use public key backwards: Alice can encrypt a message using her private key. Anyone can decrypt the message using her public key but, in doing so, would know that the message was encrypted by Alice.

A digital signature can be constructed by simply encrypting the hash of a message with the creator’s (signer’s) private key. Anyone who has the message signer’s public key can decrypt the hash and thus validate the hash against the message. Other parties cannot recreate the signature.

Note that, with a MAC, the recipient or anyone in possession of the shared key can create the same MAC. With a digital signature, the signature can only be created by the owner of the private key. Unlike MACs, digital signatures provide non-repudiation – proof of identity. Alice cannot claim that she did not create a signature because nobody but Alice has her private key. Also unlike MACs, anyone can validate a signature since public keys are generally freely distributed. as with MACs, digital signatures also provide proof of integrity, assurance that the original message has not been modified.

Covert and authenticated messaging

We ignored the encryption of a message in the preceding discussion; our interest was assuring integrity. However, there are times when we may want to keep the message secret and validate that it has not been modified. Doing this involves sneding a signature of the message along with the encrypted message.

A basic way for Alice to send a signed and encrypted message to Bob is for her to use hybrid cryptography and:

1. Create a signature of the message. This is a hash of the message encrypted with her private key.
2. Create a session key for encrypting the message. This is a throw-away key that will not be needed beyond the communication session.
3. Encrypt the message using the session key. She will use a fast symmetric algorithm to encrypt this message.
4. Package up the session key for Bob: she encrypts it with Bob's public key. Since only Bob has the corresponding private key, only Bob will be able to decrypt the session key.
5. She sends Bob: the encrypted message, encrypted session key, and signature.

Anonymous identities

A signature verification key (e.g., a public key) can be treated as an identity. You posess the corresponding private key and therefore only you can create valid signatures that can be verified with the public key. This identity is anonymous; it is just a bunch of bits. There is nothing that identifies you as the holder of the key. You can simply assert your identity by being the sole person who can generate valid signatures.

Since you can generate an arbitrary number of key pairs, you can create a new identity at any time and create as many different identities as you want. When you no longer need an identity, you can discard your private key for that corresponding public key.

Identity binding: digital certificates

While public keys provide a mechanism for asserting integrity via digital signatures, they are themselves anonymous. We’ve discussed a scenario where Alice uses Bob’s public key but never explained how she can assert that the key really belongs to Bob and was not planted by an adversary. Some form of identity binding of the public key must be implemented for you to know that you really have my public key instead of someone else’s. How does Alice really know that she has Bob’s public key?

X.509 digital certificates provide a way to do this. A certificate is a data structure that contains user information (called a distinguished name) and the user’s public key. This data structure also contains a signature of the certification authority. The signature is created by taking a hash of the rest of the data in the structure and encrypting it with the private key of the certification authority. The certification authority (CA) is responsible for setting policies of how they validate the identity of the person who presents the public key for encapsulation in a certificate.

To validate a certificate, you would hash all the certificate data except for the signature. Then you would decrypt the signature using the public key of the issuer. If the two values match, then you know that the certificate data has not been modified since it has been signed. The challenge is how to get the public key of the issuer. Public keys are stored in certificates, so the issuer would have a certificate containing its public key. This certificate can be signed by yet another issuer. This kind of process is called certificate chaining. For example, Alice can have a certificate issued by the Rutgers CS Department. The Rutgers CS Department’s certificate may be issued by Rutgers University. Rutgers University’s certificate could be issued by the State of New Jersey Certification Authority, and so on. At the very top level, we will have a certificate that is not signed by any higher-level certification authority. A certification authority that is not underneath any other CA is called a root CA. In practice, this type of chaining is rarely used. More commonly, there are hundreds of autonomous certification authorities acting as root CAs that issue certificates to companies, users, and services. The certificates for many of the trusted root CAs are preloaded into operating systems or, in some cases, browsers. See here for Microsoft’s trusted root certificate participants and here for Apple’s trusted root certificates.

Every certificate has an expiration time (often a year or more in the future). This provides some assurance that even if there is a concerted attack to find a corresponding private key to the public key in the certificate, such a key will not be found until long after the certificate expires. There might be cases where a private key might be leaked or the owner may no longer be trustworthy (for example, an employee leaves a company). In this case, a certificate can be revoked. Each CA publishes a certificate revocation list, or CRL, containing lists of certificates that they have previously issued that should no longer be considered valid. To prevent spoofing the CRL, the list is, of course, signed by the CA. Each certificate contains information on where to obtain revocation information.

The challenge with CRLs is the TOCTTOU problem: not everyone may check the certificate revocation list in a timely manner and some systems may accept a certificate not knowing that it was revoked. Some systems, particularly embedded systems, may not even be configured to handle CRLs.