Introduction to Distributed Systems (CS 417)
This is not a class about learning the hottest new JavaScript framework or building a single-page app. This is about the systems behind the systems: the invisible machinery that keeps Amazon’s site up when servers crash, that lets you watch Netflix without caring where in the world the video is stored, and that makes multiplayer games work without everyone shouting “lag!”
If you’ve ever sat at a party and pretended you knew what “sharding” meant, or nodded along when someone dropped “protocol buffers” into a sentence, this course is your chance to stop faking it. We’re going to talk about the hard stuff: fault tolerance, consensus algorithms, distributed file systems, and how to keep data consistent when your machines can’t even agree on the time.
The class
Distributed systems are the foundation of modern computing. Search engines, social networks, online banking, multiplayer games, and cloud services all rely on multiple computers working together to provide a seamless experience. Behind the scenes, these systems must operate across unreliable networks, handle failures gracefully, and remain secure, scalable, and consistent.
In this course, we will explore how to design, build, and reason about distributed systems. We will cover system architectures, software abstractions, distributed algorithms, and the unique challenges of running software in distributed environments. Topics will range from core principles like networking, fault tolerance, and consensus to practical technologies such as distributed file systems, peer-to-peer networks, distributed databases, and cloud computation frameworks. Security, scalability, and real-world failure models will be recurring themes throughout the semester.
The work will be both theoretical and hands-on. You will implement components of distributed systems, experiment with communication and coordination mechanisms, and analyze system behavior. The syllabus includes topics such as inter-process communication, leader election, consensus algorithms like Raft, consistency models, MapReduce, Apache Spark, event streaming with Kafka, and security protocols like TLS and OAuth.
There is no required textbook; detailed lecture notes will be posted online. You are expected to have solid programming skills in Java and/or Python. Exams are spaced throughout the semester, with the lowest score dropped, and quizzes will help reinforce lecture material.
This class is not about using a specific application framework; it is about understanding the fundamental ideas that make distributed computing possible and applying them to real systems. The goal is that by the end of the semester, you will be able to design robust distributed architectures, anticipate the trade-offs in your design decisions, and understand why distributed systems fail in the ways they do.