Schedule of topics and readings

This schedule is tentative, and it is neither sound (that is, if a particular reading or topic is listed here, that doesn’t mean we’ll cover it!) nor complete (that is, if a particular reading or topic is not listed here, that doesn’t mean we won’t cover it!).

The course will alternate between Discussion and Lecture days. For days marked Discussion, you must turn in your reading response for that day’s reading assignment on Canvas by 5pm Pacific time on the day prior to the day of class. For example, for the paper to be discussed on Wednesday, September 29, you must turn in your reading response on Canvas by 5pm Pacific time on Tuesday, September 28.

Date Topic Notes
Friday, 9/24 Course overview  
Monday, 9/27 Lecture: distributed systems: what and why?; time and clocks; Lamport diagrams Start-of-course survey due by start of class today
Wednesday, 9/29 Discussion: Jim Waldo et al., “A Note on Distributed Computing” (1994) Reading response due 5pm PT on Tuesday, 9/28; scribes’ wiki write-up due 11:59:59pm PT on Wednesday, 10/6
Friday, 10/1 Lecture: causality and happens-before; network models; state and events; partial orders; total orders; Lamport clocks; vector clocks  
Monday, 10/4 Discussion: Leslie Lamport, “Time, Clocks, and the Ordering of Events in a Distributed System” (CACM 1978) (skip the section “Physical Clocks”) Reading response due 5pm PT on Sunday, 10/3; scribes’ wiki write-up due 11:59:59pm PT on Monday, 10/11
Wednesday, 10/6 Lecture: delivery vs. receiving; properties of executions: FIFO delivery, causal delivery, totally-ordered delivery; implementing FIFO delivery; implementing causal broadcast  
Friday, 10/8 Discussion: Reinhard Schwarz and Friedemann Mattern, “Detecting Causal Relationships in Distributed Computations: In Search of the Holy Grail” (Distributed Computing, 1994) (sections 1-3 only) Reading response due 5pm PT on Thursday, 10/7; scribes’ wiki write-up due 11:59:59pm PT on Friday, 10/15
Monday, 10/11 Lecture: causal histories, cuts and consistent cuts; recap of causal broadcast; safety and liveness  
Wednesday, 10/13 Discussion: Kenneth Birman, André Schiper, and Pat Stephenson, “Lightweight Causal and Atomic Group Multicast” (TOCS, 1991) (sections 1-5 only) Reading response due 5pm PT on Tuesday, 10/12; scribes’ wiki write-up due 11:59:59pm PT on Wednesday, 10/20
Friday, 10/15 Lecture: introduction to distributed snapshots; Chandy-Lamport snapshot algorithm  
Monday, 10/18 Discussion: K. Mani Chandy and Leslie Lamport, “Distributed Snapshots: Determining Global States of Distributed Systems” (TOCS, 1985) Reading response due 5pm PT on Sunday, 10/17; scribes’ wiki write-up due 11:59:59pm PT on Monday, 10/25
Wednesday, 10/20 Lecture: Chandy-Lamport wrap-up: limitations, assumptions, properties; uses of snapshots; centralized vs. decentralized algorithms; more on safety and liveness  
Friday, 10/22 Discussion: Bowen Alpern and Fred B. Schneider, “Defining Liveness” (Information Processing Letters, 1985) Reading response due 5pm PT on Thursday, 10/21; scribes’ wiki write-up due 11:59:59pm PT on Friday, 10/29
Monday, 10/25 Lecture: fault classification and fault models; two generals problem; reliable delivery; idempotence; so-called “exactly-once” delivery  
Wednesday, 10/27 Discussion: Joseph Y. Halpern and Yoram Moses, “Knowledge and Common Knowledge in a Distributed Environment” (JACM, 1990) (sections 1-4 only) Reading response due 5pm PT on Tuesday, 10/26; scribes’ wiki write-up due 11:59:59pm PT on Wednesday, 11/3
Friday, 10/29 Lecture: forms of fault tolerance; reliable broadcast; implementing reliable broadcast; redundancy and fault tolerance; preview of replication  
Monday, 11/1 Discussion: Felix C. Gärtner, “Fundamentals of Fault-Tolerant Distributed Computing in Asynchronous Environments” (ACM Computing Surveys, 1999) Reading response due 5pm PT on Sunday, 10/31; scribes’ wiki write-up due 11:59:59pm PT on Monday, 11/8
Wednesday, 11/3 Lecture: total order vs. determinism; consistency models (informally); primary-backup replication; chain replication; latency and throughput  
Friday, 11/5 Discussion: Robbert van Renesse and Fred B. Schneider, “Chain Replication for Supporting High Throughput and Availability” (OSDI 2004) Reading response due 5pm PT on Thursday, 11/4; scribes’ wiki write-up due 11:59:59pm PT on Friday, 11/12
Monday, 11/8 Lecture: introduction to consensus; problems equivalent to consensus; the FLP result; introduction to Paxos  
Wednesday, 11/10 Discussion: Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson, “Impossibility of Distributed Consensus with One Faulty Process” (JACM 1985) Reading response due 5pm PT on Tuesday, 11/9; scribes’ wiki write-up due 11:59:59pm PT on Wednesday, 11/17
Friday, 11/12 Lecture: Paxos: the interesting parts; Multi-Paxos  
Monday, 11/15 Discussion: Leslie Lamport, “Paxos Made Simple” (2001) Reading response due 5pm PT on Sunday, 11/14; scribes’ wiki write-up due 11:59:59pm PT on Monday, 11/22
Wednesday, 11/17 Lecture: fault tolerance of Paxos; brief mention of other consensus protocols: Viewstamped Replication, Zab, Raft; eventual consistency; strong convergence and strong eventual consistency; intro to application-specific conflict resolution  
Friday, 11/19 Discussion: Tushar Chandra, Robert Griesemer, and Joshua Redstone, “Paxos Made Live: An Engineering Perspective” (invited talk, PODC 2007) Reading response due 5pm PT on Thursday, 11/18; scribes’ wiki write-up due 11:59:59pm PT on Friday, 11/26
Monday, 11/22 Lecture: network partitions; availability; the consistency/availability trade-off; anti-entropy with Merkle trees; gossip; quorum consistency  
Wednesday, 11/24 Lecture: sharding; consistent hashing  
Friday, 11/26 No class (Thanksgiving break)  
Monday, 11/29 Discussion: Giuseppe DeCandia et al., “Dynamo: Amazon’s Highly Available Key-value Store” (SOSP 2007) Reading response due 5pm PT on Sunday, 11/28; scribes’ wiki write-up due 11:59:59pm PT on Monday, 12/6
Wednesday, 12/1 Lecture: online systems vs. offline systems, raw data vs. derived data; MapReduce  
Friday, 12/3 Discussion: Jeffrey Dean and Sanjay Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters” (OSDI 2004) Reading response due 5pm PT on Thursday, 12/2; scribes’ wiki write-up due 11:59:59pm PT on Friday, 12/10
Thursday, 12/9 No final exam. Enjoy your winter break!