Schedule of topics and readings

This schedule is tentative, and it is neither sound (that is, if a particular reading or topic is listed here, that doesn’t mean we’ll cover it!) nor complete (that is, if a particular reading or topic is not listed here, that doesn’t mean we won’t cover it!).

The course will alternate between Discussion and Lecture days. For days marked Discussion, you must turn in your reading response for that day’s reading assignment on Canvas by 11:59:59pm Pacific time on the day prior to the day of class. For example, for the paper to be discussed on Wednesday, October 4, you must turn in your reading response on Canvas by 11:59:59pm Pacific time on Tuesday, October 3.

Date Topic Notes
Friday, 9/29 Course overview  
Monday, 10/2 Lecture: distributed systems: what and why?; time and clocks; Lamport diagrams Start-of-course survey due by start of class today
Wednesday, 10/4 Discussion: Jim Waldo et al., “A Note on Distributed Computing” (1994) Reading response due 11:59:59pm PT on 10/3; scribes’ wiki write-up due 11:59:59pm PT on 10/11
Friday, 10/6 Lecture: causality and happens-before; network models; partial orders; total orders; Lamport clocks  
Monday, 10/9 Discussion: Leslie Lamport, “Time, Clocks, and the Ordering of Events in a Distributed System” (CACM 1978) (skip the section “Physical Clocks”) Reading response due 11:59:59pm PT on 10/8; scribes’ wiki write-up due 11:59:59pm PT on 10/16
Wednesday, 10/11 Lecture: vector clocks; properties of executions; delivery vs. receiving; FIFO delivery; causal delivery  
Friday, 10/13 Discussion: Reinhard Schwarz and Friedemann Mattern, “Detecting Causal Relationships in Distributed Computations: In Search of the Holy Grail” (Distributed Computing, 1994) (sections 1-3 only) Reading response due 11:59:59pm PT on 10/12; scribes’ wiki write-up due 11:59:59pm PT on 10/20
Monday, 10/16 Lecture: totally-ordered delivery; implementing FIFO delivery; implementing causal broadcast  
Wednesday, 10/18 Discussion: Kenneth Birman, André Schiper, and Pat Stephenson, “Lightweight Causal and Atomic Group Multicast” (TOCS, 1991) (sections 1-5 only) Reading response due 11:59:59pm PT on 10/17; scribes’ wiki write-up due 11:59:59pm PT on 10/25
Friday, 10/20 Lecture: uses of causality in distributed systems; consistent snapshots; Chandy-Lamport snapshot algorithm  
Monday, 10/23 Discussion: K. Mani Chandy and Leslie Lamport, “Distributed Snapshots: Determining Global States of Distributed Systems” (TOCS, 1985) Reading response due 11:59:59pm PT on 10/22; scribes’ wiki write-up due 11:59:59pm PT on 10/30
Wednesday, 10/25 Lecture: Chandy-Lamport wrap-up: limitations, assumptions, properties; uses of snapshots; centralized vs. decentralized algorithms; recap of delivery guarantees and protocols; safety and liveness; reliable delivery  
Friday, 10/27 Discussion: Seth Gilbert and Nancy A. Lynch, “Perspectives on the CAP Theorem” (IEEE Computer, 2012) Reading response due 11:59:59pm PT on 10/26; scribes’ wiki write-up due 11:59:59pm PT on 11/3
Monday, 10/30 Lecture: fault classification and fault models; two generals problem  
Wednesday, 11/1 Discussion: Joseph Y. Halpern and Yoram Moses, “Knowledge and Common Knowledge in a Distributed Environment” (JACM, 1990) (sections 1-4 only) Reading response due 11:59:59pm PT on 10/31; scribes’ wiki write-up due 11:59:59pm PT on 11/8
Friday, 11/3 Lecture: implementing reliable delivery; idempotence; at-least-once/at-most-once/exactly-once delivery; unicast/broadcast/multicast; reliable broadcast; implementing reliable broadcast; preview of replication  
Monday, 11/6 Discussion: Felix C. Gärtner, “Fundamentals of Fault-Tolerant Distributed Computing in Asynchronous Environments” (ACM Computing Surveys, 1999) Reading response due 11:59:59pm PT on 11/5; scribes’ wiki write-up due 11:59:59pm PT on 11/13
Wednesday, 11/8 Lecture: replication; total order vs. determinism; consistency models: FIFO, causal, “strong”; primary-backup replication; chain replication; latency and throughput  
Friday, 11/10 No class (Veterans Day)  
Monday, 11/13 Discussion: Robbert van Renesse and Fred B. Schneider, “Chain Replication for Supporting High Throughput and Availability” (OSDI 2004) Reading response due 11:59:59pm PT on 11/12; scribes’ wiki write-up due 11:59:59pm PT on 11/20
Wednesday, 11/15 Lecture: handling node failure in replication protocols; introduction to consensus; problems equivalent to consensus; the FLP result; introduction to Paxos  
Friday, 11/17 Discussion: Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson, “Impossibility of Distributed Consensus with One Faulty Process” (JACM 1985) Reading response due 11:59:59pm PT on 11/16; scribes’ wiki write-up due 11:59:59pm PT on 11/24
Monday, 11/20 Lecture: Paxos: the interesting parts; Multi-Paxos  
Wednesday, 11/22 Lecture (optional): MapReduce  
Friday, 11/24 No class (Thanksgiving break)  
Monday, 11/27 Discussion: Leslie Lamport, “Paxos Made Simple” (2001) Reading response due 11:59:59pm PT on 11/26; scribes’ wiki write-up due 11:59:59pm PT on 12/4
Wednesday, 11/29 Lecture: fault tolerance of Paxos; brief mention of other consensus protocols: Viewstamped Replication, Zab, Raft; passive vs. active (state machine) replication  
Friday, 12/1 Discussion: Tushar Chandra, Robert Griesemer, and Joshua Redstone, “Paxos Made Live: An Engineering Perspective” (invited talk, PODC 2007) Reading response due 11:59:59pm PT on 11/30; scribes’ wiki write-up due 11:59:59pm PT on 12/8
Monday, 12/4 Lecture: eventual consistency; strong convergence and strong eventual consistency; intro to application-specific conflict resolution; network partitions; availability; the consistency/availability trade-off; anti-entropy with Merkle trees; gossip; quorum consistency  
Wednesday, 12/6 Discussion: Giuseppe DeCandia et al., “Dynamo: Amazon’s Highly Available Key-value Store” (SOSP 2007) Reading response due 11:59:59pm PT on 12/5; scribes’ wiki write-up due 11:59:59pm PT on 12/13
Friday, 12/8 End-of-course wrap-up and AMA; overview of Lindsey’s group’s research  
Thursday, 12/14 No final exam. Enjoy your winter break!