“Can you tell us about your distrusted system course?” – a CSE30 student

Welcome! This is an overview of the winter 2024 edition of CSE138 (Distributed Systems), an upper-division undergraduate course in the Computer Science and Engineering Department at the UC Santa Cruz Baskin School of Engineering.

Course staff

Instructor

Lindsey Kuper

Hi, I’m Lindsey Kuper! You should call me by my first name.

  • Email: for anything CSE138-related, DM me on the course Zulip instead. Otherwise: lkuper@ucsc.edu
  • Office hours: Mondays, 4:30-5:30pm, on Zoom (link shared on Zulip)
  • Research areas: programming languages, distributed systems, software verification, concurrency, parallelism (see my academic web page and my blog for more information about my research)
  • Pronouns: she/her/hers


Teaching assistants

These friendly folks lead sections, hold office hours, help write the assignments, and generally help run things around here.

Shun Kashiwa

Shun Kashiwa


Yan Tong

Yan Tong


Tutors

These friendly folks are here to talk through course concepts and homework with you.

Albert Lee

Albert Lee


Joey Ma

Joey Ma


Ishaan Singh

Ishaan Singh

  • Email: isingh15@ucsc.edu
  • Tutoring hours: Thursdays, 12:30 - 2:30pm, Mchenry Room 0345.


Joost 'Will' Vonk

Joost “Will” Vonk


You

  • All ten colleges represented!
  • Several majors represented!

A few essential details about the course

  • Upper-division undergraduate course
  • Significant “programming-in-the-large” project component
  • Class meets Tuesdays and Thursdays, 3:20-4:55pm Pacific time, in Rachel Carson 240
  • Two sections (optional, but encouraged) meet weekly (see above for times and locations)
  • Final exam: 4-7pm Pacific time on Monday, March 18
  • Zulip (for almost everything: announcements, live chat during lecture, Q&A, communicating with course staff, and socializing): https://ucsc-cse138.zulipchat.com (all enrolled and waitlisted students will receive an invitation by email)
  • Canvas (for grades): https://canvas.ucsc.edu/courses/68015
  • Course web page (the page you’re looking at right now): https://decomposition.al/CSE138-2024-01/ (URLs are case-sensitive after the domain name, so be sure to capitalize “CSE”)

What’s this course about?

The field of distributed systems studies the design, implementation, and behavior of systems that involve independent components that communicate by passing messages to one another over a network. In addition to the usual challenges of concurrency, distributed systems may be characterized by unbounded latency between components and independent failure of components, making them challenging to reason about and debug.

Some of the foundational distributed systems concepts we’ll explore in this course are:

  • Time and asynchrony. No two computers can reason about each others’ perception of time. What does it mean to talk about time when we don’t share a clock?
  • Fault tolerance and replication. Given that computers crash and messages lost, how can we write protocols and algorithms that have adequate redundancy to tolerate failure? Maybe if I think a computer will crash, it’s a good idea to run the same computation on more than one! Maybe if I think messages will be lost, I should send the same message more than once!
  • Consistency and consensus. Is our system storing the right data and providing the right responses? I might have two “replicas” that aren’t actually replicas! If replicas disagree, how do we know which one is right?
  • Parallelism. Why deal with all the pain of distributed systems? Sometimes, if you throw a bunch of computers at a problem, you can do things faster – much faster.

The schedule has more details!

Official course description

From the course catalog:

Covers topics in distributed computing including communication, naming, synchronization, consistency and replication, fault tolerance, and security. Examples drawn from peer-to-peer systems, online gaming, the World Wide Web; other systems also used to illustrate approaches to these topics. Students implement simple distributed systems over the course of the quarter. (Formerly CMPS 128, Distributed Systems: File Sharing, Online Gaming, and More.)

Background you’ll need

CSE138 has CSE130 (Principles of Computer Systems Design) as a prerequisite.

You should be comfortable working in at least one programming language at a reasonably large scale (i.e., programs that span multiple files).

You should be willing and able to write code with a team. This will most likely involve using tools intended for programming-in-the-large, such as version control systems and continuous integration tools. It’s up to you to know (or learn) how to use these tools effectively, in a way that works for your team.

You should be willing and able to jump in, read documentation, and try things out to become familiar with tools (such as, say, Docker) that you or your teammates may not already be familiar with.

Regardless of your background, the course staff wants you to succeed and is here to help.

Course project

This course has a significant project component. For the course project, you will work together with a team to implement a distributed, sharded key-value store. You will do the project using your team’s programming language and tools of choice, using the Docker containerization platform. The project will be done in a series of cumulative assignments, with each assignment building on the previous one.

You will work together with a team on the project. (The exception is the first assignment, which must be completed individually.) Teams must have a minimum of two and a maximum of three students; we recommend having three. This arrangement is meant to be “fault-tolerant”: if someone on your team drops the class, the remaining people should still be able to handle the workload. (If everyone else on your team drops the class, talk to the course staff ASAP.)

Assignment descriptions will become available over the course of the quarter. If you want to get a head start on planning when to work on what, the deadlines for all the assignments are already on the schedule of topics. Here’s a rough sketch of what the assignments will involve (subject to change):

  • Assignment 1: You (individually) will learn to use Docker. (This is the only assignment that will be done individually.)
  • Assignment 2: You (with your team) will build a RESTful key-value store that will go in a Docker container.
  • Assignment 3: You (with your team) will make your key-value store fault-tolerant, by replicating it across multiple nodes in communication with each other, each with its own container.
  • Assignment 4: You (with your team) will make your key-value store scalable, by sharding the data across multiple partitions.

It’s going to be a challenge! Remember: We’re here to help.

Optional creative project

This course will have an optional creative project component. The creative project, which must be done independently (i.e., not with a team), gives you the opportunity to share your knowledge of the course material with the world in a fun way. I’ll provide more information about the creative project later in the quarter.

The optional creative project is not extra credit; rather, it is “alternative credit”, taking the place of another part of your grade.

Sections

Section attendance is optional, but the sections are the place to go if you want to review lecture material or discuss the homework assignments. Sections are run by the TAs.

You should feel free to attend any section, not just one you happen to be enrolled in.

Reading assignments

This course doesn’t have a textbook. We will, however, have a couple of assigned readings that will supplement the lecture material from class.

The best material to study from will be the notes you take in class (so it’s essential that you take good notes), and the conversations you have with fellow students and course staff (e.g., on Zulip).

Grading

We aim for grade transparency – your final grade in this course should be no surprise.

You can earn 500 points over the course of the quarter.

Your grade has four components:

  • Surveys and lecture quizzes: 10% (50 points)
  • Midterm: 20% (100 points)
  • Final: 24% (120 points)
  • Course project: 46% (230 points: 20 points for assignment 1, 70 points each for assignments 2-4)

However, students who choose to do the optional creative project can opt to have it account for 10% of their grade in the course, replacing part of the exam grade, in which case the grade percentage allocation will look like this:

  • Creative project: 10% (50 points)
  • Surveys and lecture quizzes: 10% (50 points)
  • Midterm: 15% (75 points)
  • Final: 19% (95 points)
  • Course project: 46% (230 points: 20 points for assignment 1, 70 points each for assignments 2-4)

Notice that with the creative project, the course project and the surveys and lecture quizzes are worth the same amount as they were before, so the creative project is not a way to get out of having to work hard on the course project or participate in class. On the other hand, if taking exams is not your strong point, a killer creative project could help make up for not-so-good exam grades. However, you do still have to take both of the exams in their entirety; they’re just worth a smaller percentage of the grade.

Furthermore, the creative project is a lot of work to do well, and it is entirely possible to ignore the creative project and still earn an A+ in the class. So you should truly only do it if it sounds like fun to you.

Grades are assigned as follows. It’s pretty standard, except that we don’t give C- grades:

  • A+: 485-500 points
  • A: 465-484 points
  • A-: 450-464 points
  • B+: 435-449 points
  • B: 415-434 points
  • B-: 400-414 points
  • C+: 385-399 points
  • C: 350-384 points
  • D+: 335-349 points
  • D: 315-334 points
  • D-: 300-314 points
  • F: <=299 points

We will grade surveys and lecture quizzes based only on your participation, not on the content of your answers. Think of them as a test of us, not as a test of you: if lots of people are confused about something, it’s a sign that we may need to spend more time on a concept or explain it more thoroughly.

Your team can earn extra credit points on assignments by being the first to find and report any typos and mistakes in the assignment specs. (So it’s in your interest to read the assignment specs closely as soon as they’re available, to look for mistakes! Several teams have gotten extra credit for this in past iterations of the course.)

Needless to say, the above grading approach assumes no violations of academic integrity.

Academic integrity on exams

Academic integrity is one of our core principles at UCSC. Cheating in any form harms everyone in our community. If cheating is suspected, the academic records of all UCSC students become suspect, and much less valuable when they graduate.

With this in mind:

  • You must work alone on the exams in this course, and not communicate with anyone other than the course staff about the exams while the exams are ongoing.
  • You may use the following resources:
    • Your own notes
    • The lecture videos, materials provided by the course staff, or other materials that are freely available on the web (i.e., findable with a web search and not password-protected), unless using those materials would violate the “work alone” rule
    • Pen/pencil and paper
  • You are required to cite any sources you use on exams. If you need to look something up to answer a question, cite it in your answer. You are not allowed to copy and paste from sources. If your answer to an exam question contains material that is copied from a source, at a minimum, you will get no credit for your answer to that question, and you risk further penalties, e.g., failing the exam.
  • Any evidence that you collaborated with other students during an exam, had outside help on an exam, or used resources other than the ones listed above will result in your exam being nullified and a failing grade in the class. (This happened to two students in the spring 2020 edition of this course. Don’t let it happen to you.)

The “no copying and pasting” policy merits further explanation. Students have unintentionally run afoul of this policy before, because they copied and pasted something from an online source into their notes while studying, and then, later on, copied and pasted from their notes into an exam. My advice to avoid this problem is to avoid copying and pasting. I recommend watching this video on good citation practice from Prof. Kevin Karplus, which gives the same advice. If you must copy and paste something from a source into your notes, always put quotes around it and add a pointer to the source so you remember that the writing isn’t your own.

Academic integrity on assignments

You’re expected and encouraged to discuss your work on assignments with others. That said, all the work you turn in for this course must be your own, independent work (for assignment 1) or the independent work of your team (for subsequent assignments). Students who do otherwise risk failing the course.

You can ask the TAs, the tutors, and classmates for advice, but you cannot copy from anyone else: once you understand the concepts, you must write your own code. While you work on your own homework solution, you can:

  • Discuss with others the general techniques involved, without sharing your code with them.
  • Use publicly available resources such as online documentation.

In the README.md file you include with each assignment, you are required to include the following sections:

  • Team Contributions lists each member of the team and what they contributed to the assignment. (There’s no need to include this for assignment 1, since assignment 1 is done independently.)
  • Acknowledgments lists people you discussed the assignment with and got help from. List each person you talked to and the concept that they helped with.
  • Citations is for citing sources you used. For anything you needed to look up, document where you looked it up.

Thorough citation is the way to avoid running afoul of accusations of misconduct. We adopt the following policy in this course (suggested by Prof. Ethan Miller):

If you copied material but cited its source, it won’t be considered academic misconduct. The course staff may decide that your assignment loses a few points if you copied a few lines from Stack Overflow, but it won’t be considered cheating. Even copying your entire program isn’t cheating, though you won’t get any credit for someone else’s code. But if you copied and didn’t cite your source, that’s misconduct.

Policy on the use of LLM-based tools like ChatGPT

In the work you do for this course, you are completely welcome to use tools based on large language models (LLMs), such as ChatGPT. Such tools can be incredibly useful, and it may be worth your while to learn how to use them. However, it’s important to be aware of the limits of LLM-based tools, and of how to use and cite them properly. In particular, here’s what you need to know for this class:

  • You must acknowledge your use of LLM-based tools. If you use LLM-based tools for any of the work you submit for this class, you must cite your sources as described above, explaining what tool you used, what you used it for, and exactly what prompts you used to get the results. Failure to do so is academic misconduct.
  • If you provide low-effort prompts, you will get low-quality results. You will need to refine your prompts in order to get good outcomes. This will take work.
  • Don’t trust anything that an LLM-based tool says. It will often make up fake citations or hallucinate plausible-but-wrong answers to questions. If it gives you a number or fact, assume it is wrong unless you either know the answer or can check with another source. You will be responsible for any errors or omissions provided by the tool. It works best for topics you understand.

(This policy is based on the article “Why All Our Classes Suddenly Became AI Classes” by Ethan Mollick and Lilach Mollick.)

All that said, I will now make a case for why you might not want to use LLM-based tools in this course.

If you’re enrolled in this elective course, it’s presumably because you are at least a little bit interested in understanding the subject matter. The knowledge and experiences you’ve gained in your life so far will inform that understanding. By building on those experiences and putting them together with what you glean from this course, you will probably think of things that I’ve never thought of before. You may even think of things that no one has thought of before! I am excited to see those ideas emerge in the work you do for this class. However, as Rob Ricci puts it, “I can’t tell you exactly what this should look like, because the point is that I want to see things that are different from what everyone else is writing.”

LLM-based tools do the opposite of that: they produce statistically likely responses to prompts. If you ask ChatGPT a question, the best you can typically hope for is that it will produce – with great confidence, and in impeccable English prose – “a thoroughly mediocre response that is indistinguishable from all the other text it has ingested on the topic”, to quote Rob again. And that’s the best case. Often, something worse will happen: it will produce output that contains information that is plausible but false.

So, please understand that while you are welcome to use LLM-based tools in this course, you should be aware of their limitations.

(Of course, I haven’t even begun to address the topic of the myriad environmental, social, and ethical problems involved in the training, deployment, and use of LLM-based tools, but that’s a topic for another course.)

What should I do if I’m struggling in this course?

Talk to the course staff immediately. Tell us what you’re having trouble with; it’s our job to help. Ask questions in lecture, in discussion sections, in office hours, in tutoring sessions, and on Zulip. Keep coming to class, keep turning in the homework even if it’s only partially done, keep doing the lecture quizzes, and continue to make a good-faith effort to succeed in the course.

Do not wait until the last minute to come talk to us. Do not assume that you can catch up later if you fall behind. Later course material builds on earlier course material, so if you become lost, it will be hard to catch up unless you come talk to us.

Acknowledgment

This course was originally based on Peter Alvaro’s course design, and it borrows some material from his previous editions of the course. That said, I take full responsibility for my own course, so if you don’t like the design of this course, you should complain to me, not Peter!

Disability accommodations

If you have a disability and you require accommodations to achieve equal access in this course, please submit your Accommodation Authorization Letter from the Disability Resource Center (DRC) to me via the Accommodate system, preferably within the first two weeks of the quarter. I am eager to discuss ways we can ensure your full participation in the course.

I encourage all students who may benefit from learning more about DRC services to contact the DRC.