Last August, I wrote about a classic protocol published by Raynal et al. in 1991 for ensuring that unicast messages in a distributed system are delivered in causal order. The Raynal et al. protocol works by queuing received messages on the recipient’s end until they’re deemed deliverable. It determines the deliverability of a message by inspecting the causal metadata attached to the message. This kind of receiver-side enforcement of causal message delivery seems to be the standard approach – for instance, Birman et al.’s classic causal broadcast protocol does something similar in the setting of broadcast messages.
In a 1995 article1, Mattern and Fünfrocken propose an alternative approach. Instead of eagerly sending messages and delaying delivery on the recipient’s side, their protocol delays messages on the sender’s side until they’re safely sendable, then unconditionally delivers messages on the recipient’s side as soon as they’re received. This sender-side approach to enforcement of causally-ordered delivery seems to be rather less widely used than the receiver-side approach. Let’s take a closer look at how the sender-side approach works and and consider the pros and cons of each.
Motivating example: an awkward work situation
I’ll begin by briefly recapping the motivating example from my post about the Raynal et al. algorithm. Suppose that Alice, Bob, and Carol work together. One day, Alice messages Carol, saying “Let’s meet at 3pm to talk about the Foo project.” Alice then decides that Bob should be in the meeting, too, so she messages Bob, asking, “Can you join a meeting with Carol and me at 3pm?” Both messages go out over an asynchronous network, meaning that there’s no bound on how long they may take to arrive at their destination. In this case, it so happens that Bob sees the message from Alice pretty quickly, and he then messages Carol: “What’s the 3pm meeting with Alice about?” But Carol hasn’t yet heard anything from Alice about a meeting, so she’s confused.
What went wrong here? Alice sent a message to Carol before she sent one to Bob, so we know that her message to Carol happened before, or causally preceded, her message to Bob.2 Likewise, we know that Alice’s message to Bob happened before Bob’s message to Carol, because Bob received Alice’s message before sending one to Carol. Because the happens-before relation is transitive, we can put those two relationships together to see that Alice’s message to Carol happened before Bob’s message to Carol – but that’s not the order Carol saw them in. No wonder Carol is confused.
We can solve this problem by using a messaging protocol that ensures causal message delivery: if a message \(m\) happens before a message \(m'\), then any process delivering both messages should deliver \(m\) before delivering \(m'\). (In distributed systems jargon, delivering a message is something that the process receiving the message can do. It means taking the received message and processing it in some way, such as by handing it off to whatever application might be waiting for it, like Carol’s chat client.)
Receiver-side and sender-side enforcement of causal message delivery
The Raynal et al. protocol uses the receiver-side approach. It decouples message reception from message delivery: when Bob’s message arrives at Carol, the protocol delays delivery of that message until Alice’s message to Carol has been both received and delivered on Carol’s end. Here’s a simplified visualization:
I won’t go into details here about how the Raynal et al. algorithm works, since that’s already discussed in detail in my post from last August. In a nutshell, though, the decision about whether a message is deliverable is made based on causal metadata attached to the message, which requires each participant in the system to do some extra bookkeeping. Worse, it increases the size of each message. In particular, in the Raynal et al. algorithm, the size of the causal metadata is \(O(n^2)\) in a system of \(n\) participants, which could quickly become prohibitively large if we need to scale up to lots of participants.
Mattern and Fünfrocken’s protocol takes a different approach. Instead of delaying message delivery on the receiver’s end, it works by delaying message sending. In the case of Alice, Bob, and Carol’s interaction, Alice’s message to Bob goes into an outgoing message queue, where it waits to be sent until an acknowledgment comes from Carol. There’s therefore no chance of Bob’s message to Carol overtaking Alice’s message to Carol. Here’s a simplified visualization:
Sender-side enforcement of causal message delivery has some advantages. We don’t need to worry about whether or not to deliver a received message – if a process gets to the point of receiving it, it’s always okay to deliver! As a result, there’s no need to keep track of causal metadata or attach it to messages, so there’s no issue with scaling the protocol to lots of participants. An obvious downside, though, is that having to wait for acknowledgements from message recipients could make things slow compared to the receiver-side approach. We’ll consider this trade-off further in a bit.
The sender-side protocol
In Mattern and Fünfrocken’s protocol, the key idea is that after sending a message, a process must wait for an acknowledgment before sending another message. If we treat acknowledgments just like ordinary messages, though, this approach can lead to a deadlock, as Mattern and Fünfrocken explain. In particular, if two processes concurrently send messages to each other, each can get stuck waiting for the other’s acknowledgment, as we can see in the following example with Alice and Bob.
To avoid deadlocks like this, Mattern and Fünfrocken suggest adding an output buffer to each process. The output buffer is a FIFO queue where ordinary outgoing messages (that is, not acknowledgment messages) hang out. Whenever a process has an ordinary message to send, the process enqueues the message in its output buffer. The output buffer waits for messages to appear. Whenever the buffer isn’t empty, it dequeues the next message and transmits it to its destination, then waits for an acknowledgment from the destination process before dequeuing and transmitting another message.
Crucially, acknowledgment messages don’t go through the output buffer of their sender. They’re sent immediately, as soon as an ordinary message is received. On the receiving end, acknowledgment messages go right to the output buffer of the original message’s sender, letting it know that it’s cleared to dequeue and transmit more messages, if there are any.
Adding this level of indirection lets us avoid deadlocks, as illustrated below:
In Mattern and Fünfrocken’s paper, the setup is a bit more complicated than this: each process also has an explicit input buffer, another FIFO queue that hands incoming messages off to the process and is responsible for transmitting acknowledgments back to their senders. In other words, the “main thread” of the process doesn’t do any external communication; the input and output buffers handle it all. In real life, something like Mattern and Fünfrocken’s input buffer would certainly exist at some level of the networking stack. In this post, though, I’m leaving the input buffers out of the diagrams to keep the discussion simple, essentially collapsing their functionality into the main thread. (I’m not pulling any sneaky tricks here; keep in mind that the input buffer doesn’t need to do anything smart to check whether a message’s causal dependencies have been satisfied, as would be necessary in receiver-side enforcement of causal delivery. Rather, it’s merely a FIFO queue that also sends an ack whenever a message gets dequeued.)
With output buffers in place, we can see how Mattern and Fünfrocken’s protocol avoids the awkward work situation we saw earlier. Alice’s outgoing messages go into her output buffer, and her message to Carol can be transmitted over the network right away. However, Alice’s message to Bob has to sit in her output buffer until she’s gotten an acknowledgment that Carol has received (and delivered) her first message. Only then can Alice’s second message be transmitted to Bob. Because Bob’s “What’s the meeting about?” message to Carol is caused by Alice’s message to Bob, that message is naturally also sent later, and so there’s no risk of it overtaking Alice’s message to Carol. Everyone’s messages can always be delivered as soon as they show up on the receiving end. Here’s what the whole execution looks like:
And that’s all there is to it! Just by instituting a simple “don’t send a message until you get an acknowledgment of the last one you sent” policy, we get causal order, without any need to track causal metadata and attach it to messages.
But isn’t this kind of…slow?
The reason the sender-side approach does the job is because it approximates synchronous communication on top of an asynchronous network. It’s well known that synchronous communication is stronger than causal communication (that is, synchronous executions are a subset of causal executions)3, and so any approach that gives you synchronous communication gives you causal communication for free.
That’s also the weakness of the sender-side approach, though – it’s more conservative than necessary. In particular, the protocol is “pessimistic” in the sense that Alice’s message to Bob gets delayed, even though it wouldn’t be a violation of causal delivery for it to be delivered earlier. The receiver-side approach is “optimistic” in the sense that it goes ahead and sends messages whenever, and leaves it to the receving end to sort it out.
Lately, my students and I have been thinking about ways to modify the sender-side approach to regain some of the optimism of the receiver-side approach, while keeping the lack of metadata overhead that the sender-side approach enjoys. One reason to do this: suppose Bob has a time-consuming task to do as a result of receiving a message from Alice, and which needs to be done before he can take further actions (like talking to Carol). If that’s the case, then the delay imposed by the sender-side approach especially sucks, because it prevents Bob from getting started on the time-consuming task.
It’d be nice if Bob could get started on any time-consuming tasks optimistically, with a solemn promise to not tell anyone about the result until Alice has told him that it’s okay.
A “can you keep a secret?” protocol
To that end, here’s a slight tweak to the Mattern and Fünfrocken protocol. In our example execution, things start out the same as before: Alice queues up her message to Carol, and it gets transmitted right away.
Alice then queues up her message to Bob. This is where things differ from the original protocol: because the destination of this message differs from that of the previous, thus far unacknowledged message, Alice does what we’ll call an eager send: she sends the message with a caveat that the receiver cannot take any externally observable actions, such as sending messages, until it gets a follow-up from Alice that says it’s okay. (Messages back to Alice, however, are fine.)
When Bob gets Alice’s message, he can get started on any time-consuming tasks right away. When Alice has received both the acknowledgment of her eager send and the acknowledgment of her previous normal send, she sends another message to Bob that says, essentially, “OK, now you can tell.” At that point, Bob can send messages or do whatever he wants.
In a nutshell: if a message in the output buffer has a different destination than preceding messages in the output, then it can be “eagerly” sent and later followed up with a “now you can tell” message. The receiver cannot send any messages (except back to the eager sender) until receiving the “now you can tell” message, but it may take internal actions and receive messages.
Is this a done thing? If there are existing protocols that use this kind of “here you go, but don’t tell yet”/”now you can tell” approach, I’d be interested to hear about it!
Some design considerations
Here are a couple of things to consider when designing a “can you keep a secret?” protocol:
- Should a recipient acknowledge eagerly sent messages right away, or wait until it gets the “now you can tell” message to send an ack? My answer: I think it has to be immediate, because we need to make sure the now-you-can-tell message doesn’t overtake the eagerly sent message, so we should wait until getting an ack for the eagerly sent message before sending the now-you-can-tell message.
- Can you do an eager send as a result of an eager send? Consider what would happen here when Bob gets done processing Alice’s message and then queues up his own to send. If Bob could eagerly send a message to Carol, it could overtake Alice’s message on the way to Carol, which is the violation of causal delivery that we wanted to prevent in the first place. So I’m inclined to say that no, you can’t do an eager send as a result of an eager send. (On the other hand: if Carol does get an eager send from Bob too soon, consider what she’s allowed to do as a result. She can’t take any externally observable actions! So it’s not like anyone can find out that she got the message out of causal order. If a message violates causal delivery in the forest and nobody hears it, does it make a sound?)
- Okay, let’s say you can’t do an eager send as a result of an eager send. When a process wants to send a message as a result of an eager send it previously received, should the new message go into the output buffer while waiting for the now-you-can-tell message, or should it not even be buffered yet? This I’m not sure about.
- A downside of the “can you keep a secret?” approach seems to be that it creates more message traffic. As a PL person, I can’t help but think that some sort of language-based information flow analysis could be helpful here. If Alice is sending two messages and an analysis can determine that they are in fact not related, then it seems that she ought to be able to send the second without regard to whether she’s heard back about the first. The bigger issue here is that the standard happens-before relation is quite coarse, and relates a lot of messages that in reality are not causally related, and just happened to be sent in a given order for some arbitrary reason. We use happens-before because it’s a reasonable overapproximation of actual causality that’s relatively easy to compute. But if we can get some input from the language level on what secrets actually need to be kept, maybe we can use a “can you keep a secret?” protocol judiciously. (If you’re interested in working on this stuff, let me know!)
We’re glibly throwing around the phrase “causally preceded” here, but don’t read too much into it: Lamport’s happens-before relation doesn’t actually tell us that a message caused another message, only that it could have caused that message. For instance, Alice might happen to send two unrelated messages in sequence, and then they would be related by happens-before even though they have no meaningful causal relationship. Moreover, in real life, causal relationships could be created via a side channel, in which case we don’t even know that messages that are causally related will be related by happens-before! For instance, Alice could message Carol about the 3pm meeting and then run into Bob in the hallway and verbally ask him to join the meeting, and Bob could then message Carol asking what the meeting is about. Alas, the causal message delivery protocols we’re discussing in this post have no way to prevent Carol’s confusion in such a side-channel scenario. ↩
A cool paper appearing this week at POPL puts several flavors of message-passing communication in a strict hierarchy, with asynchronous the weakest, synchronous the strongest, and causal in the middle. (That particular result isn’t new, but the paper has other novel results.) ↩