This year at SNAPL, I’ll be presenting a new paper, co-authored with my colleague Peter Alvaro, that collects our recent thinking on consistency-aware solvers. Here’s the abstract:

To guard against machine failures, modern internet services store multiple replicas of the same application data within and across data centers, which introduces the problem of keeping geo-distributed replicas consistent with one another in the face of network partitions and unpredictable message latency. To avoid costly and conservative synchronization protocols, many real-world systems provide only weak consistency guarantees (e.g., eventual, causal, or PRAM consistency), which permit certain kinds of disagreement among replicas.

There has been much recent interest in language support for specifying and verifying such consistency properties. Although these properties are usually beyond the scope of what traditional type checkers or compiler analyses can guarantee, solver-aided languages are up to the task. Inspired by systems like Liquid Haskell and Rosette, we believe that close integration between a language and a solver is the right path to consistent-by-construction distributed applications. Unfortunately, verifying distributed consistency properties requires reasoning about transitive relations (e.g., causality or happens-before), partial orders (e.g., the lattice of replica states under a convergent merge operation), and properties relevant to message processing or API invocation (e.g., commutativity and idempotence) that cannot be easily or efficiently carried out by general-purpose SMT solvers that lack native support for this kind of reasoning.

We argue that domain-specific SMT-based tools that exploit the mathematical foundations of distributed consistency would enable both more efficient verification and improved ease of use for domain experts. The principle of exploiting domain knowledge for efficiency and expressivity that has borne fruit elsewhere – such as in the development of high-performance domain-specific languages that trade off generality to gain both performance and productivity – also applies here. Languages augmented with domain-specific, consistency-aware solvers would support the rapid implementation of formally verified programming abstractions that guarantee distributed consistency. In the long run, we aim to democratize the development of such domain-specific solvers by creating a framework for domain-specific solver development that brings new theory solver implementation within the reach of programmers who are not necessarily SMT solver internals experts.

I’m excited about this paper because it brings together a number of themes in my recent research life. We PL folk like to talk about how choosing the right high-level abstractions for a given domain can make possible particular optimizations or smart scheduling choices that result in great performance; that’s what my colleagues and I were after a couple years ago at Intel Labs with our work on ParallelAccelerator, for instance. But PL people don’t have a monopoly on this way of thinking; SMT solver developers have known it for a long time. For me, having the opportunity to work with the Reluplex team opened my eyes to the power of domain-specific SMT solvers to exploit domain knowledge for efficiency. In that case, the solver knows about the ReLU activation functions used in many neural networks and has a clever way of encoding them in SMT formulas. Having this extra piece of domain knowledge as part of the solver makes it much faster to verify certain properties of neural networks than it could have been with a general-purpose solver. The same principle should apply to all kinds of domains where we want to make solvers practical – including and especially reasoning about distributed systems, which is the domain I’m most interested in these days! In our paper, we give a few examples of ways in which we think a consistency-aware solver could help us build better systems.

Regardless of the domain of interest, though, the most ambitious part of what we’re proposing has to do with democratizing solver development. In principle, SMT solvers like Z3 should be easily extensible, allowing anyone to write their own domain-specific theory solver (ideally in a high-level language!), plug it in, and reap the efficiency benefits. The design of modern “lazy” SMT solvers would seem to lend itself to a modular style of development. In practice, though, it seems that SMT solvers are somewhat monolithic and tricky to extend, and theory solvers have to be written in low-level C++, by people who are already SMT solver internals experts.

For implementing domain-specific languages, we have things like the Delite framework, which aimed to make it possible to rapidly implement high-performance DSLs embedded in Scala without necessarily having to be an expert on the guts of compilers. Could we have something like Delite, but for implementing high-performance, domain-specific solvers, including (but not limited to) solvers that can reason about distributed consistency? (Come do a Ph.D. with me here at UCSC, and let’s find out!)

Comments