Research projects
My research work draws on the traditions of several subfields of computer science: programming languages, software verification, and concurrent and distributed computing. The programming languages research tradition is where I’m most at home, and it informs my approach to solving computing problems: by developing general abstractions for defining problems and solutions, and by considering software behavior in a rigorous and general way.
This page summarizes research projects I’ve contributed to in some way over the years, in more or less reverse chronological order. I try to keep it up to date, but the most recent stuff I’m thinking about might not be here – if you’re curious, ask me about it in person sometime.
- Formal foundations of the Entity-Component-System pattern
- Emulation and representation independence of replicated data structures
- Monolift: automating distribution with the tools you have at home
- Inductive diagrams for causal reasoning
- Encoding actors with exceptions
- Library-level choreographic programming
- Surveying bugs in serverless runtime systems
- Verifying distributed systems with refinement types
- Integrating interactive and automated theorem proving
- Verification for safety-critical neural networks
- Parallelizing Julia with a non-invasive DSL
- LVars: lattice-based data structures for deterministic parallel and distributed programming
- River Trail: parallel programming in JavaScript
- Various contributions to the Rust programming language
- Parametric polymorphism through run-time sealing
- Testing abstract representations of CPU semantics
- A pattern matcher for miniKanren
Formal foundations of the Entity-Component-System pattern
My PhD student Patrick Redmond is leading our group’s exploration of the Entity-Component-System (ECS) software design pattern, long used in the game development industry. Software written using the ECS pattern is naturally concurrent, and the pattern offers modularity, flexibility, and performance benefits that have led to a proliferation of ECS frameworks, yet it is little-known and poorly understood outside of the game development domain. In our OOPSLA ‘25 paper, we present Core ECS, the first formal model of the ECS pattern. Core ECS abstracts away the details of specific ECS frameworks to reveal the essence of software using the pattern. We identified a class of Core ECS programs that behave deterministically regardless of scheduling, enabling use of the ECS pattern as a general-purpose deterministic-by-construction concurrent programming model. Using Core ECS as a point of comparison, we then surveyed several real-world ECS frameworks and found that they all leave opportunities for deterministic concurrency unexploited. Our findings point toward an exciting new implementation approach for ECS frameworks, which we are currently pursuing.
Emulation and representation independence of replicated data structures
Conflict-free replicated data types (CRDTs) are distributed data structures designed for fault tolerance and high availability. CRDTs can be implemented either by propagating state updates among replicas (known as the state-based approach) or by replicating the entire operation stream (known as the op-based approach). In the literature, state-based and op-based CRDTs have been considered equivalent due to the existence of algorithms that let them emulate each other. Verification techniques and results that apply to one kind of CRDT are said to apply to the other by appealing to this equivalence. However, what exactly it means for state-based and op-based CRDTs to emulate each other had never been made fully precise. Emulation is nontrivial since state-based and op-based CRDTs place different requirements on the underlying network with regard to both the causal ordering of message delivery, and the granularity of the messages themselves. In our ICFP ‘25 paper, led by my PhD student Nathan Liittschwager, we specify and formalize CRDT emulation in terms of simulation by modeling CRDTs and their interactions with the network as transition systems. We show that emulation can be understood as weak simulations between the transition systems of the original and emulating CRDT systems. We precisely characterize which properties of CRDT systems are preserved by our weak simulations, and therefore which properties can be said to be preserved by emulation algorithms, thus closing a gap in the CRDT literature. Finally, we leverage our emulation results to obtain a general representation independence result for CRDTs, enabling modular reasoning about CRDTs and their clients.
Monolift: automating distribution with the tools you have at home
The Monolift project, led by my PhD student Tim Goodwin, seeks to develop new techniques for developing and deploying distributed applications that prioritize incremental adoption and support for legacy code. The key idea of Monolift is to treat distribution as a compiler pass for general-purpose languages, where users guide the process via lightweight comment directives. Additionally, Monolift embeds a runtime monitor to intelligently manage the compiled application’s distribution in response to live performance metrics. Check out our PLOS ‘25 paper on Monolift, or watch Tim’s KubeCon ‘24 talk with our collaborator David Morrison, “What if Kubernetes Was a Compiler Target?”!
Inductive diagrams for causal reasoning
Causality – the principle that an effect cannot precede its cause – is of central importance in concurrent and distributed systems. Verification of properties related to causality requires modeling some notion of the history of how a system runs. System designers and implementors often use intuitive (but informal) diagrams to describe such histories; meanwhile, verification engineers use formal (but unintuitive and error-prone) mechanisms to represent histories, leading to tedious and ad-hoc proofs.
To bring together the strengths of these two approaches, my PhD student Jonathan Castello is leading our group’s work on causal separation diagrams, a formal diagrammatic language for representing execution histories of concurrent systems. In a CSD, “happens-before” relationships in an execution are modeled as a dependent type of paths between events, letting us follow the software design principle of “make illegal states unrepresentable”: causally impossible executions are unrepresentable as CSDs! Moreover, CSDs are an inductively-defined data type, making it natural to give compositional interpretations of CSDs into other data types. CSDs’ interpretability enables an elegant approach to verification in which the building blocks of executions are interpreted as proof steps, then composed along their causal structure. In our OOPSLA ‘24 paper, we used this approach to carry out a generic mechanized proof of Lamport’s clock condition that is parametric in a choice of logical clock (i.e., scalar, vector, and matrix clocks), yielding verified implementations of these widely-used mechanisms.
Encoding actors with exceptions
Asynchronous exceptions are a powerful feature of modern runtime systems such as that of GHC Haskell. Unlike typical (synchronous) exceptions, which come from the current thread, asynchronous exceptions can be thrown to the current thread by a different thread or by the runtime system itself. With tongue firmly planted in cheek, in our Haskell Symposium ‘23 paper led by my PhD student Patrick Redmond, we demonstrated that asynchronous exceptions can therefore be repurposed for general inter-thread communication. We used asynchronous exceptions to implement actors, a computational paradigm characterized by message passing, and used the resulting actor framework to implement a classic leader election protocol, demonstrating that asynchronous exceptions are at least as general as actors.
“This is horrifying, so…well done.” – Stephen Diehl
Library-level choreographic programming
Distributed software systems require global coordination between multiple parties to accomplish tasks (for example, an e-commerce transaction involving a buyer, seller, payment processor, and shipping service), yet each party in the system can only take local actions. Choreographic programming (CP) is an emerging paradigm for programming distributed systems that addresses this mismatch between global tasks and local actions by letting the programmer describe the behavior of the entire system as a unified program called a choreography, which has benefits both for correctness and for understandability. A choreography can then be automatically translated to multiple programs run by distinct parties, via a technique called endpoint projection.
Traditional implementations of CP required a purpose-built language and compiler, hindering widespread adoption of the technique. In our ICFP ‘23 paper led by my PhD student Gan Shen, we introduced the HasChor framework, which pioneered library-level choreographic programming. Library-level CP enables CP in existing, general-purpose programming languages by using a novel run-time approach to endpoint projection, enabling seamless use of the host language’s ecosystem.
Our PLDI ‘25 paper with our collaborators at the University of Vermont advances the state of the art of library-level CP in three ways. First, one challenge of CP is ensuring knowledge of choice, which is run-time information that tells each party in a choreography what branch to take in a conditional (e.g., “if-then-else”) expression; we proposed and formalized new mechanisms that enable efficient knowledge of choice in library-level CP frameworks. Second, we introduced a novel technique for abstracting over the number of participants in a choreography. Third, we proposed endpoint projection as dependency injection, a design pattern for library-level CP usable in any host language that supports higher-order functions. We demonstrated these contributions with three open-source software libraries for the programming languages Haskell, Rust, and TypeScript.
After years of work on their theoretical underpinnings, choreographies are now poised to enter the mainstream programmer’s toolbox via libraries for general-purpose programming languages. Library-level CP has now spread beyond our research group, and as of this writing in 2026, the last two years have seen the release of CP libraries for programming languages including Rust, TypeScript, OCaml, Clojure, Elixir, Racket, Unison, and more, being actively developed by both researchers and practitioners.
To learn more about about choreographic programming, check out our zine, “Communicating Chorrectly with a Choreography”!
Surveying bugs in serverless runtime systems
So-called “serverless” computing is a cloud computing abstraction in which programmers specify functions to run, and the platform provides everything else (such as an operating system, resource allocation, load balancing, and fault tolerance). Though serverless computing promises to drastically simplify distributed system deployments, serverless runtime systems must attain complex design goals and are prone to bugs. In our SESAME ‘23 paper led by my PhD student Tim Goodwin, to investigate the challenges faced by designers of serverless runtime systems, we carried out an empirical study of bugs in a serverless runtime system. We investigated and taxonomized all bugs publicly reported against Knative Serving, a popular open-source serverless computing platform, over a three-year period, resulting in a public dataset of over 100 bugs.
Verifying distributed systems with refinement types
To guard against machine failures and keep data close to clients who need it, modern internet services store multiple replicas of the same application data within and across data centers. Replication introduces the problem of keeping geo-distributed copies of data consistent with one another in the face of network partitions and unpredictable message latency. To avoid costly and conservative synchronization protocols, many real-world systems provide only weak consistency guarantees (e.g., causal consistency or PRAM consistency), which permit certain kinds of disagreement among replicas. There has been much recent interest in programming-language support for specifying and verifying such consistency properties. Although these properties are typically beyond the scope of what traditional type checkers or compiler analyses can guarantee, solver-aided programming languages – languages that have integrated SAT or SMT solving capabilities, either during compile time, run time, or both – are up to the task.
This line of work aims to enable the development of correct and reliable distributed systems by leveraging cutting-edge verification tools integrated into general-purpose programming languages. Refinement types are a programming language feature that augments standard data types with logical predicates that restrict the set of values described by a type. Refinement types have traditionally been used to express relatively simple properties of individual functions that can be checked by an off-the-shelf SMT solver. However, recent advancements such as the reflection mechanism in the Liquid Haskell programming language let programmers use refinement types to specify rich properties that can relate multiple functions, and then write SMT-aided mechanized proofs of those properties.
In our OOPSLA ‘20 paper, together with our collaborators at the University of Maryland, we used the Liquid Haskell solver-aided language to prove that replicated data structures are strongly convergent – the property that, if replicas have applied the same set of update operations, in any order, then they have equivalent state. We were able to state and prove the strong convergence property at the level of a generic interface in Liquid Haskell, and then plug in a data-structure-specific commutativity proof for each data structure we wanted to verify. Being able to do this proof in this modular fashion required extending Liquid Haskell to add the ability to state and prove properties at the typeclass level, a feature with many applications beyond our work.
In my group’s IFL ‘22 paper led by my PhD student Patrick Redmond, we then used Liquid Haskell to develop a novel mechanically verified implementation of a causal message broadcast protocol. Causal broadcast is a widely used building block of distributed systems (and itself a prerequisite for implementing many CRDTs), motivating the need for verified implementations. Our proof development uses refinement types to express the safety property that messages are never delivered to a process in an order that violates the causal order.
My FLOPS 2022 keynote talk, “Adventures in Building Reliable Distributed Systems with Liquid Haskell”, summarizes these projects.
More broadly, in our SNAPL ‘19 paper, my colleague Peter Alvaro and I made a case for domain-specific SMT-based tools that exploit the mathematical foundations of distributed consistency. Such consistency-aware solver-aided tools would support the rapid implementation of formally verified programming abstractions that guarantee distributed consistency. What if we could democratize the development of such domain-specific solvers by creating a framework for domain-specific solver development that brings new theory solver implementation within the reach of programmers who are not necessarily SMT solver internals experts?
Integrating interactive and automated theorem proving
Mechanized theorem-proving systems have historically fallen into two camps, with different strengths and weaknesses: interactive theorem proving, exemplified by proof assistants based on dependent types such as Agda, and automated theorem proving, exemplified by SMT solvers and tools that build on them, such as Liquid Haskell’s refinement types. Experience with large-scale proof mechanization highlights the need for usable hybrid verification tools that integrate solver automation, à la Liquid Haskell, with visibility into the in-progress proof state, à la Agda. Two complementary workshop publications from my research group suggest paths toward such a hybrid approach. In one direction, we propose integrating SMT-powered refinement types into Agda to ease the burden of programming with dependent types; in the other direction, we propose improving the usability of Liquid Haskell by extending it with support for Agda-style typed holes and interactive editing commands that take advantage of them.
Verification for safety-critical neural networks
In 2017, my group at Intel Labs began sponsoring and collaborating with a team of researchers at Stanford and HUJI who are extending the capabilities of automated verification tools, such as SMT solvers, to formally verify properties of deep neural networks used in safety-critical systems. One of the findings from this work was that in order to scale up SMT-based verification to realistic networks, we need to develop domain-specific solvers that incorporate domain knowledge about neural network internals. The Reluplex solver developed by our collaborators exemplifies this idea of exploiting high-level domain-specific abstractions for efficiency. In the longer term, I’m interested in creating a framework that makes it easy to develop such high-performance domain-specific solvers, analogously to how, for instance, the Delite framework aimed to make it easy to develop high-performance DSLs.
For more on this line of work, aside from the posts in the “neural network verification” category on this blog, see our short paper that appeared at SysML ‘18 and the accompanying poster.
Parallelizing Julia with a non-invasive DSL
Computational scientists often prototype their software using a “productivity” language such as MATLAB, Python, R, or Julia. When it comes time to scale up such prototypes to run on a high-performance computing system, though, the code typically has to be rewritten in an “efficiency” language like C++. This rewriting process takes considerable time and expense and often results in brittle code, with which it is harder to experiment. Domain-specific languages (DSLs) are one well-regarded solution to the efficiency/productivity trade-off, but it is difficult to design a DSL that can anticipate any use case that the programmer will eventually encounter.
With my colleagues at Intel Labs, I worked on ParallelAccelerator, a compiler for the Julia programming language that attempts to bridge this productivity-efficiency gap in scientific computing. ParallelAccelerator discovers and exploits the implicit parallelism in source programs that use parallel programming patterns. For the most part, these patterns are already present in standard Julia, so programmers can use ParallelAccelerator to compile Julia programs with only minor changes to source code. Our results show that ParallelAccelerator enables orders-of-magnitude speedups over plain Julia on a variety of scientific computing workloads, without requiring the programmer to extensively rewrite code or add lots of type annotations to guide the compiler.
For more on ParallelAccelerator, aside from the posts in the “ParallelAccelerator” category on this blog, see:
- Our ECOOP ‘17 paper.
- The ParallelAccelerator.jl Julia library on GitHub.
- My March 2016 guest post on the Julia blog and accompanying talk at JuliaCon 2016. I also have an older talk from SPLASH-I 2015.
LVars: lattice-based data structures for deterministic parallel and distributed programming
Parallel programming is notoriously difficult. Programs can yield inconsistent answers, or even crash, due to unpredictable interactions between parallel tasks. But it doesn’t have to be this way: deterministic-by-construction programming models offer programmers the promise of freedom from subtle, hard-to-reproduce nondeterministic bugs in parallel code. To make that promise a reality, though, deterministic-by-construction parallel programming models must sharply restrict sharing of state between parallel tasks, usually either by disallowing sharing entirely or by restricting it to one type of data structure, such as single-assignment locations.
A common theme that emerges in the study of such deterministic-by-construction programming models – from venerable ones like Kahn process networks, to modern single-assignment-based models like Haskell’s monad-par library and Intel’s Concurrent Collections system – is that the determinism of the system hinges on some notion of monotonicity. By taking monotonicity as a starting point, then, we can generalize and unify existing models. For my dissertation research with Ryan Newton, I developed LVars, a new model for deterministic-by-construction parallel programming that generalizes existing single-assignment models to allow multiple assignments that are monotonically increasing with respect to a specified lattice. LVars maintain determinism by allowing only monotonic writes and “threshold” reads to and from shared data.
Finally, LVar-style threshold reads can also apply to the setting of convergent replicated data types (CvRDTs), which specify the behavior of eventually consistent replicated objects in a distributed system. In recent work, I’ve extended the CvRDT model to support deterministic, strongly consistent threshold queries. The technique generalizes to any lattice, and hence any CvRDT, and allows deterministic observations to be made of replicated objects before the replicas’ states converge.
For more on LVars, aside from the posts in the “LVars” category on this blog, see:
- My 2015 dissertation, Lattice-based Data Structures for Deterministic Parallel and Distributed Programming.
- Our FHPC ‘13, POPL ‘14 (with Aaron Turon and Neel Krishnaswami), and PLDI ‘14 (with Aaron Todd and Sam Tobin-Hochstadt) papers.
- A draft paper on deterministic threshold queries.
- LVish, our Haskell library for parallel programming with LVars (with Aaron Turon and other contributors) (also on GitHub).
- My PLT Redex models of LVar calculi.
River Trail: parallel programming in JavaScript
In 2014, I joined Intel Labs and began contributing to River Trail, a library, JIT compiler, and Firefox extension to enable data-parallel programming in JavaScript. The River Trail library provides a ParallelArray array type and a variety of potentially-parallel operations on it, like map, reduce, and filter; it works in conjunction with the Firefox extension, which enables programs written using the library to run in parallel, via OpenCL. Together with my colleagues at Intel, I rewrote the River Trail Firefox extension in JavaScript, replacing the old binary XPCOM extension. This was my first major JavaScript project, and I learned a lot about browser extensions.
For more on River Trail, see the tutorial, as well as the OOPSLA 2013 paper by Stephan Herhut, Richard Hudson, Tatiana Shpeisman, and Jaswanth Sreeram.
Various contributions to the Rust programming language
Rust is a new systems programming language being developed by Mozilla Research with the goals of safety, concurrency, and speed. During the summers of 2011 and 2012, I worked on the Rust team at Mozilla, making various contributions to the self-hosted compiler. Rust remains under active development by a team of paid and volunteer contributors; follow the progress on GitHub!
During summer 2011, I added self-dispatch, object extension, and method overriding to the prototype-based object system that Rust had at the time. Prototype-based object systems are usually only found in dynamic languages such as Self and JavaScript, and adding these features to Rust posed an unusual challenge, since dynamic object extension and method overriding cause the type of the current object to change at run-time. I gave a talk about this project to my research group in September 2011. (Although the prototype-based object system was scrapped soon afterward, a vestige of my work lives on: the self keyword in Rust!)
During summer 2012, I made various contributions to Rust, including implementing integer-literal suffix inference and some work on extending the Rust trait system to support Haskell-like default methods. I gave a talk about the latter project at Mozilla in August 2012. (When I started this work, traits were still called ifaces; I changed the keyword to trait!)
Parametric polymorphism through run-time sealing
Languages with parametric polymorphism provide the compile-time guarantee that programs behave uniformly regardless of the types they are instantiated with. In such languages, this parametricity guarantee forms the basis of data abstraction and makes possible Wadler’s “theorems for free!” Ordinarily, embedding untyped code fragments in a statically typed language would break parametricity. However, it’s possible to preserve parametricity in such a setting by the use of dynamic sealing, which enforces data abstraction by using uniquely generated keys to seal values that should be kept opaque.
I worked with Amal Ahmed and Jacob Matthews on an update to their previous work on proving parametricity for a multi-language system that combines System F with a Scheme-like language. Our updated proof employs a cross-language, step-indexed logical relation that uses possible worlds to model the semantics of key generation. Using this possible-worlds semantic model, we can show two results: first, that System F’s parametricity guarantee is preserved when interoperating with Scheme, and, second, that parametricity is preserved in a Scheme-only setting that interacts with System F only to implement a polymorphic contract system. I gave a talk about this project at the Northeastern University Programming Languages Seminar in February 2011.
Testing abstract representations of CPU semantics
Junghee Lim and Thomas Reps’ Transformer Specification Language (TSL) system comprises a domain-specific programming language for specifying the semantics of a processor at the level of the instruction set architecture, together with a run-time system for generating static analysis engines for executables compiled for that instruction set. TSL’s ability to generate static analysis engines from ISA specifications makes it a powerful tool for analyzing executables, but it’s of limited value if the ISA specifications are wrong in the first place. We are therefore interested in verifying the correctness of TSL’s ISA specifications.
During summer 2010, I worked with David Melski and the research team at GrammaTech and implemented a prototype system for testing TSL specifications. Given a CPU emulator generated from a TSL spec, our system attaches the emulator to a Linux process running on a physical CPU, initializes the emulator’s state from that of the process, and compares the states of the emulated and physical CPUs after the execution of each instruction. If the states differ, that suggests a possible bug in the specification from which the emulator was generated. Our emulator-testing framework is flexible enough that it could be extended to support testing of abstract emulators that have an abstract interpreter at their core, in addition to the concrete CPU emulators it supports now. This work was done as part of the Safety in Numbers research project and appears in the project report; I also gave a talk about my work on the project to my research group in October 2010.
A pattern matcher for miniKanren
miniKanren is a declarative logic programming system implemented in a pure functional subset of Scheme. In 2009, I worked with Andy Keep, Michael Adams, Will Byrd, and Dan Friedman on implementing a pattern matcher for miniKanren. Our self-imposed challenge was to implement the pattern matcher using Scheme’s syntax-rules macro system, as opposed to a more powerful macro system such as syntax-case. Our pattern matcher implementation uses CPS macros to control the order in which macro expansion takes place, delaying expansion of a macro that generates variable bindings until after other macro expansions have happened. The ability to control the order in which our macros expand allows us to generate more concise code. However, we discovered a situation in which delayed expansion caused incorrect code to be generated, leaving variables unbound that should have been bound. The problem can be worked around either by using syntax-case, or by using syntax-rules but not (as much) CPS. Our paper on this project appeared at the 2009 Scheme Workshop, giving me an upper bound of 4 on my Erdős number.