What do people mean when they say “transpiler”?
Around 2013 or so, it started to become fashionable to use the word “transpiler” for certain kinds of compilers. When people say “transpiler”, what kind of compiler do they mean?
Here’s a longer excerpt from the Wikipedia “source-to-source compiler” page (to which the “transpiler” page redirects):
A source-to-source compiler, transcompiler or transpiler is a type of compiler that takes the source code of a program written in one programming language as its input and produces the equivalent source code in another programming language. A source-to-source compiler translates between programming languages that operate at approximately the same level of abstraction, while a traditional compiler translates from a higher level programming language to a lower level programming language.
Interestingly, this definition says that “transpiler” and “source-to-source compiler” are synonyms: it says that they both mean a compiler with input and target languages that are approximately the same level of abstraction. But a “source-to-source compiler” sounds like it ought to be something more specific than that. It sounds like a compiler for which the input and target languages not only operate at similar levels of abstraction, but also both operate at high levels of abstraction (because a language that we’d call a “source” language is presumably high-level). This definition coincides with “translates between programming languages that operate at approximately the same level of abstraction” only if we assume that all compilers have a high-level input language.2
So, we’ve got several overlapping-but-different definitions for “transpiler”:
- compiler that translates between languages that operate at similar levels of abstraction (the Wikipedia definition)
- compiler with high-level languages as its input and target languages (which is what I think “source-to-source compiler” sounds like)
- compiler that targets a high-level language (examples include Emscripten, TypeScript, and CoffeeScript)
- compiler that produces readable output in a high-level target language (examples include TypeScript and CoffeeScript, but not Emscripten)
In a recent conversation with a friend who writes compilers, I learned about yet another connotation that “transpiler” might have. Compiling from a high-level input language all the way to assembly removes certain opportunities for language interoperability; if you want to link together components that are written in different input languages, then it can be easier to do that if the input languages’ compilers both target the same intermediate language. (Microsoft’s Common Intermediate Language is an example of an intermediate language that compilers for various input languages target, and that was designed with language interoperability in mind.) It’s useful to have a word for compilers that make a point of making interoperability easy in this way, and for my friend, the word “transpiler” serves that purpose. So we have yet another overlapping-but-not-quite-coinciding definition:
- compiler that targets a relatively high-level language for the purpose of facilitating interoperability between different input languages
I don’t think many people use “transpiler” to mean this, but at least this one language implementer that I know does!
By now, I hope it’s clear that “transpiler” means different things to different people, and that the definitions are fuzzy and have weird edge cases. Furthermore, even if two people agree that “transpiler” means, say, “compiler that targets a high-level language” or “compiler with input and target languages at similar levels of abstraction”, there’s still room for misunderstanding: “high-level” and “low-level” are relative, and whether or not any given two languages operate at “similar levels of abstraction” isn’t necessarily a cut-and-dried matter. There isn’t a total ordering on languages by level of abstraction. Language constructs that operate at different levels of abstraction can coexist in the same language, making it simultaneously “higher-level” and “lower-level” than another language that has only language constructs that are somewhere in between.
When I wrote the first, unpublished draft of this post a few years ago, it was called “Stop saying ‘transpiler’”. I decided not to call it that, because I’m trying to lay off the prescriptivism (and I have a policy against writing blog posts with “stop doing X” titles now, anyway); I understand that people are using “transpiler” because, for them, it describes something that it’s convenient to have a word for. I do think, though, that before saying “transpiler”, it’s useful to think about what exactly one means by it, and whether the people one is communicating with are going to agree with that meaning. For me, any convenience that might be gained by saying “transpiler” is rarely worth the potential for confusion.
Compiling from one assembly language to another assembly language is sometimes called binary translation. ↩