Build Automation and Programming Languages

March 02, 2020

One of my long time side interests has been software building and deployment. Not primarily out of (scientific) curiosity, but rather driven by the pragmatic need/desire to make the tools I built available to others.

When I first worked on creating a distribution of the Stratego/XT transformation tool suite in the early 2000’s, I learned that creating a distribution that is usable by others is quite a different thing than writing a piece of code that I could use myself.

Building a portable C code base required a tower of hacks including make, automake, and configure. And then making binaries that worked on a variety of platforms required things like RPMs and Debs. While I investigated these tools to make things work, I was not exactly enthused. I had the distinct impression that it should be possible to provide better and more reliable abstractions to specify the construction of a software system from a bunch of code.

I learned to call this area software deployment and this became the topic of Eelco Dolstra’s PhD thesis work on the Nix software deployment system, and his postdoc work on the NixOS linux distribution.

Nix makes software builds and deployments reproducible by turning a software build into a pure function of which all dependencies are explicit arguments. Thus, it is impossible to forget dependencies, since they will not be available during build and runtime. Furthermore, a software build should not update arbitrary areas of a user’s hard disk. My take on this was reflected in the title of our ICSE 2004 paper: Imposing a Memory Management Discipline on Software Deployment. Just like an operating system imposes a discipline on the memory used by software, a software deployment system should impose a discipline on the extended memory hierarchy.

After the work on Stratego/XT I have abstracted from software builds. Sadly, that was abstraction by delegation to others rather than that the problem was solved by proper abstraction and automation.

In the Spoofax language workbench project we work on high-level declarative meta-languages for various aspects of the definition of programming languages. Using several meta-languages, a language designer can specify a language and generate a range of tools from such a specification. While we spent much research in developing useful language definition abstractions, the tooling around these meta-languages was glued together by a bunch of (relatively low-level) code and build systems. At some point at least five different build systems/languages were in play. While these are somewhat better behaved than the make/automake/configure stack, it is still a loosely coupled combination of scripts and code.

Currently, we are working on Spoofax3, the next generation of Spoofax. The key distinction with the current Spoofax2 (aka Spoofax-Core) is the use of a build system as the glue between all components of the workbench. The PIE build system is a successor of the pluto build system and provides a scalable algorithm for precise builds with dynamic task dependencies. This should enable fast incremental builds for all kinds of pipelines in a language workbench, from micro pipelines for updating the syntax highlighting in an editor, to compiler pipelines for languages built with the workbench, through building a language project in the workbench, to macro pipelines for bootstrapping the entire workbench.

Ultimately, I believe that realizing more reliable and efficient build automation requires integration with programming languages: The compiler is the build system! We have explored this recently in a project turning the whole program Stratego compiler into an incremental compiler. Much more needs to be done before we can produce build system integrated languages as easily as we do produce parsers from syntax definitions.

Together with Andrey Mokhov, I am organizing a workshop at PLDI 2020 on Build Automation and Programming Languages to explore this topic. The goal of this workshop is to bring together build automation experts and language designers and implementers to explore the interaction of build automation and programming languages in systems for incremental analysis, building, testing, packaging, and deployment of software. Deadline for submissions of extended abstracts (2-4 pages) is March 15, 2020. Consider submitting and/or attending the workshop!