Fast and Safe Linguistic Abstraction for the Masses

September 29, 2019

In September 2018, Marieke Huisman and I organized a workshop in the Lorentz Center in Leiden titled “A Research Agenda for Formal Methods in the Netherlands”. The extended abstracts contributed to the workshop were bundled by Wouter Swierstra in a Utrecht University Technical Report. Below is my contribution.

Fast and Safe Linguistic Abstraction for the Masses

Abstract

Language workbenches support the high-level definition of (domain-specific) programming languages and the automatic derivation of implementations from such definitions. The mission of language workbench research is to increase the level of abstraction of language definitions and expand the range of tools that can be generated automatically from language definitions. In this note, I give an overview of research into language workbenches at TU Delft and the perspective of future research.

Linguistic Abstraction

Software engineering is the act of encoding intent into programs. A mismatch between the concepts being encoded and the intent to be expressed leads to unnecessary software complexity. Linguistic abstraction is the process of turning software design patterns into language constructs in order to support expressing intent at the right level of abstraction [15]. A linguistic abstraction provides notation, static checking, and implementation strategies that take the understanding of the domain into account. For example, in the IceDust language for data modeling, a programmer directly expresses how derived values are computed from base values, without being concerned about implementation techniques for caching computed values and incrementally recomputing values upon data changes; an implementation of an efficient incremental re-computation strategy is generated automatically from an IceDust program [3–6]. While many DSLs exist, the process of developing DSLs should become more systematic and requires further industrial case studies.

Linguistic Abstraction for the Masses

A domain-specific programming language can be beneficial for its programmers, but can be expensive to develop. Language workbenches are tools to reduce the cost of DSL development by generating language implementation from language definitions [1, 2]. Spoofax is a language workbench developed in the Programming Languages group at TU Delft [7, 16]. Spoofax has been applied to DSLs in industry (Oracle, Océ) and academia. Linguistic Abstraction for Linguistic Abstraction Rather than implememting languages in a general purpose language, language workbenches provide domain-specific abstractions for language development, i.e. linguistic abstractions for linguistic abstraction. Spoofax is a test bed for the exploration of the design of high-level declarative meta-languages for various aspects of language definition. A recent example is the development of the Statix DSL for the specification of type systems [11, 12]. The DSL is based on a novel approach to the formalization of name binding in programming languages using scope graphs [8]. A scope graph is an abstraction of a program that represents its binding facts. This abstraction allows the definition of a language independent calculus to define name resolution in programs, giving rise to a range of language independent tools.

Fast Linguistic Abstraction

Traditionally, language workbenches define code generators to realize the implementation of DSLs programs. The disadvantage of this approach is that the semantics of the language is defined through a translation relation in terms of a target language. We are investigating whether it is possible to define directly the semantics of languages and derive an implementation from such definitions. The DynSem meta-language is designed for the high-level declarative specification of the operational semantics of programming languages [13]. DynSem specifications can be executed directly using a meta-interpreter that interprets a DynSem specification with respect to an object program, which gives rise to two levels of interpretation and significant overhead. A promising direction of research is to use partial evaluation techniques to specialize the application of a meta-interpreter [14] to provide both a fast turn around time for language design experiments and at the same time get a fast run time for the language under specification.

Safe Linguistic Abstraction

In addition to producing implementations, the proper design of a programming language requires proofs that a design satisfies properties such as type soundness and semantics preservation of transformations. Traditionally, proving such properties is separated from language implementation, leading to proofs for only subsets of a language and a divergence of specification and implementation. We are investigating meta languages that enable the automatic verification of such properties. A first proof of concept automates the verification of type soundness. A complication in type soundness proofs is the alignment of the binding of names statically and at run time. We have developed a systematic approach to formalizing this alignment by relating scopes in scope graphs to frames in frame heaps [9]. By encoding a scope graph based type system as an intrinsically typed abstract syntax signature, type checking the evaluation rules of a definition interpreter against such a signature entails type soundness of the language under definition [10].

Future Work

In this abstract I have described several proof of concept components of a development environment for creating new (domain-specific) programming languages from declarative specifications that are safe by construction and that have fast runtimes. Fully realizing the promise of these proofs of concept requires scaling up the techniques to more advanced programming languages and large code bases.

References

[1] Sebastian Erdweg, Tijs van der Storm, Markus Völter, Meinte Boersma, Remi Bosman, William R. Cook, Albert Gerritsen, Angelo Hulshout, Steven Kelly, Alex Loh, Gabriël Konat, Pedro J. Molina, Martin Palatnik, Risto Pohjonen, Eugen Schindler, Klemens Schindler, Riccardo Solmi, Vlad A. Vergu, Eelco Visser, Kevin van der Vlist, Guido Wachsmuth, and Jimi van der Woning. The state of the art in language workbenches

  • conclusions from the language workbench challenge. In Martin Erwig, Richard F. Paige, and Eric Van Wyk, editors, Software Language Engineering - 6th International Conference, SLE 2013, Indianapolis, IN, USA, October 26-28, 2013. Proceedings, volume 8225 of Lecture Notes in Computer Science, pages 197–217. Springer, 2013.

[2] Sebastian Erdweg, Tijs van der Storm, Markus VÃűlter, Laurence Tratt, Remi Bosman, William R. Cook, Albert Gerritsen, Angelo Hulshout, Steven Kelly, Alex Loh, Gabriël Konat, Pedro J. Molina, Martin Palatnik, Risto Pohjonen, Eugen Schindler, Klemens Schindler, Riccardo Solmi, Vlad A. Vergu, Eelco Visser, Kevin van der Vlist, Guido Wachsmuth, and Jimi van der Woning. Evaluating and comparing language workbenches: Existing results and benchmarks for the future. Computer Languages, Systems & Structures, 44:24–47, 2015.

[3] Daco Harkes, Danny M. Groenewegen, and Eelco Visser. Icedust: Incremental and eventual computation of derived values in persistent object graphs. In Shriram Krishnamurthi and Benjamin S. Lerner, editors, 30th European Conference on Object-Oriented Programming, ECOOP 2016, July 18-22, 2016, Rome, Italy, volume 56 of LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016.

[4] Daco Harkes and Eelco Visser. Unifying and generalizing relations in role-based data modeling and navigation. In Benoît Combemale, David J. Pearce, Olivier Barais, and Jurgen J. Vinju, editors, Software Language Engineering - 7th International Conference, SLE 2014, Västeras, Sweden, September 15-16, 2014. Proceedings, volume 8706 of Lecture Notes in Computer Science, pages 241–260. Springer, 2014.

[5] Daco Harkes and Eelco Visser. Icedust 2: Derived bidirectional relations and calculation strategy composition. In Peter Müller, editor, 31st European Conference on Object-Oriented Programming, ECOOP 2017, June 19-23, 2017, Barcelona, Spain, volume 74 of LIPIcs. Schloss Dagstuhl

  • Leibniz-Zentrum fuer Informatik, 2017.

[6] Daco C. Harkes, Elmer van Chastelet, and Eelco Visser. Migrating business logic to an incremental computing dsl: a case study. In David Pearce 0005, Tanja Mayerhofer, and Friedrich Steimann, editors, Proceedings of the 11th ACM SIGPLAN International Conference on Software Language Engineering, SLE 2018, Boston, MA, USA, November 05-06, 2018, pages 83–96. ACM, 2018.

[7] Lennart C. L. Kats and Eelco Visser. The Spoofax language workbench: rules for declarative specification of languages and IDEs. In William R. Cook, SiobhÃąn Clarke, and Martin C. Rinard, editors, Proceedings of the 25th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2010, pages 444–463, Reno/Tahoe, Nevada, 2010. ACM.

[8] Pierre NÃľron, Andrew P. Tolmach, Eelco Visser, and Guido Wachsmuth. A theory of name resolution. In Jan Vitek, editor, Programming Languages and Systems - 24th European Symposium on Programming, ESOP 2015, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2015, London, UK, April 11-18,

  1. Proceedings, volume 9032 of Lecture Notes in Computer Science, pages 205–231. Springer, 2015.

[9] Casper Bach Poulsen, Pierre Néron, Andrew P. Tolmach, and Eelco Visser. Scopes describe frames: A uniform model for memory layout in dynamic semantics. In Shriram Krishnamurthi and Benjamin S. Lerner, editors, 30th European Conference on Object-Oriented Programming, ECOOP 2016, July 18-22, 2016, Rome, Italy, volume 56 of LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016.

[10] Casper Bach Poulsen, Arjen Rouvoet, Andrew P. Tolmach, Robbert Krebbers, and Eelco Visser. Intrinsically-typed definitional interpreters for imperative languages. Proceedings of the ACM on Programming Languages, 2(POPL), 2018.

[11] Hendrik van Antwerpen, Pierre Néron, Andrew P. Tolmach, Eelco Visser, and Guido Wachsmuth. A constraint language for static semantic analysis based on scope graphs. In Martin Erwig and Tiark Rompf, editors, Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, PEPM 2016, St. Petersburg, FL, USA, January 20 - 22, 2016, pages 49–60. ACM, 2016.

[12] Hendrik van Antwerpen, Casper Bach Poulsen, Arjen Rouvoet, and Eelco Visser. Scopes as types. Proceedings of the ACM on Programming Languages, 2(OOPSLA), 2018.

[13] Vlad A. Vergu, Pierre Néron, and Eelco Visser. Dynsem: A dsl for dynamic semantics specification. In Maribel FernÃąndez, editor, 26th International Conference on Rewriting Techniques and Applications, RTA 2015, June 29 to July 1, 2015, Warsaw, Poland, volume 36 of LIPIcs, pages 365–378. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2015.

[14] Vlad A. Vergu and Eelco Visser. Specializing a meta-interpreter: Jit compilation of Dynsem specifications on the Graal vm. In Eli Tilevich and Hanspeter MÃűssenbÃűck, editors, Proceedings of the 15th International Conference on Managed Languages & Runtimes, ManLang 2018, Linz, Austria, September 12-14, 2018. ACM, 2018.

[15] Eelco Visser. Understanding software through linguistic abstraction. Science of Computer Programming, 97:11–16, 2015.

[16] Guido Wachsmuth, Gabriël Konat, and Eelco Visser. Language design with the spoofax language workbench. IEEE Software, 31(5):35–43, 2014.