At PEPM 2016 we presented the design of a constraint language for type checking based on scope graphs. That design was the basis for the NaBL2 name and type analysis language that is currently integrated into the Spoofax language workbench. While the approach deals elegantly with a large variety of name binding patterns, including type-dependent name resolution, it was limited to `simple’ type systems. In particular, NaBL2 does not support type systems with structural (anonymous) types and generic (parameterized) types. In our OOPSLA 2018 paper we show that these limitations are not intrinsic to the scope graph model. We show that by using scopes as types, we can model such types. The picture shows the scope graph for a program with structural record types.

To be presented next week at SPLASH 2018 in Boston.

Hendrik van Antwerpen, Casper Bach Poulsen, Arjen Rouvoet, Eelco Visser. Scopes as Types. Proceedings of the ACM on Programming Languages, Volume 2 Issue OOPSLA, November 2018, Article No. 114. 10.1145/3276484

Abstract: Scope graphs are a promising generic framework for modeling the binding structures of programming languages, briding formalization and implementation, supporting the definition of type checkers and the automation of type safety proofs. However, previous work on scope graphs has been limited to simple, nominal type systems. In this paper, we show that viewing scopes as types enables us to model the internal structure of types in a range of non-simple type systems (including structural records and generic classes) using the generic representation of scopes. Further, we show that relations between such types can be expressed in terms of generalized scope graph queries. We extend scope graphs with scoped relations and queries. We introduce Statix, a new domain-specific meta-language for the specification of static semantics, based on scope graphs and constraints. We evaluate the scopes as types approach and the Statix design in case studies of STLCREC, System F, and Featherweight Generic Java.

Following up on our paper about the design of the PIE build DSL, a paper at the ASE 2018 conference, to be presented by Gabriël Konat next week, deals with the poor scalability of the Pluto algorithm we used for the implementation of PIE. Since the algorithm inspects the entire dependency graph, even if few nodes are affected by a change, there is a large overhead that becomes significant in large projects. This paper extends the Pluto algorithm with a bottom-up traversal of the dependency graph that only visits nodes affected by a change, dramatically reducing build times in most scenarios.

Gabriël Konat, Sebastian Erdweg, and Eelco Visser. 2018. Scalable incremental building with dynamic task dependencies. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE 2018). ACM, New York, NY, USA, 76-86. DOI: https://doi.org/10.1145/3238147.3238196

Abstract:

Incremental build systems are essential for fast, reproducible software builds. Incremental build systems enable short feedback cycles when they capture dependencies precisely and selectively execute build tasks efficiently. A much overlooked feature of build systems is the expressiveness of the scripting language, which directly influences the maintainability of build scripts. In this paper, we present a new incremental build algorithm that allows build engineers to use a full-fledged programming language with explicit task invocation, value and file inspection facilities, and conditional and iterative language constructs. In contrast to prior work on incrementality for such programmable builds, our algorithm scales with the number of tasks affected by a change and is independent of the size of the software project being built. Specifically, our algorithm accepts a set of changed files, transitively detects and re-executes affected build tasks, but also accounts for new task dependencies discovered during building. We have evaluated the performance of our algorithm in a real-world case study and confirm its scalability.

Programming language design requires many iterations to get right. Building the tools to use and evaluate a design is an important cost factor. In this talk I discuss and demonstrate the Spoofax language workbench, its high-level declarative meta-languages for the definition of syntax and semantics of programming languages, and the live generation of programming environments from those definitions.

PIE is a new DSL we are developing to model all sorts of build process in the Spoofax language workbench. A paper about the design of the DSL was published in the Programming journal and presented last week at the Programming 2018 conference:

Gabriël Konat, Michael J. Steindorfer, Sebastian Erdweg, and Eelco Visser. PIE: A Domain-Specific Language for Interactive Software Development Pipelines. The Art, Science, and Engineering of Programming, 2018, Vol. 2, Issue 3, Article 9 [doi, PDF]

Abstract

Context. Software development pipelines are used for automating essential parts of software engineering processes, such as build automation and continuous integration testing. In particular, interactive pipelines, which process events in a live environment such as an IDE, require timely results for low-latency feedback, and persistence to retain low-latency feedback between restarts.

Inquiry. Developing an incrementalized and persistent version of a pipeline is one way to reduce feedback latency, but requires implementation of dependency tracking, cache invalidation, and other complicated and error-prone techniques. Therefore, interactivity complicates pipeline development if timeliness and persistence become responsibilities of the pipeline programmer, rather than being supported by the underlying system. Systems for programming incremental and persistent pipelines exist, but do not focus on ease of development, requiring a high degree of boilerplate, increasing development and maintenance effort.

Approach. We develop Pipelines for Interactive Environments (PIE), a Domain-Specific Language (DSL), API, and runtime for developing interactive software development pipelines, where ease of development is a focus. The PIE DSL is a statically typed and lexically scoped language. PIE programs are compiled to programs implementing the API, which the PIE runtime executes in an incremental and persistent way.

Knowledge. PIE provides a straightforward programming model that enables direct and concise expression of pipelines without boilerplate, reducing the development and maintenance effort of pipelines. Compiled pipeline programs can be embedded into interactive environments such as code editors and IDEs, enabling timely feedback at a low cost.

Grounding. Compared to the state of the art, PIE reduces the code required to express an interactive pipeline by a factor of 6 in a case study on syntax-aware editors. Furthermore, we evaluate PIE in two case studies of complex interactive software development scenarios, demonstrating that PIE can handle complex interactive pipelines in a straightforward and concise way.

Importance. Interactive pipelines are complicated software artifacts that power many important systems such as continuous feedback cycles in IDEs and code editors, and live language development in language workbenches. New pipelines, and evolution of existing pipelines, is frequently necessary. Therefore, a system for easily developing and maintaining interactive pipelines, such as PIE, is important.

Last week, Eduardo presented our paper on fast parsing for grammars with deep conflicts at the Programming 2018 conference:

Luís Eduardo de Souza Amorim, Michael J. Steindorfer, and Eelco Visser. Towards Zero-Overhead Disambiguation of Deep Priority Conflicts. The Art, Science, and Engineering of Programming, 2018, Vol. 2, Issue 3, Article 13. [doi, PDF]

Abstract

Context Context-free grammars are widely used for language prototyping and implementation. They allow formalizing the syntax of domain-specific or general-purpose programming languages concisely and declaratively. However, the natural and concise way of writing a context-free grammar is often ambiguous. Therefore, grammar formalisms support extensions in the form of declarative disambiguation rules to specify operator precedence and associativity, solving ambiguities that are caused by the subset of the grammar that corresponds to expressions.

Inquiry Implementing support for declarative disambiguation within a parser typically comes with one or more of the following limitations in practice: a lack of parsing performance, or a lack of modularity (i.e., disallowing the composition of grammar fragments of potentially different languages). The latter subject is generally addressed by scannerless generalized parsers. We aim to equip scannerless generalized parsers with novel disambiguation methods that are inherently performant, without compromising the concerns of modularity and language composition.

Approach In this paper, we present a novel low-overhead implementation technique for disambiguating deep associativity and priority conflicts in scannerless generalized parsers with lightweight data-dependency.

Knowledge Ambiguities with respect to operator precedence and associativity arise from combining the various operators of a language. While shallow conflicts can be resolved efficiently by one-level tree patterns, deep conflicts require more elaborate techniques, because they can occur arbitrarily nested in a tree. Current state-of-the-art approaches to solving deep priority conflicts come with a severe performance overhead.

Grounding We evaluated our new approach against state-of-the-art declarative disambiguation mechanisms. By parsing a corpus of popular open-source repositories written in Java and OCaml, we found that our approach yields speedups of up to 1.73x over a grammar rewriting technique when parsing programs with deep priority conflicts—with a modest overhead of 1–2 % when parsing programs without deep conflicts.

Importance A recent empirical study shows that deep priority conflicts are indeed wide-spread in real-world programs. The study shows that in a corpus of popular OCaml projects on Github, up to 17 % of the source files contain deep priority conflicts. However, there is no solution in the literature that addresses efficient disambiguation of deep priority conflicts, with support for modular and composable syntax definitions.