Research challenge for 2009: trust.
As mentionted before, we've been doing some real parsing research to better support parsers for extensible languages. Parse table composition provides separate compilation for syntax components such that syntax extensions can be provided as plugins to a compiler for a base language. Due to various distractions last Summer I seem to have forgotten to blog about the paper that Martin Bravenboer and I got accepted at the first international conference on Software Language Engineering (which Martin was looking forward too).
M. Bravenboer and E. Visser. Parse Table Composition. Separate Compilation and Binary Extensibility of Grammars. In D. Gasevic and E. van Wyk, editors, First International Conference on Software Language Engineering (SLE 2008). To appear in Lecture Notes in Computer Science, Heidelberg, 2009. Springer. [pdf]
Abstract: Module systems, separate compilation, deployment of binary components, and dynamic linking have enjoyed wide acceptance in programming languages and systems. In contrast, the syntax of languages is usually defined in a non-modular way, cannot be compiled separately, cannot easily be combined with the syntax of other languages, and cannot be deployed as a component for later composition. Grammar formalisms that do support modules use whole program compilation.
Current extensible compilers focus on source-level extensibility, which requires users to compile the compiler with a specific configuration of extensions. A compound parser needs to be generated for every combination of extensions. The generation of parse tables is expensive, which is a particular problem when the composition configuration is not fixed to enable users to choose language extensions.
In this paper we introduce an algorithm for parse table composition to support separate compilation of grammars to parse table components. Parse table components can be composed (linked) efficiently at runtime, i.e. just before parsing. While the worst-case time complexity of parse table composition is exponential (like the complexity of parse table generation itself), for realistic language combination scenarios involving grammars for real languages, our parse table composition algorithm is an order of magnitude faster than computation of the parse table for the combined grammars.
The experimental parser generator is available online.
Last Summer I attended the Code Generation 2008 conference in Cambridge to give a tutorial on WebDSL, as case study in domain-specific language engineering. The conference was an interesting change from the usual academic conferences I visit, in that the majority of the audience were from industry. It was good to see the interest in code generation in industry, but also disconcerting to observe the gap between academic research and industrial practice; but more about that some other time.
It was a long time ago (1997) that I defended my PhD thesis, which was mostly about syntax definition and parsing. In particular, I introduced SDF2, which radically integrates lexical and context-free syntax, and the SGLR parsing algorithm for parsing arbitrary 'character-level' context-free grammars. Since finishing my thesis I have done quite a bit of `applied parsing research', using SDF and SGLR for applications such as meta-programming with concrete object syntax and DSL embedding, but I don't consider myself a hard-core parsing researcher any more. So I had to dig deep in my memory to talk about Noam Chomsky's language hierarchy, grammars as string rewrite systems, and parsing algorithms. I find the result a bit awkward to listen to, but people assure me that is because it is my own voice I'm listening too.
In the meantime my relation to parsing is changing again. While SDF/SGLR still provides the best approach to declarative definition of composite languages (in my opinion at least), it has some fundamental limitations which have never been addressed. A first step in addressing these limitations was taken in the SLE 2008 paper with Martin Bravenboer on parse table composition (see upcoming blog) to provide separate compilation for grammars. With a new PhD student starting in the new year, I hope to address other limitations such as the lack of error recovery.
The paper "Decorated Attribute Grammars" by Lennart Kats, Tony Sloane and Eelco Visser has been accepted for presentation at the International Conference on Compiler Construction (CC 2009) to be held in March 2009 in York (UK). [pdf]
Abstract Attribute grammars are a powerful specification formalism for tree-based computation, particularly for software language processing. Various extensions have been proposed to abstract over common patterns in attribute grammar specifications. These include various forms of copy rules to support non-local dependencies, collection attributes, and expressing dependencies that are evaluated to a fixed point. Rather than implementing extensions natively in an attribute evaluator, we propose attribute decorators that describe an abstract evaluation mechanism for attributes, making it possible to provide such extensions as part of a library of decorators. Inspired by strategic programming, they are specified using generic traversal operators. To demonstrate their effectiveness, we describe how to employ decorators in name, type, and flow analysis.
The ideas have been implemented in Aster, an extension of Stratego with reference attribute grammars.