Parse Table Composition

December 19, 2008

As mentionted before, we’ve been doing some real parsing research to better support parsers for extensible languages. Parse table composition provides separate compilation for syntax components such that syntax extensions can be provided as plugins to a compiler for a base language. Due to various distractions last Summer I seem to have forgotten to blog about the paper that Martin Bravenboer and I got accepted at the first international conference on Software Language Engineering (which Martin was looking forward too).

M. Bravenboer and E. Visser. Parse Table Composition. Separate Compilation and Binary Extensibility of Grammars. In D. Gasevic and E. van Wyk, editors, First International Conference on Software Language Engineering (SLE 2008). To appear in Lecture Notes in Computer Science, Heidelberg, 2009. Springer. [pdf]

<img src=”” width=”240” height=”160” alt=”submittted” border=0 align=right />

Abstract: Module systems, separate compilation, deployment of binary components, and dynamic linking have enjoyed wide acceptance in programming languages and systems. In contrast, the syntax of languages is usually defined in a non-modular way, cannot be compiled separately, cannot easily be combined with the syntax of other languages, and cannot be deployed as a component for later composition. Grammar formalisms that do support modules use whole program compilation.

Current extensible compilers focus on source-level extensibility, which requires users to compile the compiler with a specific configuration of extensions. A compound parser needs to be generated for every combination of extensions. The generation of parse tables is expensive, which is a particular problem when the composition configuration is not fixed to enable users to choose language extensions.

In this paper we introduce an algorithm for parse table composition to support separate compilation of grammars to parse table components. Parse table components can be composed (linked) efficiently at runtime, i.e. just before parsing. While the worst-case time complexity of parse table composition is exponential (like the complexity of parse table generation itself), for realistic language combination scenarios involving grammars for real languages, our parse table composition algorithm is an order of magnitude faster than computation of the parse table for the combined grammars.

The experimental parser generator is available online.