Today I gave a mini-tutorial at the 5th International Conference on Software Language Engineering (SLE 2012) attempting to explain grammarware to meta-modeling researchers. Since grammarware is a huge area, I chose to discuss a selection of memes from grammarware, illustrated with examples from the Spoofax 'technological space' (as they say). Here are the slides. A recording of the talk may be published as well. Note that the slides do not have much text explaining what they are about. We plan to elaborate the tutorial material in an associated paper for the SLE 2012 post-proceedings.

Read more

Yesterday we deployed a new release of researchr. The release fixes a bug that prevented creation of personal libraries. We have recently also addressed a number of performance bugs that became manifest at scale. But the most significant change in functionality is the advanced support for faceted search. The self implemented support for faceted search allowed selection of publications in a publication list for a single facet at a time. Now you can explore the intersection of multiple facet categories and taking the union or intersection of multiple facets within a category. Try it out. For example, explore the publications on domain-specific language design and find the publications from 2009 tagged 'grammar', or find the publications in the 'oopsla' venue tagged 'language workbench'.

The custom implementation of faceted search lacked features, was slow, not always correct, and took considerable effort to implement. The new faceted search is no longer a custom implementation specific for researchr, but is based on the search DSL for WebDSL that Elmer van Chastelet has been working on for his Master's thesis project. The DSL provides abstractions for search indexing with Apache Lucene and Hibernate Search and a query language for searching objects. With these features, any WebDSL application can use these powerful libraries with a fraction of the effort it took my custom implementation. A full description of the DSL is under construction.

Triggered by the bridge parsing paper that Emma Nilsson-Nyman presented at SLE 2008, we started working in 2009 on error recovery for SGLR parsing in order to make Spoofax editors robust in the presence of syntactic errors. Most editor services, from syntax highlighting to code completion, depend on an abstract syntax tree. Since the programs are frequently in a syntactically incorrect state during editing, many editor services would break without parse error recovery.

Because of the parallel forking nature of GLR parsing, error recovery looked like an impossible problem to solve. We ended up developing an interesting mix of techniques consisting of permissive grammars, a back-tracking extension of SGLR, and layout-sensitive error region discovery that produce good error recovery without language designer intervention.

However, evaluating the quality of error recovery turned to be a laborious process with lots of pitfalls. In an ASE 2012 short paper that Maartje presented last week at the conference, a solution to this problem is presented. By generating programs with errors from correct programs, we cheaply get a large collection of test programs for which we know a good recovery. The generators randomly insert errors guided by rules about typical types of errors that occur during programming.

Maartje de Jonge, Eelco Visser. Automated Evaluation of Syntax Error Recovery. In 27th IEEE/ACM International Conference on Automated Software Engineering (ASE 2012), September 3-7, Essen, Germany. pages 322-325, ACM, 2012.

Abstract: Evaluation of parse error recovery techniques is an open problem. The community lacks objective standards and methods to measure the quality of recovery results. This paper proposes an automated technique for recovery evaluation that offers a solution for two main problems in this area. First, a representative testset is generated by a mutation based fuzzing technique that applies knowledge about common syntax errors. Secondly, the quality of the recovery results is automatically measured using an oracle-based evaluation technique. We evaluate the validity of our approach by comparing results obtained by automated evaluation with results obtained by manual inspection. The evaluation shows a clear correspondence between our quality metric and human judgement.

We have extended the Spoofax Language Workbench with a domain-specific language for specifying the name binding and scope rules of programming languages. Instead of programmatically encoding name resolution algorithms, as is standing practice, a language designer defines name binding in terms of four basic domain-specific concepts: 'definitions', 'references', 'scopes', and 'imports'. With these concepts a wide variety of name bindings can be expressed non-algorithmically. For example, the following rules define the binding of base class references (inheritance) and types to class definitions in C#:

    rules
      Class(NonPartial(), c, _, _) : 
        defines unique class c
      Class(Partial(), c, _, _) : 
        defines non−unique class c 
      Base(c) : 
        refers to class c
      ClassType(c) : 
        refers to class c

From such a definition we automatically derive a name resolution algorithm that is used as the basis for editor services such as reference resolution and code completion. Our hope is that NBL can play the role for name binding that BNF plays for syntax definition. That is, that a single declarative definition can be used as the basis for implementation and documentation.

NBL is already available in the nightly builds of Spoofax. In the coming months we will be spreading the word at various events. We will present a poster at SPLASH and SLE. I will present our NBL paper at SLE 2012, and the language will feature in my grammarware tutorial at SLE as well. I may be giving a talk in the Bay Area as well.

For a full account of NBL see our SLE 2012 paper:

Gabriël D. P. Konat, Lennart C. L. Kats, Guido Wachsmuth, Eelco Visser. Declarative Name Binding and Scope Rules. In Krzysztof Czarnecki, Görel Hedin, editors, Software Language Engineering, 5th International Conference, SLE 2012, Dresden, Germany, September 26-28, 2012, Revised Selected Papers. Volume 7745 of Lecture Notes in Computer Science, pages 311-331, 2013.

Abstract: In textual software languages, names are used to reference elements like variables, methods, classes, etc. Name resolution analyses these names in order to establish references between definition and use sites of elements. In this paper, we identify reoccurring patterns for name bindings in programming languages and introduce a declarative metalanguage for the specification of name bindings in terms of namespaces, definition sites, use sites, and scopes. Based on such declarative name binding specifications, we provide a language-parametric algorithm for static name resolution during compile-time. We discuss the integration of the algorithm into the Spoofax Language Workbench and show how its results can be employed in semantic editor services like reference resolution, constraint checking, and content completion.