architecture

In July, we reported on our ongoing work to formalize the connection between name binding described with scope graphs and types, and in particular type-dependent name resolution in which name and type analysis are interdependent (and, as a result, cannot be staged). In a new technical report we present a generalization of that framework with as interface a constraint language that can serve as intermediate language for name and type analysis.

Hendrik van Antwerpen, Pierre Néron, Andrew Tolmach, Eelco Visser, and Guido Wachsmuth. A Constraint Language for Static Semantic Analysis based on Scope Graphs with Proofs. Technical Report TUD-SERG-2015-012, Delft University of Technology.

Abstract: In previous work, we introduced scope graphs as a formalism for describing program binding structure and performing name resolution in an AST-independent way. In this paper, we show how to use scope graphs to build static semantic analyzers. We use constraints extracted from the AST to specify facts about binding, typing, and initialization. We treat name and type resolution as separate building blocks, but our approach can handle language constructs—such as record field access—for which binding and typing are mutually dependent. We also refine and extend our previous scope graph theory to address practical concerns including ambiguity checking and support for a wider range of scope relationships. We describe the details of constraint generation for a model language that illustrates many of the interesting static analysis issues associated with modules and records.

Update 20 October, 2015: The paper has been accepted at PEPM 2016

Update December 2015: New version of technical report reflecting the final revision for the conference.

In this post I give an overview of WebLab, a web application for programming education, propose to use it to organize programming competitions, and call for student volunteers to organize a competition.

Traditional Programming Education

If you are a new computer science student, let me explain how the feedback cycle of programming courses used to be organized (and sometimes still is). As student you make a programming assignment and then you have to wait for a teaching assistant to manually grade your work. Typically, it takes at least a day, but often much longer, before you get feedback. And often there is no time to use that feedback to improve your work.

For teachers and assistants this is as frustrating as it is for you. Grading requires a lot of effort (with over 200 computer science students in our first year) and is extremely tedious. Moreover, it is quite hard to determine the correctness of a program just by reading code.

In addition, we have to make sure that all students can work with the same programming environment, which means installing software on lab machines and make sure that a distribution is available that works on a wide variety of student machines.

And finally, exams are typically conducted on paper, making it hard to test programming skills. An exam for a programming course should (at least) test whether you can write a correct solution for a programming problem in the scope of the course. Programming on paper is not what we train you for in the lab, so why test it in the exam?

Online Programming Education

To address these issues, Vlad Vergu and I started developing WebLab in early 2012. WebLab is a web application that supports all aspects of a programming course. Instead of installing a programming environment, editing a solution on your computer, and submitting it in a tool such as CPM, you can edit the solution for a WebLab assignment right there in the browser. While developing your solution, you can compile the solution, to make sure the program has no syntax or type errors, and then run it against a set of unit tests. This gives you direct feedback about the solution you are developing. By extending the set of unit tests you can express your increased understanding of the problem. When you are satisfied with your solution, you can run it against a set of secret specification tests. The only feedback you get from this test is the number of tests that succeeded and the total number of tests. While this is not a lot of information, it does give you an indication about how you are doing, and at a much earlier stage than waiting for an assistant to look at your code. After the submission deadline, grading is mostly based on this testing score. But some aspects of an assignment may be checked by a teaching assistant using a grading rubric. WebLab can also be used for exams. An exam is just a collection of regular WebLab assignments. Using personalized keys handed out on paper, we can make sure you are present in the room during the exam. This enables us to align the exam with the (programming) skills that you develop during the course.

WebLab also integrates the administration of courses. The grades for all your assignments are available in WebLab as soon grading is finalized (which can be on the day of the exam, when no additional manual grading is done). Overall, WebLab saves instructors and assistants a lot of time normally spent on grading and administration and the real-time statistics provides a good tool to monitor the progress of students in a course.

WebLab is now being used in several courses at TU Delft (Algorithms and Data Structures, Algorithms, Concepts of Programming Languages) and at TU Darmstadt (Concepts of Programming Languages, Type Systems of Programming Languages).

Engineering WebLab

WebLab is itself an interesting software engineering project, since it combines a number of non-trivial requirements.

  • Speed: The aim of WebLab is to provide real-time feedback on submissions to programming assignments, reducing the feedback cycle from days or weeks to minutes. However, as soon as the feedback cycle gets that short, students expect really short response times. When testing is part of the development cycle, waiting for a minute to see results feels like an eternity.

  • Robustness: The application must be able to deal with erroneous programs. A key part of learning to program is that one (accidentally) writes non-terminating programs or programs that do otherwise not correspond to our expectations. The application should also not lose solutions written by students.

  • Scalability: The application should do all this not for a single student, but for hundreds (eventually thousands) of students simultaneously writing and executing programs.

  • Security: The application executes user-written code on the server. For regular web applications this is a scenario to be avoided at all costs, since it could allow users to use excessive amounts of resources (memory, CPU), disrupt the application (denial of service), or access and tamper with private data. Thus, the application must restrict what user code can do and how what resources it can use.

  • Quality of feedback: After testing student programs, the application should give useful feedback to students.

Architecture

Satisfying all these requirements at the same time is non-trivial. It took us a number of iterations (and some painful failures) to get to the current implementation. And we have plenty of ideas for further improving it. Let me give an overview of the architecture of WebLab.

The front-end of WebLab takes care of the web-based user interface and maintains a database with all data for a course such as assignments, student enrollments, submissions, and grades. As it is a web application, it is not too hard to satisfy the requirements above even if does require care. Of course, the interaction design for such an application is not obvious and we are still fine tuning navigation and optimal use of the variety of screen resolutions the application is used with. One of the annoying issues with the front-end that we have recently addressed is data loss. We used to maintain only the latest version of a submission without support for undoing of changes. Editing accidents such as saving from an old open tab or pasting code in the wrong tab, could result in code loss. WebLab now maintains a revision history of all edits to a submission, so that you can always retrieve that clever piece of code that you accidentally deleted (as long as you saved it at some point).

The back-end of WebLab takes care of executing student code. This is where the requirements are harder to satisfy. The front-end stores submitted programs and tests in the database. On a request for a test of a submission, the front-end sends the solution code and the test code to the back-end. The back-end compiles and runs the code, and sends the results back to the front-end, which interprets the results and presents those in the console.

To enable rapid feedback (speed) on program executions, it is prohibitive to start a new process to run a compiler or a (Java) virtual machine for the execution of each solution. For scalability it would be problematic to allocate a running JVM for each user. For security we need to be able to restrict what a program can do; we should certainly not run a program in an unrestricted native process. For robustness we need to be able to control programs that attempt to use an excessive amount of CPU cycles, memory, or time.

We ended up with the following solution. A back-end is a Java Virtual Machine (JVM) that processes execution jobs. Each job is executed in a new thread using a separate class loader that restricts what the program can do. We can use this directly for the byte code generated from Java and Scala programs. C and JavaScript programs are first compiled so that they can be run on the JVM. A monitor kills execution tracks if they take too long. In order to cope with programs that crash a back-end anyway, we run a pool of back-ends that are restricted in the amount of memory and CPUs they can use.

WebLab Programming Competitions

In addition to use in traditional programming courses, WebLab has all the ingredients of a great tool for organizing programming competitions, provided we add support for rankings and leaderboards (which may be useful in (gamified) courses too). By frequently organizing programming competitions, you can test your programming skills and we can speed up the development iterations for WebLab, introducing new features for giving feedback and analyzing programs. These competitions would be different in format than the regular ACM competitions with a much shorter duration (say about 3 hours) and individual assignments.

Call for Competition Organizers

To help us organize the competitions, we are looking for student volunteers. Are you a second year (or later) computer science student with good results in programming? Would you like see what setting programming assignments (and WebLab) looks from the instructor side? Do you have creative ideas for assignments that are feasible for your colleagues to make, yet challenging at the same time? Would you like to contribute to new ways of assessing programming assignments? Then, get in touch with me.

Federated conferences such as SPLASH are complex organizations composed of many parts organized by volunteers. The main conference consists of multiple tracks and sub-conferences (OOPSLA, Onward!, SPLASH-I, invited speakers, tutorials, panels), and additional conferences, symposia, and workshops are co-located before the main conference. Each of the parts has its own steering, organizing, and program committees. For each of these parts, presentations (often based on papers) have to be selected. These presentations should be put together into a program that the attendees of the conference can inspect to decide what to attend. While there has been considerable attention for organizing and automating the paper submission and review process, developing the website for such a conference also requires a considerable effort, and requires input from many people over an extended period of time. Often the process of developing it is reinvented for each yearly edition of a conference by a fresh team of volunteers using software that provides little to no support for the domain. As a result, the information for many tracks is not integrated in the main website. Each workshop maintains their own website from which the attendees have to harvest calls for papers and programs.

Since early 2014, we have been developing Conf.Researchr.Org, a domain-specific content management system developed to support the production of large conference web sites. The system is developed using WebDSL, a domain-specific programming language for the development of data rich web applications that integrates sub-languages for data modeling, a template language, access control rules, data validation, and full text search. The application has been adopted by the main conferences in programming languages: SPLASH ('14,'15,'16), PLDI ('15,'16), ISMM'15, ESOP'15, ECOOP ('15,'16), POPL'16, PPoPP'16, Modularity'16, ICFP'16.

We will be presenting a poster about the site at SPLASH 2015. The extended abstract that comes with the poster explains the key concepts of the application. A pre-print is available.

conf-ecoop15-front

type-dependent name resolution

In our ESOP 2015 paper on A Theory of Name Resolution we presented a semantic framework for specification of the name binding patterns in programming languages. We are working on extending this framework with type analysis in order to support declarative specification of type systems. In this technical report we discuss our first results in this direction. We use constraints to formalize type analysis and connect these constraints to the references and declarations in scope graphs to provide a typing context to the names in programs. The connection between names and types is especially interesting in cases where type analysis is needed to perform name resolution, i.e. type-dependent name resolution. This occurs in programming languages in constructs such as field and method access in object-oriented languages, or in with expressions (inspired by Pascal's with statement). Figure 6 from the paper illustrates how such binding patterns are expressed in our framework using a combination of scope graph and constraints. We have a prototype implementation for the framework performing constraint resolution, and we have a prototype of a new version of our name binding and type analysis DSLs NaBL/TS targeting this framework as back-end.

Hendrik van Antwerpen, Pierre Neron, Andrew P. Tolmach, Eelco Visser, Guido Wachsmuth. Language-Independent Type-Dependent Name Resolution. Technical Report TUD-SERG-2015-006, Delft University of Technology, Software Engineering Research Group, Delft, The Netherlands, July 2015. [researchr, PDF]

Abstract: We extend and combine two existing declarative formalisms, the scope graphs of Neron et al. and type constraint systems, to build a language-independent theory that can describe both name and type resolution for realistic languages with complex scope and typing rules. Unlike conventional static semantics presentations, our approach maintains a clear separation between scoping and typing concerns, while still being able to handle language constructs, such as class field access, for which name and type resolution are necessarily intertwined. We define a constraint scheme that can express both typing and name binding constraints, and give a formal notion of constraint satisfiability together with a sound algorithm for finding solutions in important special cases. We describe the details of constraint generation for a model language that illustrates many of the interesting resolution issues associated with modules, classes, and records. Our constraint generator and solver have been implemented in the Spoofax Language Workbench.

In response to popular demand I have created a Spoofax project for LBNF, the grammar formalism of the BNFC project. The project contains an SDF3 definition for LBNF obtained by manually converting the LBNF grammar in LBNF, and a transformation from LBNF to SDF3.

The transformation is fairly straightforward since LBNF and SDF3 productions are similar. In particular, LBNF productions declare a constructor symbol, which is also needed in SDF3. The main distinction is the way that lists are specified. For example, the following LBNF productions

    Rule . Def ::= Label "." Cat "::=" [Item] ;
    []  . [Item] ::= ;
    (:) . [Item] ::= Item [Item] ;

translate to this SDF3 production:

    Def.Rule = [[Label] . [Cat] ::= [{Item " "}*]]

The transformation is probably somewhat rudimentary and ignores advanced features of LBNF. Yet the converter should be fairly robust, since it skips anything that it doesn't recognize. I have only tested it on the LBNF.cf file itself. More testing is needed. I will be happy to do some more work on the transformation if popular demand so requires.

Using LBNF to SDF3

To use the converter:

  • Check out the GitHub repository (https://github.com/MetaBorgCube/metaborg-lbnf) and import the project into your Eclipse with Spoofax.
  • Build the project.
  • Open a .cf file
  • Hit the 'Transform to SDF3' transformation in the 'Generation' menu
Read more