Mechanized λ<sub>JS</sub>

Mechanized λJS
The Brown PLT Blog, 2012-06-04

In an earlier post, we introduced λJS, our operational semantics for JavaScript. Unlike many other operational semantics, λJS is no toy, but strives to correctly model JavaScript's messy details. To validate these claims, we test λJS with randomly generated tests and with portions of the Mozilla JavaScript test suite.

Testing is not enough. Despite our work, other researchers found a missing case in λJS. Today, we're introducing Mechanized λJS, which comes with a machine-checked proof of correctness, using the Coq proof assistant.

More work on mechanizing the actual, implemented semantics of a real language, rather than a toy.

ACM A.M. Turing Centenary Celebration

ACM A.M. Turing Centenary Celebration

33 ACM A.M. Turing Award Winners came together for the first time, to honor the 100th Anniversary of Alan Turing and reflect on his contributions, as well as on the past and future of computing. The event has now taken place—but everyone can join the conversation, #ACMTuring100, and view the webcast.

This event totally flew under my radar! Many thanks to Scott Wallace for pointing it out.

How to Make Ad Hoc Proof Automation Less Ad Hoc

How to Make Ad Hoc Proof Automation Less Ad Hoc
Georges Gonthier, Beta Ziliani, Aleksandar Nanevski, and Derek Dreyer, to appear in ICFP 2011

Most interactive theorem provers provide support for some form of user-customizable proof automation. In a number of popular systems, such as Coq and Isabelle, this automation is achieved primarily through tactics, which are programmed in a separate language from that of the prover’s base logic. While tactics are clearly useful in practice, they can be difficult to maintain and compose because, unlike lemmas, their behavior cannot be specified within the expressive type system of the prover itself.

We propose a novel approach to proof automation in Coq that allows the user to specify the behavior of custom automated routines in terms of Coq’s own type system. Our approach involves a sophisticated application of Coq’s canonical structures, which generalize Haskell type classes and facilitate a flexible style of dependently-typed logic programming. Specifically, just as Haskell type classes are used to infer the canonical implementation of an overloaded term at a given type, canonical structures can be used to infer the canonical proof of an overloaded lemma for a given instantiation of its parameters. We present a series of design patterns for canonical structure programming that enable one to carefully and predictably coax Coq’s type inference engine into triggering the execution of user-supplied algorithms during unification, and we illustrate these patterns through several realistic examples drawn from Hoare Type Theory. We assume no prior knowledge of Coq and describe the relevant aspects of Coq type inference from first principles.

If you've ever toyed with Coq but run into the difficulties that many encounter in trying to construct robust, comprehensible proof scripts using tactics, which manipulate the proof state and can leave you with the "ground" of the proof rather than the "figure," if you will, in addition to being fragile in the face of change, you may wish to give this a read. It frankly never would have occurred to me to try to turn Ltac scripts into lemmas at all. This is much more appealing than most other approaches to the subject I've seen.

Validating LR(1) parsers

Validating LR(1) parsers

An LR(1) parser is a finite-state automaton, equipped with a stack, which uses a combination of its current state and one lookahead symbol in order to determine which action to perform next. We present a validator which, when applied to a context-free grammar G and an automaton A, checks that A and G agree. Validating the parser provides the correctness guarantees required by verified compilers and other high-assurance software that involves parsing. The validation process is independent of which technique was used to construct A. The validator is implemented and proved correct using the Coq proof assistant. As an application, we build a formally-verified parser for the C99 language.

I've always been somewhat frustrated, while studying verified compiler technology, that the scope of the effort has generally been limited to ensuring that the AST and the generated code mean the same thing, as important as that obviously is. Not enough attention has been paid, IMHO, to other compiler phases. Parsing: The Solved Problem That Isn't does a good job illuminating some of the conceptual issues that arise in attempting to take parsers seriously as functions that we would like to compose etc. while maintaining some set of properties that hold of the individuals. Perhaps this work can shed some light on possible solutions to some of those issues, in addition to being worthwhile in its own right. Note the pleasing presence of an actual implementation that's been used on the parser of a real-world language, C99.

Google Blockly Lets You Hack With No Keyboard

Following on from recent discussions about graphical languages in the Russian space program, here's a recent story about Google's new visual programming language.

Cade Metz, "Google Blockly Lets You Hack With No Keyboard", Wired Enterprise.

Now available on Google Code — the company’s site for hosting open source software — the new language is called Google Blockly, and it’s reminiscent of Scratch, a platform developed at MIT that seeks to turn even young children into programmers.

As the Blockly FAQ says, "Blockly was influenced by App Inventor, which in turn was influenced by Scratch." So if you've seen Scratch before, this will look very familiar. If you haven't seen Scratch, and want to have a go with Blockly, you can find the maze demo from the Wired story here.

Graphical languages of the Russian space program

I found this very interesting: What software programming languages were used by the Soviet Union's space program?

The languages DRAGON and Grafit-Floks appear to be graphical. You can see some videos of working with DRAGON.

Interactive Tutorial of the Sequent Calculus

Interactive Tutorial of the Sequent Calculus by Edward Z. Yang.

This interactive tutorial will teach you how to use the sequent calculus, a simple set of rules with which you can use to show the truth of statements in first order logic. It is geared towards anyone with some background in writing software for computers, with knowledge of basic boolean logic. ...

Proving theorems is not for the mathematicians anymore: with theorem provers, it's now a job for the hacker. — Martin Rinard ...

A common complaint with a formal systems like the sequent calculus is the "I clicked around and managed to prove this, but I'm not really sure what happened!" This is what Martin means by the hacker mentality: it is now possible for people to prove things, even when they don't know what they're doing. The computer will ensure that, in the end, they will have gotten it right.

The tool behind this nice tutorial is Logitext.

Tool Demo: Scala-Virtualized

Tool Demo: Scala-Virtualized

This paper describes Scala-Virtualized, which extends the Scala language and compiler with a small number of features that enable combining the benefits of shallow and deep embeddings of DSLs. We demonstrate our approach by showing how to embed three different domain-specific languages in Scala. Moreover, we summarize how others have been using our extended compiler in their own research and teaching. Supporting artifacts of our tool include web-based tutorials, nightly builds, and an Eclipse update site hosting an up-to-date version of the Scala IDE for Eclipse based on the Virtualized Scala compiler and standard library.

Scala has always had a quite good EDSL story thanks to implicits, dot- and paren-inference, and methods-as-operators. Lately there are proposals to provide it with both macros-in-the-camlp4-sense and support for multi-stage programming. This paper goes into some depth on the foundations of the latter subject.

Common Lisp: The Untold Story

Common Lisp: The Untold Story, by Kent Pitman. A nice paper about the history of my favorite lightweight dynamic language.

This paper summarizes a talk given at “Lisp50@OOPSLA,” the 50th Anniversary of Lisp workshop, Monday, October 20, 2008, an event co-located with the OOPSLA’08 in Nashville, TN, in which I offered my personal, subjective account of how I came to be involved with Common Lisp and the Common Lisp standard, and of what I learned from the process.

Some of my favorite parts are:

  • How CL was viewed as competition to C++. (Really, what were they thinking?)
  • How CL was a reaction to the threat of Interlisp, and how "CLOS was the price of getting the Xerox/Interlisp community folded back into Lisp community as a whole" (link).
  • How individuals shaped the processes of standardization. MIT Sloan did an analysis of these processes.
  • How the two- to three-day roundtrip time for UUCP emails to Europe may be responsible for the creation of the separate EuLisp.

I have a soft spot for CL, so I am biased, but I think Greenspun's Tenth Rule (and Robert Morris' corollary) still holds - CL is the language that newer dynamic languages, such as Perl 6, JavaScript, and Racket are asymptotically approaching (and exceeding in some cases, which is why I view CL as a lightweight language today.)

Evaluating the Design of the R Language

From our recent discussion on R, I thought this paper deserved its own post (ECOOP final version) by Floreal Morandat, Brandon Hill, Leo Osvald, and Jan Vitek; abstract:

R is a dynamic language for statistical computing that combines lazy functional features and object-oriented programming. This rather unlikely linguistic cocktail would probably never have been prepared by computer scientists, yet the language has become surprisingly popular. With millions of lines of R code available in repositories, we have an opportunity to evaluate the fundamental choices underlying the R language design. Using a combination of static and dynamic program analysis we assess the success of different language features.

Excerpts from the paper:

R comes equipped with a rather unlikely mix of features. In a nutshell, R is a dynamic language in the spirit of Scheme or JavaScript, but where the basic data type is the vector. It is functional in that functions are first-class values and arguments are passed by deep copy. Moreover, R uses lazy evaluation by default for all arguments, thus it has a pure functional core. Yet R does not optimize recursion, and instead encourages vectorized operations. Functions are lexically scoped and their local variables can be updated, allowing for an imperative programming style. R targets statistical computing, thus missing value support permeates all operations.

One of our discoveries while working out the semantics was how eager evaluation of promises turns out to be. The semantics captures this with C[]; the only cases where promises are not evaluated is in the arguments of a function call and when promises occur in a nested function body, all other references to promises are evaluated. In particular, it was surprising and unnecessary to force assignments as this hampers building infinite structures. Many basic functions that are lazy in Haskell, for example, are strict in R, including data type constructors. As for sharing, the semantics cleary demonstrates that R prevents sharing by performing copies at assignments.

The R implementation uses copy-on-write to reduce the number of copies. With superassignment, environments can be used as shared mutable data structures. The way assignment into vectors preserves the pass-by-value semantics is rather unusual and, from personal experience, it is unclear if programmers understand the feature. ... It is noteworthy that objects are mutable within a function (since fields are attributes), but are copied when passed as an argument.

But then, they do a corpus analysis to see what features programmers actually use! We don't do enough of these in PL; examples from the paper:

R symbol lookup is context sensitive. This feature, which is neither Lisp nor Scheme scoping, is exercised in less than 0.05% of function name lookups...The only symbols for which this feature actually mattered in the Bioconductor vignettes are c and file, both popular variables names and built-in functions.

Lazy evaluation is a distinctive feature of R that has the potential for reducing unnecessary work performed by a computation. Our corpus, however, does not bear this out. Fig. 14(a) shows the rate of promise evaluation across all of our data sets. The average rate is 90%. Fig. 14(b) shows that on average 80% of promises are evaluated in the first function they are passed into.

And so on. A lot of great data-driven insights.