Generating compiler back ends at the snap of a finger

The paper:

Resourceable, Retargetable, Modular Instruction Selection Using a Machine-Independent, Type-Based Tiling of Low-Level Intermediate Code

Ramsey and Dias have a series of papers about making it ever easier to generate compiler backends, and the claim is that they produce decent code to boot. I wonder if this stuff has/will show up in compilers I can use? (Or, do you think it not actually matter, for some pragmatic reason or other?)

Abstract: We present a novel variation on the standard technique of selecting instructions by tiling an intermediate-code tree. Typical compilers use a different set of tiles for every target machine. By analyzing a formal model of machine-level computation, we have developed a set of tiles that is machine-independent while retaining the expressive power of machine code. Using this tileset, we reduce the number of tilers required from one per machine to one per architectural family (e.g., register architecture or stack architecture). Because the tiler is the part of the instruction selector that is most difficult to reason about, our technique makes it possible to retarget an instruction selector with significantly less effort than standard techniques. Retargeting effort is further reduced by applying an earlier result which generates the machine-dependent implementation of our tileset automatically from a declarative description of instructions' semantics. Our design has the additional benefit of enabling modular reasoning about three aspects of code generation that are not typically separated: the semantics of the compiler's intermediate representation, the semantics of the target instruction set, and the techniques needed to generate good target code.

Paul Hudak

These are sad news indeed. I am sure almost everyone here read at least one paper by Paul and many knew him personally. When I just started thinking about programming languages I was fascinated by DSLs and his work was simply inspiring. His voice will be missed.

Discussions of Paul Hudak's work

Update:There is some confusion about the situation. Please see the comments for further information.

Everything old is new again: Quoted Domain Specific Languages

Everything old is new again: Quoted Domain Specific Languages, by Shayan Najd, Sam Lindley, Josef Svenningsson, Philip Wadler:

We describe a new approach to domain specific languages (DSLs), called Quoted DSLs (QDSLs), that resurrects two old ideas: quotation,
from McCarthy’s Lisp of 1960, and the subformula property, from Gentzen’s natural deduction of 1935. Quoted terms allow the DSL to share the syntax and type system of the host language. Normalising quoted terms ensures the subformula property, which guarantees that one can use higher-order types in the source while guaranteeing first-order types in the target, and enables using types to guide fusion. We test our ideas by re-implementing Feldspar, which was originally implemented as an Embedded DSL (EDSL), as a QDSL; and we compare the QDSL and EDSL variants.

Neat paper first posted here by Blaisorblade that formalizes the properties needed for using quotation to implement DSLs. Most embedded DSLs use custom operators and lifted conditionals to make embedded programs seem more natural, but quoting enables the use of native operators and conditions, and might lead to more natural expressions of lightweight (possibly heterogenous) staged computations.

The Next Stage of Staging

The Next Stage of Staging, by Jun Inoue, Oleg Kiselyov, Yukiyoshi Kameyama:

This position paper argues for type-level metaprogramming, wherein types and type declarations are generated in addition to program terms. Term-level metaprogramming, which allows manipulating expressions only, has been extensively studied in the form of staging, which ensures static type safety with a clean semantics with hygiene (lexical scoping). However, the corresponding development is absent for type manipulation. We propose extensions to staging to cover ML-style module generation and show the possibilities they open up for type specialization and overhead-free parametrization of data types equipped with operations. We outline the challenges our proposed extensions pose for semantics and type safety, hence offering a starting point for a long-term program in the next stage of staging research. The key observation is that type declarations do not obey scoping rules as variables do, and that in metaprogramming, types are naturally prone to escaping the lexical environment in which they were declared. This sets next-stage staging apart from dependent types, whose benefits and implementation mechanisms overlap with our proposal, but which does not deal with type-declaration generation. Furthermore, it leads to an interesting connection between staging and the logic of definitions, adding to the study’s theoretical significance.

A position paper describing the next logical progression of staging to metaprogramming over types. Now with the true first-class modules of 1ML, perhaps there's a clearer way forward.

A theory of changes for higher-order languages — incrementalizing λ-calculi by static differentiation

The project Incremental λ-Calculus is just starting (compared to more mature approaches like self-adjusting computation), with a first publication last year.

A theory of changes for higher-order languages — incrementalizing λ-calculi by static differentiation
Paolo Giarusso, Yufei Cai, Tillmann Rendel, and Klaus Ostermann. 2014

If the result of an expensive computation is invalidated by a small change to the input, the old result should be updated incrementally instead of reexecuting the whole computation. We incrementalize programs through their derivative. A derivative maps changes in the program’s input directly to changes in the program’s output, without reexecuting the original program. We present a program transformation taking programs to their derivatives, which is fully static and automatic, supports first-class functions, and produces derivatives amenable to standard optimization.

We prove the program transformation correct in Agda for a family of simply-typed λ-calculi, parameterized by base types and primitives. A precise interface specifies what is required to incrementalize the chosen primitives.

We investigate performance by a case study: We implement in Scala the program transformation, a plugin and improve performance of a nontrivial program by orders of magnitude.

I like the nice dependent types: a key idea of this work is that the "diffs" possible from a value v do not live in some common type diff(T), but rather in a value-dependent type diff(v). Intuitively, the empty list and a non-empty list have fairly different types of possible changes. This makes change-merging and change-producing operations total, and allow to give them a nice operational theory. Good design, through types.

(The program transformation seems related to the program-level parametricity transformation. Parametricity abstract over equality justifications, differentiation on small differences.)

John C Reynolds Doctoral Dissertation Award nominations for 2014

Presented annually to the author of the outstanding doctoral dissertation in the area of Programming Languages. The award includes a prize of $1,000. The winner can choose to receive the award at ICFP, OOPSLA, POPL, or PLDI.

I guess it is fairly obvious why professors should propose their students (the deadline is January 4th 2015). Newly minted PhD should, for similar reasons, make sure their professors are reminded of these reasons. I can tell you that the competition is going to be tough this year; but hey, you didn't go into programming language theory thinking it is going to be easy, did you?

Zélus : A Synchronous Language with ODEs

Zélus : A Synchronous Language with ODEs
Timothy Bourke, Marc Pouzet
2013

Zélus is a new programming language for modeling systems that mix discrete logical time and continuous time behaviors. From a user's perspective, its main originality is to extend an existing Lustre-like synchronous language with Ordinary Differential Equations (ODEs). The extension is conservative: any synchronous program expressed as data-flow equations and hierarchical automata can be composed arbitrarily with ODEs in the same source code.

A dedicated type system and causality analysis ensure that all discrete changes are aligned with zero-crossing events so that no side effects or discontinuities occur during integration. Programs are statically scheduled and translated into sequential code that, by construction, runs in bounded time and space. Compilation is effected by source-to-source translation into a small synchronous subset which is processed by a standard synchronous compiler architecture. The resultant code is paired with an off-the-shelf numeric solver.

We show that it is possible to build a modeler for explicit hybrid systems à la Simulink/Stateflow on top of an existing synchronous language, using it both as a semantic basis and as a target for code generation.

Synchronous programming languages (à la Lucid Synchrone) are language designs for reactive systems with discrete time. Zélus extends them gracefully to hybrid discrete/continuous systems, to interact with the physical world, or simulate it -- while preserving their strong semantic qualities.

The paper is short (6 pages) and centered around examples rather than the theory -- I enjoyed it. Not being familiar with the domain, I was unsure what the "zero-crossings" mentioned in the introductions are, but there is a good explanation further down in the paper:

The standard way to detect events in a numeric solver is via zero-crossings where a solver monitors expressions for changes in sign and then, if they are detected, searches for a more precise instant of crossing.

The Zélus website has a 'publications' page with more advanced material, and an 'examples' page with case studies.

Facebook releases "Flow", a statically typed JavaScript variant

The goal of Flow is to find errors in JavaScript code with little programmer effort. Flow relies heavily on type inference to find type errors even when the program has not been annotated - it precisely tracks the types of variables as they flow through the program.

At the same time, Flow is a gradual type system. Any parts of your program that are dynamic in nature can easily bypass the type checker, so you can mix statically typed code with dynamic code.

Flow also supports a highly expressive type language. Flow types can express much more fine-grained distinctions than traditional type systems. For example, Flow helps you catch errors involving null, unlike most type systems.

Read more here.
Here's the announcement from Facebook.

Why do we need modules at all?

Post by Joe Armstrong of Erlang fame. Leader:

Why do we need modules at all? This is a brain-dump-stream-of-consciousness-thing. I've been thinking about this for a while. I'm proposing a slightly different way of programming here. The basic idea is:

  • do away with modules
  • all functions have unique distinct names
  • all functions have (lots of) meta data
  • all functions go into a global (searchable) Key-value database
  • we need letrec
  • contribution to open source can be as simple as contributing a single function
  • there are no "open source projects" - only "the open source Key-Value database of all functions"
  • Content is peer reviewed

Why does Erlang have modules? There's a good an bad side to modules. Good: Provides a unit of compilation, a unit of code distribution. unit of code replacement. Bad: It's very difficult to decide which module to put an individual function in. Break encapsulation (see later).

Conservation laws for free!

In this year's POPL, Bob Atkey made a splash by showing how to get from parametricity to conservation laws, via Noether's theorem:

Invariance is of paramount importance in programming languages and in physics. In programming languages, John Reynolds’ theory of relational parametricity demonstrates that parametric polymorphic programs are invariant under change of data representation, a property that yields “free” theorems about programs just from their types. In physics, Emmy Noether showed that if the action of a physical system is invariant under change of coordinates, then the physical system has a conserved quantity: a quantity that remains constant for all time. Knowledge of conserved quantities can reveal deep properties of physical systems. For example, the conservation of energy, which by Noether’s theorem is a consequence of a system’s invariance under time-shifting.

In this paper, we link Reynolds’ relational parametricity with Noether’s theorem for deriving conserved quantities. We propose an extension of System Fω with new kinds, types and term constants for writing programs that describe classical mechanical systems in terms of their Lagrangians. We show, by constructing a relationally parametric model of our extension of Fω, that relational parametricity is enough to satisfy the hypotheses of Noether’s theorem, and so to derive conserved quantities for free, directly from the polymorphic types of Lagrangians expressed in our system.