Meta-Programming
In 2017, a team from Northeastern University released Turnstile, a framework for implementing propositionally typed languages in Racket; cf. naasking's story Type Systems as Macros. The system was really nice because it allowed type systems to be expressed in a manner similar to the way theoretical PL researchers would in a paper, and because it hooked into Racket's clean compiler backend.
Now Stephen Chang, one of that team, together with new coauthors Michael Ballantyne, Usamilo Turner and William Bowman, have released a rewrite that they call Turnstile+, together with a POPL article, Dependent Type Systems as Macros. From that article's introduction:
Turnstile+ represents a major research leap over its predecessor. Specifically, we solve the major challenges necessary to implement dependent types and their accompanying DSLs and extensions (which Turnstile could not support), while retaining the original abilities of Turnstile. For example, one considerable obstacle was the separation between the macro expansion phase and a program’s runtime phase. Since dependently typed languages may evaluate expressions while type checking, checking dependent types with macros requires new macrology design patterns and abstractions for interleaving expansion, type checking, and evaluation. The following summarizes our key innovations.
- Turnstile+ demands a radically different API for implementing a language’s types. It must be straightforward yet expressive enough to represent a range of constructs from base types, to binding forms like Π-types, to datatype definition forms for indexed inductive type families.
- Turnstile+ includes an API for defining type-level computation, which we dub normalization by macro expansion. A programmer writes a reduction rule using syntax resembling familiar on-paper notation, and Turnstile+ generates a macro definition that performs the reduction during macro expansion. This allows easily implementing modular type-level evaluation.
- Turnstile+’s new type API adds a generic type operation interface, enabling modular implementation of features such as error messages, pattern matching, and resugaring. This is particularly important for implementing tools like tactic systems that inspect intermediate type-checking steps and construct partial terms.
- Turnstile+’s core type checking infrastructure requires an overhaul, specifically with first-class type environments, in order to accommodate features like dependent binding structures of the shape[x:τ]...,i.e., telescopes [de Bruijn 1991; McBride 2000].
- Relatedly, Turnstile+’s inference-rule syntax is extended so that operations over telescopes, or premises with references to telescopes, operate as folds instead of as maps
The code is available at https://github.com/stchang/macrotypes.
How to Write Seemingly Unhygienic and Referentially
Opaque Macros with Syntax-rules
By Oleg Kiselyov
This paper details how folklore notions of hygiene and referential transparency
of R5RS macros are defeated by a systematic attack. We demonstrate
syntax-rules that seem to capture user identifiers and allow their own identifiers
to be captured by the closest lexical bindings. In other words, we have written
R5RS macros that accomplish what commonly believed to be impossible.
Type Systems as Macros, by Stephen Chang, Alex Knauth, Ben Greenman:
We present TURNSTILE, a metalanguage for creating typed embedded languages. To implement the type system, programmers write type checking rules resembling traditional judgment syntax. To implement the semantics, they incorporate elaborations into these rules. TURNSTILE critically depends on the idea of linguistic reuse. It exploits a macro system in a novel way to simultaneously type check and rewrite a surface program into a target language. Reusing a macro system also yields modular implementations whose rules may be mixed and matched to create other languages. Combined with typical compiler and runtime reuse, TURNSTILE produces performant typed embedded languages with little effort.
This looks pretty awesome considering it's not limited to simple typed languages, but extends all the way to System F and F-omega! Even better, they can reuse previous type systems to define new ones, thereby reducing the effort to implement more expressive type systems. All code and further details available here, and here's a blog post where Ben Greenman further discusses the related "type tailoring", and of course, these are both directly related to Active Libraries.
Taken to its extreme, why not have an assembler with a powerful macro system of this sort as your host language, and every high-level language would be built on this. I'm not sure if this approach would extend that far, but it's an interesting idea. You'd need a cpp-like highly portable macro tool, and porting to a new platform consists of writing architecture-specific macros for some core language, like System F.
This work may also conceptually dovetail with another thread discussing fexprs and compilation.
BER MetaOCaml -- an OCaml dialect for multi-stage programming
Oleg Kiselyov
2010-2015-
BER MetaOCaml is a conservative extension of OCaml for ``writing programs that generate programs''. BER MetaOCaml adds to OCaml the type of code values (denoting ``program code'', or future-stage computations), and two basic constructs to build them: quoting and splicing. The generated code can be printed, stored in a file -- or compiled and linked-back to the running program, thus implementing run-time code optimization. A well-typed BER MetaOCaml program generates only well-scoped and well-typed programs: The generated code shall compile without type errors. The generated code may run in the future but it is type checked now. BER MetaOCaml is a complete re-implementation of the original MetaOCaml by Walid Taha, Cristiano Calcagno and collaborators.
Introduction to staging and MetaOCaml
The standard example of meta-programming -- the running example of
A.P.Ershov's 1977 paper that begat partial evaluation -- is the
power function, computing x^n. In OCaml:
let square x = x * x
let rec power n x =
if n = 0 then 1
else if n mod 2 = 0 then square (power (n/2) x)
else x * (power (n-1) x)
[...]
In MetaOCaml, we may also specialize the power function to a particular value n, obtaining the code which will later receive x and compute x^n. We re-write power n x annotating expressions as computed `now' (when n is known) or `later' (when x is given).
let rec spower n x =
if n = 0 then .<1>.
else if n mod 2 = 0 then .<square .~(spower (n/2) x)>.
else .<.~x * .~(spower (n-1) x)>.;;
A brief history of (BER) MetaOCaml
As MetaOCaml was being developed, new versions of the mainline OCaml were released with sometimes many fixes and improvements. The MetaOCaml team tracked new OCaml releases and merged the changes into MetaOCaml. (The MetaOCaml version number has as its base OCaml's release version.) The merge was not painless. For example, any new function in the OCaml compiler that dealt with Parsetree (AST) or Typedtree has to be modified to handle MetaOCaml extensions to these data structures. The merge process became more and more painful as the two languages diverged. For instance, native code compilation that first appeared in MetaOCaml 3.07 relied on SCaml, a large set of patches to OCaml by malc at pulsesoft.com to support dynamic linking. OCaml 3.08 brought many changes that were incompatible with SCaml. Therefore, in MetaOCaml 3.08 the native compilation mode was broken. The mode was brought back in the Summer 2005, by re-engineering the SCaml patch and implementing the needed parts of dynamic linking without any modification to the OCaml code. The revived native compilation has survived through the end.
[...]
BER MetaOCaml has been re-structured to minimize the amount of changes to the OCaml type-checker and to separate the `kernel' from the `user-level'. The kernel is a set of patches and additions to OCaml, responsible for producing and type-checking code values. The processing of built code values -- so-called `running' -- is user-level. Currently the user-level metalib supports printing, type-checking, and byte-compiling and linking of code values. Users have added other ways of running the code, for example, compiling it to machine code, C or LLVM -- without any need to hack into (Meta)OCaml or even recompile it.
[...]
By relying on attributes, the feature of OCaml 4.02, BER N102 has become much closer integrated with OCaml. It is instructive to compare the amount of changes BER MetaOCaml makes to the OCaml distribution. The previous version (BER N101) modified 32 OCaml files. The new BER N102 modifies only 7 (that number could be further reduced to only 2; the only file with nontrivial modifications is typing/typecore.ml). It is now a distinct possibility that -- with small hooks that may be provided in the future OCaml versions -- MetaOCaml becomes just a regular library or a plug-in, rather being a fork.
The Next Stage of Staging, by Jun Inoue, Oleg Kiselyov, Yukiyoshi Kameyama:
This position paper argues for type-level metaprogramming, wherein types and type declarations are generated in addition to program terms. Term-level metaprogramming, which allows manipulating expressions only, has been extensively studied in the form of staging, which ensures static type safety with a clean semantics with hygiene (lexical scoping). However, the corresponding development is absent for type manipulation. We propose extensions to staging to cover ML-style module generation and show the possibilities they open up for type specialization and overhead-free parametrization of data types equipped with operations. We outline the challenges our proposed extensions pose for semantics and type safety, hence offering a starting point for a long-term program in the next stage of staging research. The key observation is that type declarations do not obey scoping rules as variables do, and that in metaprogramming, types are naturally prone to escaping the lexical environment in which they were declared. This sets next-stage staging apart from dependent types, whose benefits and implementation mechanisms overlap with our proposal, but which does not deal with type-declaration generation. Furthermore, it leads to an interesting connection between staging and the logic of definitions, adding to the study’s theoretical significance.
A position paper describing the next logical progression of staging to metaprogramming over types. Now with the true first-class modules of 1ML, perhaps there's a clearer way forward.
Types for Flexible Objects, by Pottayil Harisanker Menon, Zachary Palmer, Alexander Rozenshteyn, Scott Smith:
Scripting languages are popular in part due to their extremely flexible objects. These languages support numerous object features, including dynamic extension, mixins, traits, and first-class messages. While some work has succeeded in typing these features individually, the solutions have limitations in some cases and no project has combined the results.
In this paper we define TinyBang, a small typed language containing only functions, labeled data, a data combinator, and pattern matching. We show how it can directly express all of the aforementioned flexible object features and still have sound typing. We use a subtype constraint type inference system with several novel extensions to ensure full type inference; our algorithm reï¬nes parametric polymorphism for both flexibility and efficiency. We also use TinyBang to solve an open problem in OO literature: objects can be extended after being messaged without loss of width or depth subtyping and without dedicated metatheory. A core subset of TinyBang is proven sound and a preliminary implementation has been constructed.
An interesting paper I stumbled across quite by accident, it purports quite an ambitious set of features: generalizing previous work on first-class cases while supporting subtyping, mutation, and polymorphism all with full type inference, in an effort to match the flexibility of dynamically typed languages.
It does so by introducing a host of new concepts that are almost-but-not-quite generalizations of existing concepts, like "onions" which are kind of a type-indexed extensible record, and "scapes" which are sort of a generalization of pattern matching cases.
Instead of approaching objects via a record calculus, they approach it using its dual as variant matching. Matching functions then have degenerate dependent types, which I first saw in the paper Type Inference for First-Class Messages with Match-Functions. Interesting aside, Scott Smith was a coauthor on this last paper too, but it isn't referenced in the "flexible objects" paper, despite the fact that "scapes" are "match-functions".
Overall, quite a dense and ambitous paper, but the resulting TinyBang language looks very promising and quite expressive. Future work includes making the system more modular, as it currently requires whole program compilation, and adding first-class labels, which in past work has led to interesting results as well. Most work exploiting row polymorphism is particularly interesting because it supports efficient compilation to index-passing code for both records and variants. It's not clear if onions and scapes are also amenable to this sort of translation.
Edit: a previous paper was published in 2012, A Practical, Typed Variant Object Model -- Or, How to Stand On Your Head and Enjoy the View. BigBang is their language that provides syntactic sugar on top of TinyBang.
Edit 2: commas fixed, thanks!
Conor McBride gave an 8-lecture summer course on Dependently typed metaprogramming (in Agda) at the Cambridge University Computer Laboratory:
Dependently typed functional programming languages such as Agda are capable of expressing very precise types for data. When those data themselves encode types, we gain a powerful mechanism for abstracting generic operations over carefully circumscribed universes. This course will begin with a rapid depedently-typed programming primer in Agda, then explore techniques for and consequences of universe constructions. Of central importance are the “pattern functors†which determine the node structure of inductive and coinductive datatypes. We shall consider syntactic presentations of these functors (allowing operations as useful as symbolic differentiation), and relate them to the more uniform abstract notion of “containerâ€. We shall expose the double-life containers lead as “interaction structures†describing systems of effects. Later, we step up to functors over universes, acquiring the power of inductive-recursive definitions, and we use that power to build universes of dependent types.
The lecture notes, code, and video captures are available online.
As with his previous course, the notes contain many(!) mind expanding exploratory exercises, some of which quite challenging.
Tool Demo: Scala-Virtualized
This paper describes Scala-Virtualized, which extends the Scala language and compiler with a small number of features that enable combining the beneï¬ts of shallow and deep embeddings of DSLs. We demonstrate our approach by showing how to embed three different domain-speciï¬c languages in Scala. Moreover, we summarize how others have been using our extended compiler in their own research and teaching. Supporting artifacts of our tool include web-based tutorials, nightly builds, and an Eclipse update site hosting an up-to-date version of the Scala IDE for Eclipse based on the Virtualized Scala compiler and standard library.
Scala has always had a quite good EDSL story thanks to implicits, dot- and paren-inference, and methods-as-operators. Lately there are proposals to provide it with both macros-in-the-camlp4-sense and support for multi-stage programming. This paper goes into some depth on the foundations of the latter subject.
Macros that Work Together - Compile-Time Bindings, Partial Expansion, and Definition Contexts, Matthew Flatt, Ryan Culpepper, David Darais, and Robert Bruce Findler. Under consideration for publication in J. Functional Programming.
Racket (formerly PLT Scheme) is a large language that is built mostly within itself. Unlike the usual
approach taken by non-Lisp languages, the self-hosting of Racket is not a matter of bootstrapping
one implementation through a previous implementation, but instead a matter of building a tower of
languages and libraries via macros. The upper layers of the tower include a class system, a component
system, pedagogic variants of Scheme, a statically typed dialect of Scheme, and more. The demands
of this language-construction effort require a macro system that is substantially more expressive than
previous macro systems. In particular, while conventional Scheme macro systems handle stand-alone
syntactic forms adequately, they provide weak support for macros that share information or macros
that use existing syntactic forms in new contexts.
This paper describes and models novel features of the Racket macro system, including support for
general compile-time bindings, sub-form expansion and analysis, and environment management. The
presentation assumes a basic familiarity with Lisp-style macros, and it takes for granted the need for
macros that respect lexical scope. The model, however, strips away the pattern and template system
that is normally associated with Scheme macros, isolating a core that is simpler, that can support
pattern and template forms themselves as macros, and that generalizes naturally to Racket’s other
extensions.
A good description of Racket's rocket science tools for growing languages.
|
Recent comments
27 weeks 1 day ago
27 weeks 1 day ago
27 weeks 1 day ago
49 weeks 2 days ago
1 year 1 week ago
1 year 3 weeks ago
1 year 3 weeks ago
1 year 5 weeks ago
1 year 10 weeks ago
1 year 10 weeks ago