Type Systems as Macros

Type Systems as Macros, by Stephen Chang, Alex Knauth, Ben Greenman:

We present TURNSTILE, a metalanguage for creating typed embedded languages. To implement the type system, programmers write type checking rules resembling traditional judgment syntax. To implement the semantics, they incorporate elaborations into these rules. TURNSTILE critically depends on the idea of linguistic reuse. It exploits a macro system in a novel way to simultaneously type check and rewrite a surface program into a target language. Reusing a macro system also yields modular implementations whose rules may be mixed and matched to create other languages. Combined with typical compiler and runtime reuse, TURNSTILE produces performant typed embedded languages with little effort.

This looks pretty awesome considering it's not limited to simple typed languages, but extends all the way to System F and F-omega! Even better, they can reuse previous type systems to define new ones, thereby reducing the effort to implement more expressive type systems. All code and further details available here, and here's a blog post where Ben Greenman further discusses the related "type tailoring", and of course, these are both directly related to Active Libraries.

Taken to its extreme, why not have an assembler with a powerful macro system of this sort as your host language, and every high-level language would be built on this. I'm not sure if this approach would extend that far, but it's an interesting idea. You'd need a cpp-like highly portable macro tool, and porting to a new platform consists of writing architecture-specific macros for some core language, like System F.

This work may also conceptually dovetail with another thread discussing fexprs and compilation.

Comment viewing options

So, it's in Lisp, huh..

So, it's in Lisp, huh..

is this new?

This article is not original in its main thrust as described in the abstract. The use of judgments to fabricate new types and macros to design new typed languages was anticipated by the open science language Shen which was released in 2011. You'll find a 30 minute video describing Shen on the home page www.shenlanguage.org. Shen was first written in Lisp but now exists in many languages (ports in Java, JVM, JS, Clojure, C, Scheme, CL, Emacs Lisp, Python, Haskell, Ruby).

Prior to that we had Qi (2005) which also used 'judgments' in pursuit of the same goal and if you really want to go back - SEQUEL in 'A Language for Implementing Arbitrary Logics' in International Joint Conference on Artificial Intelligence (1993).
Both done in Lisp.

Anyway if you want to follow some leads and judge for yourselves.

1. The Shen appeal video of 2013 which shows the scope of our work (about 5 minutes). http://shenlanguage.org/appeal.mp4.

2. The Shen introductory video of 2017 which explains Shen w.o. making any background assumptions (about 30 minutes).
See http://shenlanguage.org/. You'll find in that video a program using macros and judgments to form an ML-style
type system. The code is lifted from the Book of Shen (2015).

3. The OS Shen Manual; http://shenlanguage.org/learn-shen/index.html.

4. Really old; but my invited 2008 talk on Qi 'Lisp for the C21' paid for by sponsors. http://shenlanguage.org/l21.wmv.

5. My 2015 essay 'In Defence of Open Science' where I discuss the closed peer review process vs open science in the
context of Wikipedia and the abuses of the traditional system. http://www.marktarver.com/openscience.html

6. Rather old; but a history page of the project from 1990 to 2014. http://shenlanguage.org/history.html

7. For the really keen, "The Book of Shen" (2015) 3rd edition which discusses the whole technology and the algorithms behind it.
https://www.amazon.co.uk/dp/1784562130

As a group we've been around for a long time; we have nearly 500 members. The Shen kernel went OS 2 years ago and we've got
a group of OS enthusiasts who are now doing really well improving and porting it. Go to the news group if you want to learn more and participate in the project.

Mark Tarver

comparison with Shen

Our paper presents a method for implementing typed DSLs, particularly type checkers for these DSLs, using only an off-the-shelf macro system, namely Racket's.

Our method is pragmatic since it is compatible with the way Racket programmers already implement (untyped) DSLs, as a series of macro definitions. We show that these DSL creators may add type checking to their language simply by extending their macro definitions with a protocol for communicating the necessary type information. As a result, type checking is computed as part of macro expansion, hence "type systems as macros". Finally, we add a syntactic layer (Turnstile) that enables defining these type-checking macros using a declarative syntax that roughly resembles conventional type rule specifications.

In contrast, Shen appears to already come with a full-featured type system out of the box. Programmers then extend this type system via a special-purpose framework that allows defining new datatypes in a sequent-calculus-style syntax. Thus Shen seems to fall under the "extensible type systems" line of work?

Implementing typed DSLs with extensible type systems is typically more restrictive than our approach because the base type system constrains both the features that *may* be added and the features that *must* be included in any DSL. Shen's extension framework seems sufficiently expressive such that the former may not be much of problem, but I think the latter still might be an issue? In other words, can a DSL writer take away unwanted type system features? For example, can one reuse Shen's type system to create a simply-typed DSL where polymorphism is not allowed? How about defining a new function application rule for a DSL with a type-and-effect system? Or how about a DSL with completely different base types, e.g., Int for integers instead of Number?

Implementationwise, neither Shen's base type system nor its extension rules appear to desugar into plain macro definitions. Instead, I think they correspond to rules in an underlying logic language? In short, I do not see "type systems as macros" in any of the linked examples and thus I do not understand the similarity assertions. Would "type extensions as logic programs" be a more accurate description of Shen?

Turnstile and Shen

I don't doubt that there are differences; but the central claims of the abstract wrt novelty are questionable.

To implement the type system, programmers write type checking rules resembling traditional judgment syntax.

Definitely not new. I did this 25 years ago, published it in 1993, and the entire Qi/Shen group has been using the technology for > 10 years.

TURNSTILE ...... exploits a macro system in a novel way to simultaneously type check and rewrite a surface program into a target language.

Already well established in our group. Shenturians know how to do this. We've implemented macro constructions for FOR ..... NEXT, pattern-matching local assignments and other DSL features using this technology.

Shen's extension framework seems sufficiently expressive such that the former may not be much of problem, but I think the latter still might be an issue? In other words, can a DSL writer take away unwanted type system features?

Actually yes. It is a feature built in that hardly any Shenturian uses but the power is there and has been for years and can be easily invoked. The command (enable-type-theory -) wipes the slate clean.

(0+) enable-type-theory
enable-type-theory : (symbol --> boolean)

(1+) (datatype int

if (integer? N)
_______________
N : int;)
type#int : symbol

(2+) 1
1 : number

(3+) (enable-type-theory -)
false : boolean

(4+) 1
1 : int


You now have a blank slate.

Would "type extensions as logic programs" be a more accurate description of Shen?

Shen's logic engine is simply a means of compiling judgments into efficient code for type checking.

I'm not arguing that Turnstile is a reproduction of Shen. Simply that the degree of overlap is significant enough to warrant some comparison in the paper.

Mark

was SEQUEL new?

I'm looking at "A language for implementing arbitrary logics". How does it compare to the Edinburgh Logical Framework (and its incarnation Twelf) and to LambdaProlog? They both use logic programming and proof search, and are from the beginning of the 90s. The paper seems to cite LambdaProlog (Miller 1990) but I can't quickly find a mention of it or a "related work" discussion. LF's first paper is from the same year, so one can't complain about the lack of a citation.

I see that paper is cited a few times, including by later papers from you. So why no "papers" page? I've often tried to discern contributions from Shen's user manuals.

https://pdfs.semanticscholar.org/3bd4/9137d3f2451296e4804b6b95988f126bc630.pdf

Your "open science" essay raises questions about soundness. I agree peer review in CS is not sufficient to ensure proofs are bug-free, I have doubts on the rest.
Have you considered mechanizing your type soundness proof and publishing that?

EDIT: to be sure, nowadays peer review in programming language is turning more and more often to mechanized proofs, since we are aware of the problem. The characterization of peer review in that essays mentions a few known problems, but seems otherwise somewhat unfair. While peer review might fail at checking correctness, peer review still checks whether you have a compelling argument for caring about your system.

was SEQUEL new?

SEQUEL was first published in 1992. The IJCAI paper followed a year later. Basically my direction was different from ELF and NuPrl because my goal was to create a shell in which judgments could be compiled to define types and used to check programs w.o. interaction from the user. Since the system was a shell, there was no question of a metalogic beyond that of the notation of sequent calculus. In both these respects SEQUEL differed from NuPrl and ELF.

Shen is SEQUEL 20 years on - with a semantics and a correctness proof of the native type system which is contained in the book
and a logic compiler many times better than SEQUEL.

The book (with the proofs) has been purchased in hundreds and the system has been in public use for 12 years w.o. anybody finding any serious error. During that time we've had maybe 10,000+ downloads. We have nearly 500 members in the group.

Have you considered mechanizing your type soundness proof and publishing that?

Have I thought about it? Yes. Will I do it? No. I just haven't the luxury of giving a year or two to do this.

Regarding a fully formal proof; fully formal proofs are very rare in CS and in mathematics. A fully formal proof can be a Ph.D. in itself. very few published proofs have that degree of certification.

One addendum though; my third and final book in CS is called 'Program Verification in Shen' and it does provide a framework
for formal reasoning wrt Shen programs. PVS is much closer in spirit to ELF.

Mark

Type soundness proof

>>Have you considered mechanizing your type soundness proof and publishing that?

>Have I thought about it? Yes. Will I do it? No. I just haven't the luxury of giving a year or two to do this.

...

>my third and final book in CS is called 'Program Verification in Shen' and it does provide a framework for formal reasoning wrt Shen programs

Failing an investment of a great deal of time, and providing something that is easier to discuss on LtU than a pay-to-participate book: is it possible that you might outline your approach in an article or detailed blog post?

Twelf does a great deal of automatisation of structural induction. I would be interested to know if its approach could be carried over to Shen.

The proof of the soundness

The proof of the soundness of the type system comes within a segment of TBoS running from page 334 to 368. That's over 30 pages and includes a formal semantics and a statement of the type theory and the actual proof from pp 351-356 and the proof of the termination of the type checker from pp 356-361. So its rather a lot of dense material to precis in a blog.

Really the book needs to be 'nationalised' but there is no finance framework for supporting Shen outside the income stream from the book and other sources. The lack of a suitable funding framework means that Shen technology is not widely understood outside the Shen group.

The PVS system (Program Verification in Shen) is still in development and the central algorithm Omega needs a correctness proof.

This algorithm compiles Shen programs to formulae in a typed 2nd order logic called LLambda. I've done one basic inductive
proof in Llambda so far - the associativity of append. This took 63 steps indicating the kind of work you have to do to accomplish
formal proofs. I'd agree that automation would be great!

Omega is generally applicable to any pattern-directed FPL like ML etc. so the work is of general interest outside Shen.

LF

The original LF paper was published in 1987, followed quickly by numerous other papers and theses.

Elf, which became Twelf, was created in 1989, and developed for decades thereafter.

SEQUEL

SEQUEL was a general purpose *programming language* whose the first applications happened to be in the area of application of the LF. There wasn't much overlap between the design of the LF and SEQUEL.

Post '90s, most programming in SEQUEL's successors, Qi and Shen, has not been in the area of application of the LF.

"The Book of Shen" is not expensive and the barriers to comprehension are no steeper than the pay walls around many academic journals. The free online materials on the site would inform even the most casual observer of what the group is doing and the news group is always there to provide information.

In fact at least one member of the Racket mailing list seemed to be aware of the Shen project right from the outset.

https://lists.racket-lang.org/users/archive/2012-January/049829.html

You'll also find discussion of Shen in the Racket IRC.

Mark

Unification

Isn't this basically showing that in Racket macro matching is essentially unification, which is the principal operation in most type systems?

Effectively the macros form a committed choice logic language.

Not unification

No, Racket's syntax matching is not a unification algorithm. Culpepper and Felleisen's ICFP 2010 paper has more details if you are interested.

Equivalence?

If you can Construct a functioning type system then matching must be equivalent to unification, otherwise how can the type system formed be a consistent logic? Or am I reading too much into this paper? Have you simply built a unification algorithm as a meta-program? In other words macros form a Turing complete compile time language, so they can implement any program, one of which is type inference/checking. You then use reflection to type the target program?

more than matching

Yes, we need more than matching. You're right that any sufficiently expressive metaprogramming framework can be used to implement a type checker. Our focus is more on the pragmatics. We want everyday programmers to be able to create typed DSLs if they want. Ours is just one possible approach, though preliminary feedback from users has been encouraging.

Hackett (mini-Haskell over Racket) uses this work

Alexis King's Hackett project uses this work (Type Systems as Macros) to implement a non-trivial subset of the Haskell type system. See the recent blog post and the interesting reddit discussion.