Fexprs as the basis of Lisp function application; or, $vau: the ultimate abstraction

John N. Shutt's PhD Thesis is posted.
Abstract

Abstraction creates custom programming languages that facilitate programming for specific problem domains. It is traditionally partitioned according to a two-phase model of program evaluation, into syntactic abstraction enacted at translation time, and semantic abstraction enacted at run time. Abstractions pigeon-holed into one phase cannot interact freely with those in the other, since they are required to occur at logically distinct times.

Fexprs are a Lisp device that subsumes the capabilities of syntactic abstraction, but is enacted at run-time, thus eliminating the phase barrier between abstractions. Lisps of recent decades have avoided fexprs because of semantic ill-behavedness that accompanied fexprs in the dynamically scoped Lisps of the 1960s and 70s.

This dissertation contends that the severe difficulties attendant on fexprs in the past are not essential, and can be overcome by judicious coordination with other elements of language design. In particular, fexprs can form the basis for a simple, well-behaved Scheme-like language, subsuming traditional abstractions without a multi-phase model of evaluation.

The thesis is supported by a new Scheme-like language called Kernel, created for this work, in which each Scheme-style procedure consists of a wrapper that induces evaluation of operands, around a fexpr that acts on the resulting arguments. This arrangement enables Kernel to use a simple direct style of selectively evaluating subexpressions, in place of most Lisps' indirect quasiquotation style of selectively suppressing subexpression evaluation. The semantics of Kernel are treated through a new family of formal calculi, introduced here, called vau calculi. Vau calculi use direct subexpression-evaluation style to extend lambda calculus, eliminating a long-standing incompatibility between lambda calculus and fexprs that would otherwise trivialize their equational theories.

The impure vau calculi introduce non-functional binding constructs and unconventional forms of substitution. This strategy avoids a difficulty of Felleisen's lambda-v-CS calculus, which modeled impure control and state using a partially non-compatible reduction relation, and therefore only approximated the Church-Rosser and Plotkin's Correspondence Theorems. The strategy here is supported by an abstract class of Regular Substitutive Reduction Systems, generalizing Klop's Regular Combinatory Reduction Systems.

Related discussions

$VAUs vs. fexprs

As has been pointed out before, the objects constructed with $VAU aren't realio, trulio, old Lisp fexprs, because they can't be invoked in the same way as ordinary procedures, as by using APPLY. You can only invoke a $VAU object that is not statically known by constructing a Lisp form and using EVAL to evaluate it.

By johnwcowan at Mon, 2010-10-04 04:06 | login or register to post comments

re $VAUs vs, fexprs (what are you trying to say?)

the objects constructed with $VAU aren't realio, trulio, old Lisp fexprs, because they can't be invoked in the same way as ordinary procedures, as by using APPLY.

If I am taking your meaning correctly, one can pretty much define the "APPLY" that you are thinking of perfectly well. You are making a distinction without much difference. Or maybe you can make it more precise?

You can only invoke a $VAU object that is not statically known by constructing a Lisp form and using EVAL to evaluate it.

Can you say that more precisely? In particular, do you mean that it is impossible to create some call site (of the form "(x y z [....])") where the binding of x to a $VAU object is not statically known at all... or do you mean that such a form can see the binding of x to some $VAU object change at run-time only if the environment of x is captured and modified (in statically knowable ways)?

Either way, how exactly do you think this justifies refusing the name "FEXPR"?

The main reason why a linguistic maven might withhold the name "FEXPR" is, as far as I can tell, something that Shutt writes about: that FEXPRs were common when dynamically scoped lisps were the norm but fell out of practice, coincidentally(?), while lexically scoped became more common. If you think "FEXPR" is an adjective that only applies in a dynamically scoped lisp then, trivially, it doesn't much apply (pun not intended but unavoidable).

The unkind but frankly-I-believe-it take is that as the compiler gear heads with the most grant money came to dominate Scheme, they chased FEXPRs out because there were a lot of easy plays to be had if they were gone -- not because it really made sense to get rid of FEXPRs or first class environments in a lexically scoped language. The hygienic macro movement is valuable -- and most valuably understood as a case study in a special case of FEXPRs and first class environments.

By Thomas Lord at Mon, 2010-10-04 05:02 | login or register to post comments

Anecdote on implementing hygiene

The hygienic macro movement is valuable -- and most valuably understood as a case study in a special case of FEXPRs and first class environments.

Since our last discussions on fexprs and hygiene, I've implemented a hygienic macro system patterned after SRFI 72 in a Lisp->C compiler I'm working on (here's SRFI 72's my-or example in my Lisp dialect). Unlike the usual implementations of such systems, which "render" down to a core language like R5RS, my implementation does have some support from the compiler - namely, identifier ASTs have a hygiene context slot, and this is set to a fresh UUID by quasisyntax (and can be manipulated using datum->syntax to inject code into a particular hygiene context). This makes hygiene extremely simple to implement - I think the lines of C code in the compiler dedicated to hygiene are not more than a few dozens or so.

I just wanted to report this because usually hygiene is presented through a gigantic apparatus (e.g. portable syntax-case has thousands of lines of code), which makes it appear as much more complicated than it really is. With support from the compiler core, hygiene is basically trivial to implement. I'll still take another deep look at Shutt's work, though, of course.

By Manuel J. Simoni at Mon, 2010-10-04 11:35 | login or register to post comments

expansion-passing style

Unfortunately the hygiene movement sidelined Dybvig's expansion-passing style, which can handle call-by-name, call-by-need, code-walking, etc. v.well.

By pdlogan at Thu, 2011-08-11 16:42 | login or register to post comments

Visionary directions for languages

The unkind but frankly-I-believe-it take is that as the compiler gear heads with the most grant money came to dominate Scheme, they chased FEXPRs out because there were a lot of easy plays to be had if they were gone -- not because it really made sense to get rid of FEXPRs or first class environments in a lexically scoped language.

Perhaps, but there was at least one paper in the 1980s, written after Pitman's pox on FEXPRs, that argued that macros were less efficient than fexprs for "call by text" scenarios, such as embedded DSLs.

Such embedded DSLs are more broad than just Lisp, though! For example, Eelco Visser's scannerless GLR parsing (SGLR) and Bryan Ford's parsing expression grammars (PEG) both share a common language characteristic in that they both use a basic design trick of eliminating the scanner from the compiler.¹ Yet these design tricks show up in other places as well, such as programming-by-demonstration and are the basic ideas behind Jonathan Edwards' "subtextual" languages (which provide the projection of changes through copies). Since subtext's parsing model isn't defined anywhere (and thus can't be referred to as scannerless), it is worth noting SGLR, PEG and subtext share an even more basic characteristic: a direct-style towards providing language embeddings that removes obfuscations such as The Lexer Hack and second-class copy-and-paste that requires extending the programming environment to enrich the IDE (see Managing Duplicated Code with Linked Editing), but an indirect-style also complicates unparser definitions, viz a viz (Lisp) code walkers requiring their environment to ultimately know how to handle common uses of code walking such as pretty printing required for Alex Plotnick's CLWEB, due to the fact code walkers tend to be used when non-local structure of the code must be known to generate a local effect).

See Sometimes an FEXPR is better than a macro (ACM Paywall):

Common Lisp, which is becoming THE Lisp standard, does not support call by text (FEXPR mechanism in Mac/Franz Lisp). This effect can be obtained using macros. Based on the experience of converting an OPS5 implementation from Franz Lisp to Common Lisp, it is argued that sometimes call by text is needed for efficiency, despite its negative aspects. In the case of languages embedded in a Lisp system, using the macro alternative for call by text can cause marco expansion at execution time. This leads to a some-what less efficient implementation of the embedded language.

Of course, since Lichtman's paper was written, the Scheme community has developed conventions for writing macros to try to avoid these issues while simultaneously trying to solve the more important semantic issue of composition, such as using CPS-style (see Oleg's coverage of composing macros for an introduction).

An important problem in language design is actually figuring out how to compose several embedded DSLs. This is one of the big ticket issues for VPRI's Fundamentals of New Computing project and NSF grant. In their work, they refer to these issues via metaphorical solutions -- "particles" and "fields". Such metaphors just point out how incomplete their solutions are.

It would also be nice if the next great SCIP-like book was written with such questions in mind, since most real-world systems are not designed in isolation but require integrating across many systems with many syntaxis (e.g., an ORM).

With that, I will retreat into my nerd cave and read this monsterous dissertation. I hope to see this promoted to the front page. I am thankful this was posted to the LtU forum first, since front page-only stories do not get bumped in "Active forum topics" when a reply is made many moons later. Considering the length of this dissertation and what it seeks to debunk, I expect very few initial comments here.

Footnotes:
1. This trick -- scannerless parsing -- has been used to create two programming languages where control flow is effectively defined in terms of the parsing properties: Katahdin and OMeta. Both use PEGs as their core control flow construct.

By Z-Bo at Mon, 2010-10-04 15:32 | login or register to post comments

No surprise....

The Common Lisp standard explicitly permits an implementation to provide a special form (FEXPR) corresponding to any standard macro for efficiency. However, it requires that the macro definition also be present, so that code-walkers (including compilers) can always expand them out, leaving only the 25 special forms defined by Common Lisp. (Scheme manages with seven or eight, nyah nyah.)

For example, Eelco Visser's scannerless GLR parsing (SGLR) and Bryan Ford's parsing expression grammars (PEG) both share a common language characteristic in that they both use a basic design trick of eliminating the scanner from the compiler.

I can't speak to SGLR, but PEG parsers typically don't have scanners because they don't need them. LL(1) and LALR(1) parsers need scanners to keep the amount of lookahead down to one token, but since PEG parsers are willing to look arbitrarily far ahead, they don't need a separate routine to chunk the input into tokens. In principle, a PEG parser could also have the bytes-to-characters decoding specified by rules, or for that matter the bits-to-bytes decoding, and accept input in the form of bits. It may well be, however, that certain PEG parsers would run faster given a regex-style scanner on the front end; I have no intuition about that, and I know of no relevant measurements.

By johnwcowan at Tue, 2010-10-05 20:55 | login or register to post comments

Yes,

It may well be, however, that certain PEG parsers would run faster given a regex-style scanner on the front end.

Indeed, even without having to consider them equipped with a separate scanner in the strict sense, I can confirm (AFAIK) that PEG-based parsers (and, say, "PEG-oriented fluent interfaces" cousins) accommodate themselves as nicely as naturally with regular expressions matching to eat the input you give them, wherever you see fit (within this or that outer PEG pattern).

By Cyril at Wed, 2010-10-06 00:28 | login or register to post comments

Scheme special forms

I've heard that most Schemes have less than a dozen fundamentally special forms. Can you provide a link to a paper that discusses what they are? Obviously there are going to be choices, but I'm interested in a representative choice.

By Michael Vanier at Wed, 2010-10-06 05:12 | login or register to post comments

The Racket list

Here's the list for Racket, in the docs. Note that begin0 and #%plain-lambda are both trivially expressible as macros, and are kept for efficiency and better error reporting.

By Sam Tobin-Hochstadt at Thu, 2010-10-07 00:48 | login or register to post comments

Thanks!

I count 23 fundamental forms (including begin0 and #%plain-lambda).

By Michael Vanier at Thu, 2010-10-07 23:28 | login or register to post comments

Plain R5RS Scheme has ...

... variables, procedure calls, constants with or without quote, lambda, if, set!, and the declarations define and define-syntax; it's possible that let-syntax and letrec-syntax are also primitive. You can see the lists in R5RS sections 4.1, 5.2, and 4.3.1.

Racket's additions mostly have to do with its support for low-level macros (that is, ones whose bodies are written in Scheme), as mandated by R6RS but not R5RS.

By johnwcowan at Fri, 2010-10-08 15:41 | login or register to post comments

Not correct

First, R5RS does not require any particular set of primitive forms. Implementations are free to make cond primitive, or any other special form.

Second, letrec is primitive; R5RS letrec cannot be faithfully implemented as a macro (as the R5RS discusses).

Third, very few of Racket's additions have to do with procedural macros - only quote-syntax and define-values-for-syntax.

By Sam Tobin-Hochstadt at Sat, 2010-10-09 15:11 | login or register to post comments

letrec can be transformed

It's possible to transform letrec into e.g. let and set!, as "Fixing Letrec" does, but indeed to obey the letrec restriction, you need some help from the compiler.

I would not be surprised if some let-syntax trickery could take care of the validity checks, though. Not that it matters to most people, and such transformation would make optimization quite difficult.

By Andy Wingo at Wed, 2010-10-20 10:57 | login or register to post comments

counting Kernel "special forms"

This got me thinking about the corresponding count in Kernel. There isn't anything quite like what is meant by "fundamental special form" in Scheme, but the closest thing is probably Kernel primitive operatives, and in that case the count is three: $if, $define!, and $vau. All the other usual suspects are derived (in principle, anyway): $lambda, $sequence, $set!, $let, $letrec, etc.

Two caveats.

(1) Kernel operatives aren't "special forms" in the sense of reserved operators that don't get evaluated: all operators get evaluated. Consequently, primitive operatives don't naturally occur on the same list with things like "variables" and "constants". The evaluator algorithm has three cases — variables (symbols), combinations (pairs), and constants (everything else) — and combinations have two subcases, distinguished after operator evaluation — applicative combinations and operative combinations. But if "procedure call" is a single case for Scheme, then "operative combination" is a single case for Kernel (or perhaps "combination" is a single monolithic case, though that's not how the Report portrays it).

(2) In counting the number of primitive operatives, one could argue that every primitive applicative should be counted as well, since each one is a wrapper around an operative, resulting in a huge count — but I doubt it's useful to think of, say, number? as anything but an applicative, with the usual Kernel facilities for manipulating it (which may involve unwrapping it). For those applicatives one might as well think of unwrap as a way of deriving new operatives from them.

[edit: egregious typo fix, "now" -> "not"]

By John Shutt at Thu, 2010-11-11 15:45 | login or register to post comments

three "special forms"

That's fascinating, John. Do I understand correctly that, therefore, a very concrete operational semantics for the core could be given by a short C program that:

1) Defines a primitive tag type which wraps either a cons-pair, nil, a symbol, an environment, a primitive operative (which wraps a C function pointer), or a user defined operative (which wraps a pointer to the body of the operative and a pointer to its static environment)
2) Defines a base environment that has exactly three bindings mapping the primitive operative's symbols into the operative's data representation.
3) Defines the three operatives themselves as C functions that take operands and lexical environment data structures as arguments and returns their result.

This would come with a standard library which defined all the usual suspects ($lambda, wrap, unwrap, etc...) in terms of the core three alone, and any "practical" application would include a few other primitive operatives (e.g. read, write).

Is this approximately right? Does it need a primitive $eval as well?

I'm sure the metacircular definitions work well enough for those better versed in implementing scheme, but I'd personally find it very instructive to see what a minimal self-contained C definition of Kernel looks like.

By Marc Coram at Sat, 2010-11-13 03:05 | login or register to post comments

three "special forms" and some primitive applicatives

Alas, no. Besides three additional types (two more single-value types, and the all-important applicative type — an applicative being a one-tuple whose element is either an operative or another applicative), you'd also want bindings for a number of other core primitive applicatives (or their underlying operatives, which I take it is what you mean by $eval). All I meant by "usual suspects" was the other operatives that would be primitive in Scheme but are derived in Kernel; and in order to derive those, you also need a bunch of the core primitive applicatives, which cannot themselves be derived from the three core primitive operatives. For example, the usual (non-robust) library derivation of $sequence uses not only $if, $define!, and $vau, but also wrap, null?, eval, and cons.

You'd surely want some of the following bound in your environment:

Nine core primitive type predicates. (The C program distinguishes ten, because it knows which operatives are primitive versus compound.)
Two core primitive comparison predicates. (eq?, equal?.)
Three core primitive constructors. (cons, make-environment, wrap.)
Two core primitive mutators. (set-car!, set-cdr!.)
One core primitive accessor. (unwrap.)
eval.

From the C program's perspective, all those core primitive applicatives appear to be additional core primitive operatives, so there are far more of them than there are Scheme "fundamental special forms". However, conceptually they seem to me analogous not to Scheme's fundamental special forms, but to Scheme's built-in procedures, which is why I didn't count them.

An additional difficulty in the comparison between languages, though, is that if most of the C program's "primitive operatives" are considered analogous to Scheme built-in procedures, why not all of them? If you treat $if, $define!, and $vau as just three more built-in procedures, then Kernel has no "fundamental special forms". It seems that counting Kernel's fundamental special forms is rather like dividing zero by zero: you may assign a value by convention, but it really depends on how you approach it.

By John Shutt at Sat, 2010-11-13 06:02 | login or register to post comments

three "special forms" and some primitive applicatives

I see. I agree that the number of primitive operatives is most parallel to the number of built-in procedures in Scheme. But mainly this is because I've come to believe that the salient issue as a model for the definitional complexity of the language is the total complexity of the primitive functions in the underlying implementation (in the reference language) plus the total complexity of the library code.

Chapter 6 section 6 of your thesis (plus appendixes), then, elegantly demonstrates an upper bound for the complexity of a Kernel evaluator, it being succinctly defined in the language of a minimalistic, Kernel-compatible, macro-free, vanilla Scheme, plus library code. It's of interest to me, then, if there would be economies to shift some of that logic into the underlying implementation, just so that I could pretend that I understand the "whole thing."

But I guess the even more pressing issue if one wants to ``make things practical'' is how to identify a programming idiom / type-system under which compliant user-level code can be compiled into efficient machine code. This suggests to me that it would be important to develop an idiom for staged computation with explicit evaluation, which seems like something Kernel might be particularly good at.

However, I've run far beyond my depth so I'll just say thank you and express an earnest interest to see where you take things.

By Marc Coram at Sun, 2010-11-14 04:09 | login or register to post comments

Well, a SGLR parser with a

Well, a SGLR parser with a scanner would just be a GLR parser :)

(and fwiw, I have seen pegs with scanners)

By tef at Fri, 2010-10-08 03:14 | login or register to post comments

'both share a common

'both share a common language characteristic in that they both use a basic design trick of eliminating the scanner from the compiler.'

A search through the literature will reveal that parsers with one grammar are easier to extend than a parser/scanner combination.

Extensible grammars don't play well with fixed lookahead, and most scannerless parsing requires infinite lookahead (roughly).

GLR parsers accomodate this through ambiguous parses and breadth first search
PEG parsers accomodate this through backtracking and frequently memoization.

Neither of these is cheap. GLR has cubic time, but PEGs can have explosive memory requirements.

There are other options like parr's dfa approximations of lookahead, but I like left recursion and i'm like that. (and i'm yet to see his method extended to lr parsing)

Really, what we want is determinism, linear time and infinite lookahead without backtracks, or resorting to generalized parsing.

You might enjoy looking at shift-resolve parsing, which seems to offer a parsing method that satisfies the requirements

http://www.lsv.ens-cachan.fr/Publis/PAPERS/PDF/FSF-ciaa06.pdf

By tef at Fri, 2010-10-08 03:13 | login or register to post comments

but PEGs can have explosive

but PEGs can have explosive memory requirements

Sure, agreed. But... "hey, you gotta try memoize to avoid that, at a minimum." ;)

By Cyril at Fri, 2010-10-08 04:20 | login or register to post comments

well, memoization may not always be the right choice

http://www.mercury.csse.unimelb.edu.au/information/papers/packrat.pdf

'This paper argues that (a) packrat parsers can be trivially implemented using a combination
of deï¬nite clause grammar rules and memoing, and that (b) packrat
parsing may actually be signiï¬cantly less eï¬ƒcient than plain recursive
descent with backtracking, but (c) memoing the recognizers of just one or
two nonterminals, selected in accordance with Amdahlâ€™s law, can some-
times yield speedups. We present experimental evidence to support these
claims.'

By tef at Sat, 2010-10-09 00:58 | login or register to post comments

Ha, thank you again

This paper is indeed another must-read for PEG users like me. Most interesting to me, specifically, and as put in their conclusion, the results they obtained with using DCG vs. classic Packrat parsers with or without memoing against real-size grammars (alike Java's, etc) confirm what I found myself thinking about while implementing that PEG-oriented fluent interface and observing the parsing times I'd get :

"Most predicates donâ€™t contribute significantly to the parserâ€™s runtime, so tabling them just adds overhead in both space and time. For memoing to yield a benefit, the memoed predicate must contribute significantly to the runtime of the parser, and the average running time of one of its invocations multiplied by the hit rate of the table (the expected savings), must exceed the time taken by the tabling operations themselves (the cost). We propose that this be the chief consideration in deciding what predicates to memo in a recursive descent parser. This consideration is so important that respecting it, and tabling only a minimal set of predicates (usually only one, sometimes none) leads to a parser that is significantly faster than the one generated by Rats!, even though the Rats! applies a whole host of other optimizations we donâ€™t.".

Now I know I shall definitely remember about this useful study when I'll consider refactoring towards better performance (which I knowingly didn't want to yet, as I was more interested in validating other ideas). So, thanks!

By Cyril at Sat, 2010-10-09 02:53 | login or register to post comments

Thank you

And very interesting link, too: thank you! I definitely missed that one until now... Ah, and just can't be surprised to see Farre's name among the authors...

By Cyril at Fri, 2010-10-08 04:28 | login or register to post comments

the thesis is also a fun read on the parsing methodology

http://www.lsv.ens-cachan.fr/~schmitz/pub/phd.pdf

it is a small part of a larger system of approximating context free grammars for parsing and validation.

By tef at Sat, 2010-10-09 00:54 | login or register to post comments

fluent operative handling

... [operatives] can't be invoked in the same way as ordinary procedures, as by using APPLY. You can only invoke [an operative] that is not statically known by constructing a Lisp form and using EVAL to evaluate it."

This appears to be a straightforward technical misunderstanding. Statically unknown operatives are easily invoked in Kernel without constructing a form and using eval on it. It's even easy to invoke them using apply. You do need to specify what you want done, because Kernel apply has just one uniform behavior, without a discontinuous behavioral provision for operatives. Kernel apply is used in the tail recursion in both tail-recursive derivations of $cond in the Kernel Report. The Kernel Report rationale for apply is also worth a look. Note too that if, for some reason, you actually want an apply-like device with discontinuous behavior, you can easily build one; abstraction, you know.

As the preface to the dissertation notes, although $vau is flashier, the treatment of applicatives (at the primitive level, wrap/unwrap) is the more elegant and subtler innovation. It empowers quite fluent use of statically unknown operatives.

By John Shutt at Mon, 2010-10-04 15:54 | login or register to post comments

Front page

I think this should be posted to the front page, but would rather see this story promoted than start a separate thread. Pretty please?

By Matt Hellige at Tue, 2010-10-05 02:29 | login or register to post comments

Your wish is my command.

By Ehud Lamm at Tue, 2010-10-05 18:44 | login or register to post comments

Good step for lexical scope

Congratulations to John Shutt. But, I'd say that it is too easily accepted that original fexprs in dynamically scoped Lisp do not "behave well"; particularly Pitman's article has a number of conceptual and technical difficulties. Any case, if one wants to stay with lexical scope, then Kernel is significant step forward, compared with anything else I've seen recently. Dissertation is well written, and I enjoy reading it.

By Kazimir Majorinc at Tue, 2010-10-12 14:27 | login or register to post comments

Lambda the Ultimate

User login

Navigation