Lisps, First-Class Special Forms, Fexprs, The Kernel Programming Language

I've been thinking about macros vs. fexprs on and off for about a year. The following bit of Scheme code exposes, what seems to me, a troublesome difference between function application and special forms.


(define (a) 1)
(define (alpha) (a))
(alpha)             ;  1
(define (a) -1)
(alpha)             ; -1

(define-syntax b (syntax-rules () ((_) 2)))
(define (beta) (b))
(beta)              ;  2
(define-syntax b (syntax-rules () ((_) -2)))
(beta)              ;  2 still!

The function being applied, a, in the first half of the code above, is dynamically determined when the application occurs. A redefinition of a means that calling alpha will reflect that change. Dynamic languages. Late binding. Yay!

In the second half of the code above, the special form, b, is determined and expanded when beta is first defined and redefining b has no affect on later calls to beta.

I know this is an old issue but looking at John Shutt's reasonably recent The Kernel Programming Language (and on wikipedia and here) he seems to have "solved" the problem by inventing $vau expressions which fall roughly into the fexprs category.

It seems a long time ago that Kent Pitman argued influentially against fexprs because they did not play well with the Lisps of 1980 with their dynamically scoped variables. The Kernel Programming Language has lexically scoped variables and the $vau form is not one considered by Pitman.

I've can only find mention of John Shutt's language once on Lambda the Ultimate and am surprised the $vau expression idea has not received more attention. It seems like a great idea enabling a more dynamic and consistently defined language.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

no sentiment

Unfortunately, there seems to be about two or perhaps three of us in the Scheme community who would care to fix this kind of problem, then a lot of others riding other hobby horses.

Hell, I just want something roughly like R4 but with FEXPRs and first class environments and I'd call Scheme "done" but... it's not a popular notion, afaict.

-t

Compiler issue?

Couldn't this be solved at the environment level, by having the compiler/runtime keep track of macro uses, and recompile macro callers when a macro is changed?

Recompilation

Perhaps the lack of late binding in the example could be solved with a lot of bookkeeping and recompilation. That doesn't result in special forms that are first-class objects, however, and that uniformity between all forms would increase language consistency.

From the discussion, it

From the discussion, it sounds like fexprs are like C macros except they are first class, right? So the point is that lisp macros can not capture variables at their point of use, while fexprs can. I can't think of a situation where I would want my C macro to capture a variable that is neither
- passed as argument
- or global.

Is it different with lisp? If the idea of a global variable is not acceptable, how about making it "external" to the program (ie. Stored in a file or database). I think that would work well with the emacs-like editor.

Understanding fexprs versus macros (versus functions)

The excessively short answer is: no, fexprs aren't at all like macros. Hopefully I can do better than that, though.

There are three differences floating around here between how ordinary functions work, how macros work, and how fexprs work.

First, when an ordinary function is called, its operands are evaluated; in Scheme terminology, at least, the results of these evaluations are called arguments (arguments are the results of evaluating the operands). The arguments are the inputs that will be made available to later processing by the function. Pretty much the only thing that macros and fexprs have in common is that they don't evaluate their operands: the operands become the inputs, rather than the arguments becoming the inputs as for an ordinary function.

Second, once the inputs have been chosen (either arguments or operands), an ordinary function computes a value by evaluating the expressions that make up its body, in a local environment that makes its inputs visible. In Lisp, the result of the last of these evaluations becomes the output of the function (which in C would be explicit using a return statement). Fexprs do exactly the same thing, although the inputs that are visible in that local environment are the operands rather than the arguments. Macros are the ones that are usually different at this stage: C-style macros, or hygienic Lisp macros, substitute the inputs into a template to produce an output expression. ("Traditional" Lisp macros actually do the same thing as fexprs at this stage to produce an output expression.)

Third, when an ordinary function produces its output, that output becomes the result of the call to the function. The same for a fexpr. But the output of the macro is an expression that is then evaluated in the context where the macro was called in the first place. That's what makes it a "macro", the fact that its output is then evaluated.

If you want to do some kind of manipulation on the operands before they're evaluated, you can't use an ordinary function; you have to use a fexpr or a macro. But the practical difference (IMHO, of course) between a fexpr and a macro is that with a macro, you specify how to construct an expression that specifies what you want done, whereas with a fexpr you directly specify what you want done.

One advantage of fexprs over macros is that when you're trying to make sure your code will do what you really want it to do, it's easier to do so by actually specifying what you want done (the fexpr strategy), than by specifying how to build an expression that will specify what you want done (the macro strategy). Like the difference between manipulating something with your hands, and manipulating it with tongs. (As I recall, Pitman even acknowledged this clarity advantage to fexprs in his paper, along the way to condemning fexprs for other reasons.)

Practical difference between fexpr and macro

If the practical difference is that the macro tells how to build a program, while the fexpr is a program, shouldn't the language have a facility to make a macro from a fexpr?

Back to the original post:
the problem I see is that macro expension is not explicit, so it looks like the second function does the same as the first. If expension is explicit that shows clearly the difference. That does not solve the issue in case you want to capture the variable in beta, but is it much necessary?

macros are fexprs + constant expression elimination

In my view (which is neither unshared or universally accepted) what compiler writers think of as macros *are* a subset of fexprs that a compiler can statically determine have macro-like semantics -- thus enabling the compiler to go ahead and partially compute the application of a fexpr at "compile time". If you don't believe in phases (i.e., prefer on-demand compilation and caching) then the fexpr doesn't even need to be strictly macro-like to compile it, so long as you can unroll and redo the compilation when something it depends upon changes.

-t

Phase separation of macros

Macros are usually *required* to be processed at compile time, whereas fexprs and ordinary functions are, conceptually, processed at runtime. That was kind of the point of Pitman's favoring macros over fexprs, that because macros are required to be processed at compile time, they can't possibly fail to be analyzable at compile time — and the sense in which fexprs "aren't well-behaved" is exactly that they are sometimes impossible to analyze at compile time. In the extreme, you can't even tell whether the operands are going to be evaluated at all, because it depends on whether the operator evaluates *at runtime* to a fexpr, and it's impossible in general to determine at compile time what the result of a computation will be at runtime. This problem with fexprs is much, much more extreme if the language is dynamically scoped, which I believe is what caused the rejection of fexprs of which Pitman was the visible spokesman (because the mainstream Lisps in 1980 were dynamically scoped); and other facets of a Lisp language design can also be fine-tuned to maximize the analyzability of programs in the presence of fexprs.

A second reason to love fexprs (besides the fundamental clarity advantage of directly saying what you mean, which I mentioned before) is that fexprs don't require a logically separate phase of computation. That is, they don't require a phase-separation that restricts what the programmer can do. This is the flip side of the unanalyzability objection: the only way to guarantee analyzability at compile time is to impose restrictions that preclude all fexpr-based situations that can't be compile-time analyzed. Arbitrary restrictions are, of course, inherently objectionable to those of us who believe uniformity is prerequisite to maximal expressive power (in my case, I'd say, particularly, maximal *abstractive* power).

All this is about whether things *appear to the programmer* to be performed at compile time or runtime. Optimizations are beside the point(s): optimizations that cause some things to be taken care of at compile time rather than runtime (or vice versa) shouldn't even be visible, except perhaps by making things run faster, and in fact if they *are* visible then that's a bug.

The compile-time-versus-runtime issue is worrisome enough that various efforts have been made to create macros that are processed at runtime, or at least appear to be. (It generally wouldn't occur to those who do this to use fexprs instead of inventing ever-more-complicated kinds of macros, because they either aren't aware of fexprs, or mistakenly think that fexprs are unmanageable — a very understandable mistake, because it's been the folk wisdom about fexprs for decades.) Aside from individual shortcomings of particular first-class-macro efforts, there is still always a sort of impedance mismatch between the "first-class macros" (whatever exact form they take) and ordinary functions, simply because they're two separate devices with different computational properties, creating complex interactions. Fexprs, depending on just how they're integrated into the language design, may avoid the impedance mismatch as well. (I consider the way fexprs are integrated into Kernel to be especially smooth.)

Processing at compile time

In Common Lisp, macros are not required to be processed at compile time, and some implementations (like LispWorks) process them at runtime in interpreted mode, which is useful at least during development.

fexprs are a very bad idea

You should read and understand The Theory of Fexprs is Trivial before advocating for them. The problem with fexprs is that they defeat just about any equational reasoning anywhere. Pitman's note talks about problems for compilers (though IMO his note isn't very clear), but this is also a problem for any reasoning about programs.

Fexpr values can essentially turn up anywhere; whenever you apply a computed function to an argument expression, you don't know whether the function will turn out to be a fexpr, so you've essentially exposed every last syntactic detail of the argument expression. For example, in the program:

(define (foo f)
  (f (+ 2 3)))

You might think (+ 2 3) could be replaced with 5, but what if f turns out to be a fexpr? This even ruins alpha-equivalence, because fexprs can observe variable names, too.

Dynamic languages. Late binding. Yay!

I loathe the phrase "late binding" -- it's ill-defined, over-used, and essentially meaningless. It's used to arbitrarily justify anything that's more "dynamic." But dynamic is not always better than static (see also: dynamic scope).

If by "late binding," you mean the meaning of an operator can "change" -- which might either suggest first-class functions or mutable first-class functions -- then Scheme's already got it. There's plenty of dynamism to go around, without any need for macros to be expanded at runtime.

This isn't a "troublesome difference" between macros and procedures. The phase separation serves a purpose. Scheme is the only dynamic language that gives you programmatic control over compile-time behavior. There are many, many important things you can do with this.

Finally, I should just point out that just because a language feature is easier to implement doesn't mean it's easier to use. There's a woefully rampant confusion in the community between semantics and interpreter, between interface and implementation. Nobody claims that the last word has been written on macros; there's plenty of work left to do on the semantics of macros, but God help us if fexprs are the answer.

nah, fexprs are great.

Not so fast.

Fexprs thwart compilation and certain other kinds of static analysis. No kidding. Nobody ever thought otherwise, I think.

There are two and one half things that I think you're missing:

1) Compilation isn't everything. There are "many, many important things you can do with interpretation", too.

2) It's a tautology that (phased) compilers are tools for where you can perform certain kinds of static analysis. Well, you can perform such analysis on subsets and special cases of languages that use fexprs at the core. Have fexprs but just make sure it isn't too hard to write code in a style that makes it trivially apparent to a compiler that the hard / impossible cases won't show up.

3) You excitedly note that Scheme's hygenic macros give you essentially turing complete extensibility of the compiler - well, fexprs give you essentially turing complete extensibility of the interpreter. Why favor one over the other? The core language definition should use fexprs (and first class environments) because you build hygenic macro subsets atop that and have a perfectly workable semantics.

-t

Flexibility is nice, but...

Flexibility is nice, but I don't believe fexprs are offering much 'new and useful' flexibility, by which I refer to flexibility that is both useful in practice and not readily available by leveraging other mechanisms. Further, you pay a not insignificant price for those rather marginal benefits.

If I wanted to define 'a' such that its meaning can change later and affect earlier definitions, I'd have defined a with reference to a mutable cell then modified that cell explicitly (with, IIRC, 'set!' in Scheme). Other alternatives would include a tweak to the environment to support 'special vars', or a tweak to the syntax to thread a context object similar to a Haskell monad. In all cases, the semantics would be far more explicit and apparent.

flexibility? no, simplicity.

It sounds to me like you want to define a far more complicated core language because of some "principled" avoidance of fexprs, after which you will add library code to simulate fexprs.

Fexprs have a nice, simple, operational semantics atop which all the statically analyzable things you want can be built as subset environments and, meanwhile, fexprs perform their basic function of adding extensibility to "eval".

-t

Simplicity?

I think you and I have different ideas of what 'simplicity' means. There's the Turing Tarpit form of 'simplistic' that characterizes unlambda, one instruction set computer, and other systems that may be easy to implement but blow up in your face when you try to use them. Then there's 'simplicity', which takes a real problem and makes it comprehensible for users and makes tractable static analysis and optimization for those responsible for implementation.

Fexprs offer faux simplicity. The automatic rebinding of names makes it difficult for users to anticipate the impact of a new definition without also knowing the whole code-base. Unit-tests of such code are similarly limited. And, of course, separate compilation and linking becomes more complex. If this is your 'simplicity', who needs it?

True simplicity, as I understand the word, requires capturing the relevant elements of accidental complexity to make other things simple - 'other things' in this context including modularity, configuration, policy and dependency injection, security, safety, ability for users to predict impact of a change, concurrency, distribution, persistence, disruption tolerance, and performance.

Also, I did not suggest "add library code to simulate fexprs". I said "leverage alternative mechanisms to achieve the same ends to which you'd be applying fexprs". Those are very different goals, especially in context where I'm suggesting fexprs offer more expressive power than users require.

yes, simplicity

Fexprs, like just about every interesting programming construct, allow you to do things that make a program all but incomprehensible. Abuse of name capture in fexprs is a fine example. Name capture in modern hygienic systems (that allow to you break hygiene) can be similarly abused. As the doctor says, if it hurts when you do that, then don't do that.

Fexprs can also be used in plenty of ways that are easy to reason about. They are easy to reason about operationally in an interpreter. Particular fexprs (that can be trivially shown to not play nasty tricks) can be reasoned about equationally. If you are worried about the bindings of global free variables in some source file, use them less. If you want to carve out a subset that can be statically analyzed and compiled with optimization, fexprs give you a fine (cornerstone of) a dialect in which to do so.

You say:

I'm suggesting fexprs offer more expressive power than users require.

That is anathema to all things lisp-ish, including (real) scheme.

-t

"Real" Scheme

Everyone's welcome to their own crazy ideas about programming languages, but this:

That is anathema to all things lisp-ish, including (real) scheme.

is very wrong. In particular, how do you square your attitude toward compilers with the existence of Rabbit, the first real Scheme implementation, written by one of its creators, and supervised by the other? You may want to read your preferences into the history of programming languages, but you shouldn't expect the rest of us to agree.

Rabbit thesis v. fexprs ("'Real' Scheme")

No, I'm not "very wrong."

In Steele's vocabulary, at that time, the record would seem to indicate - the terms FEXPR and FSUBR are to be understood more or less as they come from the lisp 1.5 manual. In AIM-349 Steele makes it clear that FEXPRs and FSUBRs are not "Lisp Functions" (thus not "primitive" operations in Scheme, whatever "primitive" means) but that Scheme's special forms "are to Scheme as FSUBRs are to lisp". In AIM-474 (the "Rabbit Thesis") Steele exhibits a (for its time) startlingly elegant compiler for FEXPR-free code that uses but a specific, finite list of FSUBRs.

It may well even be the case that Steele then and/or now thinks FEXPRs are just so ugly that perhaps we can do without them beyond just a few but there is nothing in the Lambda papers or Rabbit that rules them out and some that makes careful use of them.

It is a scant ~15 years after Rabbit that SCM appears, implementing an extensible Scheme interpreter rather beautifully using two key variations on FEXPRs (traditional and "memoizing").

Since 1990 (SCM), the work on hygienic macros has begrudgingly led to systems in which free variable capture is entirely possible just so long as the programmer has to be explicit about it at the point of macro definition - thus even the modern FEXPR-less Scheme advocates have already retreated 80% of the way and reintroduced most of the "compiler needs to do global analysis" problems simple FEXPRs would present. (All that convolution, in my opinion, because compiler writers economically dominate Scheme standardization and don't want to admit that there's a useful language larger than what they can usefully compile!)

SCM has just about everything right, but under-developed. You get a clear operational model using a classical graph-code interpreter. It implies a perfectly fine denotational model. You get very large statically identifiable and pragmatically handy subsets that you can reason about as you would an R4 program - and you get extensibility of that set of subsets.

Every single time this argument comes up, the hygiene zealots wind up falling back on arguments about what other programmers can't possibly reason about (in spite of historic evidence that they can) and about what power other programmer's don't need (in spite of historic evidence that they do) and Scheme continues its slow death spiral into a bondage and discipline language of the soft that could only be loved by a "fascist with a read only mind".

-t

Mind-reading

You seem to have read much more into my comment than was there. I didn't mention hygiene, I didn't claim that Steele hated FEXPRs, nor did I say anything about who can reason about what. What I said was that you have very particular ideas about what "Scheme" means, that doesn't comport with the history of Scheme at all. If you want to claim that Rabbit was a step away from "Scheme", and the the community was rescued by a useless interpreter that no one uses, by all means, but don't claim that this is a well-supported reading of history.

"useless interpreter that no one uses"

It's not a scientific measure, certainly, but SCM and it's protege/fork Guile are the third and second most popular implementations on Freshmeat. Saying that it is a "useless interpreter that no one uses" seems odd, unless you want to treat that as a lemma of a general hypothesis that Scheme is a useless language that no one uses.

As to reading history, well, it's a big topic. Scheme had "impact" for being a full upwards funargs dialect, for having a curious resonance with Scott-Strachey semantics, for admitting surprising small and simple interpreters and optimizing compilers, for showing a duel of "Actors", and for committing to lexical scope. As it left the lab of its origin and spread to other academic labs, there were some distinct shifts. First, at Yale and CMU there was a lot of emphasis on adding object oriented programming / message passing to the mix (T and Oaklisp, later Dylan). Those looked promising for a while but eventually petered out. There was a CMU / MIT axis around systems and embedded programming for a while, and clever implementations of interactive environments giving such things as S48 and SCSH. But what came to dominate (Scheme standards) more recently was the influence of compiler research "elsewhere", and attending to that the fascination with hygienic macrology and module systems that enforce separate compilability. Meanwhile, the intellectual but not so much academic circle around MIT yielded SCM which is evolutionary from traditional lisp hacks and Scheme origins.

"Fascist with a read only mind"

Nice connotations, Thomas Lord.

Personally, I'm not to concerned about how I might abuse fexprs in a small project. I'm concerned about how the me of last year or the vast network of people whose libraries I might be using may have abused them or might abuse them in a future 'upgrade'. I'd like to minimize the amount of detailed knowledge of implementation of such libraries that I need to keep in my head at any given moment.

Your doctor - the one who offers you such wise advice as "if it hurts when you do that, then don't do that" - should also be telling you to avoid smoking, don't drink wine from lead cups, don't sniff paint, don't spend too much time in the sun without protection, etc. There are many acts that don't hurt when you do them but remain inadvisable nonetheless.

Powerful and generic language primitives (e.g. call/cc and fexprs) tend to make for a 'toy language' if the requirements include optimization, scaling, modularity, etc. The problem occurs because having fewer primitives fundamentally cannot eliminate essential complexity. Thus, fewer primitives means each primitive (on average) serves more semantic 'duties'. This makes abuse of primitives difficult to detect (too many false positives), and makes makes composition of primitives difficult to control (too much need for arbitrary composition in the absence of alternatives). Inability to control properties of composition additionally makes scaling difficult on the human front - programs cannot compose very 'deeply' before the knowledge and maintenance burden for programmers (how much they need to know, and how much they need to validate for component upgrades) becomes overwhelming.

Using fewer primitives leads, paradoxically, to a more complex interface. And if you want optimization, it also leads to a more complex implementation because it requires a 'sufficiently smart' optimizer to properly judge the use of a primitive rather than a 'dumb' optimizer with simple optimization rules.

I was once an admirer of Scheme. It is 'simple' for the projects to which I was applying it - lab-toy projects as part of a Computer Science curricula. And I'll admit that by many metrics it still stands quite well against other popular languages (e.g. C++). When I initially made a hobby of pursuing open distributed programming, a Scheme framework was my first starting point. It wasn't long before I was disillusioned; the burden to support distribution, concurrency, security, and safe and scalable composition required more guarantees, and Scheme's 'self-disciplined' approach to programming was insufficient.

I've since become a fan of 'invariants' - promises from a language. But a language with many invariants is, by itself, far too restrictive so I've pursued the 'layered' languages concept that supports invariants, successively reducing invariants and adding primitives in each higher 'layer', yet encouraging outputs of each layer be available or referenced in a lower layer. Layering helps programmers control composition. Layering helps optimizers leverage optimizations. Layering simplifies the language for everyone but the language-designer. (The language-designer, of course, is left with two interdependent problems: figuring out the right primitives, and figuring out the right permutation.)

Scheme relies fairly heavily on just one optimization: tail recursion. The ability to count upon any given optimization is wonderful - it offers a great sense of freedom to a programmer, who may then avoid semantic-noise and explicit work-arounds to achieve performance goals. Other optimizations include lazy or lenient computations (e.g. Haskell programs are often far more straightforward than ML programs when programmers can take advantage of laziness), memoization, interning, flattening flow-based object graphs (i.e. combining objects), dead-object eliminations, smashing layers of abstraction via partial-evaluation, and full dead-code elimination across libraries.

It is especially important to that freedom that these optimizations are achieved in the face of modularity, 'deep' composition, concurrency, configuration, security, safety, etc. Any violation forces painful work-arounds. I'd say the same for any cross-cutting concern. Implicit support for safety concerns (including safe operations, order of operations, progress, liveness and redundancy, self-healing, graceful degradation, disruption tolerance), security concerns (authority, secrecy), and configuration concerns (policies, process accounting, dependency management, debugging and logging), and so on are similarly liberating. Yet, they must still be achieved with minimal 'semantic noise' and without explicit work-arounds, lest they interfere with expression and become 'bondage and discipline'.

I think what we consider 'elegance' and 'simplicity' is this sort of freedom to work, to know that certain optimizations and other concerns will be taken care of: freedom from specific 'concerns'.

Pursuing this freedom from specific concerns (performance, safety, security, scalable composition, modularity, etc.) at the cost of faux simplicity that complicates both the user-interface and optimized-implementation layer seems to me a win-win-win-win-win-win situation.

. . . Unless, of course, you were simply unconcerned with a given property, or the scale is too small for a given concern to become relevant. These conditions aren't uncommon for lab-toy examples and CS curricula projects.

I can grok why a lone hacker, especially one obtaining an education in programming or exploring a thesis who wishes to minimize reliance upon external libraries, will be tempted by that 'New Jersey simplicity' for language-design. But I think, long term, it is hurtful - even for the lone hacker - due to insecurities for exploratory programming and an education that resists use and knowledge of third-party utilities. That isn't to say rich libraries aren't available for languages that provide a great deal of power then rely upon self-discipline. But it is my observation, and my personal experience, that programmers hesitate to use a library when they cannot depend upon its performance, safety, security, etc.

A lone-wolf hacker could have a great deal more confidence in the rich libraries and artifacts available to users of a 'scalable language' - one with well understood properties for composition - even if all he's doing is writing small, individual projects. The assured quality of those libraries may also benefit; a language that can provide security or limited confinement guarantees, for example, would be suitable for large-scale automated zero-button integration testing without risk of side-effects (other than memory and CPU for a suitable simulation).

All this is not the dream of a "fascist with a read-only mind" or even the result of a bondage & discipline fetish. I seek simplicity and freedom through liberating constraints. They don't even need to be 'in-your-face' constraints - I favor an 'incentives' based approach: working with a lower-layer where feasible (aka 'Principle of Least Expression') is not enforced, but is encouraged by improved modularity, reusability, security, safety, and performance.

can't wait

i want to be in on the beta release. seriously.

(restoration of ref)

I've been trying to shorten the above post (and failing) but in doing so I probably deleted that to which you refer. For completeness, I'll paraphrase it here.

Thomas Lord describes use of 'fexprs' for extensible applications. My favored design uses a composable library-level publish/subscribe model in which a flow-based programming style 'object configuration' can (as a language primitive) automatically hook and access data and event sources (FRP and SEP). A data source could identify a function, a procedure, or an object reference.

Advantages of this approach are manifold. The use of FRP and SEP as a basis support caching, multi-cast, automatic hardening (if the update facet of an FRP data-source is collected, a JIT can leverage this... technically, this can also happen for slow-update sources), automatic distribution with locality-of-subscription (e.g. a 10 Hz signal subscription will automatically be sourced close to the recipient), graceful degradation and disruption tolerance through automatic fallbacks, persistence and self-healing through automatic recovery of high-quality sources, and demand-driven low-latency communications.

That sort of feature set is difficult to achieve at the implementation-layer by following the Scheme 'philosophy' of using just a few primitives that serve many duties, and of failing to control composition of primitives.

Further, the 'domain-centric' nature of publish/subscribe libraries offer expressiveness advantages over the suggested use of 'fexprs' to this end. In particular, data 'subscribed' in the publish/subscribe model may itself be described in terms of complex objects, potentially even with reactive control data (e.g. a mobile viewport and adjustable QoS and Level of Detail requirements). Compare fexprs, which offer a relatively poor 'symbol' override. It has been my experience that the application-domain properties are very useful in constructing and attaching application extensions.

On this basis, fexprs are somewhat inadequate to a suggested use, yet still too powerful to readily support many optimizations. As language designs go, that's pretty much the opposite set of properties one is looking for in a design 'sweet spot'.

I also excised discussion of other 'primitive' language extension mechanisms that referenced my project. My design currently favors annotations, an extensible syntax/macro support, and a require/provide/combine primitive trio (for open definitions, open logic/data predicates, and late-binding) inspired by IBM's hyperspaces.

All this is very interesting to me, but I felt it a voluminous distraction from the above, which remains ridiculously voluminous despite my attempts to shorten it.

---------

And I apologize, raould. The 'beta' won't be out any time soon, nor will the alpha, which is why I don't bother mentioning my project by name. To be honest, I'm scared of a premature success becoming my own worst competitor, and I'm also scared of an initial implementation gathering a reputation like that starting Objective C's career (the safety of C and the blazing speed of Smalltalk!), so I'm favoring taking my time.

Besides, I had three major design changes this year:

  1. switching from process calculus to asynchronous actors with distributed transactions circa Nov-Jan. Process calculus and linear serialization of messages was too difficult to reconcile with both composable and secure (multi-organization) concurrency control and with replication for performance. (Now only cells and 'edge' actors that proxy for device controllers are serialized, under restriction of never waiting for a reply).
  2. concluded (from use-cases) in May that 'actors for plumbing' in the style of Smalltalk, Erlang, E, etc. is a far too 'leaky' in face of network disruption, required much micro-management, were too subject to programmer error, and were generally insufficient to deal with: locality of subscription, composition separate from demand-driven subscription, and automatic fallback for graceful degradation under partial disruption. Process calculus, at least, had (somewhat) nice properties for hooking processes together. I'm now using FRP and SEP in a layer below actors to fill this gap; these do much better than process calculus, which reinforces my belief in breaking primitives down into smaller semantic duties. The mutable 'cell actor' is now a three-facet configuration of update/access/demand, with 'demand' allowing one to stop and start updates to a cell based on whether anybody is looking at it (and thus chain demand-driven properties explicitly through a CEP or FBP network, where the language cannot automate it). Similar 'pipe' for edge of SEP (can be subscribed, but has no state).
  3. After reading much on LtU about zero-knowledge proofs (only half of which I understood), I decided to follow E's example for sealer/unsealer 'encryption' primitives. Each such object-pair serves as a phantom-type for secure protocols, and also implicitly enables encryption for messages that are hosted briefly across trust boundaries (without requiring per-message encryption for messages that never leave a set of trusted hosts). There is also a partner pair for signature and validation, added because even implicit hashing doesn't optimize well (even if one can prove A != B, one cannot prove H(A) != H(B) for secure-hash H).

Who knows what next year's changes will be? But I am actively implementing... a few hours per week.

Simplicity: MIT v. New Jersey

I think there are two similar-seeming, incommensurable notions of simplicity going on here, which I think means simplicity is a useless concept in this discussion. Maybe concrete instances is what this discussion needs?

Particular fexprs (that can be trivially shown to not play nasty tricks) can be reasoned about equationally.

To slightly reformulate my question below: are there any examples of these for which we cannot decide until run/eval-time (or, at least, not until after some expansion is completed) whether or not some variable is bound to a procedure or a special form? If so, are any of these examples useful?

MIT v. New Jersey

Charles, you ask:

To slightly reformulate my question below: are there any examples of these for which we cannot decide until run/eval-time (or, at least, not until after some expansion is completed) whether or not some variable is bound to a procedure or a special form? If so, are any of these examples useful?

Yes. An extensible application with an interactive interpreter. For example, a good Scheme implementation of an Emacs-like editor.

-t

Dynamic, staged computation

An extensible application with an interactive interpreter.

I'm generally inclined to defer to your superior knowledge on this application area, except that I'm guessing that this is not exactly the sort of example that I am looking for. Something like Emacs, if I understand correctly, is an example of dynamically staged computation using a bytecode VM, and the whole issue of expand-time vs. eval-time arises in just the same manner for regular macros here. It's not obvious to me that the particular sort of special FEXPR dynamicity that I'm talking about, is actually useful here.

What I'm really getting at, is whether we can expect a simple type system along the lines of Bawden's first-class macros to sort out what people like Will Clinger think is wrong with FEXPRs?

That's not the point.

Not only can't a *compiler* reason about your program in the presence of arbitrary fexprs, no human programmer can reliably do so either. And that includes you, when not in the heat of the moment.

Compiler reasoning about fexprs

Yes, a Lisp compiler can reason reliably about fexprs, as can a human Lisp programmer, *if* the other language features are carefully selected. Notably, Lisp functions (both applicative and operative, i.e. "procedures" and "fexprs") should be primarily statically scoped, and there shouldn't be a device for non-local binding mutation (i.e. set-bang). With those constraints, it's possible for some environments to have provably limited access, so that bindings in those environments can be provably stable.

On the theory side, although the technical result in Wand's paper is quite solid of course, the usual informal interpretation of that result — that every calculus of fexprs must have a trivial theory — is an overgeneralization. His paper proves that if a "calculus" of fexprs is constructed in a certain way, then its equational theory is trivial. But a calculus of fexprs doesn't have to be constructed that way. Wand's "calculus" distinguishes between subterms to be evaluated and subterms not to be evaluated by, in essence, using a context to mark the ones that aren't to be evaluated. Naturally, the resulting system isn't compatible, so its equational theory is trivial. The alternative is to use a context to mark those subterms that *are* to be evaluated. That way, you can get a calculus of fexprs that is a conservative extension of lambda-calculus. Really. (I was really shocked when I first realized vau-calculus has a subsystem isomorphic to lambda-calculus. That isomorphic subset is the part that deals only with fexprs, never with evaluation.)

In fact, from the way pure vau-calculus works out, fexprs don't even count as "side-effect-ful": unless you also add some other feature that *is* side-effect-ful (like sequental control or state), vau-calculus achieves Church-Rosser-ness without a fixed order of argument evaluation.

Mitchell Wand's influence

John,

Great to see you active in the conversation.

It seems Pitman's stance against fexprs was very influential but Mitchell Wand's paper really put the theoretical kibosh on them all even though there are many types of fexprs.

I noticed you have acknowledged Mitchell Wand in your Kernel standard [.pdf]. Given that Wand himself is responsible perhaps for a long stall in academic interest in fexprs, and presuming you have spoken to him about your vau calculus, what does he think of the Kernel language and its use of the vau forms?

Mitch Wand in context

Although I do agree that Wand's paper has a pretty big footprint amongst the opposition to fexprs, I'm not so sure it's really been responsible for any change in the general atmosphere. From the scattered post-Pitman literature on fexprs I've managed to accumulate, that paper comes across to me as very much in line with its predecessors — at the intersection, in fact, of the two "major" veins of such literature, such as they are: work on so-called "reflective" Lisps, and work invoking fexprs as an illustration of a really preposterously unmanageable feature.

Reflective Lisps come into this because the "reflective procedures" that figure so prominently in that tradition are actually fexprs (on steroids), though of course they usually aren't called fexprs unless the author is fishing for sympathetic gasps of horror from their audience. From my point of view, the reflective Lisp tradition is significant because thinking of operand-access as a form of reflection, thus implying that evaluated arguments are the norm, subtly reinforces the quotation mindset (I call it the "implicit-evaluation paradigm"), which is at the heart of the strategy that induces the trivialization result in Wand's paper.

As for the other vein of literature, using fexprs as a punching bag... I recall an interesting 1993 paper by John C. Mitchell, "On Abstraction and the Expressive Power of Programming Languages", that defines a formal notion of "abstraction" and then shows, as an illustration of the concept, that Lisp with fexprs cannot support any abstraction at all. This is, from my point of view, a particularly ironic result, since a major motive for my interest in fexprs is that I suspect them of offering fundamentally greater abstractive power than any other programming language feature yet devised (when, that is, they're placed in a programming language design setting that brings out their natural luster, which is what I'm trying for with Kernel). Mitchell's formal definition of abstraction allows him to assess some interesting properties of language features, but I take his result on fexprs as evidence of the limits to the range of applicability of his sense of abstraction. (I've been attempting my own formal definition of abstractive power, which I hope may be able to discern the power of fexprs, but of course I can't possibly take time to pursue that until I'm out from under my dissertation. For what it's worth: "Abstractive Power of Programming Languages: Formal Definition" [.pdf].)

You did ask about my communications with Mitch Wand. In fact, I don't recall that any of my conversations with him got into vau-calculus; sadly, when I presented my vau-calculus talk at NEPLS, most of the Boston-based NEPLS people including Wand couldn't make it. My conversations with him have tended to focus on historical perspective. He was one of the participants in a lively between-talks discussion of macros and fexprs at the fall 2004 NEPLS, that started between Alan Bawden and me; most of what I remember of that discussion is recollections from the first Lisp conference — the one where Pitman's paper was presented — of the floor discussion that broke out over fexprs. (I came away rather sorry I couldn't get in a time machine and go back to witness it first-hand, with members of the audience standing up to speak, some of whom argued that fexprs could be handled using partial evaluation — and everyone else thought they were crazy. Which, in retrospect, perhaps they were, since the technology for partial evaluation on that scale just didn't exist yet.) Mitch seems to have a rather more relaxed attitude toward fexprs generally, and his 1998 paper particularly, than the deathly seriousness with which his 1998 paper is sometimes invoked. Even believing, as he apparently did, that the trivialization result was more general than I have found it to be, he still remarked in the conclusion, "we care not only when two terms have the same behavior, but we may also care just what that behavior is!" Hardly a defeatist sentiment. Similarly, after my NEPLS talk on Kernel in 2002, he remarked that I might prove Kernel does what I mean it to using the equational logic already present in the talk. (I started serious work on vau-calculus shortly after that.)

The theory of fexprs is not (necessarily) trivial

This seemed a very important point to be clear on. (Sorry to partly repeat myself, but my other posting on this was buried several levels down, thus easily overlooked.)

What Wand's paper proved (notwithstanding the unfortunate overgeneralization favored by its admittedly catchy title) is that if a "calculus" of fexprs is constructed by using contexts in the calculus to mark expressions whose operands should not be evaluated, then the equational theory of that "calculus" is necessarily trivial. This type of "calculus" also assumes that evaluation of a subterm and reduction of a subterm are the same thing; so there's a context that prevents all reductions of a subterm, therefore there is no reduction that can be performed in all contexts. And an equational theory is all about reductions that can be performed in all contexts, so since there are no such reductions, the theory is trivial. It's an important result to understand, but only because it shows what a calculus of fexprs must not do if it's going to have a useful equational theory.

Suppose you set up a conservative extension of lambda-calculus in which  (1) there is new syntax for representing unevaluated expressions — cons cells, symbols, and nil — and a new primitive context that marks its subterm as being something to evaluate; and  (2) there are additional reduction rules that specify how to reduce redexes marked by the new evaluation context. The reduction rules that govern evaluation always provide additional possible reductions, never trying to suppress reduction of subterms, so the theory isn't trivialized. In the various calculi I'm studying that work this way, the calculus operator for basic functional abstraction is called "vau", rather than "lambda", but it's actually just the traditional lambda operator with a different name: the reduction rule that eliminates it is just the traditional beta rule, except for superficially altered syntax. The beta rule doesn't cause evaluation of the operand; remember, evaluation is indicated by an explicit context, and the beta rule doesn't add that context, in fact doesn't involve it at all. And there's a name for a functional abstraction that doesn't cause evaluation of its operand: it's called a fexpr. So the (isomorphically) lambda-calculus subset of vau-calculus deals exclusively with fexprs.

LINQ?

If I understood the definition of fexprs correctly, then .NET expression trees provide similar capability. As this technique is the key to LINQ, I would not call it "a very bad idea" (and it can be used for even more powerful tricks, of course).

I think I could buy a qualified statement: "fexprs are a very bad idea in a PL without sufficiently rich static type system".

Not the same

.NET expression trees appear to just be `eval'. They're definitely not fexprs.

Trees are not eval

First, I do not agree that expression trees are eval.
Expression trees can be compiled to delegates, which can be invoked, and thus evaluated, but this is not what they are - it's merely their feature.

Second, I didn't say expression trees (values of type Expression) are fexprs. If anything, then expressions of type Expression<Func<...>>. (pun not intended) are similar to fexprs. I used the phrase "expression trees" as it is more human-readable and popular than Expression<Func<...>>. Sorry about this confusion.

I see two main differences between fexprs and the mechanism in .NET (I never used fexprs, so take this with a huge grain of salt):
1. .NET mechanism requires explicit static types.
2. .NET mechanism only supports operands of delegate types.

Consider:

void Test() {
  f1(() => 2 + 3);
  f2(() => 2 + 3);
}

void f1(Func<int> func) {}

void f2(Expression<Func<int>> e) {}

f1 is invoked with a delegate as an argument, while f2 is invoked with an expression tree representing the ()=>2+3, despite invocation operands being syntactically identical.

So .NET does not allow the moral equivalent of the following code, which presumably accepts both procedures and fexprs as f:

(define (foo f)
  (f (+ 2 3)))

1. foo cannot be typed in .NET type system to "accept both procedures and fexprs as f"
2. .NET "fexprs" only support "lambdas" as their parameters, and not arbitrary expressions such as 2+3.

It looks like the first restraint is enough to tame most of potential evil coming from fexprs, not sure whether the second has fundamental reasons, or just some pragmatic ones - I wished more than once it was not present.

Programmatic control over compile-time behavior

Scheme is not the only dynamic language that gives you programmatic control over compile-time behavior. Common Lisp already does this since at least two decades, and this is supported in a portable way across all existing Common Lisp implementations.

Mixing macros and procedures

FEXPRs might count as an example of pseudo-code: a notion of code with a well-defined, computational semantics that doesn't have an adequate notion of computation.

Tom, would Bawden's First-class macros be a more tractable fit for your needs? Are there any cases where one actually wants variables to have values where one cannot tell until run time whether they are bound to procedures or special forms?

re Mixing macros ... Bawden's First-class macros

The short answer is (afaict) "no". As a rule of thumb, if you only let me write code that can be typed (in the general language, not in some subset that the compiler can handle) then you are selling me short.

This isn't to devalue Bawden's work but to understand it as an identification of a useful special case of generalized macros.

-t

First-class sell out

As a rule of thumb, if you only let me write code that can be typed (in the general language, not in some subset that the compiler can handle) then you are selling me short.

Indubitably, you lose admissible code, but in this case, what are the examples that we should care about, if any?

The type system explicitly

The type system explicitly disallows some programs. So the question should be reversed: How do you know that you aren't disallowing useful programs? And if these programs do arise, how bad is the workaround?

I agree that a type system could be a very good way to distinguish between macros and other data. This is similar to using a type system to distinguish procedures and other data vs a special scope for functions (like CL).

bogus programs considered helpful

As a kind of category of examples, consider writing an emacs extension that is simultaneously comprised of un-type-able command functions to be invoked by the generic command loop, and also incredibly useful for years on end on the limited set of inputs I give it in practice.

How do I benefit if the language designer for Emacs says "No, you may not run that program because it can't be type checked in this particular way I made up last week for my thesis about a compiler hack?" How is anyone using Emacs rationally harmed if the language designer doesn't prevent me from running my program that is broken in unimportant ways?

-t

Whose type system? Which extension language?

un-type-able command functions

Untypeable in which type system? Neither CL nor Emacs have FEXPRs; all procedures and special forms that can be expressed in either can be assigned (trivial) types in Bawden's type system. I'm guessing you should take another look at the paper.

re "Whose type system? ..."

Bawden's system requires that at the point of applying a first-class macro a compiler can statically determine "everything it needs to know" about the environment where a macro was defined. That's an arbitrary restriction suitable for a compilable *subset* but not for the core language.

Look towards the end of section two of Bawden's paper (the "motivation" section) where he writes:

It is obviously quite difficult to generate reasonable compiled code for lazy-map if the compiler must anticipate that delay (or any of the other parameters!) might be a macro with an arbitrary definition.

But our imagined programmer does not need such excessive generality. All she wants to be able to switch between [the two aforementioned] definitions of delay[....]

There, he's begged the question before us here. He points out, for example, that if the first-class macro being applied could be quote then the compiler can't easily do much with the code for the arguments supplied the application.

Well, the possibility of quote (or something like it) in that context is an interesting one: it can be very handy for implementing run-time debugging tools, for example.

Generally speaking, FEXPRs and first-class environments give you plenty of power sufficient to thwart practical static compilation - that's a given. They also give useful power (such as debugger hacks but also beyond that). They are sufficiently expressive with a simple enough semantics that you can define things like Bawden's system in terms of them.

The key "trick" is not to deny FEXPRs and environments to the programmer but to make it easy for the programmer to write code in which it can be easily statically determined that only a disciplined and statically analyzable use of those features is being made.

Last I checked, the GNU Emacs compiler for Emacs lisp compiles some kinds of code with optimizations, but also recognizes code that it really can't do anything with. In the latter case it might issue a warning but basically emits essentially uncompiled code for the interpreter to deal with. That's the right way to do it: let programmers choose what trade-offs to make between what can and can not be optimized and compiled statically.

Also: GNU Emacs *does* have FEXPRs that work properly in the interpreter but that are mis-handled by the compiler. It would be very hard to fix the compiler, though, because its "error" in the way it handles FEXPR is a deliberate override of GNU Emacs Lisp's usual dynamic scoping rules. An Emacs based on a lexically scoped lisp would not have to have that compromise in the compiler.

-t

Aha!

I begin to get a glimmer of where you are coming from. Thanks for your stubbornness in this thread: I think you may have convinced a few of us that there is more to FEXPRs than had met the eye.

I still am not convinced, though. I'll explain why I think you are underestimating statically-typed syntax when I find time.

Check Newlisp for fexprs.

The programming language Newlisp has nice implementation of fexprs. I used CL and Scheme before Newlisp, and I always missed the first class and runtime generated macros. Fexprs are not only more expressive, they are also easier to use. There is a lot of code with fexprs on my blog.

The compilation is optimization technique. Just like every optimization, it doesn't work in general case. But, if one limits the way he uses fexprs to be equivalent to say, CL macros, compilation is theoretically possible. Re Kernel: I could imagine lexical scope has sense for Ada, Eiffel and similar priorities, but I think such restriction was too strong for languages like Lisp. Why lexical scope? Safety? In my experience, some funargs problems can be solved on pretty trivial way, with disciplined naming, and other do not occur in practice at all. Speed? Not really that important...

Pitman's reasons are accepted, but I believe main reasons for discontinuation of fexprs were social, i.e. it was the time Lisp community wanted to go mainstream and earn some real money. They already had one of the most powerful languages around, but they had no performances. And whenever they tried to move to be more popular, people always told them, you are too bloated, we need speed.

Scoping fexprs

Fexprs with *purely* lexical scope would be pretty useless, since they would have no way to evaluate their operands in the dynamic environment (which is often what one wants to do with them, and is in a broad sense the hygienic thing to do with them). The body of a Kernel combiner is evaluated in a local environment that is a child of its static environment — but there may also be an "environment parameter" bound in that local environment to the captured dynamic environment. For example, one could write

($define! get-current-environment (wrap ($vau () e e)))

which creates an applicative of arity zero that locally binds symbol e to its dynamic environment (specified by the first e), and then evaluates symbol e in that local environment (specified by the second e). It's an applicative, i.e. a non-fexpr, because it's wrapped; every applicative is just an operative, i.e. a fexpr, that's been wrapped. Applicatives rarely want to capture their dynamic environments, so to avoid accidents the $lambda operative constructs applicatives that don't (which is why the definition of get-current-environment doesn't use $lambda).

As for why static scope as the primary scoping, that's because it's more stable than dynamic scope. One of the basic design principles of Kernel is that "dangerous things should be difficult to do by accident." If the local environment of a function call is a child of its static environment rather than of its dynamic environment, the person writing the function is more likely to be correct in their assumptions about what the body of the function will do. (In fact, if it were a child of the dynamic environment they wouldn't have the faintest idea what it was going to do.) Experience with Kernel fexprs (such as it is) suggests that in the event, if you're trying to do something that's mostly hygienic, and you go about it in a straightforward way, you won't accidentally commit any hygiene violations that weren't part of your intention. For example, here's the library implementation of $lambda:

($define! $lambda
   ($vau (formals . body) env
      (wrap (eval (list* $vau formals #ignore body)
                  env))))

Working in the local child of the static environment, this constructs an expression consisting of the $vau operative — not the symbol $vau, which gets evaluated in the local environment while constructing this expression — followed by the formal parameter tree specified as the first operand to $lambda, a specification that the dynamic environment is to be ignored (that's the "#ignore"), and the body (consisting of all operands to $lambda after the first one); and then it evaluates this constructed expression in the dynamic environment of the call to $lambda, and wraps the result. There is no possibility that the dynamic environment could capture the symbol $vau that was used when constructing the expression, because that symbol is long gone when the dynamic environment gets ahold of the expression: the symbol $vau was evaluated in the local environment of the call to $lambda, so that the car of the constructed expression is actually the self-evaluating, encapsulated operative object that symbol $vau is bound to in the static environment where $lambda was defined.

I can see how someone using a dynamically scoped Lisp would tend to attribute Pitman's position to politics, but I really do think it was all about fexprs being horribly badly behaved, in an insidious way that spread itself across the entire language platform making it impossible to know anything at all about the run-time behavior of an expression — which is what happens when fexprs are supported by a dynamically scoped Lisp. The cure for that horribly bad behavior is to support fexprs judiciously in a "statically scoped" Lisp (though not purely so, because as I've said fexprs need to be able to access their dynamic environments). It's ironic that Pitman's paper and the accompanying momentum for abandoning fexprs happened just before the dawn of the age of statically scoped Lisps (though if I were going to choose a single event that led to the deprecation of fexprs, it wouldn't be Pitman's paper; it would be the early bug that brought about dynamically scoped Lisp in the first place).

No

Fexprs with *purely* lexical scope would be pretty useless

Not in a language (i) where FEXPRs are first-class entities, and (ii) which has stateful variables.

purely?

Not in a language (i) where FEXRPs are first-class entities, and (ii) which has stateful variables.

This... might lead to some interesting insights all around (at the least, insights for each into the reasoning behind the other's position), but I'm getting an uncomfortable feeling we might be about to get lost in different understandings of terminology. I wouldn't ordinarily use the name "fexpr" for anything that wasn't first-class (though in fairness, I rarely work with anything at all that isn't first-class, that being another of Kernel's design guidelines). My understanding of "fexprs with purely lexical scope" is that the fexprs have no way to access their dynamic environment. Supposing that that is what you also understand by "fexprs with purely lexical scope", how do you envision gainfully employing such an entity?