First-class environments. Discuss. ;)

Thomas Lord's recent insistence that a simpler Scheme is possible (including that advanced features such as first-class macros and environments should be provided, with the caveat that their use may be less performant than non first-class devices) has got me to investigate first-class environments.

Way back there's been a bit of discussion of this topic here.

The usual arguments against first-class environments are:

  • They're "dangerous". (IMO, dangerous is good. ;))
  • They mess up ahead-of-time compilation. (Well, many projects use JITs these days.)

On the pro side, we have:

So my question is: given that JITs are commonplace these days, should first-class environments be reconsidered for inclusion into programming languages?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Static reasoning

The programmer benefits as much as the compiler from the ability to reason statically about code. With disciplined use of first class namespaces I'm sure they could be effective. I expect disciplined use would in practice mean "imitating existing constructs" so then I wonder why you'd bother.

There is another point: the Scheme standard is not really the place for experimentation. If you want to see something in the standard get it into an implementation first, generate some significant experience with the feature, and then perhaps the committee will listen to you.

I expect disciplined use

I expect disciplined use would in practice mean "imitating existing constructs" so then I wonder why you'd bother.

I agree. I think that the extra expressivity of first-class environments is probably only used in exceptional circumstances (say, writing a module system).

Why I bother is that I'd like to have really clear and simple semantics for my languages. R6RS doesn't even say anything about the interactive top-level, IIRC, and other languages like O'Caml have slightly different semantics for interaction and static compilation. First-class environments seem to make it possible to give a nice and simple description of interactivity, and that's one reason why I find them interesting.

If you want to see something in the standard get it into an implementation first, generate some significant experience with the feature, and then perhaps the committee will listen to you.

I'm not in any position to, nor am I trying to, influence the committee. I'm doing this for the fun of it.

~~ sigh ~~

I wish I could get the members of the committee to thoroughly respect that principle, and not invent things today to be standardized tomorrow, particularly in WG1.

hmm. I guess I should reply.

It might be helpful to describe what I had intended to propose in Scheme WG1 (before I was expelled from the group). As I go along I'll offer some rationale for that. And I'll respond (out of thread) to the comments that have appeared here so far.

My goals were threefold:

  • To simplify and better formalize the Scheme Report.
  • To, for a first time in the Report's history, provide a convincing and useful semantics for fully dynamic, interactive Scheme programming environments.
  • To, for the first time in any report but the poorly received R6RS provide a mechanism for program-created disjoint, encapsulated types.

My strategy was to be as follows:

  • Define a tinier than is customary "minimal Scheme" that resembles little so much as it does a plausible low-level intermediate form in a compiler or interpreter. This could be dubbed "WG0 Scheme".
  • Offer at least one complete formal semantics for WG0 Scheme.
  • Specify essentially all of the features found in R5RS by giving customary narrative descriptions of these features, but also providing a formal definition in the form of working code in WG0 Scheme.
  • Demonstrate mathematically that any existing, correct implementation of R5RS is, in retrospect, a correct implementation of the WG0 Scheme implementation of R5RS. That is, that full upward compatibility has been achieved for programs which use only the R5RS features.
  • Demonstrate (with an implementation) that a naive (i.e., highly unoptimized) direct implementation of WG0 Scheme could run the reference implementation of R5RS well enough to run a wide range of R5RS programs with performance that, while it wouldn't win any benchmarks, would be "good enough" for wide variety of purposes. This was also to demonstrate that a direct WG0 implementation could be tiny and easy to understand.
  • Demonstrate with runnable WG0 Scheme code that highly dynamic and interactive programming environments can also be implemented on WG0 Scheme. Declare victory in WG1's charter to provide a convincing "REPL semantics".
  • Propose that WG1 draft a new Scheme Report that specifies WG0 Scheme as a set of optional features, that it then specify (more or less) all of R5RS in terms of WG0 Scheme as required features, and that it at least briefly point out the intended approach to "REPL semantics".
  • Time permitting, demonstrate that WG0 Scheme can be very nicely implemented as a program that uses the R5RS dialect augmented with "syntax-case" macros (macros that permit the deliberate breaking of hygiene). That is to say, at least some extant optimizing compilers for Scheme are, already - when certain Scheme libraries are added - optimizing WG0 compilers. In the ideal outcome of this, an R5RS program compiled indirectly using the WG0 libraries would produce code just about as good as if the R5RS program were compiled directly. While I believe that is possible in the case of some compilers, based on the narrative descriptions of the optimizations they perform, it is an uncertainty solved only by experiment whether or not this last bit actually works out.

Well, that was the plan, anyway.

The WG0 Scheme would contain the basic types of R5RS (characters, numbers, cons pairs, vectors, etc.) It would contain only the special forms LAMBDA, SET!, FEXPR, and THE-ENVIRONMENT. It would contain the procedure EVAL. It would contain a simple mechanism for creating new encapsulated types (much like the one observed in the Kernel programming language specification).

THE-ENVIRONMENT (invoked (the-environment)) would yield a procedural interface to dynamic instance of the enclosing lexical environment. The procedural interface can read and write existing variables, but not (by default) change binding contours. Environments in which binding contours can change can be created using LAMBDA.

EVAL would of course accept a form to evaluate and a reified environment, and "do the obvious thing", although it would have some provisions to afford "extensible" environments (e.g., permit at least the late introduction of formerly unbound variables).

A FEXPR is like a LAMBDA except that (a) when it occurs in the first position of an application, the operands are not evaluated and an additional parameter is passed which is the caller's value for (the-environment). APPLY of a FEXPR would pass an extension of (the-environment) in which a QUOTE form is certainly bound, and pass the arguments as QUOTED forms. (e.g., (apply and (list #t x)) with X bound to #f would be equivalent to (and '#t #f).

In WG0, the default "top-level" is immutable and contains only bindings for the WG0 primitives. The basic "unit" of code is a simple stream of S-exp forms. A dialect, such as R5RS, can be expressed as a WG0 program which, when run, ignores WG0 "units" and takes over reading, processing, and evaluating forms directly.

It's quite simple, really, and I make no claim to it being original. Rather, it seemed to me like a lot of "well known stuff" about Scheme that both deserved to be written down concisely in one place and that, by writing it down that way, could simplify the Report, put the formal semantics on a firmer ground, provide a convincing REPL semantics, and add the missing feature for creating new encapsulated types.

Above, Noel and Msimoni mention the importance to programmers of being able to statically reason about programs. Features such as EVAL, and FEXPRS, and THE-ENVIRONMENT clearly limit that ability. On the other hand, they don't eliminate the ability to statically reason about programs. In particular, if you have a WG0 implementation of R5RS you'll find that in strictly R5RS programs, all calls to FEXPRs are statically detectable and easily eliminated - more or less using techniques that date all the way back to RABBIT, albeit with a few simple additional rules that express some EVAL identities. For example, (eval form (the-environment)) is, if EVAL and THE-ENVIRONMENT have their standard bindings in the lexical context, equivalent to form. All of the ways in which are used to statically reasoning about, say, an R5RS program -- those are all still valid for a WG0 program written in R5RS dialect.

On the other hand, for prorgrams that use features like EVAL etc. in more complex ways: WG0 at least still gives you a solid operational semantics. You can reason about what the program will do even if it is harder to reason about what "value" the program will ultimately produce. You still get plenty of opportunities to assert equations that equivocate alternative expressions of a piece of code.

johncowan laments "I wish I could get the members of the committee to thoroughly respect that principle, and not invent things today to be standardized tomorrow, particularly in WG1."

Given that he himself seems to have proposed new inventions for WG1 Scheme, such as in his Unicode proposal, I'm afraid that whatever "principle" is at work here is lost on me. It was, perhaps by coincidence, just a few hours after I remarked on that (and then was soundly flamed by John) that my "expulsion" from WG1 took effect. I perhaps unfairly infer a "principle" at work there but I have enough hope in humanity and John to believe it was quite the principle intended to be put into operation.

THE-ENVIRONMENT (invoked

THE-ENVIRONMENT (invoked (the-environment)) would yield a procedural interface to dynamic instance of the enclosing lexical environment. The procedural interface can read and write existing variables, but not (by default) change binding contours. Environments in which binding contours can change can be created using LAMBDA.

By changing binding contours, do you mean e.g. enriching an environment with new bindings?

As to LAMBDA, can't it be defined in terms of FEXPR and EVAL?

re "THE-ENVIRONMENT"

About binding contours: yes. I make a further distinction: You can permit the dynamic extension of an environment only in ways that capture no previously bound identifier, or you can permit the dynamic extension of an environment in ways that can alter (at run time) which binding an identifier refers to. I propose only the former and care to say as little as possible about the hairy topic of the latter (because I can't begin to fathom how to deal with it).

About LAMBDA in terms of FEXPR and EVAL: yes. C.f. the Kernel programming language, roughly speaking. As a matter of exposition style or choice of foundation: I don't mind taking LAMBDA as primitive but there are, indeed, some kinds of simplification, from some perspectives, that come from making LAMBDA derived in terms of FEXPR and EVAL. Roughly, a LAMBDA is a FEXPR that always evaluates all of its arguments before doing anything else. Depending on your goals that can be useful reduction or just a tedious one.

Thomas, missing </code>

Right after (apply and (list #t x)).

It only affects the rest of the paragraph in Firefox. In Chrome, it's affecting the rest of the thread.

Thanks, fixed.

Thanks, fixed. (Although untested since I don't run Chrome.)

Thank you.

Thank you. [I would not be offended if you deleted this subthread, since it is no longer relevant..]

Hey, I'm a member too

What I'm suggesting, for those not following the scheme-reports-wg1 list, is allowing implementations to break the R5RS rule that says string<? and friends are lexicographical extensions of char<? and friends. Hardly a radical revision of the basic Scheme infrastructure.

If Mr. Lord thinks I flamed him, he should perhaps re-read some Erik Naggum posts.

First class environments

... are to modules and encapsulation almost exactly what reified continuations are to control flow.

They are a simple means of implementing or reasoning about the semantics of whatever module system you like, or a way to facilitate experimentation about module systems, or a way to emulate the semantics of any module system found in a language you're writing a system to interpret, etc. They make it unnecessary for a "minimal" standard to specify a module system at all.

But the current scheme process isn't going for "minimal." Its goals are to "facilitate the sharing of code" and "produce a standard such that any conforming program is also a conforming program of WG2 scheme" (Per the WG1 working group charter). These are not evil purposes, but they simply do not admit simplifying the standard by removing restrictions, nor especially by introducing more general and simpler constructs.

Sharing of code means putting together modules that were written by different people, probably for different projects, into the same program. That goal does not allow the existence of a simple minimal tool for implementation of and experimentation with module systems; it requires exactly (and ONLY) the same module system in use across all code.

Producing a language subset of WG2 scheme (such that any WG1 program is also a valid WG2 program and has the same semantics under WG2 that it has under WG1) does not allow the simplification or removal of restrictions from any construct that is complicated or restricted by WG2, nor the introduction of any more general feature not found in WG2. Because environments are not first-class in WG2, first-class environments cannot be allowed in WG1.

Scheme as a notation for computer science concepts and algorithms is no longer the principle at work in the scheme standardization process, and extensions to the domain of what scheme can be used to reason about (such as first-class environments or FEXPRs would provide) are no longer considered desirable by that process.

You're advocating two ideas in the above, both of them semantically radical. Both could simplify the language by removing restrictions, but both would result in a lisp which simply isn't Scheme.

In the first place, first class environments are THE solution to modules and THE mechanism by which a lisp that enables real discoveries about module systems can be made. Further, they are a compatible extension to current scheme semantics. But as a requirement, they fly in the face of the goals of the current standardization efforts. They've been flatly rejected in the past by people who want to build "Efficiently, statically optimizable" implementations of scheme, and who are still there. More than that, the currently favored module system is part of R6RS (virtually the ONLY part) that was adopted wholesale by people who otherwise rejected R6RS totally.

The community of scheme programmers was *REALLY* hurting for a module system, the one in R6RS appears good enough, and at this point I think most of the people involved really want to put the pain of arguing about module systems out of their minds. First Class Environments are a stillborn idea in that reality. A language in which environments are first class is not a language in which modules can be meaningfully standardized in a way compatible with the restricted compromise of module definitions that people are now desperately clinging to.

To use a metaphor, a drowning man doesn't want a build-your-own-boat kit, even though he might learn a lot more about boats and naval architecture from getting one; he wants a boat. And a formerly-drowning man who now has a boat, has no room aboard for the kit and will rightly reject it for fear that its weight will upset his boat.

In the second place, FEXPR semantics violate the very fundamental principle, adhered to in scheme since its very beginning, that the called procedure and the argument expressions are evaluated in the same environment, by the same rules, before the procedure is called. While they could completely eliminate macrology and the runtime / loadtime phase separation and all the semantic complications that those things bring about, they break one of the very first definitional principles of Scheme and they break it good and hard.

The scheme community is now very invested in its macrology; they got there by long hard work and emotional processing and yelling and screaming and weeping and gnashing of teeth, and they still remember the pain of not having a standard macrology. You will not pry it away except from their cold dead fingers, and you will not redefine it without defeating them in mortal combat.

Unicode is, make no mistake, useful. Having everybody doing it the same way is useful. But, ultimately, there isn't that much to it semantically speaking, and there is nothing established in the scheme community which its fundamental semantics contradict. No fundamental computer science insights can be had from implementing it or reasoning about it, and it gives us no new tools for thinking about hard problems, but it doesn't particularly upset the very fundamental notions people have about what scheme is and what's good for it.

It's solved and safe and (although I myself have ranted and argued about the hamfisted way it's implemented) mostly uncontroversial; it can be standardized and will make a million (mostly-trivial) things work more smoothly together. You have to understand the difference. The "innovation" in Unicode libraries is like deciding what color to paint the living room, and "innovations" with FEXPRs and First-Class Environments are like deciding where to start building and whether you want to build a house or a railroad depot.

So, I advise you to come to the same conclusion as me; the language you want to work on is a lisp, but it is not any kind of scheme. They were right to expel you; it helps them to achieve the goals stated in the working group charter. Also, they did you a favor by expelling you; you are now free to work on a thing both more important and closer to your heart.

Ray

re: First Class Environments

Ray,

You and I have so often agreed on so many things that I'm shocked, shocked I say, to have to pick some nits with you here. Although there is also much to agree with.

I agree most especially with this: "Also, they did you a favor by expelling you; you are now free to work on a thing both more important and closer to your heart." That would be the theory. A nice thing is that, so far, that theory is born out in practice. For example, I am free to not give a $#@! about proving what I know - that you can write WG0 atop a good R5-ish compiler using syntax-case: right there are several thousand lines of very tedious, boring code for which I went from having a process-imposed deadline to a "deadline" of "Meh, if it later seems useful maybe I should do it." Similarly, I just started over on my WG0 implementation from scratch because I can change from "working, even if ugly, within a couple of months" to "with all deliberate speed while producing a lovely implementation" (for my own aesthetic sense of "lovely"). Mind you, "they" did me no favor in the way in which they expelled me (e.g., at the very, very least, might "they" not have offered the courtesy of privately asking for my resignation?). Instead, I was "informed" of my expulsion and even that my expulsion was being considered by the esteemed Steering Committee by a generic bounce message when I sent mail to the WG1 list after said expulsion took place. This (LtU) is not the place to further rehearse my long list of grievances about the administration of WG1 from start to finish but you get some sense of them, there.

I disagree with you about the nature of the WG1 charter. You are focusing on a few clauses and ignoring others. The WG0 approach I've described would have well satisfied the charter. Much of the ongoing discussion by the non-expelled members really does not. I guess that you are right that the charter means whatever WG1 takes it to mean -- but I think my reading of it was less "stretched" in terms of the plain language of the thing. That's life, I guess.

I disagree with you that my approach is particularly radical. In particular, it has (including what I'm calling FEXPRs) some strong resemblance to the Revised Report (R1RS). It also can be expressed as trivial extensions to, for example, the meta-circular implementation of SICP. It is more or less directly present in SCM and until quite recently in SCM's bastard child Guile. I agree with you, though, that (by social convention) - WG0 is no longer Scheme. It needs some distinct name. My WG0 interpreter is using the name "simp" for itself. It's tempting to call the language "SimpScheme" because that has some nice resonances but I fear that one of those same nice resonances might cause undue confusion with an otherwise respectable textbook.

I disagree strongly that Mr. Cowan's Unicode proposal is a bike-shed issue (what arbitrary color to paint something). In particular, he wants to break string ordering predicates in an upward incompatible way relative to R5 - unprecedented and, in my view, unwelcome. He could fix my complaint just by choosing new names for his proposed predicates.

I agree with you that I'm proposing a lisp. In a better world, I would still say I'm proposing a lisp deserving of being the next Scheme. I agree with you about the emotional investment of the (so-called) "Scheme community" in such things as avoiding FEXPRs in modern dialects in favor of macros but disagree if you mean to indicate that they have a coherent position on that topic.

Scheme is dead. Long live Scheme.

Details, please?

Tom, can we get more details on what a first-class environment proposal would entail? It seems like a lot of restrictions would be necessary to avoid the possibility of violating basic invariants we've come to expect from our code. For instance, when I call a function (or fexpr, or whatever is callable), I don't expect to end up with a bunch of new bindings in my current environment (though there are cases, notably at the top level, where this could be useful). I certainly wouldn't expect that some bindings I already have would be removed. I might be OK with having bindings mutated, though this is dynamic scoping, which is not something I normally want. The point is, there seems to be a big design space, and I'd like to see a detailed proposal about what is, and what isn't, going to be in it (you've provided some details, but not enough for me to feel like I really understand what it is you're after). The analogy with continuations is a good one; having first-class continuations doesn't mean that you can go up the call stack and arbitrarily delete or rearrange the order of frames -- it's only manageable because you can capture the continuation as an essentially immutable object.

What would also be interesting is a more detailed description of the kinds of things one might be able to do with FCE that are hard/impossible to do without them (and in more detail than just e.g. "module systems").

I'm not by any means against FCEs in principle; on the contrary, I think they're fascinating, but I don't have a good feel for what the use cases are or what the tradeoffs are (other than defeating some optimizations).

Tom, I agree with Ray that you're working on a new lisp, which is great. Maybe you and John Shutt can join forces and collaborate on a more fleshed-out version of Kernel/simp? I think this would be a valuable contribution to the language community. It seems like a lot of the opposition to FCE is somewhat knee-jerk; having an actual language with FCEs would at least make the issues more tangible.

details

Michael,

Please forgive me for not giving a fully fleshed out account here. The original plan, when I was on WG1, was to have such a thing done perhaps a month or two from now - but now I'm more focused on "simp" and taking "necessary distraction breaks" in contexts like here (and watching cute cat videos :-).

But also please let me try to raise your comfort level with some easy details:

All that I want from the primitive form of (the-environment) is a procedure that let's me set and get the values of statically, lexically apparent variables by dynamically computed name. E.g., analogs of these should work (modulo a less crude interface to environments):

((lambda (x) 
   (display ((the-enironment) 'ref 'x)
   (newline)
   (display ((the-environment) 'set 'x 42))
   (newline))
 41)

prints:
41
42

You can write that version of THE-ENVIRONMENT using rather tedious syntax-case macros in a dialect of R5RS extended to permit them. And if you avoid using THE-ENVIRONMENT, you pay no performance penalty for it. And if you fail to avoid using THE-ENVIRONMENT, the performance penalty you pay is in most cases measurable but not too large to be useless.

Now, because we gave FCE's a procedural reification, we're free to add other operations to environments we create by other than a direct call to THE-ENVIRONMENT. We could, for example, add (an-environment 'define 'new-x 13)

We can make sure that primitive EVAL copes with that but we have to be clear what such a DEFINE operator is permitted to do.

There are two choices: we can make the detailed definitions of these things such that a DEFINE operator can change which binding a previously entered identifier refers to -- or we can restrict DEFINE such that it may only add new bindings for previously unbound identifiers. I propose the latter for WG0 (so, no - we specifically avoid dynamic binding in the sense that you mean that).

If you would rather have FCE's in which DEFINE operators can change binding contours - well, you can do it but you pay a price. You'll have to add to your library of WG0 code an implementation of DYNAMIC-EVAL and pay some performance costs for the run-time look-up of identifier to binding mappings at each reference or mutation. I think that this follows the good principle of "pay as you go" for exotic features. At least we still have a rigidly formal, robust semantics for DYNAMIC-EVAL, based on its plausible definitions in basic WG0.

As for Shutt, Ray, and I collaborating: I'm not sure I see the need. I infer that Ray is a happy camper on his own. I infer that Mr. Shutt is closing in on wrapping up some graduate thesis work and, anyway, he's way ahead of me on expressing some of these ideas in academic style -- I can't see any upside for him in (much) distracting him other than to note that he's got some fans, already.

I hope that helps. I'm not entirely sure what lacking details to add to make it more persuasive for you and hope that I failed to add nothing in that direction. For the moment, for me: back to simp. :-)

Collaboration? (and ideas about FCE's)

As for Shutt, Ray, and I collaborating: I'm not sure I see the need. I infer that Ray is a happy camper on his own. I infer that Mr. Shutt is closing in on wrapping up some graduate thesis work and, anyway, he's way ahead of me on expressing some of these ideas in academic style -- I can't see any upside for him in (much) distracting him other than to note that he's got some fans, already.

Hm. I'm certainly not averse to collaborating; Although I'm happy to do my own thinking and implement my own ideas, I often find myself "waiting for insight" for months, or in the case of (my versions of) fexpr semantics or functions with multiple return values, evolving ideas over the course of years, before I really come to know what is exactly the "Right" semantics on a given subject. Even at that, the perception of "Rightness" is far from universal. Being part of a collaboration would most likely speed that process immensely, because we could pool our ability to have insights or see problems with the insights we've (collectively) had.

In order to explain my ideas about properly handling and programming with first-class environments, I'll start with the notation I'm using for functions that return multiple values. This is because that notation makes it easier for me to explain my definition of eval.

I use the pipe character in a call to introduce bindings within the current binding contour. So for example the code

;; returns the sum of foo and bar if it's more than 2.
;; returns 2 otherwise.
(if (> (+ foo bar | baz) 2) baz 2)

expresses the same semantics that, in scheme, would be expressed as

;; returns the sum of foo and bar if it's more than 2.
;; returns 2 otherwise.
(let ((baz (+ foo bar)))
   (if (> baz 2) baz 2))

... well, almost the same semantics. In the "pipe" version above, the binding persists to the end of the enclosing binding contour (let or lambda) whereas in the "let" version, the binding contour is restricted by the immediate let to the if statement.

If that were all there were to the pipe, It'd be a pretty trivial notational convenience. But it's easy to generalize the notation to handle multi-argument continuations and functions with multiple return values, without mucking about with extra nesting in "receive" forms etc. So, for example, a call to the function that returns both the sine and cosine of an angle (because it's way cheaper to compute them together and you usually need both for trig) can be written simply as

;; the sin function returns cosine as its second return value.
;; here we capture (by naming) both return values in the local 
;; scope.
(sin theta | sintheta costheta)

Now, if you're with me on the pipe notation, I'll start talking about eval. Eval, like sin above, returns two values. The first is the result of the evaluation, and the second is an environment identical to the environment with which eval was called except that it may have been extended or mutated by the evaluation itself. (note that the function can be specialized for single-return versus multiple-return contexts, so we don't have to do the "heavy lifting" of constructing a new environment all the time). So eval does not normally have side-effects to an environment, but we can use set! to mutate an environment deliberately by writing something like

 
;; env1 has no binding for foo
(seq
   (eval '(define foo 2) env1 | defreturn envreturn)
   (set! env1 envreturn))
;; in the new value of environment env1, foo is bound to 2.

As for deliberate alterations to the current environment via eval, it would work in a similar way. The code could look like this:

(seq
   (eval '(define foo 2) (the-environment) | defreturn envreturn)
   ;; setting the current environment to a new value requires 
   ;; a special form because it's fundamentally different from 
   ;; set!'s contract of modifying a single binding within that 
   ;; environment.
   (set-env! envreturn))

Now, I want to make a point about the semantics of (the-environment) as I envision it: It returns the value of the current environment, not a reference to the current environment. That is, mutations to a value returned by (the-environment) will not result in changing the actual current environment, nor will introducing bindings in the current environment change a value that has already been returned by (the-environment). You can think of it as an environment constructor with copying semantics, but not as a way of grabbing a pointer that you can use to do out-of-band mutations on the actual environment you're using.

If you want to do those dangerous kind of mutations, at least in my lisp, you have to actually call the special form provided for exactly that dangerous kind of mutation: set-env!

Ray

guarded environments (also, abstractive power)

(Re collaboration, Tom's summed up my situation quite well.)

The parallel between continuations and FCEs is, IMHO, pretty spectacular. Continuations and environments are both naturally arranged in hierarchies — each continuation has a set of descendants that are within its dynamic extent, and each environment has a set of descendants that are within its "static extent" (the symmetry is just too compelling for me to resist this terminology). In Kernel, I've got "keyed dynamic variables" that provide encapsulated dynamically scoped data (akin to Scheme's current input/output ports), following the continuation hierarchy; and "keyed static variables" that provide encapsulated statically scoped data, following the environment hierarchy. Kernel also has a mechanism called guarded continuations, which are something like exception handling except that, instead of catching abnormal transfer of control based on an explicit type hierarchy (which Kernel doesn't have), it uses the natural continuation hierarchy. I have wondered about (but never had time to pursue) the possibility of an analogous "guarded environments" mechanism, that could intercept an attempt to use an environment to evaluate an expression that it wasn't originally associated with. For example, when a fexpr uses its dynamic environment to evaluate one of its operands, that presumably shouldn't trip anything because the operand was already in the static extent of that environment; but evaluating an operand in the fexpr's local environment, which is a child of its static environment, is sending the operand to a different static extent, analogous to passing an object to a remote continuation.

One comment on style of fexprs. Ray and Tom and I each seem to have different philosophies on what constitutes, or is acceptable in the way of, complexity of design. (Of course, differing philosophies like that can sometimes produce amazingly good collaborations, as well as utter failures, but I digress.) Note, for example, that Kernel has zero special forms — not four, or three, or even just one. Although that's consistent with my personal aesthetics, I'm actually not doing it for that reason (an unprovable proposition, of course). I'm interested in abstractive power — for which I have a moderately specific notion in mind, which I've been attempting to formalize, and a basic element of my work on this subject is that whenever the abstractive facilities of a language fail to apply to the base language in a free and uniform way (recalling the RxRS passage about weaknesses and restrictions that make additional features appear necessary), that failure ultimately bounds the abstractive power of the language.

re: guarded environments

John,

At the risk of distracting you:

How are you using the word "special forms"? I think that when I say "special form" I mean nothing more or less than what you mean (in Kernel docs) when you say "operative". (I'm trying to not say anything definitive about what operatives and applicatives are "primitive" - just starting exposition with a few I take to be "built in".

Also, what do you mean by "abstractive facilities of a language [applying] to the base language in a free and uniform way"? For example, how does that describe $vau, wrap, and unwrap in some way that you think I'm missing with my approach?

For guarded environments, if I understand your intent correctly, that is part of why I propose a procedural interface to environments.

How are you using the word

How are you using the word "special forms"? I think that when I say "special form" I mean nothing more or less than what you mean (in Kernel docs) when you say "operative".

I can't speak for John of course, but when I use the words "special" and "form" together, I'm talking about any procedure that does not evaluate its arguments immediately, once each, and in normal order.

One of the things I like about the fexpr semantics is that it's possible to do the whole system so that every procedure, including the special forms, is first-class and applicable. But I need to distinguish between "normal" and "other" procedures somehow, so I use the "special form" rubric for that.

How I was using the word

Tom was spot on, I think, in identifying just where I'd misunderstood him. I tend to use "special form" only for a second-class operative determined by a reserved symbol. Whereas, when I'm being just mildly sloppy, I refer to all first-class operatives in Kernel as "fexprs", even though in the strictest technical sense only the compound Kernel operatives would be fexprs; that's a shorthand terminology that's also mentioned in the Wikipedia article "fexpr". (I do know of genuinely first-class compound operatives that aren't fexprs — I noticed them as part of my dissertation, where I call them "single-phase macros". SPMs are kind of fun, in an esoteric-language sort of way.)

I'm probably inclined to associate "special form" with second-class because the key to most of what I've figured out about making fexprs work — both in practice and in theory — hinges on thinking of them as the usual case, rather than an exceptional case. I call this the "explicit evaluation" view (or paradigm): that applicatives induce evaluation of their operands, in contrast to the implicit-evaluation view that operatives suppress evaluation of their operands. The implicit-evaluation view causes the trivialization of theory in Wand's 1998 paper. The distinction is also visible in meta-circular evaluation: for Scheme, general dispatch from the combination-handler is to mc-apply, with both operator and operands evaluated for the dispatch; whereas for Kernel, general dispatch from the combination-handler is to mc-operate, with neither operator nor operands evaluated for the dispatch (and the Kernel dispatch is the expected base case, whereas Scheme dispatch is expected to have exceptions). It's also significant that only under implicit evaluation can an operative accessing its operands be reification (converting an aspect of computation state into a data structure): under explicit evaluation, the operands were never anything else but data structures, so it's not possible to convert them into data structures, and no reification occurs.

I tend to be wary of procedural representations of environments for a similar reason: procedurally represented environments are something I've encountered primarily in reflective Lisps, where the procedural representation was leveraged for general weirdness, whereas I'm trying to be non-reflective and provide a conservative environment facility.

procedural introspection & intercession

I tend to be wary of procedural representations of environments for a similar reason: procedurally represented environments are something I've encountered primarily in reflective Lisps, where the procedural representation was leveraged for general weirdness, whereas I'm trying to be non-reflective and provide a conservative environment facility.

Could you elaborate?

Most of this discussion in this thread is uninteresting to me - but this was surprising. What do you mean by non-reflective?

For example, how does Kernel compare to OMeta or other reflective term rewriting systems? I see OMeta as Alan Kay, Ian Piumarta and Alessandro Warth's interpretation of fexpr semantics...

fexprs and reflection

I admit, I wasn't familiar with OMeta. Perhaps I'm missing something because my knowledge of it is necessarily too superficial to pick up the subtleties, but it comes across to me as an interesting pattern-matching system that is neither fexpr-based, nor reflective. I do note that neither "fexpr" nor "reflect" shows up in the dissertation, and I don't see any really glaringly obvious references in the bibliography to support the presence of either feature, either (although there might be some that I'm simply not recognizing). What I perceive is a system that can perform pattern-matching operations on data structures, and in doing so can serve as a meta-circular evaluator, which doesn't seem to be anything more than Scheme could do. Perhaps you could comment on your view of how these two features pertain to OMeta? Perhaps my remarks on reflection below will help you see where we're failing to connect.

I'll try to elaborate on non-reflection in Kernel. (Sorry I didn't manage to make the following shorter.)

Reflection, as I understand it, is the ability of a computation to directly access facets of its own computational state that it wouldn't ordinarily be able to access. What is "ordinary" is a matter of how one perceives the computational paradigm. A stock technique for reflection is reification, which provides access to one of these ordinarily-inaccessible facets by manifesting it as a data structure — thing-ifying it.

Now, I'm particularly concerned with the contrast between two different ways of thinking about how combiners handle their operands. One view is that operatives (such as fexprs) suppress evaluation of their operands. That is, when we see an operand, we assume that it is meant to be evaluated — and then, if the combiner turns out to be operative rather than applicative, we take back the decision to evaluate it. I call this implicit evaluation. Unfortunately, when you go to construct a calculus for formal reasoning about computation, "suppressing subexpression evaluation" in certain contexts translates into suppressing subterm reduction in certain contexts, which immediately means that the term reduction relation is not compatible, therefore the term reduction relation does not imply contextual equivalence, therefore the contextual equivalence relation is rendered trivial — which is the reasoning in Mitchell Wand's classic 1998 paper.

The alternative view is that applicatives induce evaluation of their operands. So the operands are just data structures, and if they are passed unevaluated to an operative, that's just what happens to those data structures. If the combiner turns out to be applicative, then we commit to evaluating the operands, a decision that is irrevocable; and since there are no "suppressing" contexts involved, it's entirely possible to construct a compatible lambda-like calculus for this, with a nontrivial theory (whose nontrivial equations all involve terms that couldn't even be expressed in Wand's theory, but that's another story).

Now, here's where reification comes in. The term reify comes, of course, from philosophy; it's a kind of conceptual error that philosophers accuse each other of. We would call it a type error, in which something is treated as if it belonged to a type to which it does not belong. Reification in the reflective computational sense is the conversion of a facet of computation into a data structure, when that facet is ordinarily understood not to be a data structure. Why then would one consider a fexpr intrinsically reflective? Because it reifies its operands, of course. It might also be seen as "reifying" its dynamic environment, but I've never known anyone to claim that a fexpr is only reflective if it accesses its dynamic environment. To say that fexprs are reflective is to say that when fexprs access their operands they are performing reification. And under the explicit-evaluation assumption, when a fexpr accesses its operands it isn't reifying anything, because the operands started out as data structures and never stopped being data structures, so they can't be "reified". So, to say that fexprs are reflective is to embrace the implicit-evaluation paradigm. Which leads to all kinds of conceptual problems, including the trivialization of theory I described a moment ago.

So as I see it, through the lens of explicit evaluation, fexprs are not intrinsically reflective; and I am interested in studying fexprs for themselves. It will promote that study to separate fexprs as much as possible from this other feature (reflection) that they have been perceived as necessarily entangled with. So I really want to avoid flourishes, such as procedurally represented environments, that are favored in reflective Lisps because they afford lots of reflective power — when I don't want lots of reflective power, in fact I'd rather not have any at all.

set-env! scares me

Ray,

I have to think about that. I'm scared of the performance implications of having "the-environment" clone environments and by the performance and semantic implications of "set-env!".

Yah, it's a bit scary....

The performance implications are in fact scary, and I'll probably define another procedure, likely named eval!, that just mutates the environment in place. I expect eval to be frequently used in linear-update contexts, so in practice it will often be substitutable by eval!.

But the eval/set-env! pair is, I think, a way to abstract "environmental" side effects (that is, side effects on the environment). For example, you could use eval to do a very complicated transaction involving many steps, tests, etc, in a "scratch" environment, and then either set-env! to commit the changes atomically, or just drop the "scratch" environment if things don't work out.

OK, I get it.

Thanks for the clarification, Tom!

Explicit Evaluation

John Shutt wrote:

the key to most of what I've figured out about making fexprs work — both in practice and in theory — hinges on thinking of them as the usual case, rather than an exceptional case. I call this the "explicit evaluation" view (or paradigm): that applicatives induce evaluation of their operands, in contrast to the implicit-evaluation view that operatives suppress evaluation of their operands.

I absolutely agree. That was a breakthrough moment in my thinking about the semantics here too. Prior to that insight, I was looking at "runtime macros" in terms of runtime code transformation and finding that there was no way to write apply where you didn't have to know in advance what it was you were applying - which sort of defeats the point of apply. There were several trains of thought that came together there - separate compilability of modules was another important one - and finally I worked out what I needed to "encode" in an invocation frame in order to call a code-transforming object.

The blinding insight that hit about then was that "normal" procedures could use the same kind of invocation frame. All it would require would be that, like the "macros", they'd have to control the evaluation of their arguments. So I could compile a call to something without knowing what it was, if I set up everything, regardless of whether it was a "function" or a "macro", to use this new kind of extended call frame.

So suddenly I had a way to write "apply" that could be used for macros or functions. So, wait a minute, these "macros" are functions. So, wait a minute again, given control of argument evaluation, many of these functions didn't need to be written in terms of code transformation, although they could be. And a whole wall of distinctions and problems and exceptions and caveats and limitations just washed away like tears in the rain. A thing that had grown to towering complexity and endless hairy problems suddenly became a vastly superior thing that was much, much simpler.

Realizing, after a bit, that I had reinvented "Fexprs", of course I went to read the papers about them and see why everybody thought they were bad. And I found that the problem had been about what environment to evaluate arguments in, in a lisp with dynamic environments which exacerbated that problem. But we have now a quarter-century or more of excellent work with scoping strategies, environments, macro hygiene, and lazy evaluation, all applicable to this kind of problem, which we didn't have then.

That made me think very carefully about applying that work. Specifically, about when and how an argument expression can be separated from the environment in which it was formed or evaluated in a "wrong" environment, and how to prevent those things from happening accidentally.

Under ordinary circumstances keeping track of the "call site environment" is enough for this. But with fexprs you can carelessly create extraordinary circumstances involving multiple call sites.

For example a function can pass an argument unevaluated to another function which then evaluates it, along with an expression formed at the call site within the first function. Now how do you determine from the information in the invocation frame to the second function the correct environment for both subexpressions?

So that was when I discovered the need for the final feature of my invocation frames - in addition to keeping a reference to the calling environment, each argument has to be an aggregate type of both expression and environment.

Anyway, that's the short history of how my thinking about it has evolved. Once I realized I needed argument evaluation to be Explicit (under the control of the called function) rather than Implicit (done independently of that function or actively suppressed by the function's "type"), the rest flowed in a straightforward course of design considerations.

A question for John and Tom: Given that neither of you are talking about constructing expression/environment aggregates for individual arguments, how do you propose to keep argument expressions associated with the environments in which those expressions were formed?

Ray

Hygiene

in addition to keeping a reference to the calling environment, each argument has to be an aggregate type of both expression and environment.

I wonder if we need yet another ingredient in order to support hygiene.

In hygienic macro systems, you have this notion that if a macro introduces an identifier, it gets transparently renamed (or marked, or "painted"), so that it can't clash with other identifiers.

(defmacro-hygienic foo ()
  `(define quux 12)))

In this contrived example, quux would be inaccessible outside foo, so it would be completely useless.

OTOH, if a macro uses an identifier it has received as an argument/operand, that identifier will capture the original identifier.

(defmacro-hygienic defun (name sig . body)
  `(define ,name (lambda ,sig ,@body)))

Here, name is accessible in the environment of the call-site of the macro.

My feeling is that we need similar renaming to enable hygienic fexprs.

But maybe, associating operands with environments, as you've described, is enough to make this work.

I swiped the aggregation technique

I swiped the technique of aggregating subexpressions with references to the environments where they were formed from "standard implementation strategy" for languages with lazy semantics.

It seemed like a well-explored method that was sufficient to serve the need. There may well be a way to do it otherwise, but if so that insight hasn't drifted yet into my poor little head.

Ray

Ray's operand/env aggregates and hygiene

A question for John and Tom: Given that neither of you are talking about constructing expression/environment aggregates for individual arguments, how do you propose to keep argument expressions associated with the environments in which those expressions were formed?

With the caveat that we have fairly different notions of environment mutation, I intend to do it similarly to what you describe, but only by convention and only in libraries rather than in the core language.

From my perspective, what you are talking about is a dynamic generalization of syntactic closures. And if you build a subset of those dynamic syntactic closures - namely, ordinary static syntactic closures - in a library atop my core, they can be statically eliminated by fairly simple rules that are deeply isomorphic to the customary rules for statically expanding, say, syntax-case macros.

That said, dynamic syntactic closures don't go in my "core" because they can be efficiently provided in a library and because there are plenty of times I just don't want them. They are overkill for simple FEXPRs like AND. They just get in the way for exotic FEXPRs that don't necessarily care about the caller's or the callee's lexical environment much (and will instead refer to some "third party" environment to resolve identifiers).

Lazy Evaluation

This all sounds like the rediscovery of lazy evaluation (like, for example in Haskell). Yep, it is beneficial for the expressiveness of a language not to evaluate the operands before passing them to the operator. And having lexically scoped lazy evaluation mostly eliminates the need for macros (hence, that's why Haskell does not have macros ...)

lazy is only part of it

Nobody is claiming that FEXPRs are an original invention. They go *way* back. Even the "Schemish fexprs" that implicitly pass around first class environments go way back.

Yes, the option for lazy evaluation with sensitivity to lexical scope is one of their interesting benefits.

No, it's not *quite* Haskell for with FEXPRs, you have something lacking in Haskell: a deep equivocation of data structures (e.g., structures formed from constants and from CONS) with code (that which can be evaluated) that is nicely (simply, directly, usefully) related to the rest of the language. For high-falutin' theory stuff, have a look at appendix C of The Revised^-1 Report on the Kernel Programming Language. "Code == Data" is pretty deep and hairy in some ways and on the other hand intuitively simple and tractable in others. It's deeply part of what makes a lisp dialect a lisp dialect. Round about the time of the Revised Revised Report on Scheme the demi-gods of Scheme standardization shoved aside "Code == Data" -- initially appearing to merely postpone the topic but in the past decade seemingly intent on killing it entirely. And this in spite of mounting evidence that the rationales they offer for doing so simply don't add up. But enough about "them"....

FEXPRs are not quite simply "lazy evaluation" in the sense of "normal order reduction of lambda calculus terms" because they add concepts like data structures (such as cons pairs) and first class environments. FEXPRs of the sort we're discussing here admit useful yet naive implementations comparable to the meta-circular interpreter found in "The Structure and Interpretation of Computer Programs". They provide a conservative generalization of the theories of phased hygienic macros that lie behind R5RS and R6RS. They are a no brainer for a modern lisp, if you ask me, except for within certain influential social circles who are trapped in dysfunctional patterns of discourse - and whom I can only address from outside, and probably incomprehensibly to many of them.

functional discourse

They are a no brainer for a modern lisp, if you ask me, except for within certain influential social circles who are trapped in dysfunctional patterns of discourse

Speaking of dysfunctional discourse, these sort of comments get tedious fast, and are difficult to distinguish from the utterings of a crank.

You'd do better to focus on technical explication of the viability of the approach you're advocating, beyond assertions in comments on LtU.

Remember the "fun" in "dysfunctional!"

I've tried pretty hard to stick to technical discussion here, but I don't think that being dumped from the working committee without notice or discussion was fair play. Is it unprofessional to vent about it here? Maybe. But is it unreasonable to ask Tom not to be angry? Definitely. I know you're not speaking for the committee, but one simply cannot treat someone that way and then ask them to shut up about it.

Anyway, Tom has no reason any more to cooperate with the committee, nor reason to care whether their feelings are hurt. in fact they've rather emphatically rejected his cooperation and deliberately disrespected him, as far as I can tell. If he now does not cooperate with their desire for silence they've themselves to blame.

Mr. Lord's claims of unjust treatment are just that -- claims

Mr. Lord was removed by WG1 by the Scheme Steering Committee, which appointed him in the first place under conditions, just like every other member. I regret (though I am not responsible for it) that the SC has not made its actions public. I do know from personal knowledge that his claim of not being warned is simply false.

That's all I have to say here.

off list

I shall correct Mr. Cowan's impression off-blog.

LtU and discourse

Your point is taken and accepted in this context, Anton, that I should lay off for now and I shall. After this:

I think that LtU is a great place for front-page material that is mostly just literature review on a technical level - for a broad definition of literature. It adds up to a great reference resource, that way, and also a great way to keep a finger on the pulse of what's going on in PLT and new language engineering.

I think that LtU is also a great place to note some history of programming languages, both retrospectively and in real time. We all have our pet theories and notions about how programming language design goes in ideal practice and here is a place where we can record both our theories and notions, and compare and contrast with the historic facts of programming language development for comparison. Perhaps, or perhaps not, in 20 years someone will ponder "Why didn't R7RS include FEXPRs? They were fairly well understood by that time. Why did they omit them? It would have been an obvious step!" Well, here is a place for a partial chronicle of events.

Final "crank" statement: Let the record show that *this* former member of WG1 was informed by the WG1 chair that by *discussing* the *possibility* of FEXPrs for WG1 Scheme, on a discussion mailing list, in threads with other people entering into the discussion -- that mere act of talking about it on a discussion-oriented mailing list with apparently willing interlocutors -- had *forced* the resignation of another member of WG1 who was (I am informed) sufficiently offended by the mere contemplation of the topic. That other member's resignation, as I (imperfectly, no doubt) understand these recent events, was one of several strikes against me that led to my expulsion. (The other strikes, as I understand them, are about as equally ridiculous as is the manner in which the Steering Committee treated the entire matter from start to finish.)

I find that remarkable. I note it here, for the record which LtU is. What's that old saying about theory in practice and practice in theory? People can sort it out later. Yes, I have been frothing at the mouth. Yes, I'm sure I'm superficially hard to distinguish from a crank in these matters. Your point is taken. I hope you take mine.

Thanks

I appreciate your graceful acceptance of my point.

I think that LtU is also a great place to note some history of programming languages, both retrospectively and in real time

The structure which we've tried to encourage (although less actively lately) is that most substantial things worth saying should be posted elsewhere, on a blog, in a paper, etc. LtU has historically worked best when it links to such material.

One reason is that anything worth discussing needs some common foundation amongst the participants and audience. Few LtU readers are likely very aware of the workings of Scheme WG1 (I'm certainly not), and an external post about whatever the issue happens to be could take the space to lay out the author's perspective for those who aren't intimately familiar with the topic.

All true and important

All true and important points, and indeed linking to more background will surely be helpful to most readers. Still, I feel I should say that I find that sometimes asides such as the one Anton responded to can illuminate the discussion, and indeed infrequent angry posts are not the end of the world. Civilized discussion, which is our goal, can become heated from time to time, so long as the discussion remains grounded in facts and arguments.

I, for one, am learning quite a bit I didn't know before from the current discussion.

Seconded

I thought I knew quite a lot, actually, about Lisp and its history, both technical and social. Tom Lord and Ray Dillinger, in particular, are helping me to understand how far off the mark my sense of understanding has been. I owe them a debt of gratitude for that. Having my own strongly-held predilections, and having them serve as a focal point for opposition, I have to say that I can also, to a limited extent at least, sympathize with Tom's frustration. It's difficult to feel that you've been excluded from a community for unsound reasons.

♫♫♪♫♪ Feelings... whoa oh oh feelings...♫♫♪♫

It's difficult to feel that you've been excluded from a community for unsound reasons.

Well, "feelings" are one thing, but from the little I've seen in this thread, I suspect the reasons relating to excluding Tom from WG1 specifically may in fact have been quite sound. Noel preempted all this at the top, when he wrote:

There is another point: the Scheme standard is not really the place for experimentation. If you want to see something in the standard get it into an implementation first, generate some significant experience with the feature, and then perhaps the committee will listen to you.

I retract my earlier comment to Tom, apparently we're discussing this now.

Touchy Subject

I didn't mean to imply that I thought the reasons, whatever they may have been, were unsound. I don't know nearly enough about the charter, the interactions, or any of the private communication related to the issue to have an opinion one way or the other. I was only responding to Tom's clear frustration here, which came attached to, but distinct in my mind from, the extremely helpful exposition of his design arguments with respect to his Lisp dialect, whatever it may be called, and whether it qualifies as a "Scheme" or not.

re: Feelings / open letter to WG1

I retract my earlier comment to Tom, apparently we're discussing this now.

I appreciate both your original encouragement to back off the topic and the follow-ups from people saying they got something out of it and it wasn't wildly inappropriate.

I don't think that there's a lot of discussion left to be had in the topic but I'll mention these things. I hope this doesn't come off as condescending:

1) WG1, if they take my unsolicited advice, should say to itself "Well, that whole mess sucked," and then proceed to have fun and do good, thoughtful work. They presumably don't need me to tell them that but I figure it can't hurt.

2) WG1 should not forget that the charter calls for a language that admits simple, tiny, naive yet useful implementations and that also admits sophisticated implementations which should serve as a platform on which a larger language (say, WG2?) can be implemented in a portable fashion. I happen to think that optional FEXPRs in a very tiny core are a good strategy for that but I would also agree that it isn't the *only* strategy. It was a mistake, in my view, for WG1 to officially suppress discussion of the proposition but what's done is done.

3) As a kibbitz from the outside perspective, I think that both WGs are at risk of suffering from over-management and under-participation. The chairs are energetic; much of the membership - meh. One of the powers of the chairs is to appoint officers. Both could benefit, perhaps, from the appointment of a secretary to record "minutes" and chronicle the proceedings, an editor to maintain work-product for the final report, and subcommittees where appropriate to consider various topics and report back.

4) WG1 should carefully consider its current trajectory of ratifying R5 plus additional requirements drawn from various SRFIs against the charter.

5) WG2 should expect from WG1 a language in which WG2 Scheme can be usefully implemented in a portable way and WG1 should expect the same. Some discussion to this point has suggested punting on that point.

6) WG1 should take REPL / dynamic programming environment semantics much more seriously than it has so far (although, to be sure, it is only around 2 months into the process).

7) Scheme is dead. Long live Scheme. About the worst damage WG1 and WG2 can do is to fail to increase the practical relevance of the Scheme Report(s) in the next generation. Such damage can can scarcely kill off the pursuit of the basic ideas at the core of Scheme. The "upside" potential of the WG's is to punt the ball far down the field in terms of Scheme's ... uh ... "acceptance", "relevance", what have you. The "downside" potential is to waste 1 - 1.5 years with little or no lasting impact (but hopefully at least have fun trying). If the WGs display some guile in pursuit of the gambit they've embarked upon they can PLoT a great future even if they are bit chicken about fexprs. If they fail, nobody will accuse them of larceny for robbing Standard Scheme of a promising future since, as it stands, chez lambda is in rough shape anyway. I don't mean to be gauche but they should be cautious about letting some kind of Stalin start taking over the process but I don't suppose that any of the members are really SCuM at the end of the day and if the WGs don't get stuck in any Rabbit holes they can grow from chibi into a mighty Oak(lisp).

-T

LINQ

I don't know much about the tangled history of fexprs in Lisp, but much more than lazy evaluation, I keep being reminded of the (type-driven) way that C# reifies expressions as part of LINQ queries (and, I think, perhaps anywhere that an expression has a type like Expression<T>?). Unfortunately, I don't know much about LINQ either beyond the little I've read in papers, so I'm hoping someone else can shed some light.

Anyway, I don't know exactly how they handle free variables, whether they pass something like a first-class environment, or attach the bindings of each free variable in the current environment along with the variable's name, or what. But if fexprs are universally known to be such a bad idea, I'm curious what the difference is here. I guess the obvious distinction is that in C#, the difference between normal and "fexpr" evaluation is statically apparent. Perhaps that's 90% of the controversy around fexprs?

Only closures are compiled

Only lambdas are compiled to LINQ expressions. If an environment value is captured by such a LINQ expression, it's embedded as a ConstantExpression with the value at the time of capture, as if it were a closure (consistent with C#'s eager semantics).

If you want to defer capture, you'd have to lift it to a parameter of the lambda, or wrap it in a custom lazy type, or perhaps an expression node that your expression visitor understands and processes specially.

Thanks

That helps somewhat. I'm still a bit confused, though. LINQ allows to introduce new bindings as well as capture, e.g., column names, correct? And no explicit lambdas are needed, right? I have in mind stuff like "select foo as bar from blah", where at minimum "foo" and "bar" are not hygienic. How is this done?

Sorry, I suppose I could just go read the docs...

var q = from b in blah


var q = from b in blah select new { bar = b.foo };
// equivalent:
var q = blah.Select(b => new { bar = b.foo });
foreach (var item in q) Console.WriteLine(item.bar);

The type of blah (which is basically a monad) must have a valid match for blah.Select<TSource,TResult>(), which can be found in a number of ways; b is expected to be of type TSource. foo is statically determined to be a member of TSource. The type of blah, and the type argument TSource, and the member foo, may all have attributes (metadata) associated with their reflection objects, and the corresponding LINQ engine uses this metadata to map from the type system constructs to the database. Normally this metadata is in the form of attributes in the source code, where the source code is generated by a tool that looks at the database schema.

Perfect, thanks!

I just wasn't sure where they were hiding the lambdas...

Maybe this will help: C# 3.0

Maybe this will help: C# 3.0 query expression translation cheat sheet. It's a fairly concise summary of all the transforms LINQ handles with its query syntax.

The final step is understanding how some LINQ expressions are compiled to delegates, ie. Func<T, R>, and some to quoted expressions, ie. Expression<Func<T, R>>. LINQ collections are an example of the former, LINQ-to-SQL are an example of the latter.

Oh, that's it.

I see in the link you provided to John's spec for the Kernel language that he's doing the same as me and providing the environment with each argument to an "operator".

Which I feel vindicates a design choice, somewhat; meaning, I suppose, that even now I hadn't fully trusted my decision to make aggregates, and find it reassuring that a smart academic guy came to the same conclusion. To me it means that if there was an alternative I missed when I made that choice, it wasn't something blindingly obvious.

Clarification, and a new quasiquote

I didn't mean to criticize associating operands with environments. To the contrary, that seems like a very good thing to do.

My question was whether FCEs could obviate extra machinery for hygiene, and now I'm more certain than before that they can.

I think that quasiquote's (and maybe also quote's) definition needs to be extended, so that the objects it creates are also such form/enviroment aggregates, where the environment is the environment in which quasiquote is called.

Then, as in a hygienic macro system, the contrived

(defmacro-hygienic foo ()
  `(define quux 12))

creates an inaccessible variable, because quux is an aggregate with foo's environment.

</handwave>

Ah, I see what you mean.

I had not considered this very closely yet because I have not implemented quasiquote and unqoute in my language. I had gotten as far as quote, and was still debating whether those should make it into the "first cut" of the language. I had noticed they needed to be more complicated somehow, but thought that maybe quasiquote should have a two-argument form with an environment as second argument. I was "waiting for insight" as to how to implement these features but hadn't gotten further than that.

Offhand, I believe that you are right, and that in a world with fexprs each unquote would need to have its own environment. That probably makes it simpler to use a "list of thunks" idiom than the more familiar "quote/unquote" idiom.

Thank you for the insight.

SRFI-72-style quasiquote

[Dear LtU'ers, please excuse posting such "in-progress" thoughts. I post this here because this thread has strayed off-topic already, and one more comment can't hurt that much. And I think that I'm onto something here ;)]

I think that my above comment regarding a new quasiquote to support hygiene doesn't go far enough.

Summary: I think that we need to separate environments (mappings from symbols to locations) from "namespaces" (variable references "painted"/marked with the same "color"/set of marks). Every invocation of quasiquote creates a new namespace, disjoint from every other namespace. Symbols can be injected into a namespace with DATUM->SYNTAX to deliberately break hygiene.

Longer explanation:

In SRFI-72, André van Tonder presents an interesting hygiene rule:

A binding for an identifier can only capture a reference to another if both were present in the source or introduced during a single evaluation of a syntax or quasisyntax form, with the understanding that the evaluation of any nested, unquoted syntax or quasisyntax forms counts as part of the evaluation of an enclosing quasisyntax.

This seems to suggest that even if multiple quasiquotes (I'm using this interchangeably with quasisyntaxes) appear in the body of the same fexpr, the "namespaces" they introduce are different:

(fexpr (param1 ... env)
  (eval `(define x 1) env)
  (eval `x env)) ;; unbound-variable-error

Fix:

(fexpr (param1 ... env)
  (let ((var `x))
     (eval `(define ,var 1) env)
     (eval var env)) ;; ==> 1

So, associating the results of quasiquotes with the environments in which they were created is not enough for (SRFI-72-style) hygiene.

Rather, each quasiquote invocation should generate a new namespace (or "template identifier" in SRFI-72 parlance), that can then also be used with DATUM->SYNTAX to inject a symbol (or other form) into the namespace of the quotation.

For example, this macro defines the variable foo in the call-site's environment and namespace, by breaking hygiene with DATUM->SYNTAX:

(define foo-definer (fexpr (param1 ... env)
  (eval `(define ,(datum->syntax param1 'foo) 12) env))
(foo-definer)
foo ;; ==> 12

(Edit: This is slightly handwavey, as the fexpr has no parameters, so param1 doesn't exist. Let's just say that param1 is a handle to the call-site's namespace. There could be a mechanism for the fexpr to receive the whole form (analogous to Common Lisp's &whole), and that would then be used.)

For comparison: this contrived, useless macro defines the variable foo in the call-site's environment, but in a fresh namespace, so the variable is inaccessible, because of hygiene:

(define broken-definer (fexpr (param1 ... env)
  (eval `(define foo 12) env))
(broken-definer)
foo ;; unbound-variable-error

Thoughts?

Quotation considered harmful

The various ideas in this thread about how to make fexprs play well with quotation are fascinating and potentially exciting. (I am skeptical, in an open-minded sort of way, about ideas that involve introducing multiple kinds of environments, as I tend to suspect weaknesses and restrictions when I see additional features appearing necessary.) However, speculation on guarded environments notwithstanding, for my own research I've taken a different direction. This might sound trivial at first, but isn't: I'm pursuing the hypothesis that quotation is another language feature that, like dynamic scope, should not be put into the same language design with fexprs.

This hypothesis occurred to me when, after several years of work on the R-1RK — and still some time before I put it on the web as a techreport — I noticed that quotation was the common element in all my simple illustrations of hygiene violations with fexprs. The Kernel design philosophy says that compromising the language style will be ultimately self-defeating, therefore startling consequences of the style should be embraced wholeheartedly; so I ruthlessly suppressed my "well, that can't be right" reaction, removed the quotation and quasiquotation syntactic sugar and associated standard operatives from the language, and settled in to see whether, in the long haul, it would really be possible to make Lisp-without-quotation work. My impression, so far, is that fexprs can be used to make it work.

What happens here, I think, is that when an unevaluated operand is passed into a fexpr — recalling long-ago terminology, one might call this a "downward quotarg" — it's reasonably unlikely to cause accidental bad hygiene, because the fexpr knows what environment to associate with it (even though Kernel doesn't entangle operands with environments); but if the fexpr passes any of its unevaluated operands back out, as an "upward quotarg", then the association between operand and environment is lost, and accidents are much more likely. The $quote operative passes its operand outward/upward; but eliminating $quote is feasible, because much of what it's used for is simply preventing some operand from being evaluated on the way in to some programmer-defined combiner, which can be accomplished by using a fexpr.

A simple example that came up in my dissertation: When writing a Scheme meta-circular evaluator in Kernel, the mc-combine applicative takes an unevaluated operator and operand-list, and first compares the operator to several reserved operators. Using Scheme-ish style, it might look something like this (assuming syntactic sugar for quote):

($define! mc-combine
   ($lambda (operator operands env)
      ($cond ((equal? operator 'define)  (mc-define operands env))
             ((equal? operator 'if)      (mc-if     operands env))
             ((equal? operator 'set!)    (mc-set!   operands env))
             (#t  (mc-apply (mc-eval     operator env)
                            (mc-map-eval operands env))))))

Here's how I envisioned it using native Kernel style:

($define! mc-combine
   ($lambda (operator operands env)
      ($cond ((define-operator? operator)  (mc-define operands env))
             ((if-operator?     operator)  (mc-if     operands env))
             ((set!-operator?   operator)  (mc-set!   operands env))
             (#t  (mc-apply (mc-eval     operator env)
                            (mc-map-eval operands env))))))

($define! $make-tag-predicate
   ($vau (tag) #ignore
      ($lambda (x) (equal? x tag))))

($define! define-operator? ($make-tag-predicate define))
($define! if-operator?     ($make-tag-predicate if))
($define! set!-operator?   ($make-tag-predicate set!))

Operative $make-tag-predicate doesn't pass its operand outward/upward, but instead uses it on-site, keeping a tight rein on it.

(Fixed: some missing parentheses.)

Quote in a fexpr language means something else.

What happens here, I think, is that when an unevaluated operand is passed into a fexpr — recalling long-ago terminology, one might call this a "downward quotarg" — it's reasonably unlikely to cause accidental bad hygiene, because the fexpr knows what environment to associate with it (even though Kernel doesn't entangle operands with environments); but if the fexpr passes any of its unevaluated operands back out, as an "upward quotarg", then the association between operand and environment is lost, and accidents are much more likely.

Actually this problem (or a version of it) is the reason I decided that I needed to form expression-environment aggregates (shall I call them 'quasipromises'?) when passing arguments. Quasipromises, when evaluated, use their own environments. So a function can take an argument without evaluating it (the unevaluated object is a quasipromise), then pass it, still unevaluated, to another function. If it gets evaluated in that other function, it's still got a reference to the proper environment for its evaluation. And if it gets returned via tail call to a continuation closer to the root of the environment tree than its own, it still contains a reference to its own environment for evaluation.

Remember how, in the original Rabbit compiler, after CPS transformation, Sussman & Steele found that there was no real difference between function calls and function returns? Both were just "goto with arguments" in a language where data (in this case environments) has unlimited extent.

To me this indicates that for consistency's sake we ought to at least consider returning values exactly the same way we supply arguments. Since the argument of quote is a quasipromise which quote does not evaluate, this argument goes, quote itself ought to return that quasipromise. Or in the language you use in your report, since $quote must be supplied with both an operand and the environment of that operand, it must similarly return both.

Raw list structure - the traditional loading of quote - would seem to be an inappropriate meaning for it in a lisp with fexpr semantics.

Anyway, that was my analysis of the problem. Accordingly I've tied the quote syntax and the quote keyword to semantics that explicitly create a quasipromise rather than to semantics that create list structure. If you just need raw list structure, you have to 'break' the quasipromise and use only the list structure.

I'm still seeking an appropriate answer for quasiquote though, and may eventually discover that it is not really amenable to one.

(As an aside, talking with actual humans about this is helping me to discover terminology. My internal documentation so far has called these entities 'argument expressions' but that's so vague as to be useless in a general discussion. 'quasipromise' as a neologism allows a precise definition, which is better, and suggests 'break' as a natural terminology for what I've been calling 'decompose.')

Quote and the applicative/operative separation

There may be something profound, here, in the reason why the aggregation that ...I think... you're describing would not be natural in my approach to fexprs. Perhaps you can tell me (when I finish my long-winded explanation of this) whether I've misunderstood how your approach works in this regard.

You've described your approach as having "only one kind of function". That is not true of my approach. I have two mutually disjoint types of combiners: applicatives and operatives, where operatives (constructed via $vau) are the flashier type, but applicatives are key to the well-behavedness of actual programs, both in practice and in theory. An applicative is a wrapper around an underlying combiner, essentially a one-tuple; after the interpreter evaluates the operator of a combination, resulting in a combiner, if the combiner is applicative the interpreter unwraps it and evaluates the operands, and recurses on the resulting combination. Operatives are the base case, and all combination evaluations will (ordinarily) get to that case, but applicatives are the inductive step on the way to that base case. The meta-circular evaluator routine for this is

($define! mc-combine
   ($lambda (combiner operands env)
      ($if (mc-operative? combiner)
           (mc-operate combiner operands env)
           (mc-combine (mc-unwrap combiner)
                       (mc-map-eval operands env)
                       env))))

There's a programming flexibility advantage to this, because the separate, unencapsulated applicative wrapper can be used for facile combiner manipulations by a client without violating the encapsulation of the underlying operative. But closely related to this are well-behavedness advantages. In theory, a compound operative call is modeled in vau-calculus by the beta-rule, which involves substitution, so that any evaluations that are actually built in to the operative would be hard to disentangle from it; therefore, having the applicative wrapper be explicitly separate makes it much easier to reason formally about applicatives. In practice, the vast majority of all applicatives have underlying operatives that couldn't care less about their dynamic environments: almost all operands to operatives are really data, not source code. The applicative wrapper therefore makes it possible for the underlying oeprative to ignore its dynamic enviornment, which both vastly reduces the distribution of first-class environments (reducing the likelihood of accidents) and avoids an uncontrolled stack growth that could potentially mess up proper tail recursion. Here, for perspective on this, is the library derivation for $lambda:

($define! $lambda
   ($vau (formals . body) env
      (wrap (eval (list* $vau formals #ignore body)
                  env))))

The point here is that when $lambda is called, it constructs an applicative whose underlying operative doesn't capture its dynamic environment; and since most combiners will be constructed via $lambda, this means that most operatives won't capture their environments. So if I were to introduce some sort of operand–environment aggregate, I'd want to use it on some operatives but not on most of them — which wouldn't be much in the spirit of uniformity. So by envisioning an orthogonality between applicatives and operatives, I'm naturally guided away from operand/environment aggregates.

Calls, returns, and function types.

Perhaps you can tell me (when I finish my long-winded explanation of this) whether I've misunderstood how your approach works in this regard.

I don't think you have misunderstood, no. My objective with this project is to make a translation-target dialect. Specifically, for as many programming languages as possible, and especially as many lisps as possible, I want to be able to construct the abstract syntax tree for any program written in that language, pull it into the system, and do useful things with it. The two main useful things would be to, under a set of definitions varying by source language, either treat it as a program having the same semantics as the original, or enable a straightforward automated translation of the program into a program in the new dialect having the same semantics as the original and also substantially the same or isomorphic AST structure as the original.

This relatively sane goal, I complicate to the point of insanity by desiring also that people working in the new dialect should be able to maintain and modify these codebases which now exist under distinctly different semantic rules, including introducing direct calls from one codebase to another.

That objective means that generality is the overriding concern. Efficiency is not quite a nonissue, but its influence is limited to implementation strategy; it cannot be allowed to drive design decisions. In the same way semantic cleanliness cannot be too closely enforced, because I may need to implement semantically unclean models in a straightforward way. What I do, or try to do, is to make sure that semantically muddled constructions are clearly marked by forcing them to be expressed with constructs whose semantic difficulties are known, clearly documented, and elicit the proper warnings from the system when used.

Rather than have classes of function that can and can't do certain things, it's cleaner for my purpose to have a single function type that can do anything, even things that are bad ideas, because I may need to do those things in order to build a straightforward representation of the semantics of some language in which code I want to preserve originates. For example, modeling languages and some symbolic-algebra languages have semantics almost exclusively modeled by composition of macros, and Algol had its peculiar call-by-name semantics, and etc.

The pursuit of optimization, I'm approaching from a post facto perspective; in working with programs from many different languages, and hoping to map function calls onto function calls, there can be no builtin proof of function properties from a type system about what functions can and can't do, because functions can do anything. There are only proofs about what particular, individual functions do and don't do.

Accordingly, I needed to construct a function call (particularly, in apply) that didn't need to know, or care, about the "type" (operative or applicative, to use your terms) of the function being called. So there really is only one kind of function call semantics. Evaluating each argument once each, in order, by the standard evaluation rules, is a difference in function behavior that the programmer has to know about, not a difference in function type that apply has to know about. I've done this without "wrappers," so each function encapsulates all of its evaluation behavior as well as its other behavior.

An invocation frame or environment as I've implemented it contains all the arguments, plus explicit references to both the dynamic environment and the lexical environment of the function call. Each argument is represented as an aggregate or 'quasipromise', containing a reference to the evaluation environment for that subexpression. This is usually but not always the same as the dynamic environment in the call.

If the function exhibits "normal" behavior, it evaluates all its arguments, immediately, once each, in order, using the standard evaluation function, and replaces the quasipromises with simple values before user-written code begins.

If any parameters are specified as "lazy" in the lambda list, the corresponding quasipromises are treated as promises. That is, they are not evaluated until (or unless) the values are needed or the parameters are explicitly "forced." Whenever that happens, the function code immediately replaces the quasipromises with the returned values.

If any parameters are specified as "app" in the lambda list, the corresponding quasipromises are evaluated whenever the corresponding values are needed, which may be many times during the execution of the function. This is typically how looping constructs take their loop-body subexpressions. This is intended to model the "applicative semantics" of pure lambda calculus, which lisps normally elide by imposing a single-evaluation rule on function subexpressions. Yes, I'm aware that you and I are using the same word in drastically different ways. Maybe I should call these 'loop' parameters after the usual use case.

Parameters taken as 'app' parameters can be explicitly forced (which means you can't ever evaluate them again) or operated on by a destructuring primitive allowing you to get at raw list structure or the bare environment reference.

Within a function, the return continuation is bound to the name return and may be called with any number of arguments. But these arguments, like the arguments into the function, are passed as quasipromises containing references to the function environment, and unpacked at the call site. Like the arguments into the function, they are normally evaluated immediately by the caller, but can be taken in the call context as 'lazy' or 'app' returns. The lambda list, with its possible 'lazy' or 'app' subexpression keywords, is just inserted after the calling arguments and a vertical bar in the call expression.

So the lambda form for a particularly silly function might look like this:

(define silly (lambda (arg1 arg2 (lazy arg3) (app arg4))
(return (f1 arg1 arg3) (f2 arg4) (f3 arg1 arg2))
)

And at the call site you might have a single-return context defined as

(silly e1 e2 e3 e4 | (lazy %1) )

specifying that the first argument returned, which is in this case the only argument taken by the caller, is treated as a promise rather than evaluated immediately. Of note, when this call completes, e3 still has not been evaluated. It was passed to a function which treated it as "lazy" and never required its value. Then a subexpression involving the variable bound to it was returned in a context which treated the return as "lazy," which still doesn't force its evaluation.

For some time this struck me as mildly insane, but that's where generality-as-overriding concern has led me, and I've gotten more used to it as time goes on. I think I will borrow a page from your book and have a naming convention for non-normal functions though; knowing where the abstraction barriers are, when a function call may or may not have them, seems like a good idea.

No well-defined thoughts yet.

Insight into the nature of the design problems is one thing. I think I've achieved that now with respect to quasiquote / unqoute .

But finding the Right solution (consistent, general, powerful, simple to use) in terms of language design is another, and on that second score I find that I am still waiting for insight.

How's that?

I see in the link you provided to John's spec for the Kernel language that he's doing the same as me and providing the environment with each argument to an "operator".

I'm puzzled by this comment. I'm not aware of doing anything that I would describe that way. In Kernel, the operand tree that is passed to a compound operative is separate from the environment that is passed to it; however many local bindings are created for parts of the operand tree, those local bindings are distinct from the local binding that may be created for the dynamic environment of the call. It doesn't seem to me that there's any aggregation involved.

re: "How's that"

In a rough and ready sense, its the same thing either way. You could pass the operands aggregated with the environment in a dis-entanglable way or you could separately pass the raw operands and the environment in a way that allow them to be aggregated. You two are talking about trivial duals (which Ray recognized there).

It's a difference that makes little nevermind although, as per usual, you have in Kernel an eloquent and nicely simple formulation whose provocations by way of implied suggestion are perhaps more productive. You got good style (where I don't mean "style" in a way that is purely superficial). (Hrm. That's too fawning. So: Get back to work on your thesis!)

(Incidentally, just for amusement, you should check out at least superficially the presence of procedure->syntax in SCM and, more deeply, I can't recommend highly enough (nor endorse as perfectly The Right Thing) SCM's implementation strategy. Mr. Jaffer and some of his co-conspirators have a particular and rare flavor of genius.)

It wasn't the aggregation I

It wasn't the aggregation I was talking about; it was providing an environment for each argument expression ("operand" in your report's parlance). I'm forming an aggregate type of expression and environment (like a promise but without the execute-at-most-once guarantee), but that's an implementation detail. The important thing in terms of laying the groundwork for a useful, nontrivial fexpr semantics, I believe, is knowing and having available in the invocation frame the "right" environment for each operand.

I observed that you came to the same conclusion re: your operators, and felt a certain amount of relief at being agreed with.

The Pitmanual

I observed that you came to the same conclusion re: your operators, and felt a certain amount of relief at being agreed with.

Ah! Yes. My 2002 NEPLS talk was my first public outing of my fexpr ideas, and I didn't really have any idea what kind of reception to expect. The overwhelmingly positive responses were profoundly reassuring that I wasn't just crazy.

Historical note. In MACLISP, most FEXPRs had just one "argument", which was the cdr of the calling combination; but even though MACLISP was dynamically scoped, one could sometimes get into trouble by evaluating an operand in the local environment, because some binding in the dynamic environment would be shadowed by the name of the one argument. To solve this problem, there was a two-argument form of FEXPR, in which the second argument was, more or less, the dynamic environment (though it wasn't first-class).

(I noticed two-argument fexprs a few years ago during yet another foray into my well-worn and cherished copy of the Pitmanual, Saturday Evening Edition. The Sunday Morning Edition is now on the web; see in particular here.)

Bogus

This bogus claim always comes up in discussions about powerful macro systems from people that haven't really used languages with powerful macro systems. Look at PLT Scheme to see the things that have been done that go far beyond lazy evaluation. OCaml's CamlP4 is another good example.

Also, Haskell *does* have macros. See Template Haskell.

Template Haskell is not

Template Haskell is not Haskell. So, Haskell does NOT have macros.

I don't doubt that Macros can do much more than lazy evaluation. But I doubt that this "much more" is needed in a modern language which usually has already constructs for the most common uses of macros.

That's, like, your opinion man

But I doubt that this "much more" is needed in a modern language which usually has already constructs for the most common uses of macros.

The why of macros is well established.

The thread you are pointing

The thread you are pointing to just establishes that the "why of macros" is not quite so well established. It rather enforces than changes "my opinion".

Why Macros

It has been my impression that, no matter what specific thing you use macros to do, you can later invent a neat language feature that handles the problem in a clean and nice manner. The problem is the "later". Macros can do it "now"... in an ugly, brutish manner, true, but the job gets done.

I'll note that the same thing can be said of frameworks as of macros. Both offer means to hack advance features into a language that are relatively convenient compared to boiler-plate or copy-and-paste programming.

Thus, the existing use of macros, and of frameworks, offers ideal terrain to mine for new language features. Achieving first-class support can often improve performance, safety, security, or composability beyond what would be achieved with the macros or frameworks (it can be very difficult to usefully compose frameworks or combine macro-based DSLs into a single program expression).

But, even after fulfilling a class of needs for a macro or a framework, one isn't rid of the utility for macros and frameworks; rather, those uses are simply pushed ever further into the language frontier. One might, tongue-in-cheek, call this 'Greenspun's Incompleteness Theorem': without meta-programming, a language will never be feature-complete. As you push any language to its limits, you will always encounter problems that cannot be readily abstracted without either sacrificing some nice property (static safety, performance, security, etc.) or inventing another language for the job.

That said, macros aren't necessary. Rather, entrepreneurial developers get to make a variety of unpalatable choices upon reaching the limitations of their language: boiler-plate, macros/extensible-syntax, third-party code-generation utilities, compromise or sacrifice of system properties, or developing and implementing a new language.

Even among people who find macros smelly, you'll find many who think macros the least distasteful of these choices.

Why Fexprs

Tying this in with fexprs:

Macros are an unpleasant feature to compensate, somewhat, for limitations of a language. It's better to compensate for the limitations than to leave the limitations in place without compensating at all. Better still, though, would be to eliminate the offending limitations. (Remove the weaknesses and restrictions that make additional features appear necessary.) Make no mistake, fexprs don't necessarily increase the uniform flexibility of the language; but with care in the design they can do so, and thereby, the jobs otherwise done by macros are brought within the purview of the elegant language core. So instead of an ugly brutish fix with macros now and hope for a more elegant replacement later, one can produce a clean solution now with fexprs.

It's an interesting question whether it's possible to introduce fexprs in an elegant way into a non-Lisp language without causing that language to become a dialect of Lisp. Fexprs come from a place deep in the fabric of Lisp, so that when they are added to a language, it seems that they would bring Lisp-nature with them. They were, after all, the native Lisp way of handling operative extensions, from the very first years of Lisp implementation before macros were grafted onto the language (in reaction, one suspects, to problems that arise when fexprs are mixed with dynamic scope).

Grafted on...

They were, after all, the native Lisp way of handling operative extensions, from the very first years of Lisp implementation before macros were grafted onto the language (in reaction, one suspects, to problems that arise when fexprs are mixed with dynamic scope).

Well, grafted on is a bit too strong for my taste. The Scheme community (err...) in particular has done a wonderful job on second-class macros. Sure, they are separate from, and complicate, the main language, but this separation enables the use of powerful compilers, something that hasn't been shown for first-class macros yet.

P.S. As I understand Lisp history, macros were not grafted on, but rather extracted out out of Lisp, in order to tame them.

macros and fexprs

Wonderful job is a bit too strong for my taste, but I certainly agree that they've been done impressively well considering their natural limitations, and probably about as well as can be done without some fundamental change of strategy (such as fexprs instead of macros; my reservations about wonderful job are perhaps because I see modern Lisp macros as the best that can be done with an inherently flawed strategy). My impression of the history is that the decision to buckle down and find a way to make the Lisp macro strategy work was one with the decision to abandon the fexpr strategy. Because of those decisions, Lisp macros have a multi-decade head start on fexprs, and there are indeed some important things that haven't been shown for fexprs yet — although fexprs should be able to borrow from other technologies to help them catch up.

Although I consider the inherent phase separation of Lisp macros to be a defect in their extension strategy, I try to say that directly; I did not intend the phrase grafted on as an indirect aspersion against the strategy. Macros are a feature that evolved outside of Lisp and was then introduced into Lisp, rather than being a native development of the Lisp paradigm, so that the way they were introduced into Lisp resembles the horticultural practice of grafting.

Local Syntax Extension vs. Fexprs

I acknowledge a bias: I view every language feature in the context of distributed systems programming and open systems composition. You may find fexprs 'elegant' in the context of the Kernel's target use-cases. Perhaps you even see them as more Lisp than Lisp. But I, honestly, cannot bring myself to care.

When I study fexprs, my attention is almost immediately upon the vulgar violations of composition properties such as implementation hiding and parametricity. You can peek inside operands that were developed in non-local modules, and thus introduce behaviors dependent on how concepts are expressed non-locally. For composition, this is bad... one must restrict behavior to a dependency only upon what is expressed. AFAICT, there is no place for fexprs if you aren't willing to compromise composition. If you do not need to examine non-local code expressions, then a one-two punch of local syntax extension and first-class procedures will do the job.

Now, I refer to 'local syntax extension' because 'macro' has too many connotations in a thread discussing Scheme and Lisp. I'm personally interested in extensible attribute grammars, by which I refer to an attribute grammar wherein one of the attributes is the full grammar, and the full grammar includes the syntax to manipulate said attribute. Technically, such grammars may also be mutable in non-monotonic manners. To the extent you avoid side-effects while processing them, Scheme macros provide a form of local syntax extension. It doesn't matter when the syntax is evaluated; what matters (for composition) is that dependencies only exist on the code local to each application of the syntax extension.

The job performed by local syntax extensions is brutish. Fundamentally, they convert high-level domain semantics into inadequate implementation semantics. They do so in a lossy manner that will hinder many sorts of high-level symbolic or logical analysis. (If you want an elegant solution to that problem, then you must look more in the direction of Term Rewrite Systems than fexprs!) This doesn't mean the syntax extensions themselves can't be elegant or efficient. There is a certain beauty in well-executed brutality.

I don't have a good, widely accepted definition for 'elegant', but I suspect you share my view that compromise is antonymous to elegance. (You assert in a recent draft of your Kernel report: "pure style is one that can be reconciled with practical concerns without compromise, neither to the style nor to the practicalities.") When the 'practicalities' include concern for code distribution and secure open-systems composition, I observe that fexprs require compromises that are not required by local syntax extensions, and thus I judge them relatively less elegant. [Of course, even without fexprs, Scheme has many other features that also compromise distribution and secure open-systems composition.]

Extensible attribute grammars; and general remarks on fexprs

I'm personally interested in extensible attribute grammars, by which I refer to an attribute grammar wherein one of the attributes is the full grammar, and the full grammar includes the syntax to manipulate said attribute.

This sounds an awful lot like Christiansen Grammars, the subject of Henning Christiansen's dissertation, "Programming as Language Development" (1988). In the following years his research in that area morphed into logic meta-programming; and about when it was morphing, I developed an alternative called Recursive Adaptive Grammars (for my Master's Thesis) whose stated purpose was to integrate the adaptivity into the CFG core of the grammar, so that instead of a clear but weak CFG and a powerful but opaque computational mechanism, you'd have everything in one lucid and powerful system. RAGs could be taken very far, I think, but I haven't done so because my own interest in abstraction has led me in a different direction; I do know, though, that someone has been experimenting with them in recent times (I've had some contact with them). Depending on your particular interests, you might also find somewhat interesting a TR I put out two years ago, Well-behaved parsing of extensible-syntax languages.

When the 'practicalities' include concern for code distribution and secure open-systems composition, I observe that fexprs require compromises that are not required by local syntax extensions, and thus I judge them relatively less elegant.

I can only, for the moment, offer a general observation, to explain why I'm... not exactly skeptical, perhaps cautious... about this objection to fexprs.
In the past, I have confronted things that fexprs supposedly made impossible to do (or that FCEs did, FCEs being very, very closely related to fexprs, as this discussion has demonstrated). Usually this supposed impossibility was presented as a reason why fexprs (or FCEs) are impractical. What I've repeatedly found is that fexprs made impossible the conventional solution, but that an alternative solution could be achieved by exploiting unconventional properties of fexprs and FCEs. And typically the alternative solution brought new insight into fexprs/FCEs. How that sort of approach could possibly apply to the problems you mention, I have off hand no idea — which suggests to me, optimistically, that an alternative solution in that area might be accompanied by some fairly spectacular new insight.

BTW, just to keep the record straight, the passage you quote from the R-1RK (which has been in there at least since 2003) is, taken in context, not a definition of pure style, but a declaration of preference — including ten more words at the front,

crystalization of style can only be fully effective if the pure style is one that can be reconciled with practical concerns without compromise, neither to the style nor to the practicalities.

awful lot like Christiansen Grammars...

My work on extensible attribute grammars was influenced more than a little by Christiansen. Is there a reason you call it 'awful'? ;-)

Besides Christiansen (and his survey), I read some of your own work on RAGs, and a few papers by Cardelli. IIRC, it was your own online bibliography on the subject that got me started (so, thanks for that). I was really hoping to find a pre-packaged solution that would solve all my problems - a sort of a BNF for extensible synax.

fexprs made impossible the conventional solution, but that an alternative solution could be achieved by exploiting unconventional properties of fexprs and FCEs.

For secure composition, the issue is really the conventional properties of fexprs, which is to expose to operatives the code, data, and environment of the operands. If operands are exposed to the operative, then they cannot encapsulate authorities. If operands cannot encapsulate authorities, then authorities cannot be delegated, which raises barriers to secure composition. (Beyond that, the exposure also allows tight coupling to syntax-level implementation details of other components, which raises potential for issues when modules are changed... not that you can ever escape such issues entirely.)

Fexprs (and Scheme and Lisp as a whole, really) assume a trustworthy environment... i.e. that all contributors to the system are making best-effort to keep everything well-behaved.

But for composition in open systems, you must assume instead that, in any population of developers contributing code (extensions, applications, queries, decision trees, agents, etc.) to the system, you'll have a few interested in smashing stacks for fun and profit, and you'll also have a few benignly clueless who will, nonetheless, contribute code that acts in a malign manner.

"BTW, just to keep the record straight..."

I apologize for obliging you to set the record straight. I hadn't intended that the quote represent a definition, only that it express your opinion that compromise is somewhat antithetical to elegance. I was assuming you consider pure styles to be reasonably elegant.

fexprs vs. composition

If operands are exposed to the operative, then they cannot encapsulate authorities.

That's not true. Consider rewriting combinations this way:

  (op rand0 ...)
  ==>
  (eval (list 'op (list (list (lambda () rand0))) ...))

We are presuming there that something which is not normally readable - the procedure values returned by (lambda () rand0) - are nevertheless valid syntax to EVAL but that's a pretty reasonable assumption.

First-class Procedures

You repeat a point I made earlier with regards to first-class procedures. If you wrap every argument into a first-class procedure then, presuming you cannot peek inside procedures, you can encapsulate authorities.

On the other hand, if you performed the transform you describe systematically, you would have no need for fexprs. Local syntax extensions, plus the above transform, would be sufficient. The features that are unique to fexprs would not be leveraged. This has been the argument I've been making.

It would be an error to assume, however, that you control the rewriting combination you described above. Operatives receive their arguments unevaluated. Thus it generally be possible for the developer of 'op' to see the implementation details of 'rand0', prior to evaluating it into a proper lambda.

There may be ways to work around this and force evaluation of rand0 before passing it to op. However, any explicit work-around to achieve security is a violation of an important security principle: that maintaining security must be the path-of-least-resistance.

fexprs and trust

Fexprs (and Scheme and Lisp as a whole, really) assume a trustworthy environment... i.e. that all contributors to the system are making best-effort to keep everything well-behaved.

Kernel doesn't assume this — at least, not as much as, say, Scheme does — exactly because it has fexprs. In the presence of fexprs, programs are heavily dependent on proving stability of bindings, so the language has to be designed defensively so a module can guarantee stability of its bindings no matter how badly behaved its clients are. A key element of Kernel's strategy for this is that an environment can only be locally mutated if one can acquire that environment as a first-class object — and given an arbitrary environment, one can't determine its ancestors. Various other facets of Kernel are designed to promote encapsulation based on this bounding of environment mutation.

Re: fexprs and trust

I must apologize. This is, I realize in hindsight, a terrible topic to reuse the word "environment" with a different meaning than that used in the OP!

Anyhow, by 'environment', I refer to the (more or less integrated) development environment.

For open distributed systems programming, you must assume this development environment includes clients and servers and clouds, all injecting code into one another. Code generally includes commands, queries, libraries, extensions, decision trees, form validation, agents, UI support, etc. Code distribution is very important for latency and network performance, disruption tolerance, load-balancing, and resilience (self-healing). Ideally, one wishes to avoid cheap security hacks - such as sandboxes and JavaScript's 'same origin policy' - because those hinder useful compositions. (Unlike capabilities, a sandbox will prevent you from accessing even legit authorities if they weren't white-listed by the host.)

The Scheme or Kernel environment will still involve code-distribution of untrusted code, in the form of libraries and extensions. The developers simply pretend the code is trustworthy, and perhaps vet it a little. Assuming otherwise is simply not very profitable in a language ill equipped to do much about it. And this is what I meant by 'assume a trustworthy environment'.

Protecting against mutation of bindings is useful for reasoning about security. However, for a security hole it is sufficient to execute a procedure or access a data resource that developers would expect to be encapsulated. This doesn't require mutation of any bindings. Merely reading the environment is sufficient to enable both of these behaviors. Access to the operands' operands is similarly problematic.

I have to admit, I haven't been thinking much about security.

I don't think that encapsulation is quite a complete solution for security anyway. Make no mistake, it does defend from certain types of security problem, and from certain types of privelege escalation attacks. But when we assume that a certain fraction of the code in the system has been written by the devil, and we don't know which fraction exactly, encapsulation is not sufficient to provide "security" in a meaningful sense to the providers or owners of the resources that need to be secured.

Ultimately these objects are handled by routines written in machine code. Machine code has almost no provision for security. If the devil manages to substitute one of his routines for one of the official ones, for example with a locally hacked version of the JVM or CLR, then whatever faith you've put in language-level encapsulation is for naught. Like a lock on a wooden door, it will only stop the honest man.

I think that we as a community need to look for a better solution than encapsulation. Capabilities seem to be the leading candidate.

I think that we as a

I think that we as a community need to look for a better solution than encapsulation. Capabilities seem to be the leading candidate.

Capabilities pretty much require encapsulation, which is what David is trying to say. If you don't have encapsulation, any capability security you hope to employ will be very weak.

Encapsulation seems impossible

In any publicly distributed system with mobile code, encapsulation as required by security concerns seems impossible. The problem is that each and every one of those machines belongs to somebody (I use the term 'publicly distributed' to mean that the machines do not all have the same ownership). Whoever "somebody" is that owns a particular machine, he or she has the right to run any code on it that he or she wants.

Even if you contend that no such civil right exists and the owner of the machine can in principle be forced to run code that they don't want to run in order to preserve security (a claim I find highly dubious), any technological measure to prevent it would necessarily also pave the way for trammeling of even more fundamental civil liberties. Besides, in practice all attempts to develop such means have consistently failed so far.

Providing a language spec that gives people no way to look inside encapsulated objects, or even a runtime implementation that gives people no way to look inside encapsulated objects (including environments), does not protect these objects in a way relevant to security . It only protects them from ordinary poor programming practices and accidents.

The sort of attacker whom you would be concerned about from a security point of view will not be constrained by your language spec, nor your implementation. The devil has access to assembly language and debuggers and is able to produce a modified version of your runtime environment. If the "encapsulated" environment or object is on his or her machine, and s/he can do evil by reading or modifying it, s/he will read or modify it regardless of your language spec or the capabilities of the runtime you provided.

That means that encapsulation is relevant to language design for distributed systems only in the sense of defending against ordinary poor programming practices and accidents; if security is your concern then you're necessarily talking about achieving it via distribution over machines none of which is owned or controlled by an attacker.

Encapsulation possible, and useful.

It is true that the host of code is able to export some 'reflective' capabilities, i.e. at the interpreter layer, to poke through external code resources and mine them for any useful information. This does not reduce the utility of language-level encapsulation. 'Untrusted code' should not be able to poke about the rest of your program and gain authorities or information that were not explicitly granted to it. Language-layer encapsulation is what prevents this, and does so without hindering performance and other properties.

As the distributor of code, your concern is information assurance - i.e. whether the code contains sensitive information, intellectual properties, or capabilities that the remote host is not authorized to possess. POLA is about limiting the maximum extent of damage, after all. The straightforward option, of course, is to hoard everything. But that option has some dire consequences for disruption tolerance, resilience (via redundancy), and performance (latency, bandwidth efficiency, and load balancing). To address these latter concerns, it is necessary to distribute code. Ideally, the language supports you in making correct code-distribution decisions, and even automates the distribution. If so, your language supports tierless programming - i.e. such that you need make no real distinctions between client, and server, and database installations. (With tierless programming, all installations are merely volatile participants in an open cloud.)

As the host, your concern is whether 'untrusted code' will compromise the security of your system. As mentioned above, this is where encapsulation of arguments and environment (i.e. as in object capability security) is useful, as it allows one to control the authorities and information granted to 'untrusted code'. A system with capability security can make a nice guarantee: that the only difference between code executing remotely, and code executing locally, will be performance, disruption tolerance, and resilience - which are exactly the reasons to distribute code. But caution! Capability security will not protect a host against a denial-of-service attacks performed by untrusted code. For that, the language must have a good model for concurrency and partial failure (fortunately, any language for distributed system will already have these!) in addition to some basic resource accounting in the language or the host.

Now, you might have noticed my consistently placing air-quotes around 'untrusted code'. This is because 'trusted code' is very poorly defined. Whose trust? Trust for what? In a running system, even code you might provisionally trust will quickly intermingle with untrusted code via closures, callbacks, and continuations. In general, 'untrusted code' is not confined: it may be endowed with capabilities, or carefully hoard them in ways that a 'well-behaved' program should not. Where critical, there are security patterns to ensure confinement, i.e. by providing untrusted code as an auditable AST. Effect-typing can also help. However, with rare exceptions, you should not care. Principle of Least Authority is based on limiting the extent of damage, not trust. And there are good alternatives to privilege escalation (i.e. involving sealer/unsealer patterns) that do not require 'trust'. Trust is only useful in the company of auditing and responsibility or liability (fiscal, reputation, etc.), and even there Trust is the third wheel in the relationship.

I don't really want to trust any code. Not the code (decision tree, form validation, GUI control, etc.) distributed to my client from a remote server. Not the code (query, command, etc.) distributed to my remote server from a client. Not the applications, the extensions, or the libraries that I painstakingly obtain using the sado-masochistic state-of-the-art code distribution mechanisms (installation packages, obtained via download... or even obtain on sneaker-net from a local electronics warehouse!). I don't even want to trust the code I wrote for myself, years ago.

For security, the language design should reduce the need for trust of code by the hosts... but must do so without compromising performance or useful expressiveness. A great many language features can help in this goal: encapsulation and memory safety, a good model for concurrency and concurrency control, resource accounting, well-defined partial-failures, information-flow analysis, linear typing, effect-typing, and so on.

It is true that the host of

It is true that the host of code is able to export some 'reflective' capabilities, i.e. at the interpreter layer, to poke through external code resources and mine them for any useful information. This does not reduce the utility of language-level encapsulation.

But it does mean that relying on language-level encapsulation for security against attackers was not part of its utility in the first place.

'Untrusted code' should not be able to poke about the rest of your program and gain authorities or information that were not explicitly granted to it.

"Untrusted code" can take the form of a memory debugger running as root in a separate thread. There is no way at all to prevent it from poking about in any program running on the same machine.

Language-layer encapsulation is what prevents this, and does so without hindering performance and other properties.

How?? What possible implementation strategy can you have in mind that protects running code and live data from being snooped around in, when you don't even own the machine where that code and data, however temporarily, resides?

Language level encapsulation can provide hygiene, sure. And hygiene is valuable. It can protect against accidental access, or even make it impossible to write code that executes in your runtime that accesses the protected resource. But hygiene is not security.
Security protects against genuinely motivated, technologically sophisticated attackers who have root access to the machine. Language level encapsulation cannot do that; it is hygiene, not security.

As far as I know the only thing that provides security is NOT running code on a widely distributed network where the ownership / control of any machine (including routers, hubs, and switches) is uncertain.

relying on language-level

relying on language-level encapsulation for security against attackers was not part of its utility in the first place

Incorrect. Encapsulation when used effectively, as in an object capability discipline, protects the host against external attackers. Usefully, it does so without hindering composition, expressiveness, or performance.

What it doesn't do is protect the attackers. To 'attack' a remote host one sends commands, queries, decision trees, mobile agents, or other code to that host, whereupon it will be executed. (Attackers, naturally, look very much like any other user or service who interacts with the system. They depend on the fact that evil code is difficult to distinguish from benign code.) The host, however, is free to peek around inside that code (command, query, etc.), and steal any sensitive information or authorities that the attacker was foolish enough to send along with the attack.

Your argument amounts to: "but the attacker's code is wide open to perusal by the defender, so how can encapsulation possibly protect the defender?" Your argument is non-sequitur. It is not the responsibility of the defender to protect the attacker's interests. The attacker, or any user, is responsible for deciding just how much information to divulge to a potentially 'untrusted' remote host.

Open distributed systems security involves systematically protecting both the distributor (who is the potential attacker) and the host. One protects the distributor by automating good decisions about how much code can be distributed, and by systematically cleaning up any accidental authority leaks (i.e. by revoking and renaming compromised authorities). One protects the host via object capability model, a good concurrency model, a well-defined failure model, and resource conservation (process accounting). When all the participants are selfishly protecting their own interests, systems security is the natural result... as is a market of services.

"Untrusted code" can take the form of a memory debugger running as root in a separate thread. There is no way at all to prevent it from poking about in any program running on the same machine.

It would be silly to run an untrusted debugger with a large degree of ambient authority. But what's important for security of the host is that the attacker cannot start or control such a debugger. And that attack is prevented because the attacker's command/query/script/extension/agent/code happens to be running in an environment with disciplined encapsulation of authorities.

That is, even if you had the authority to start a silly debugger, the attacker's code must not be able to steal it from you. If the language allows arbitrary code (including the attacker's code) to poke around the environment and unevaluated operands, it might be able to discover the authority to run the silly debugger. Encapsulation is the property that says the attacker cannot do this.

Object capability discipline doesn't really add anything to the power of encapsulation. Rather, it mostly involves streamlining the APIs to prevent developers from accidentally granting more authorities than intended.

Security protects against genuinely motivated, technologically sophisticated attackers who have root access to the machine. Language level encapsulation cannot do that; it is hygiene, not security.

Security is fundamentally about liveness and predictability. Fine-grained control over distribution of authority, and support for information assurance, are two ways to manage predictability. "Root access" is an artifact of a few popular operating systems that has nothing at all to do with the nature of security.

Language level encapsulation can offer security even if the language implementation is running with root authority... or even if the language happens to be the operating system. The distinction between languages and operating systems is terribly vague even before you consider languages with proper security and concurrency models.

the only thing that provides security is NOT running code on a widely distributed network where the ownership / control of any machine (including routers, hubs, and switches) is uncertain

Sure. That will certainly protect 'the attacker' (distributor of code), albeit only up to the moment that the attacker becomes the attacked. But, as I noted above, the cost of hoarding computations onto a 'trusted' host - in terms of disruption tolerance, resilience and redundancy, latencies, bandwidth efficiencies, utilization and load-balancing - can be dire.

There are many benefits to be had by distributing as much code as possible. And, for most large services or program, you'll find that the vast majority of their code can be acceptably distributed to untrusted nodes, and most of what is left can be run on only slightly trusted nodes (i.e. you might not trust hosts that carry a 'not evil' certificate from Google, but that might be good enough for a fair portion of whatever program or service you are offering - enough to enhance scalability).

Incorrect. Encapsulation

Incorrect. Encapsulation when used effectively, as in an object capability discipline, protects the host against external attackers. Usefully, it does so without hindering composition, expressiveness, or performance.

I do not think that we are talking about the same threat model. I was responding to your statement about running on a "widely distributed" network. On a widely distributed network, the attacker is NOT "external." He or she is the owner of, and in physical control of, one or more of the machines your application is running on. He can start high-ambient-authority processes from the command line, and is not restricted to your runtime. Now if you are claiming that the attacker is strictly an external or remote attacker, that seems to me to be backing off from your claim about providing meaningful security in a widely distributed application.

What it doesn't do is protect the attackers. To 'attack' a remote host one sends commands, queries, decision trees, mobile agents, or other code to that host, whereupon it will be executed. (Attackers, naturally, look very much like any other user or service who interacts with the system. They depend on the fact that evil code is difficult to distinguish from benign code.) The host, however, is free to peek around inside that code (command, query, etc.), and steal any sensitive information or authorities that the attacker was foolish enough to send along with the attack.

Wot? The attackers? What the bloody H... are you talking about?? That would be relevant in a remote attack on a hosted application, which we had not, up to now, been talking about. Yes, hygiene can help provide security on a hosted application. But you said you needed encapsulation for a widely distributed application, and it cannot provide that.

Widely distributed applications are subject to attempts by people with root access on the local machine to subvert the local node of the application. If your application is hosted rather than widely distributed then functional security can be built on hygiene plus the basic rule of "don't run anything you don't want hacked on a machine you don't own."

Threat Model or Application Model?

On a widely distributed network, the attacker is NOT "external." He or she is the owner of, and in physical control of, one or more of the machines your application is running on.

I ask that you ponder: In context of wide code distribution, what is an 'application'? To whom does it 'belong'? What, precisely, does it mean for a node already authorized to run the code to 'attack it'?

Consider modern support for continuous upgrades, and ask yourself: who really owns the applications, plugins, libraries, and operating system that you use on a regular basis?

It is true you cannot just distribute code with wild abandon. One of the fundamental rules - for information assurance - is that you do not distribute security-sensitive routines or data to nodes that are not authorized to host them. This does very little to hinder a significant degree of wide open-systems distribution because:

  1. In a large program that contains some sensitive code or data, it is common that most code or data is not especially security sensitive. One may often distribute a portion of any such program to untrusted nodes. Depending on communication patterns, it may even be useful to do so.
  2. Having exactly one node trusted to run a particular piece of code or host a particular piece of data is rare, especially for data and code resources with which external nodes are likely to interact. Even if you don't run your own security domains, it is not unrealistic to suggest that you might authorize nodes certified by a larger 'trusted backbone' organization, such as Google or Amazon. This still supports wide geographic distribution, including replication for performance, resilience, and regional disruption tolerance.

you said you needed encapsulation for a widely distributed application

And this is true. Your application code is potentially dangerous and expensive to hosts. Thus, unless you are depending upon the remote hosts to be foolish or (for some reason) desperate to run your code locally, there must be assurance that your code is 'mostly harmless'.

Disciplined encapsulation greatly supports this assurance (though it leaves untouched the issue of resource management). The object capability discipline, even within trusted code, improves confidence for the interactions with untrusted code, minimizes risks that developers will accidentally violate Principle of Least Authority when communicating with your code. This encapsulation being enforced by the (virtual) machine prevents your untrusted code from obtaining such authorities except through a grant.

It also helps that your distributed application code encapsulates its own authorities... i.e. that it comes pre-endowed with interesting and useful authorities. That is, developers can simply treat your code as a persistent service. Distributing low-level capabilities to 'libraries', so that they may build interesting, higher-level capabilities, is a vulnerable stage in a program's life-cycle because it requires trusting a lot of code to properly distribute the low-level capabilities. One avoids this stage for automatically distributed code. By recognizing code-distribution as a feature of the language, and protecting it under the security-umbrella of the distributed language - rather than leaving it to ad-hoc external systems - I suspect the 'widely distributed' model offers greater security and flexibility than 'centralized' (vat-based) capability languages such as E. Good support for code-distribution also helps solve the upgrade vs. persistence problems (by reducing it to a problem of runtime upgrade and live programming).

If your application is 'hosted' rather than 'widely distributed' [...]

Those terms are not contradictory.

One may easily have a widely distributed program in the sense that different components are hosted by different nodes. Indeed, this is ideal, because many resources - say, sensors in a sensor network - are widely distributed. You'll want to run code dealing with each sensor on an authorized node geographically close to that sensor, if not directly on the sensor.

Orthogonally, individual components of that distributed program may be replicated across nodes. This may be done for performance, such that common subroutines don't involve a call across the network. This replication may also be for resilience, such that if a node for one of the central components goes down the program can quickly regenerate and keeps running.

I do not think that we are talking about the same threat model.

Indeed. My impression is that you have the threat model backwards. E.g. for a distributed Poker app, the issue isn't protecting the application from the clients, but rather protecting the clients from the application and from one another.

K.I.S.S.

You can't solve impossible problems by asserting that you can invent a programming language in which they are automagically solved.

If a problem is impossible...

... the solution is to change to a different set of relevant problems.

I don't aim to solve any impossible problems, Thomas Lord.

[Though it seems Ray is intent upon finding an obscure definition of 'widely distributed' that turns it into an impossible problem. Wide distribution, IMO, does not mean that security-sensitive components get randomly distributed to untrusted nodes.]

;-)

"I don't aim to solve any impossible problems..."

Exactly.

K.I.S. for S.

I'm afraid you'll need to clarify your meaning.

ironically: it's a big topic (K.I.S.S.)

Well, you asked, so I'll be frank.

I just think that all of the progress in massively distributed, decentralized, distributed-code, massive-data-set systems for your lifetime and mine and probably for the next few generations as well - is going to be ad hoc. I think that you make robust systems out of stuff like "grep" or a good ftp client - not out of some idealized hyper-abstract model that reflects your half-baked notions of the possible potential of distributed computing.

If you were talking up some tiny languages for very domain constrained problems in distributed computing that could be interesting but instead you are posing questions like:

"I ask you to ponder: In context of wide code distribution, what is an 'application'? To whom does it 'belong'? What, precisely, does it mean for a node already authorized to run the code to 'attack it'?"

Aside from being utterly non-responsive to Ray's comment, your response, taken on its own merits, at best suggests an urgent insistence to delve deeply into needless and quite possibly incoherent abstractions. What is an application?!?? WTF do you think it is... it's a commonly used term with a fairly clear meaning. You don't get to take it over because you have the vague idea of "solving" an ill-defined problem space around distributed computing using the implausible strategy of programming language design.

hyper-abstract + half-baked notions

That would certainly be a bad combination :-)

But abstract is not bad by itself. An abstract model is only bad if it does not contain the relevant concrete scenarios. I think these concrete scenarios have emerged over the last 10 or 15 years (partially by people using "grep"). It is time for a good abstract model and a language that fits this model.

And of course is the language runtime part of the host. So if you want a secure host that can run distributed code, you need a secure language that has encapsulation. This is not a sufficient condition for security, but a necessary one.

re: hyper/baked

I do understand that over the last 10 or 15 years there have emerged a lot of oft-repeated patterns in the architecture of distributed, decentralized apps. What I don't buy is that there is a "missing" abstraction that unifies all of these patterns and expresses them as novelties in programming language design.

You write "So if you want a secure host that can run distributed code, you need a secure language that has encapsulation."

Well, prima facie, we have some examples of such hosts. One is your browser. Another is commodity computing platforms such as EC2. Evidently, such languages as Javascript and C-over-unix provide as much encapsulation as is necessary. What is the big over-arching abstraction that usefully unifies these patterns (or at least, why should I believe that such an abstraction can be found)?

There are, I'm sure, good abstractions yet to be invented for many of the aspects of the problems that arise when building these systems. If you said to me "I'm going to try to design a programming language that is a kind of DSL for distributed, decentralized social networking software..." then I would believe you and think that you have a real chance at success. I wouldn't expect the language you make to be especially useful for writing, say, a distributed and decentralized file system or a distributed and decentralized render farm. I'd believe that there are some elements in common between those various domains - for example, a dist./decent. social network and a dist./decent. file system might share in common an abstraction for user identity and authentication - but I don't believe in a set of such commonalities that adds up to a general purpose programming language.

An analogy might be made to more conventional utility systems like telephony, electricity, water, and sewage. These distributed and decentralized capability resources share some common abstractions - for example in the management of "where is it safe to drop a backhoe and dig" questions. Yet, there is no over-arching design language that encompasses all of those infrastructure pieces. If you're designing a telephony sub-net your practice is very different from if you're designing a sewage system subnet. There are some abstract concepts in common between the two. There are some common and unified practices (like "where is safe to dig"). But there is no over-arching design language that covers both cases. Those "real-world" distributed/decentralized systems are best reasoned about in application-specific ways. You've got one design language for telephony, another for sewage. It's ad-hoc, in that sense.

Just as I don't expect a common implementation specification language between telephony and sewage system designers, I will be very, very surprised to see a common language that subsumes all of distributed and decentralized computing.

More realistic than an ultimate programming language for distributed and decentralized computing would be a programming language for analyzing networks and reasoning about them. Indeed, we have such analytic languages with things like queuing theory and cybernetics and I'm sure there's plenty of room for expansion and improvement

What I deeply object to is the rhetorical move of saying that a feature of a general purpose programming language, like FEXPRs, is somehow bad because in the theories of the speaker it interferes with the project of making a programming language that, so to speak, is good for implementing both telephone and sewage systems. One shouldn't make that rhetorical move until one has made a convincing case that such a programming language can even possibly exist.

RE: Missing Abstraction? General Purpose?

What I don't buy is that there is a "missing" abstraction that unifies all of these patterns and expresses them as novelties in programming language design.

Which abstraction are you talking about, and who is trying to sell it to you? I suspect you're reading something that wasn't written.

For my own language, I haven't used even one abstraction that hasn't been studied since the 70s. The design challenge isn't the abstractions themselves so much as choosing a proper subset, breaking them down, and fitting them together in a coherent manner. I'd rather avoid the Perl phenomenon of providing "less a language and more a flock of features flying in loose formation".

Evidently, such languages as Javascript and C-over-unix provide as much encapsulation as is necessary.

Perhaps you should share this evidence, because it seems to me that most evidence is to the contrary. Why would Javascript need hacked in, painful restrictions like the 'same origin policy' if it was sufficient? Why would people be pursuing Caja?

How does C-over-unix let users place some code at or near a web-cam for bandwidth efficiency without leaping through bureaucratic hoops to obtain a user-account and shell access? Why would an administrator trust that they won't use more of that authority than I necessary to just get information off the camera and transmit it?

There are, I'm sure, good abstractions yet to be invented for many of the aspects of the problems that arise when building these systems.

Absolutely. And those abstractions that are not given first-class support from the language will become services and libraries.

If we later discover that there are significant optimization or security properties to be had by granting such abstractions first-class support, they can either be added to future generations of distributed languages and protocols.

But most abstractions cannot offer significant enough runtime benefits to receive a first-class treatment. Those will forever be relegated to libraries and syntactic transforms or other forms of meta-programming.

More realistic than an ultimate programming language for distributed and decentralized computing

I have no illusions of offering an "ultimate" (final, perfect) language. That's hard enough to do even if the goal was to create a perfect Academic toy language without concern for practicalities (plenty of competition between Coq and Maude and the rest :-).

My goal is a general purpose programming language. That 'purpose' happens to include concerns that have grown over the last couple decades. A general purpose language doesn't need to be optimal for all use-cases. It only needs to be excellent for the common use-cases, practical for most other use-cases, and merely 'capable' of the rest.

You assert that I might better formulate such a pursuit as a DSL.

I'll match and raise you one: Scheme should no longer be considered a General Purpose Programming Language, because it no longer meets General Purpose needs. Back when Lisp was invented, all the relevant resources were centralized to the host, concurrency was a non-issue, and "node failure" was so far beyond reckoning that there was little practical reason to acknowledge it. When Scheme was later built in 1975, the situation was much the same - albeit now with mainframes and remote terminals.

Well, guess what: the hardware grew up. Lisp and Scheme... did not. Today, Clojure is close to a General Purpose Programming Language. Scheme is a DSL for volatile academic toys.

DSL vs. GPPL has no clear distinction other than suitability for a purpose.

What I deeply object to is the rhetorical move of saying that a feature of a general purpose programming language, like FEXPRs, is somehow bad [...]

I'm not immune to human nature, either. I could easily tell that your objection was more emotional than based in reasoning. You built more than enough straw-man arguments to burn - about 'missing abstractions' and 'ultimate languages'. You even resorted to one 'WTF', and one long sequence of exclamation points and question marks. I could easily feel the heat. You must have some emotional investment in the subject.

Anyhow, I do not believe there exists such things as 'general purpose language features'. General purpose languages should not be flocks of 'general-purpose features' flying in loose formation.

But, even if I did believe in 'general purpose language features', I don't have much reason to believe that FEXPRs are one of them. Why would I, if they seem to interfere with reasoning and practical concerns of the modern and future eras? Even without considering distribution, they introduce their share of potential problems for local modularity.

Ah, well. Even if Ehud hasn't made an [ADMIN] comment yet, I feel it's time to either take this conversation elsewhere, or terminate it. Feel free to have the last word here.

What is an application?!??

What is an application?!?? WTF do you think it is... it's a commonly used term with a fairly clear meaning. You don't get to take it over because you have the vague idea of "solving" an ill-defined problem space around distributed computing using the implausible strategy of programming language design.

David's point is that "application" does not have a fairly clear meaning in the distributed scenario with mutually suspicious agents, which is why it's so hard to figure out proper encapsulation boundaries. Any refinement of the ill-defined concept is a welcome data point in this open territory.

re: "what is an applicatin?!??"

David's point is that "application" does not have a fairly clear meaning in the distributed scenario with mutually suspicious agents, which is why it's so hard to figure out proper encapsulation boundaries. Any refinement of the ill-defined concept is a welcome data point in this open territory.

And my point is that so far we've solved such problems in many domain-specific cases in ad hoc ways. It's not at all clear that there can possibly exist a good definition for "application" in that broad sense - it could be that ad hoc and domain-specific is the only way. It seems likely, at least to me, that it is all ad-hoc at that scale now and effectively forever. Thus I object to the rhetorical move of saying that a proposed feature of a general purpose programming language is uninteresting or bad because it fails to solve the quite possibly insoluble problem of well-defining an "application" in that vaguely conceived sense. My perception is that this brave new definition of "application" is being sought for dubious reasons, like the quest for the gold of El Dorado.

Thus I object to the

Thus I object to the rhetorical move of saying that a proposed feature of a general purpose programming language is uninteresting or bad because it fails to solve the quite possibly insoluble problem of well-defining an "application" in that vaguely conceived sense.

That's not at all what this thread is about. David legitimately objected to fexpr's because of the perceived violation of encapsulation properties that local and distributed capability systems rely on for reasoning about and enforcing security properties, a point which has yet to be addressed. The segue into "application" was merely a rhetorical device employed to convey the various security and locality properties that come into play in a distributed program.

As for your other point, re: distributed solutions will always be ad-hoc, I would take that as merely a sign of our undeveloped knowledge of a domain, not intrinsic to the domain itself. I'm not sure what evidence you have that generic tools and techniques for designing and building distributed programs is a hopeless endeavour, but it sounds very premature from what I know of the domain.

FEXPRs and capability security

David legitimately objected to fexpr's because of the perceived violation of encapsulation properties that local and distributed capability systems rely on for reasoning about and enforcing security properties, a point which has yet to be addressed.

I think its been addressed repeatedly. For the style of FEXPRs we're discussing, there is absolute local control in code of what to expose to other code. If you have a particular discipline that you want to maintain to keep capability keys safe, FEXPRs and FCEs give you plenty of mechanism for implementing that discipline as an enforced rule in the code in which you decide to deploy that discipline.

It seems to me that David went further than you suggest in positing the potential existence of a general purpose PL that would (through complex machinations like extensible grammars) have such capabilities built in and that, because of the desirability of such a language, FEXPRs were a bad idea. On the one hand I doubt the potential existence of such a PL (of much use) and on the other hand I note that if you do have a design for such a PL, by gosh, you can implement it atop FEXPRs in ways we've discussed.

I doubt the potential existence of such a PL and that is a reiteration of my belief that distributed solutions will always (my liftime, yours, our kids) be ad hoc. Ray has done a handsome job of pointing out why: as soon as you have a massively distributed and decentralized architecture like that you have created incentive for nodes to cheat and cheating isn't all that hard. Every DRM system can be hacked to circumvent the DRM infrastructure and the more you insist on relying upon that infrastructure, the higher the incentive to work around it becomes. Things are permanently ad hoc in the sense that if we shift attention from the unsolvable generalized problem to specific cases, we get a garden of solvable problems although no one solution solves them all. The most secure security policy in a distributed and decentralized computing world is simply reticence to offer service at all (as in keeping some machines off the net entirely). When offering service is desired, the appropriate controls vary wildly depending on the nature of the service being offered - faith in an over-arching abstraction to tame that beast has no evidence-based support. Continuing the analogy I made earlier: the security measures you want to put in place to protect telephony infrastructure are ultimately incommensurate with the security measures to protect sewage infrastructure. The two might have some common patterns and the analog of "re-usable code" but in their totality, they are apples and oranges.

For the style of FEXPRs

For the style of FEXPRs we're discussing, there is absolute local control in code of what to expose to other code. If you have a particular discipline that you want to maintain to keep capability keys safe, FEXPRs and FCEs give you plenty of mechanism for implementing that discipline as an enforced rule in the code in which you decide to deploy that discipline.

If there was such an argument made, I haven't seen it. Instead, I saw an argument that encapsulation was fundamentally infeasible, and so why not abandon it for the benefits of fexprs? Perhaps I misread, in which case I look forward to reading your fexpr proposal and analyzing its local reasoning properties.

From an informal description, fexprs sound problematic, but I certainly acknowledge that you may find some region of this space that makes local reasoning easy while retaining extensibility. I myself will place my bets on staging.

Ray has done a handsome job of pointing out why: as soon as you have a massively distributed and decentralized architecture like that you have created incentive for nodes to cheat and cheating isn't all that hard.

Cheating is currently easy because predominant security models allow one to express unenforceable security properties. Capabilities (largely) do not have this deficiency, though they are still vulnerable to DoS of course. This is why a platform that can support capability reasoning is essential.

When offering service is desired, the appropriate controls vary wildly depending on the nature of the service being offered - faith in an over-arching abstraction to tame that beast has no evidence-based support. Continuing the analogy I made earlier: the security measures you want to put in place to protect telephony infrastructure are ultimately incommensurate with the security measures to protect sewage infrastructure.

By this argument, all programming is ad-hoc, even local programs, in which case "ad-hocness" is not a useful distinction. That any specific program utilizes domain-specific data types and algorithms is not important, what's important is that we can derive these domain-specific types and algorithms by the composition and application of more general abstractions.

I don't see how you can truly believe that we will not find such general distributed abstractions from which we can derive domain-specific distributed programs.

Finally, any specific security policy must be expressed in a language that specifies legal information flow. Capabilities are one such language, and so at a certain level of abstraction, the measures used to secure telephony and sewage infrastructure are indeed the same.

positing the potential

positing the potential existence of a general purpose PL that would (through complex machinations like extensible grammars) have such [referring to FEXPR or FCE] capabilities built in [...] I note that if you do have a design for such a PL, by gosh, you can implement it atop FEXPRs in ways we've discussed.

You misread me.

I never posited that FEXPR or FCE like capabilities would be achievable through such machinations.

Rather, I posited that any secure or modular use of FEXPRs and FCEs will inherently subset what can be achieved, more safely, through second-class metaprogramming. Macros and extensible attribute grammars are offered as examples for second-class metaprogramming.

Logically, those positions are very different.

my belief [is] that distributed solutions will always (my liftime, yours, our kids) be ad hoc

That would be a self-fulfilling belief if you convinced everyone else. Maybe you should stop trying. ;-)

Every DRM system [...]

DRM is, at essence, a technology solution for that last mile: distribution and control for code and data on devices that happen to be physically controlled by sheep. I'm not especially enamored of DRM, but I recognize its utility for a purposes, and I would be willing to buy it for some purposes (such as competitive gameplay). Sometimes sheep have more fun.

I'll admit, my current language design is quite promising (in a scary sort of way) for developing and integrating DRM systems.

I developed my distribution models with concern for unmanned systems falling in enemy territories, and with concern for untrusted nodes attempting to obtain sensitive information or control critical service components. In short: project-layer code and capabilities may be annotated with the authorizations required to 'host' and 'hear' about them, respectively. (That is to say, there are capabilities that you aren't even authorized to hear about.) A simple contagion model with voluntary weakening spreads these authorization requirements out to the objects whose code references the objects whose code references the sensitive objects, and so on. Inductively I can always guarantee that, at the very least, the creator of every 'new' object is among the set of legal hosts. This design is subject to some rather simple static analyses, in addition to dynamic checking for linking and code-distribution. The goal is two-fold: to prevent accidental delegation of sensitive information or authorities to lower-trust resources (thus, weakening is voluntary but explicit), and to support automated clean-up whenever a node is compromised. Clean-up is simple enough: hosts are free to audit the authorities of other hosts to control a capability ... or even to know about one. There is no penalty for failing these audits, except a denial of the specific service with a security exception. (Blame and Responsibility are not issues I handle in a primitive way, though libraries may utilize Horton patterns.)

Information can easily be 'tagged' as secure via carrying it in a record with any limited-distribution capability (which might not have any other purpose). Again, the goal is to prevent only accidental leaks.

What I support doesn't provide full DRM. That is, nothing restricts programming of the hardware. But DRM may easily benefit from what I provide. It is very easy to ensure that only DRM nodes can prove specific authorities, and having the language handle these issues could avoid a lot of hassle.

Every DRM system can be hacked to circumvent the DRM infrastructure

If the DRM infrastructure were rigid, like a cement wall, I suppose that would be a problem. There are ways to reduce the value of any particular violation, thus reducing the incentives, without reducing the cost in achieving the violation.

But I'm just playing devil's advocate. DRM is not my goal.

The most secure security policy in a distributed and decentralized computing world is simply reticence to offer service at all (as in keeping some machines off the net entirely).

People who actually study computer security call that a "denial of service attack" and categorize it as a "security violation".

Security requires liveness and accessibility. A resource or service is not 'secure' if you are authorized to use it but cannot do so. This is also a problem with sandboxes.

Things are permanently ad hoc in the sense that if we shift attention from the unsolvable generalized problem to specific cases, we get a garden of solvable problems although no one solution solves them all. [...] Appropriate controls vary wildly depending on the nature of the service being offered.

One might attack other GPPLs on the same basis: appropriate controls for a particular algorithm vary wildly based on whatever utility you are implementing; "solutions will always (my liftime, yours, our kids) be ad hoc!" Clearly, Scheme is not dreamed of in such a philosophy.

While I would love to produce an ideal solution for all problems, I would be entirely satisfied with producing a very-good solution for many problems. Many others could be produced via meta-programming and frameworks. This is the tradition of all GPPLs.

Is this more humble goal possible? Sure. E, Erlang, Mozart, and Alice provide a lot of evidence suggesting so. Even Python and Java, with their frameworks and clouds forming 'higher level' distributed systems languages, provide evidence.

commonly used term with a

commonly used term with a fairly clear meaning

I asked Ray to 'ponder' it because I wasn't interested in derailing the whole thread on this subject.

I don't consider 'application', nor ownership thereof, to be at all a clear concept... not even today.

But, if you feel it is, then give it a try: define 'application'. Be sure to do so in a way that properly covers SETI at home, World of Warcraft, Google Maps, and Firefox with its auto-updates.

My own understanding of 'application' is 'code that specifies an interactive scene-graph (that can be rendered into a GUI)'. That's it. And my reasoning on this subject is basically: I've never heard anyone call a non-GUI service an 'application'.

But, while I do not consider 'application' to be a clear concept, I also do not consider it an especially useful one. In the most relevant ways, a specific GUI or application can be treated as just another service - or, most precisely, a service-adaptor that translates data resources and service capabilities into something efficient for display and human interaction. I don't think in terms of 'applications'. I think in terms of services, sensors, and actuators.

Despite your thoughts on the 'merits' of my comments, I offer no "urgent insistence" to define any "needless and quite possibly incoherent abstractions". But I believe that you, and Ray, could profit from ruminating over and reconsidering your own assumptions on the nature of development.

Chances are, your own assumptions on the subject make code-distribution itself seem a much less K.I.S.S. topic than it could be.

Distributed code is distributed code. I downloaded Firefox at some point. So did millions of other people. Thus, Firefox code is massively distributed. This method of code-distribution is almost stupidly insecure: at the moment, Firefox has access to my entire machine every time I run it. If I really wanted to, I could get the same effect by directly giving Mozilla.org an open pipe to talk directly to my Operating System, and there'd be no great difference in security. There would, of course, be a great difference in performance and disruption tolerance. The basic reasons for code distribution do not change. Whether I download the code myself, obtain it through sneaker-net, or support automated distribution, the goals are performance (latency, bandwidth efficiency, global utilization via distribution of load), resilience (through redundancy), disruption tolerance, and (potentially) irrevocable licensing.

re: commonly used term

I don't consider 'application', nor ownership thereof, to be at all a clear concept... not even today.

But, if you feel it is, then give it a try: define 'application'. Be sure to do so in a way that properly covers SETI at home, World of Warcraft, Google Maps, and Firefox with its auto-updates.

Why not stick to plain English? "SETI at home" is a (not for profit) business model implemented using certain server applications and certain client applications. "World of Warcraft" and "Google Maps" are similar except for having or being part of for-profit business models. Firefox is a browser application and its auto-updates are a way to upgrade installed versions of that application. It's one thing to wonder if there is some over-arching concept that somehow comprises all of those things but to call that concept an "application" and demand a programming language for such applications seems rather unjustified.

Distributed code is distributed code. I downloaded Firefox at some point. So did millions of other people. Thus, Firefox code is massively distributed. This method of code-distribution is almost stupidly insecure: at the moment, Firefox has access to my entire machine every time I run it. If I really wanted to, I could get the same effect by directly giving Mozilla.org an open pipe to talk directly to my Operating System, and there'd be no great difference in security. There would, of course, be a great difference in performance and disruption tolerance. The basic reasons for code distribution do not change. Whether I download the code myself, obtain it through sneaker-net, or support automated distribution, the goals are performance (latency, bandwidth efficiency, global utilization via distribution of load), resilience (through redundancy), disruption tolerance, and (potentially) irrevocable licensing.

Your wheels look to me like they are spinning in the mud, there. Hey, you left out goals of efficient power consumption, effective brand identity, and fund-raising for future R&D. I mean, if we're just throwing a bunch of unrelated desirable things in a big basket and saying "let's make a programming language for this" why not go all out? In any event, returning to your original point, you've made no convincing argument against FEXPRs there.

Miscommunication

In plain English:

  1. I never claimed 'application' to be a common word with a clear meaning. That was you.
  2. I do not believe that 'application' has a clear concept - no clear borders, no clear ownership, no clear packaging. Your attempts to categorize SETI, WoW, Google Maps, and Firefox do little to change my impression. I suspect this to be a case of violent agreement.
  3. I have never expressed a desire to usurp the word for my own use or inflict such a definition on anyone else. That assumption, and the friction it is causing, is entirely on your side of this conversation. I can only feel the heat indirectly, when you push it into your posts.
  4. I am under the impression that my other correspondent, Ray, had a particular 'application' model in mind, when he was discussing 'widely distributed applications'. In particular, his repeated references to 'ownership' indicate such a model. I believe that these assumptions hinder understanding of practical models for widely distributed programming. Therefore, I asked for him to reconsider his assumptions.

demand a programming language for such applications seems rather unjustified

How could demand for programming languages to support for applications of the modern era possibly be unjustified?

Maybe you wish to believe it unjustified, so that you can continue to justify a language product that is clearly deficient for these purposes?

You left out goals of efficient power consumption

Actually, I've given that goal more than a little consideration. Any GPPL for the modern era really ought to support mobile devices if it is to deserve the 'GPPL' descriptive.

I just don't consider this germane to the conversation. FEXPRs cause problems for encapsulation, and therefore for secure program extension (such as third-party modules, libraries, queries, commands, scripts, plugins). But I do not see how FEXPRs cause any problems for efficient power consumption.

If you were promoting another alleged 'GPPL feature' with inherently poor power-performance properties, I may have brought up the issue.

In any event, returning to your original point, you've made no convincing argument against FEXPRs there.

I made a reasonable point. John Shutt recognized this point, and he did a great job of addressing half of it. He acknowledges and leaves the other half to 'future insight'.

That's really quite admirable. I am not convinced a solution is possible without either changing FEXPRs or adding other features to the language (FEXPR-typing?). But I aspire, and quite often fail, to address arguments with that level of decorum.

Anyhow, whether an argument is convincing depends as much on the audience as upon the argument. If you wish to dismiss my point for whatever reason - good or bad - I will do nothing to stop you. That said, I would rather not stand by while my valid point is characterized as the ravings of a sewage-and-telephony crank...

re: Miscommunication

Bah. Sorry that you felt I was labeling you a crank. I'm not. I think I do have a good point here so if you'll indulge me let's please rewind and go through this.

I don't agree that we are in violent agreement about the word "application" although I see it would be a bit of a pointless rat-hole to spend too much on sorting out our differences. I'll just say that: A browser is an application you can download and install on your machine. Is the world wide web an application? Sometimes people say things like "Email is the killer-app for the Internet," and other times they say things like "Sendmail Inc. sells a variety of applications for processing email." In the first case ("killer-app for the Internet") I think they are speaking metaphorically, in the latter case I think they are speaking literally. The world wide web and email are certainly applications of the Internet infrastructure in the sense of "use of" but neither is an "application" in the sense usually applied to software. Rather, the WWW and email are both distributed and decentralized systems that, at a technical level, are defined by various protocols and guidelines as to the expected behavior of various server and client programs. It doesn't advance any analysis to declare, as you did, that there is some special problem in recognizing the boundaries or ownership of apps like SETI at home, World of Warcraft, Google Maps, or Firefox. Sure, it's true that each does, in some sense, distributed computing. Several of them involve automagic software updates, and so forth. There are a bunch of interesting technical problems in and around the network service infrastructure that each comprises. There are, I think, various interesting PLT takes on those problems. I just don't think you make any progress on any of those technical challenges by proposing a reconsideration of what an "application" comprises. Indeed, none of the challenges of such systems as you listed are especially new.

How could demand for programming languages to support for applications of the modern era possibly be unjustified?

Briefly, because you haven't given any convincing (to me, at least) characterization of what distinguishes "applications of the modern era" and why one should believe there is a deficit of support for them among existing languages. For a few decades now we've gotten by pretty well by defining distributed and decentralized systems in terms of protocols, not programs. This seems to me to be deeply related to the decentralized aspect of things. There are concepts that cross-cut across protocol design (e.g., URLs, user identities, etc.) but these generally have not yielded a need for programming languages other than domain specific languages for very narrow purposes.

Now, as I understand you, you're in part concerned about an issue that cuts across multiple protocols: the exchange of programs and the execution of programs received from possibly untrustworthy sources. I would agree with you if you are saying that in the "modern era" there is a lot more such exchanges and execution going on than in the past. But "more" does not make for "different". In the past, different instances of this problem have called for wildly different solutions so I don't see any chance of an overarching concept that will fix that.

And I understand you to be concerned with composition of services and at moving around boundaries of trust (as with "capability"-based systems) but, here again, the decades of history suggest that the concept of such composition breaks down, as a practical matter, into a possibly infinite set of special case solutions.

[re: why not throw in a requirement for low power consumption] I just don't consider this germane to the conversation. FEXPRs cause problems for encapsulation, and therefore for secure program extension (such as third-party modules, libraries, queries, commands, scripts, plugins). But I do not see how FEXPRs cause any problems for efficient power consumption.

OK, on that point I am guilty of conflating two criticisms and being horribly confusing as a result. Let me tease them apart.

One issue is that the referent of "secure program extension" is not defined in a way that makes a convincing case for the need for a unified treatment of its problematics. To put it schematically: you could waltz into many different and diverse software engineering team meetings and say "Hey, we have a problem about how to safely run untrusted third party code in our app!" and everyone around the room will nod in agreement. But bring together any 3 of those groups in a single room and you are likely to get at least 4 opinions about what "safely run untrusted third party code" means. That's why I talked about "hyper-abstract" and "half-baked" notions: you have there (that untrusted code issue) an abstraction that works socially because it is vague but not so much technically because the use of that application in various real contexts has so many divergent meanings.

A second issue is simply that you don't get a good domain against which to define a PL just by making an arbitrary heap of attributes you want programs in that PL to display. "Low power consumption" was meant as a reductio kind of argument, as in, hey, why not also make a programming language that makes free ponies - that'd be cool, too. So, that was the (impolite, sorry) snark of "why not throw in low power consumption". It wasn't meant as a personal attack or to call you a crank - just to economically convey the concept of that kind of design error.

Finally, FEXPRs don't cause any kind of encapsulation problem. Mr. Shutt showed you how to force operand evaluation in Kernel, I showed you how to force operand evaluation in the face of Scheme+FEXPRs -- there's no problem there. And, for that matter, if you like: you can box up untrusted third party code in such a way that you have fine-grained control over which unevaluated operands are exposed to it. To say, as you seem to be, that FEXPRs introduce some special kind of security problem is, at best, to wildly exaggerate. On the contrary, FEXPRs give a nice operational and abstract semantic model on which to build those enforced disciplines you care for in particular domains.

That said, I would rather not stand by while my valid point is characterized as the ravings of a sewage-and-telephony crank...

From the start of my responding to you my intent (although observably not my effect) was to offer a constructive criticism, not to label you as a crank. Capability-based-security on distributed and decentralized systems comprising mutually untrusted nodes - and languages to orchestrate computations across these -- rocks my world, man. You seem to have a great agenda, in that regard. I think that such languages will be a fine thing to have in the toolbox. A lot of the foundation for such has been laid already in the protocol definition space. In this vague area I think we are, as you put it, in "violent agreement".

There are just two ways in which I think, at least as you've expressed your work and as imperfectly as I understand you, you have some problems. I'll boil it down to two that are worth mentioning: One is that I think (and have tried to say repeatedly) that you would benefit from a tighter agenda and suffer from a lack of a little more domain specificity - you start lumping together SETI at home, Google Maps, and Firefox and I can't help but roll my eyes at the buzzword soup. I'm sorry I'm rude that way, I'm not trying to be hurtful - I just think you'd get further with a tactic of a narrower initial focus. I understand I'm perceived as rude for saying such things but it makes no sense to me: criticism of that kind can be right or wrong and I might be either but in neither case is it rude (in my book). The second problem I think you have is applying the analytic framework you're working on to FEXPRs in a really forced way - the Schemish FEXPRs we're talking about (as both I and Mr. Shutt have pointed out) don't challenge the possibility of encapsulation (as you're discussing it) at all. I would add that they don't challenge the ability to define environments in which the style of encapsulation you describe is the default - the path of least resistance. There seems to be some kind of conflation of levels that you think FEXPRs are problematic for your domain. If you don't want them for untrusted code (a proposition I'm not sure I agree with at all, but if that's what you want) then use FEXPRs to program their own elimination, in that environment. That's the kind of thing for which they are there.

Application: It's Obvious!

I just don't think you make any progress on any of those technical challenges by proposing a reconsideration of what an "application" comprises.

"It's obvious" is the death of creativity. Until you relinquish that which is obvious, you cannot make any progress... technological or otherwise.

A reconsideration of what 'application' comprises does not achieve any progress on its own. However, it can remove a significant barrier against progress and understanding within someone who happens to believe 'application' to be a fairly clear term.

none of the challenges of such systems as you listed are especially new

That's good. I would really hate to develop a GPPL to solve "especially new" problems.

If I were asked, I'd suggest that "especially new" problems are properly the province of libraries and frameworks and - if I were feeling really daring - maybe even a DSL!

Funny, how differently you and I think.

I would agree with you if you are saying that in the "modern era" there is a lot more such exchanges and execution going on than in the past. But "more" does not make for "different".

The fact that "more" does not make for "different" indicates to me something suitable for abstraction. What does it mean to you?

In the past, different instances of this problem have called for wildly different solutions so I don't see any chance of an overarching concept that will fix that.

The relevant question isn't whether the solutions were wildly different "in the past". There's always an exploration phase when exploring new ideas, after all.

Nowadays, we have all sorts of frameworks that do the same darn things with regards to communications - publish-subscribe patterns, message passing, promise pipelining, transactions. And E, Mozart, Alice, Erlang, plus a variety of frameworks also provide studies on the subject of code-distribution. There, too, you will find tons of commonalities.

decades of history suggest that the concept of such composition breaks down, as a practical matter, into a possibly infinite set of special case solutions

Indeed. Composition, by nature, is combinatorial. Decades of history also suggest that the proper way to handle composition of special cases. This is through composition of generic abstractions which may individually be specialized to their purpose.

This is the basis for functional programming, for procedural programming, for object oriented programming, for module systems... it's even the basis for fexprs. I am baffled why you expect this well proven and systematic solution to suddenly fail at handling special cases in distributed programs.

you don't get a good domain against which to define a PL just by making an arbitrary heap of attributes you want programs in that PL to display

True. But you must think me a little on the daft side if you believe I'm just "making an arbitrary heap of attributes". The 'domain' I target with my PL is survivable multi-tiered command and control of distributed unmanned systems and sensor networks. Issues include data fusion, autonomy, coordination between systems, wireless bandwidth concerns, link losses, communications silence and jamming, power consumption, delegation of control, operations safety, operations security, loss of a system in enemy lands, and much more.

Unmanned systems today do not make it easy to get code onto the system or securely keep it updated. And the missions they can run autonomously are extremely limited.

"Low power consumption" was meant as a reductio kind of argument, as in, hey, why not also make a programming language that makes free ponies - that'd be cool, too

Well, to be honest, I did understand this.

Finally, FEXPRs don't cause any kind of encapsulation problem. Mr. Shutt showed you how to force operand evaluation in Kernel

I suspect you and I have a different idea, then, of what constitutes a 'problem'. I consider it a serious problem to write 'apply' before every use of an apparently-a-function-variable that might have been obtained from another module to be a problem. I especially consider it a problem to ask third-party developers to maintain this discipline. Security should be the default.

Though, in that vein, if he reversed it, so that 'apply' was the default, and you have to write something like ($ f arg1 arg2 argN) every place a FEXPR 'f' might legitimately be used, at least he'd have encapsulation as the default - and it would be easier to statically analyze code for potential vulnerabilities.

To say, as you seem to be, that FEXPRs introduce some special kind of security problem is, at best, to wildly exaggerate.

FEXPRs violate encapsulation, by default. That isn't a special kind of security problem. But it is a security problem.

FEXPRs give a nice operational and abstract semantic model on which to build those enforced disciplines you care for in particular domains.

I've never had much luck using Macros to 'enforce' anything without them becoming a bit draconian. In part, this is because code-walkers have a terrible tendency to interfere with one another. I.e. how can you tell legitimate uses of FEXPRs from bad ones, when enforcing the discipline? Would you restrict the degree to which developers within the discipline may develop and use FEXPRs? (Doing that would certainly justify my claims!)

I know that staging and macros are safe in a capability language. It doesn't take me much analysis to validate that judgement. I'm not keen on giving up on simple reasoning, unless you can show me how to buy it back, with interest.

I think you'd get further with a tactic of a narrower initial focus

I'll take this under advisement. That said, I have no reason to suspect pushing my particular interests directly will lead any more quickly to elucidation.

I understand I'm perceived as rude for saying such things

Not at all. It's the '[...] application?!?? WTF [...]' that makes me perceive rudeness. ;-)

the Schemish FEXPRs we're talking about (as both I and Mr. Shutt have pointed out) don't challenge the possibility of encapsulation (as you're discussing it) at all

There's a world of difference between challenging the 'possibility' of safety, vs. challenging safety. There's a world of difference between challenging the 'possibility' of performance, and challenging performance. There's a world of difference between challenging the 'possibility' of security, vs. challenging security. And I'm sure you see where I'm going with 'encapsulation'...

When I'm developing a program, I'd rather not spend a lot of time working on things that are merely 'possible' that really should be 'easy'. I've dealt with that generic problem often enough when using thread-based concurrency models, mutexes, reentrancy issues for mutable collections on delete, and so on.

The goal should be to lower the barrier for development, not merely keep it from touching the ceiling.

Hmm. okay.

Clearly I misunderstood you. Sorry, I'm sick to death of bogus security claims and requirements made by or on behalf of, eg, copyright holders or would-be law enforcers in widely distributed file-sharing applications like gnutella, or by or on behalf of censors or anti-spam crusaders in widely distributed applications such as HTTP, SMTP, or NNTP. Those are intractable problems as long as these things are not under any central control, and besides a central point of control would also be a central point of failure.

Finally, it's the nature of widely distributed systems that, much as we gripe about their problems, neither we nor any trustworthy central authority actually want them under any particular central authority's control.

When you started using the same "widely distributed" rubric I thought you were talking about the same kind of thing, but you're still talking about something which has particular owners, a central point of control, and non-peer nodes -- not about something I ever would have called widely distributed.

Anyway, you raise an interesting point for hosted applications, including those run on server farms or geographically distributed.

But, regardless, you will have to sandbox any procedure that is supplied by the client anyway, so you can kill it (rather than shut down your entire node) if it fails to halt or requests capabilities you're unwilling to give it.

With some careful design, it should be possible to rapidly prove via static analysis that any particular client-provided code is a member of a subset that makes no unauthorized accesses, so the code can be "safe" in that way (even if this subset doesn't include *all* code that makes no unauthorized accesses, it can be broad enough to be useful).

But unless the client is constrained to a very limited (non-Turing-complete) subset of the language, you can't easily prove that it halts in any reasonable time bound, so running client-provided code needs a sandbox anyway.

Stay tuned for the following post

...

Looks like your second link

Looks like your second link is broken.

RE: particular owners, central control, non-peer nodes, et. al.

I'm sick to death of bogus security claims and requirements made by or on behalf of, eg, copyright holders or would-be law enforcers [...] When you started using the same "widely distributed" rubric I thought you were talking about the same kind of thing

I would not consider the distributed system 'secure' if law-enforcers and copyright holders had a special key. It would be far too vulnerable to insider attack.

you're still talking about something which has particular owners, a central point of control, and non-peer nodes -- not about something I ever would have called widely distributed.

I'm not talking about this, either!

particular owners...

First, 'ownership' itself has an unclear meaning when applied to the interactions between services. To grasp notion, ask yourself: who 'owns' a TCP connection? Object capability discipline strongly encourages one to place most code into such interactions. Doing so greatly simplifies reasoning about security, among other properties. New services grow out of such interactions... and third parties can start interacting with these services, totally ignorant of how they were constructed... thus, continuing to muddy the ownership issues.

Second, between the capability model, use of sealers/unsealers, and distribution of transparent (= non-revocable) data, it is entirely possible to express a wide variety of ownership patterns in the code. One can even describe such things as transfer of ownership, exclusive rights, shared rights, etc.

Thus, while I can express a distributed service with a 'particular owner', I am by no means required to do so.

The only components with 'clear' ownership in a widely distributed programming system are the actuators (i.e. the brakes on my car, the traffic lights, my monitor) and the 'unique' (not especially pervasive) sensors. And these only because they possess physical counterparts, and a lot of existing law surrounds such ownership.

a central point of control

I'm curious as to how you gained the impression I was talking about a central point of control.

Control of and access to service is easily distributed (that's what capabilities do). Of course, you'd typically want to distribute different controls to different groups of users.

There is no particular node you can shut down to stop a service - unless, of course, that service happens to be a sensor, actuator, or dependent upon one. Hosting of most services may fully or partially be distributed.

and non-peer nodes

It is unclear to me what you mean.

If you mean that not all nodes receive equal degrees of trust, that is true. But trust is not a property of a node! It is a relationship between a node and a developer. Given a million developers, you'll certainly see some sets of nodes that are very widely trusted and thus serve as a 'trusted backbone'. But there is nothing special about those nodes. What is 'special' is the shared confidence of disparate developers in a common third party.

If you mean that not all nodes have equal capabilities, that is certainly true, but is rather trivial. For example, there is no other monitor identical - in space, and time - to my own.

If you were thinking something else, I ask you to clarify.

you will have to sandbox any procedure that is supplied by the client anyway, so you can kill it (rather than shut down your entire node) if it fails to halt [...] unless the client is constrained to a very limited (non-Turing-complete) subset of the language, you can't easily prove that it halts in any reasonable time bound, so running client-provided code needs a sandbox anyway

I agree that you'll want to support process accounting. I do not see why a sandbox is appropriate for achieving this. This seems a case of asking for a child's toy shovel when what you really want is a six-foot hole-in-the-ground for misbehaving processes.

What I'd like to see is good support for pay-as-you-play. Even if that payment is in meaningless tokens from a purse, it would allow useful external control over the process behavior. If the tokens were truly redeemable for other resources, then both out-of-control computes and Denial of Service attacks could become self-regulating.

But process accounting is one of those problems that I haven't yet integrated cleanly into the rest of my language. (OTOH, I do make useful guarantees about termination up to message-passing cycles.)

you will have to sandbox any procedure that is supplied by the client anyway, so you can kill it if it requests capabilities you're unwilling to give it

The Object capability discipline and design patterns involves, to a large degree, doing away with "requests" for capabilities - aka, the ability to turn strings into power. And for very good reasons:

  1. such features complicated the problem by requiring a string and a capability to obtain something useful
  2. it is difficult to understand the context of such a request

There are related patterns (including match-maker, factory, dependency injection, and powerbox). But match-makers and factories and dependency-injection aren't about security (though they can be composed securely). And a powerbox also suffers above properties, though one might hope for a 'user' to understand the context for a request, so powerbox is used in UI.

But it is true that, for distributed code, you'll sometimes want to access 'local' resources. For example, suppose access to 'time' is not a language primitive (it isn't in my language). It would be rather inconvenient - for performance, disruption-tolerance, and resilience - if one had to use a capability half-way around the world just to access the time on a particular remote node.

To handle this issue, most languages that recognize distribution also recognize the unum pattern or something similar. Essentially, a capability (to a specific clock, on a particular node, half-way around the world) is distributed along with an annotation that declares, "hey! I'm just a clock! I don't mind if you implement me locally! Really, I'd prefer it if you do!". The host, then, can implement the clock locally... or even ask a neighboring node. This pattern applies to regular in-language resources, pervasive sensors, and common FFI services. (A slight tweak and this is runtime-plugin-extensible and works for purely local stuff, too.)

Also, in case it was unclear to you: in an object-capability model, authority is not distributed based on locality.

The user story is more like: "I gave mozilla.org temporary/limited access to my graphics, mouse, and keyboard. The first time I did this, they sent a bunch of code to live 'nearby' this capability, to help shape the display into windows and buttons and text. The second time I did this, that code was still cached locally, so it popped right up."

Since authority is orthogonal to location, and since capabilities are never 'refused', I say: for object capability systems, there are no sandboxes.

it should be possible to rapidly prove via static analysis that any particular client-provided code is a member of a subset that makes no unauthorized accesses

Indeed. In object capability languages, any code whatsoever meets that criterion. Analysis complete. It's rather convenient.

The beauty of object capability model is that the ability to make a request is proof that you are authorized to do so.

In any publicly distributed

In any publicly distributed system with mobile code, encapsulation as required by security concerns seems impossible.

Encapsulation of any migrating code is certainly impossible to ensure (absent homomorphic encryption, or TPM of some sort). Capability systems don't rely on encapsulation of migrating code though. Instead, all objects are tied to known hosts and secure remote references and message pipelining provide the necessary distribution mechanisms (generally via futures/promises, as in E), and persistence and replication provide the necessary fault tolerance. Encapsulation safety properties are thus ensured, and can be relied upon for reasoning about security.

Projecting too much E

Capability systems don't rely on encapsulation of migrating code though. Instead, all objects are tied to known hosts and secure remote references and message pipelining provide the necessary distribution mechanisms [...]

I don't think the 'tied down' or 'known hosts' or 'message pipelining' bits are true for 'Capability systems' in general.

Of course, but to maintain

Of course, but to maintain encapsulation any migration must happen only between hosts in the same TCB. Message pipelining is important for any high-performance distributed messaging, though not strictly necessary.

to maintain encapsulation

to maintain encapsulation any migration must happen only between hosts in the same TCB

For a security problem to exist, there must be both a 'vulnerability' and a 'threat'. Thus, allowing vulnerability is okay if you can avoid the threat. If the goal is to maintain code integrity and encapsulation, you do not need to stick with to a TCB. It is sufficient to distribute to nodes that have no self-interest in violating that integrity.

This mostly means you can't distribute information or authorities (including reply authority) among nodes that will actually care. A lot of code contains no special information or authorities. Though you are certainly free to do a step better and distribute such among nodes that have a reputation to maintain (i.e. as part of a market of CPU, Memory, and Network resources).

Also, 'the same TCB' suggests you consider the concept to be unique to any given host and common to all the code it hosts. I've spent a lot of time thinking about how to help automate correct code-distribution decisions for tierless programming, and one of my insights was that nodes can be certified for multiple security domains, and that code can be authorized for multiple security domains. Also, one can find user-stories where different components of the same service are specified for different security domains.

If the goal is to maintain

If the goal is to maintain code integrity and encapsulation, you do not need to stick with to a TCB. It is sufficient to distribute to nodes that have no self-interest in violating that integrity.

Sure, I was being conservative. You can cut it more finely, but you're more likely to make a mistake. I'd be more comfortable with those finer distinctions if you could express such properties for a tool or analysis to verify, though I'm not sure how to express high-level knowledge of "incentives". Sounds like an interesting research project. :-)

Also, 'the same TCB' suggests you consider the concept to be unique to any given host and common to all the code it hosts.

I don't mean host in the machine sense, I mean it in the E/Vat sense, ie. a local domain of shared vulnerability.

fexprs, trust, etc.

dmbarbour, your points about security requirements for certain kinds of distributed computing about which might speculate are well taken but miss some key points about the context in which FEXPRs came to be discussed for Scheme.

To paraphrase the "Working Group 1" charter, some of the goals of the current effort include:

  1. creating a small and simple language definition
  2. which admits small and simple as well as sophisticated implementations
  3. and is a suitable platform for language experimentation and implementing new languages
  4. and includes facilities for encapsulating third party bodies of code (modules)

That isn't an exhaustive list of the goals.

Your project of language innovation for massive distributed systems looks from my perspective like it is a target application of the ideal WG1 Scheme. That is, WG1 Scheme would succeed, in this instance, if you found that it was a good environment in which to at least prototype your language. Thereafter you might or might not discover it convenient to retain the Scheme foundation: who knows. But if WG1 Scheme were already here and had turned out along the lines I envisioned for it, then we would have done our job well if you thought "Aha, this is a nice kit for at least quickly trying out my ideas."

That said, I'm a bit surprised that other virtues of a tiny Scheme-like language don't have more appeal to you in a massively distributed context. In particular, a tiny Scheme-like language admits very tiny, low-latency, memory-efficient implementations and yet is known to be a very flexible tool for specifying computations whose programs are relatively small. Having Schemes at every "node" seems like a potentially good strategy for your goals, at least to me.

"Schemes at every node" fails Scalability

your points [...] are well taken but miss some key points about the context in which FEXPRs came to be discussed for Scheme

I don't believe a discussion about a system feature - such as first-class environments or FEXPRs - is complete until it covers a wide variety of contexts. That includes interaction or interference with other potential features (especially those that interest me ;-). It isn't that I missed the context assumed at the start of this discussion, so much as I chose to not be constrained by it.

Your project of language innovation for massive distributed systems looks from my perspective like it is a target application of the ideal WG1 Scheme. [...] I'm a bit surprised that other virtues of a tiny Scheme-like language don't have more appeal to you in a massively distributed context. [...] Having Schemes at every "node" seems like a potentially good strategy for your goals, at least to me.

I did, in fact, start with a 'tiny Scheme-like language' when I first became enamored with massively distributed systems, back in 2003. I even treated it as an application, as it did not occur to me until a couple years later that concerns for security, efficient communications, distribution, persistence, and expression (language), were so tightly integrated.

My Scheme-based design fell to scalability years before I was focusing attention on security.

My goal, even back in 2003, was "Internet-scale", which was given a hand-wavy estimate of "millions of developers and billions of nodes". I could not (can not) fully grok that scale. Fortunately, I do not need to: I can look at the Internet, and study known problems that crop up in distributed systems programming. These include: node failures are regular, disruptions are not uncommon and tend to segregate subnets, node failure cannot be readily distinguished from temporary disruption, the slashdot effect is common and looks an awful lot like distributed denial of service, developers have different interests that may be adversarial, nodes may be mobile, and nodes are heterogeneous but rarely unique (even for sensors and effectors, such as access to a particular printer, authority tends to belong to a subset of nodes). I'm sure I could go on.

In this environment, a developer must reason about security (authority, information assurance), partial-failure, graceful degradation, resilience (self-healing, redundancy, persistence), disruption tolerance, progress and concurrency (delay, termination, race-conditions, deadlock, livelock, non-determinism), maintenance and upgrade, integration, validation, safety, performance (efficiency, utilization, load-balancing, latencies, real-time or embedded), resource accounting, and even packaging, marketing, licensing. (Code-distribution is useful for at least six of these reasons: disruption tolerance, load-balancing, redundancy, latency, bandwidth efficiency, and irrevocable licensing.) Even a distributed systems expert would have a terrible time reconciling all of these issues without support, and only a tiny minority of "millions of developers" will be distributed systems experts. I conclude that the system must not only support reasoning, but must also reduce need for such reasoning in nearly all cases... i.e. by automating correct distribution of code; by simplifying integration, validation, upgrade; by guiding developers towards a correct solution as the path-of-least-resistance. Lowering the barriers to distributed development is a prerequisite for scalable composition in open distributed systems. Only fools will open a distributed system if it is easy for themselves and others to get things wrong.

That is where Scheme fails. Several of Scheme's lowest-level decisions - including strict evaluation, side-effects in arbitrary functions, and widespread support for mutations (of structured data and environment variables) - do much to hinder reasoning about progress, concurrency, security, and where failure may occur, in addition to making code-distribution very inefficient (because all mutable state must be maintained across replicas).

If I continued with Scheme, I would have needed to change pretty much everything except the parentheses. And, to be honest, I've never been too enamored of those parentheses.

That said, I have taken inspiration from Scheme (among other languages, such as Charity and Oz). Similarly to Scheme's goals, I avoid overlap between language primitives, such that I can rightfully assert 'language minimalism' - that I cannot remove any primitive without losing something of critical importance to a large number of use-cases. I'll refrain from discussing exactly which primitives I'm favoring and why (as that discussion would continue all week).

A big difference between Scheme's philosophies and my own, though, is is at the concepts of 'program' and 'module'. I consider 'program' to be a continuous concept, both in time and space. There is, in my philosophy, no such thing as a useful 'relatively small program', because every useful program involves something very large - external clients and services, markets and discovery, maintenance life-cycles, sensors and actuators - the whole world, present and future. While there are programs that require relatively little 'new' code, I measure program size by including every dependency - including services and libraries, servers and clients, the language runtime and link-loader, the Operating System and its drivers, the installation framework, the market integration (even including PayPal, if you use it), and so on. I believe that this global scale view is important. We too easily lose site of what our decisions cost the forest while we focus on the trees. We too easily claim victory for optimizing in-the-small even at cost to performance in-the-large.

After studying distributed systems for a while, 'applications', 'plugins', 'third-party modules' and such all start looking like inefficient, untrustworthy, difficult-to-upgrade, difficult-to-validate, difficult-to-manage-and-configure, and otherwise horribly sado-masochistic mechanisms for what should be relatively simple issues of code distribution.

You speak of "tiny, low-latency, memory-efficient implementations" for Scheme. This is certainly true in some relative sense (compared to other, similar products). But, to help elucidate my position, I will claim that no such thing exists... not if you measure efficiency, latency, and memory costs on a more global scale. Include the time and bandwidth for downloading the code. Include the total space on hard-disks globally. Include the inefficiencies of OS-level process separation - process startup, process shutdown, and page sizes. Include various forms of overhead: context-switches, serialization, parsing, encryption, and decryption for messages passed between processes on the same machine. Include life-cycle costs of maintenance, upgrade, and eventual disposal. Include the human costs and latencies for each installation.

When we consider the global scale, the web applications today often perform at least as well as anything we download and run locally. My goal is to do much better than that, and not only for applications. There is more than enough waste - among both users and developers - that could be avoided.

Trust in Kernel

With regard to security in Kernel of one module against accidentally-allowed compromise by another module, there are some details that I consider to be not yet as fully resolved as I would like them to be. However, based on... well, ultimately, an intuitive sense of the ebb and flow of the Kernel design, developed from being immersed in it for years... I believe that these details are just that: details, things that can be dealt with.

Something that I suspect may be an important part of the answers (the answers that I don't have all of yet) is standard applicative apply, which has non-robust library derivation

($define! apply
   ($lambda (appv arg . opt)
      (eval (cons (unwrap appv) arg)
            ($if (null? opt) (make-environment) (car opt)))))

If the optional environment argument isn't provided to apply, the call is made in a freshly created empty environment. If the combiner argument isn't applicative, it isn't called, because an error occurs when apply tries to unwrap it. And because apply is itself applicative, the object that gets passed to the unwrapped combiner is the result of an evaluation. So a call to a combiner by means of apply is pretty safe. To illustrate, here's a classic example of what can go wrong when using fexprs.

($define! compose ($lambda (f g) ($lambda (x) (f (g x)))))

There are two different problems here. Assuming that we have no control over what f and g are,

  • either one of them might be an operative, which would therefore have access to its unevaluated operand. That's information we didn't want f or g to have (although, thankfully, at least it won't be able to mutate the operand, because when Kernel constructs a compound combiner it stores an immutable copy of the body). And,
  • either one of them might capture its dynamic environment, which is the local environment created for the particular call to compose (although, thankfully, capturing that local environment won't automatically compromise the static environment of compose, because there's no general way to extract the ancestry of an environment). Of the two, this is the liability that mainly concerns me; the operand capture is a trifle by comparison.

Both of these problems would go away if one wrote compose using apply, which would both guarantee argument evaluation and prevent environment capture. Interestingly, it would also naturally favor a slightly different common behavior of compose,

($define! compose ($lambda (f g) ($lambda x (apply f (apply g x)))))

although one could get the earlier common behavior by writing "(apply f (list (apply g x)))". The trouble here is that it's still way too easy to accidentally write the unsafe version of compose. I've been aware of this problem since the first years of the Kernel design, of course, and while some possible approaches have come to mind, nothing has felt so obviously "right" that I've been moved to adopt it. I'm "waiting for insight" on this (rather as Ray reports doing), and meanwhile there is no shortage of other things about fexprs to occupy my time while I'm waiting.

BTW, it's worth noting that a first-class environment in Kernel is, in essence, a capability for read/write access to a local binding frame. Just as a first-class continuation is a sort of control-transfer capability.

Your brain on fexprs

I find that even if these "new-style fexprs" and FCEs we have been discussing are somewhat uncharted territory (e.g. compilation), they make thinking about the issues related to macros, phasing, and hygiene much more tractable.

For example, one of my insights gained from thinking in terms of fexprs was, that phase separation, as is currently a topic in Scheme, is simply an optimization: if the compiler can compile a given expression ahead of time – great! This includes doubly, or N times evaluating a piece of code: once at compile-time (once at the compile-time of the first compile-time, ...), once at runtime.

With fexprs, it becomes clearer that macros are related to inlined functions. With inlined functions it is self-evident that alpha-renaming (i.e. hygiene) is a good and necessary thing. The same thing holds for macros. Furthermore, a fexpr's semantics are, as you have stressed runtime semantics: you don't say what you want done (as with current macros), you simply do it. Ahead of time compilation seems thusly to be only possible, when the fexpr's body is amenable to a kind of offline, partial evaluation. However, the nice thing about fexpr semantics is: if you can't run it at compile-time, you simply run it at runtime (with a warning, possibly).

One man's non-local expression of concepts is another man's EDSL

You can peek inside operands that were developed in non-local modules, and thus introduce behaviors dependent on how concepts are expressed non-locally. For composition, this is bad...

The point of embedded DSLs (which is kinda the point of macros) is exactly to introduce and enable such non-local language uses, i.e. the non-local expression of concepts defined in the EDSL.

Whether that's bad for composition per se is a different question. I can imagine that combining a very complicated and esoteric EDSL (for example, a codewalker that introduces nondeterminism, like SCREAMER, or a control-flow manipulating macro suite such as SERIES) with another EDSL could be problematic.

However, many macro (and EDSL) uses are indeed very local affairs, that introduce little if no trouble whatsoever for composition. (Think defclass for example, which is a prominent macro, that's merely cosmetics for creating or updating a class object.)

One man's misunderstanding is another man's miscommunication...

My assertion applies specifically to "how concepts are expressed non-locally", not "which concepts are expressed non-locally. By peeking inside operands, for example, an operative can distinguish between whether '1+1' or '2' was used in some external component.

If the result of utilizing an EDSL is a bunch of domain data that is intended for external use, then so be it. Composition issues only arise when the consumer of data can poke around and discover how a non-local component was manufacturing that data.

Since it seems I was unclear: local syntax extensions, including ye' traditional macros (even the non-hygienic ones), do not suffer this problem. Anything you are using macros for today - excepting compile-time side-effects - you could continue doing without raising composition and modularity issues. Also, 'local syntax extensions' can still be subject to modular export and import. The 'local' refers to how their application affects the rest of the program, not to the distance between definition and application.

Anyhow, I don't believe DSLs are the point of macros. You'll end up writing domain-specific languages even without macros... you'll have domain-specific types, domain-specific domain models, domain-specific queries, and so on. I remember writing more than a few of these in Haskell. Syntax extensions only provide the finishing touch, to clean up 'syntactic noise' and boiler-plate, to simplify expression of the DSL.

Boilerplate-averse programming

Anyhow, I don't believe DSLs are the point of macros. You'll end up writing domain-specific languages even without macros... you'll have domain-specific types, domain-specific domain models, domain-specific queries, and so on. I remember writing more than a few of these in Haskell. Syntax extensions only provide the finishing touch, to clean up 'syntactic noise' and boiler-plate, to simplify expression of the DSL.

Good point. It is not totally exact, though, I think. For example, in all my Java projects I created "domain-specific types, domain-specific domain models, domain-specific queries". But I still have to write out the boilerplate every time. In fact, when I look at the macro expansions of Lisp macros, they often look exactly like the code that I would have to write in Java by hand. I wouldn't call that finishing touch. "Mere" syntactic abstraction enables a qualitatively better programming experience.

Just let the happy campers camp

Steven, I don't see the point in doubting the need for a language device that (1) has been in daily heavy (and happy) use by many advanced programmers for decades (2) has demonstrably enabled Lisp to be at the forefront of dynamic languages for decades (3) is explained, analysed, and documented in a huge body of research papers, programming languages, and folklore.

You may well not like macros, and there are indeed many valid arguments against them, but doubting their need without proposing an alternative that replaces all the documented use cases of macros is kinda pointless, IMO. And "a little syntactic sugar for arbitrary monads" for example, is not one such alternative, in the context of Lisp, as this would require a huge reimagining and reengineering of core Lisp, something it has withstood for decades, happily.

Arguing against macros in a Lisp context is a bit like telling C programmers to stop using pointers.

I think we misunderstood

I think we misunderstood each other, I just said that I don't see the need for macros in a "modern language", and I deliberately exclude Lisp here :-)

WAIT!!! Don't start typing an angry response yet. I totally acknowledge what Lisp has done for programming and that it is still superior to, say, Java. But projects like "Kernel" show that there is still much room for experimentation and improvement in the Lisp arena.

I am currently in the process of developing my own dynamically-typed programming language Babel-17. Despite being dynamically-typed, Babel-17 is closer in spirit to Standard ML and Scala than Lisp. There won't be macros in Babel-17, and probably fexprs won't be a part of it, either. But I am definitely interested in the applications of macros and fexprs and how to provide language features for the important ones.

Modern

What exactly is modern about SML or Scala vis-a-vis Common Lisp, when you take static typechecking out of the equation?

Update: for SML I can see higher-order modules, and I don't know if Scala has an advanced module system.

Scala is actually all about

Scala is actually all about modules, both in the small and in the large.

What I like about Scala is that both objects and functions are taken seriously. The Lisp object system might be powerful, but it makes its generalizations in the wrong places. I don't really need multi-methods, but without encapsulation objects are much less interesting (because they cannot act as modules, then).

I think what I mean by modern is that the language tries to come up with a well-designed tension between the freedom that the programmer is given, and the guarantees that the language gives (to the programmer and the interpreter/compiler). With Lisp, I don't feel that tension, it is all on the freedom side.

Not Your Father's Lisp

I agree that Common Lisp and most Scheme implementations are not exploring new ground in programming languages. PLT Scheme is an exception. It has higher order modules, an interoperable statically typed variant, and a whole heap more all built on probably the world's most advanced macro system; the base language is just functions and structures. And yes, it does have strong abstraction boundaries.

Also Clojure

Thanks for the reference of PLT Scheme, I knew it was around, but never really looked at it. Its ideas of modules and objects (or structs) look pretty similar to what I want to put into Babel-17. Also Clojure has a lot of aspects that I really like.

The one serious reason I will never become a fan of Lisp is its syntax.
Although it is very uniform with its incredible amount of brackets, it is uniform in a way that is good for computers, not for me (and many other people; I am sure that there are people that love this kind of syntax; these are usually the people who spot in a lecture about logic the slightest typo immediately).

Static analysis and compilation

My own interest in the evolution of Scheme is the elimination of eval and working toward a language predicated only on static compilation to machine code. Also, hopefully a simple static code analysis model, separate compilation and perhaps a whole program compilation leading only to more potential optimizations.

The interpretative nature of Scheme is unimportant to me, whether it leads to a more minimal core language or not. Given the constraints of static compilation, yes, a minimal core is still a good ideal to shoot for.

I guess I am unclear on whether a first class environments language feature is still open to reasonable static compilation of Scheme programs?

Are any other folks out there seriously working on an "interpretation free" Scheme definition, free of eval and other runtime-only language semantics?

Statically compilable Scheme

As far as I can see, FEXPR's are impossible and first-class environments are in principle useless unless you have eval in the language, and also still have some representation of the abstract syntax tree (which in Lisps is effectively the source code) in memory at runtime.

It sounds like Scheme reached its pinnacle of usefulness to you with the R4RS/IEEE1178 standards; those languages were effectively compilable to pure machine code. That's also a valid vision of the future of Scheme; just not the path that the scheme reports actually travelled. There's a significant and worthy niche in the design space for a statically compiled lisp drastically simpler than CommonLisp, and I encourage you to use, maintain, or create one if that's your passion.

This is just one of the issues on which the scheme community was (and, I suppose, is) divided. I remember 'eval' and other runtime constructs being a huge deal in the R5RS discussions. R5RS was a major divergence from the path of being a statically compilable language. There was a lot of heated discussion about the dynamic features, but almost no one (except Jeffrey Mark Siskind) then took the really hard line of just flatly refusing the new standard, and the community did not have a schism at that time of the kind that R6RS caused.

Siskind had a very good reason for not implementing the dynamic features of R5RS; he is the author of Stalin, which is possibly the most advanced static optimizing compiler ever made. It's long compile times are not to my taste, but it produces blazing fast optimized code and in fact, it sounds like Stalin is exactly the system you want.

further reading at:
http://en.wikipedia.org/wiki/Stalin_(Scheme_implementation)

Downloadable Gzipped Tarball, including source and documentation, at:
http://cobweb.ecn.purdue.edu/~qobi/software.html

Stalin hasn't been updated in some time; I have the impression that Siskind considers it "finished" and considers that there is nothing more to do. I think I remember that some folks intended to fork it via a sourceforge project, but I haven't heard anything about that since it happened.

ISLisp is also statically compilable...

... but uses the conventions of Common Lisp; it's very close to being a Common Lisp subset. I have a back-burner project to write an ISLisp compiler; the first step will be an interpreter, because ISLisp's macros are DEFMACROs, so you need an interpreter at compile time but not at run time!

Information at http://islisp.info .

re: static analysis and compilation

I guess I am unclear on whether a first class environments language feature is still open to reasonable static compilation of Scheme programs?

A good rule of thumb, I think, is Dyvbig's notion that esoteric features should "pay their own way" - that is, they are admissible so long as they don't "tax" the performance of code which doesn't use them.

Suppose that you had a fully static version of Scheme with just syntax-case macros. And you have (and can you can find extant in the real world) highly optimizing compilers for this.

Now, atop that, you can write a bunch of macros which define "(the-environment)" to, in every context, return a procedure that let's yet set or get arbitrary variables from the lexical environment by name. You'll have to write macros to replace LAMBDA, LET, and so forth - and these macros get quite tedious - but you can do it. "(the-environment)" expands into something like "(lambda (var-name . optional-value) (case var-name ((x) ...) ...))" For sake of discussion we can say that ((the-environment) 'x) returns the value of X and ((the-environment) 'x 42) sets X.

All of the code you write that does not use (the-environment) will compile just as it ever did, assuming the compiler is smart enough to eliminate the dead code definitions of the-environment. Any code that does use the-environment may very well take a modest performance hit, depending on how you use the reified environment. If you write ((the-environment) 'x 3) probably the compiler will trivially expand the macro and ultimately reduce this to an efficient (set! x 3) but if you let the reified environment escape or if you give it a dynamically computed variable name -- now your variables all have to be boxed somewhere and this limits the available optimizations. "Pay as you go."

How's about EVAL? Well, worst case, screw it! Just type in a variation on the meta-circular interpreter you find in SICP but make it use your new conveniently reified environments and you've got an EVAL that works - but that is slow compared to statically compiled code. Pay as you go, again, and extant and historic compiler-based implementations take this approach and yield a useful EVAL. It's useful enough for any undergrad course in Scheme. It's useful enough for an extension language in an editor. Etc. Many, many optimizations are available from there but the ground floor is already useful and "all" you did is write an optimizing compiler, and pass it a meta-circular evaluator written in Scheme.

Are any other folks out there seriously working on an "interpretation free" Scheme definition, free of eval and other runtime-only language semantics?

You will have trouble coherently explicating what exactly you mean by "free of eval and other runtime-only language semantics". One understands that you mean to minimize what exactly a compiler has to deal with -- the informal meaning is clear and sound -- but by the time you've got an optimizing Scheme compiler along those lines then the "run-time only" parts can be added via a library.

In the interzone of neutral territory where compiler fans and interpreter fans meet - the issue of FEXPRs, EVAL, and FCEs as it relates to language specification comes down to economy of expression in defining the overall language. I argue that it is more expositionally economic to define the semantics of your highly optimizable subset of the overall language by starting with a core that includes FEXPRs, EVAL, and FCEs. The "great divide" between an interpreted and an optimized compiled version of Scheme is greatly exaggerated except in just one sense: programmers concerned with performance have to be mindful of when they are invoking features that the compiler can't do much with and it should be easy to avoid doing so and the optimizable subset of the language should be quite useful on its own.

reminds me of javascript and its eval

Javascript's eval is specified to operate within the lexical (and dynamic) context of its invocation, at least in ES3. ES5 appears to be changing this to be more static. I am having a tough time digging up links about this, though.

Environmental mutation

Edit: Withdrawn. Addressed by Mr. Lord higher in the thread.

If you write ((the-environment) 'x 3) probably the compiler will trivially expand the macro and ultimately reduce this to an efficient (set! x 3)

Why would this be compiled into (set! x 3) instead of (define x 3) with shadowing semantics? To my naive understanding, the former corresponds to value mutation and the latter corresponds to environmental mutation.

Say you have a fexpr constructed by a call to f, its parameter p and its environment, e. Would calling (e 'x 42) only affect the binding of x within p akin to let, or would it also affect operations following the call to f akin to define?

re: environment mutation

[why would] ((the-environment 'x 3) [mean] (set! x 3) [rather than] (define x 3)

There are a few reasons and, also, if you want define you can obtain it from this core Scheme:

Unproblematic Semantics: If by default environments permit only referencing and setting extant variables, their semantics is both useful and simple. Consider code like this:

(lambda (a)
  (lambda (b)
    ... (the-environment) ...))

This is (essentially) equivalent to:

(lambda (a)
  (lambda (b)
    ...
    (lambda (var . opt-value)
      (case
        ((b) (if (null? opt-value)
                 b
                 (set! b (car opt-value))))
        ((a) (if (null? opt-value)
                 a
                 (set! a (car opt-value))))
        [... etc for other lexically apparent variables ...]
        (else [... signal an error ...])))
    ...))

There are just no semantic surprises there. The procedure returned by (the-environment) can escape and be used any which way from tuesday and nothing will happen that isn't easily understandable in terms of traditional, fexperless Scheme. If that environment procedure does escape (or is used in certain ways) then it's very clear that some potential compiler optimizations are thwarted but that's OK (from my perspective) because I assume we've agreed that first class environments should pay their own way.

In contrast, let's hypothesize that environments by default somehow allow define. Then consider this procedure:

((lambda (a)
   (let ((b 42))
     (let ((w (lambda () a))
           (x (the-environment))
           (y (lambda () a))
           (z (lambda () q)))
       (values w x y z))))
  13)

That returns four values, each a procedure. I'll call them by their internal names (w, x, y, and z). Suppose I call w - I'd expect 13. Same thing if I call y. It's unclear the definition of z should be accepted in the first place but supposing it does, and I call it, I should get an unbound variable error. Then, using the environment x, I do the equivalent of (define a 'hrm) and also (define q 'ha). Now, what should calling w, y, and z do?

I don't know any one right and obviously good and useful answer to the question. So, I omit that capability by default. (Your mileage may vary. Kernel specifically provides an answer to those questions.)

Don't Need it (for Scheme): Internal defines in standard Scheme are (a) quite restricted in use and (b) not FEXPRs. In particular, internal defines in Scheme are not themselves FEXPRs but, rather, are in every case a syntactic feature of some enclosing FEXPR. For example, if I write (lambda (a) (define b ...) ...) the define is a clause of the lambda - not a stand-alone thing. Scheme restricts where internal defines can be used and makes them a syntactic property of an enclosing form precisely to avoid weird semantic questions similar to the one I illustrated above (with the "w,x,y,z" example).

You can have it anyway: There is a great deal of flexibility that comes with giving environments and procedures procedural interfaces: it's easy to define "non-standard" environments and to compose standard and non-standard environments in various ways. As one example that's actually important to the "tiny core Scheme" I'm describing, if you want extensible, mutable top-level environments (including the possibility of modules) you do that by defining non-standard environments, extending EVAL, and taking over the reading and interpretation of subsequent source (e.g., with a REPL although you can also provide environments that look more like how a static compiler is usually expected to treat things).

Would calling (e 'x 42) only affect the binding of x within p akin to let, or would it also affect operations following the call to f akin to define?

Neither. It would be as if a (set! x 42) were executed at the position where f is called. The effect of the set! is apparent even after f returns - but it is not like define. If you wanted to specify a semantics for (dynamic invocations of) define in this context you could do it, although the caller would have to agree to let you by providing you with a non-standard environment.

Typed Scheme

Have you looked at Typed Scheme? It goes a long way to making useful optimisation simple. In the last few months it has become complete enough to be used for general hacking with only the occasional bit of pain.

Typed Scheme Totally Rocks

It's a loverly language in principle. I've made attempts in the past to use it in the intermediate representation and code analysis/transformation for my ongoing "toy" compiler experiment (an ML'ish language with OO features, co-routines, representation optimizations, etc.), but I hit "potholes" along that road.

Just following the email list, many issues have been resolved or clarified (integrating typed scheme with vectors, pattern matching, comprehensions, blah blah), and I hope to transform my entire code base (minus PLT Scheme's lex/yacc) to typed scheme.

In another thread

In another thread (http://lambda-the-ultimate.org/node/3794), John Shutt wrote:

Granted, I've glossed over the major obstacles to fexprs. (There are, literally, a dissertation's worth of them.) Dynamic scope isn't the only Lisp feature you have to avoid in order to make fexprs stable enough to use, and other such features are ubiquitous in modern Lisps.

Okay.... Dynamic scope and lack of first-class environments are two of the first things one must give up to make fexpr's work. John, I've heard you say you're giving up quote as well, and I decided that it needed different semantics, so arguably I have also given it up.

Which other features are so problematic that their absence becomes necessary in a fexpr-enabled lisp with first-class environments?

Features that clash with fexprs

In retrospect, that passage make it sound a little as if the major obstacles to fexprs were all features that they don't mix well with. The biggest challenge by intellectual weight is developing the theory of fexprs (and, even worse, figuring out how to explain why they have a nontrivial theory; the more worthwhile an idea is, I think, the more likely it is to have a beautiful, easily understandable explanation that is much harder to discover than the idea itself was — but I digress :-). And then there's the challenge of developing a good programming style for working with fexprs.

Features that destabilize fexprs. We've already mentioned dynamic scope (the obvious one), and quotation. I think I had two others in mind, one of which is unclear. The unclear one is macros. It's unclear because  (1) macros, at least of the ordinary sort, can be converted cleanly into fexprs; so if there's something that really clashes with fexprs it has to be unconverted macros, either ordinary or perhaps otherwise; and  (2) I really haven't investigated the use of unconverted macros together with fexprs, so I don't have any specific illustrations to back up my intuition on this.

The clear-cut example, I somewhat vaguely alluded to above: non-local set-bang. Scheme's set! can mutate any binding that it can see, making it impossible to allow clients to see an environment without also allowing them to corrupt it. (In other words, no read capability without write capability.) The classic illustration is the Kernel ground environment, which exhibits all the standard bindings of Kernel, but no standard feature mutates it and there is no way for the programmer to acquire it as a first-class object. A standard environment is, by definition, a child of the ground environment (with no local bindings). Standard environments are cheap to create; the standard make-kernel-standard-environment applicative could be derived by evaluating the following definition in the ground environment (which the programmer isn't allowed to do, of course):

($define! make-kernel-standard-environment ($lambda () (get-current-environment)))

If it were possible to mutate all visible bindings, then it wouldn't be safe to have a single ground environment like this, because its bindings would never be provably stable, and nothing depending on them could be optimized for essentially the same reason that dynamically scoped combiners can't be optimized (in the presence of fexprs). The only way to provide stable standard bindings would then be to make wholesale clones of the pristine standard environment — which, as you and Thomas Lord were remarking earlier, has rather scary performance implications.

Set!

in R4RS/IEEE1178 scheme, any set! for a variable that didn't have a corresponding define in scope was an error. Thus, if you had not used 'define' to shadow a predefined variable in an environment necessarily a child of the standard environment, then you were not permitted to mutate that variable. The standard environment was therefore safe from mutation.

This limitation on set! was handled in various ways by different implementations, and got dropped from R5RS. I thought at the time that dropping it was a bad idea.

I had observed that for analysis purposes as well as expressing programmer intent and protection from mistakes, we need to have a way to declare a binding or environment to be immutable, or to make an existing binding immutable during runtime, or both. I had not really thought about whether such a limitation should depend on whether the mutator was in the 'local scope' of the definition or not, and it's an interesting point. I will need to consider it.

I don't believe wholesale cloning of environments really has to be a huge performance issue; environment-cloning is to me a semantics requirement, giving the power to evaluate without also giving the power to mutate the environment. Its naive or "obvious" implementation is very bad for performance, but may be okay under a different implementation strategy such as copy-on-write. I have not yet, however, implemented any such strategy, so it remains to be seen.

Back in 1988...

I've just found this old thread (by Pavel Curtis, in 1988) with "A Proposal for Environments in Scheme". Worth a read, I think.

re: 1988

That's an interesting proposal and I'm sure it was put to good use in Cedar Scheme but as a matter of history I think it is the kind of proposal that helped to give first class environments an irrationally bad reputation fairly early on in Scheme standardization discussions. It was wildly complicated compared to anything already found in Scheme. It had lots of ad-hoc weirdness, particularly environment names and the syntax for "qualified" identifiers. It tossed in "multiple inheritance" for environments which seems completely arbitrary. It offered ENVIRONMENT-DEFINE! but didn't offer any guidance as to the intended semantics and how it would interact with cached variable look-ups in an interpreter, never-mind a compiler (though, to be sure, it did leave source files that didn't use these features in mostly good shape)....

With 22 years of hindsight it's easier to see how to do much better, but the irrationally bad reputation of FCEs persists, nonetheless.

Pebble

Actually, Kernel has multi-parented environments, with depth-first left-to-right search. Where things really go downhill, for me, is the unencapsulated approach; Kernel supports neither determining the parents of an arbitrary environment, nor determining the complete list of bindings of an arbitrary environment, nor (as I was remarking to Ray just above) mutating an arbitrary visible binding. Apparently they did see that last as enough of a problem that they suggested "locking" the standard environments against mutation.

All this reminded me of Pebble, which I had completely forgotten about. I think I was in a hurry when I first read about it; at any rate, I'm pretty sure I didn't give it as much attention as it deserves, because it looks a lot more interesting now than I remember it as seeming then. (Maybe I just know more now? Nah.) Uncharacteristically for me, I didn't hang on to a bibliographical reference for it. So now, reminded, I googled it, and found... that it was from 1988, that it was tangled up with Cedar... that it's presumably part of the same elephant. The basic reference for Pebble seems to be Lampson and Burstall, Pebble: A kernel language for modules and abstract data types.

(postscript: Sigh. The above is an interesting paper, but the more I remember about Pebble, the more glaringly obvious it is that this is a different Pebble. The one I was thinking of is a lot more like the Pavel Curtis proposal, and probably a lot less genetically related to it.)

First-class environments are

First-class environments are fundamental primitives with respect to encapsulation/isolation. They provide a powerful system for managing bindings. Mixing them with other primitives for managing encapsulation/isolation can complicate things; done poorly, it can make the resulting system difficult to analyze.

I think it's probably a good idea to have functions that return environment "constants" with "normal" values for all language bindings, as this proposal outlines. It's overcomplicated to have three degrees of semantic conformance built layers, though.

For purposes of module isolation and interoperation, mutations made to the "parent" environment of one module, if any, should not be visible in a different module. I gather John thinks that parent environments should be immutable by code in any child environment, which is an even stronger condition. He may be right; I had't really considered sub-modules and sub-sub-modules, etc, which is where the semantic models diverge.

In the simple case these ideas are dual to each other. Having an immutable parent-environment whose bindings you can shadow locally (and freely mutate the local shadowing variables) without affecting global bindings isn't much different from having your own copy of the parent environment whose bindings you can mutate without altering the parent-environment bindings visible from other modules. That is, it's not different UNLESS you re-export your local shadowing bindings to other modules, which I think is a Bad Idea.

So it comes down to exports. I prefer mutations to the module's copy of the parent environment, because it seems more "natural" for there to be no exports (no visibility to other modules) of bindings that aren't locally defined.

⨯-reference