the gnu extension language

I found this to be an entertaining and interesting read: the gnu extension language, by Andy Wingo, maintainer of Guile.

Guile is the GNU extension language. This is the case because Richard Stallman said so, 17 years ago. Beyond being a smart guy, Richard is powerfully eloquent: his "let there be Guile" proclamation was sufficient to activate the existing efforts to give GNU a good extension language. These disparate efforts became a piece of software, a community of hackers and users, and an idea in peoples' heads, on their lips, and at their fingertips.

The two features of Guile he highlights are macros ("With most languages, either you have pattern matching, because Joe Armstrong put it there, or you don't") and delimited continuations.

The accompanying slides, The User in the Loop, for the 2011 GNU Hackers Meeting are also noteworthy, because they are not as dry as usual PL fare - instead Wingo revives the spirit of the Portland Pattern Repository:

  • "Thesis: Some places just feel right"
  • "Architectural patterns help produce that feeling"
  • "E11y [extensibility] is fundamental to human agency and happiness"
  • "Moglen: ‘Software is the steel of the 21st century’"
  • "Building Materials: Le Corbusier's concrete; GNU's C"

[Update: Video now available.]

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Funny tittle

I think that the tittle "The User in the Loop" is quite funny, considering that it advocates a language which has been, quite clearly rejected by users compared to Ruby or Lua which have the same age..

agreed for $0.02

as much as i like the nerdy alternative under-loved programming languages in the world, including all the variants of scheme out there, i think the guile folks probably need to just give in to the fact that adoption is a curious and subjective mix, and that RMS saying so isn't enough. go with something that sucks but has mind-share. the point is to get people extending things, not to have the one true language for that purpose. i'm being a pessimist and going with the "the perfect is the enemy of the good."

rdf

My reality distortion field is clearly not functioning properly, I need to get it checked ;-)

But yes, it does seem that if Guile is to succeed, whatever that means, it will need to not only do Elisp and Scheme but also something more normal-looking, like Lua or something. Whether that is possible to do right is another issue. I am cautiously optimistic in that regard.

For functions...

...this doesn't seem too bad, as long as you are smart and stick with an expression-oriented infix syntax rather than trying to jump through hoops to introduce a statement/expression distinction (which is purely an obstacle for programming anyway). For macros, the problems seems a lot harder. IIRC, the Dylan guys just punted on translating s-expression macros into block-expression macros and rewrote them all.

Anyway, if you can give Emacs lexical scoping, tail recursion, and some kind of threading then from my POV Guile will have succeeded. :)

Here's an example

I should make this suggestion a little more concrete. Here's a YACC-ish grammar for, basically, core Scheme with a C-like syntax. With a precedence that assignment binds more tightly than function applications which binds more tightly than binary operators, there should be no shift-reduce conflicts with an LALR(1) parser generator.

expr :
| IDENT
| NUM
| STRING
| '(' block ')'
| expr '(' eargs ')'
| IDENT '=' expr
| 'fun' '(' arglist ')' stmt_expr
| expr BINOP expr
;

stmt_expr :
| '{' block '}'
| 'if' expr stmt_expr 'else' stmt_expr
| 'while' expr stmt_expr
;

block :
| 'let' IDENT '=' expr ';'
| 'let' 'rec' IDENT '(' arglist ')' stmt_expr
| simple block
| expr ';' block
| expr
| stmt_expr
;


eargs :
|
| expr
| expr ',' eargs
;

arglist :
|
| IDENT
| IDENT ',' arglist
;

The key idea is to distinguish between ordinary expressions and block expressions. Block expressions are like Scheme BEGIN form + internal defines -- you give a semicolon-delimited sequence of ordinary expressions and let-bindings which scope over the remainder of the block-expression. Since block-expressions can be embedded in an ordinary expression with a pair of parens, we don't introduce an artificial statement/expression dichotomy. The internal defines are the natural place to add defclass and defgeneric, as well.

Furthermore, to continue replicating the appearance of C, we mark certain expressions as "statement expressions" -- these are things (like if-then-else or while-loops) which are ordinarily written as statements in C. These get removed from the expression grammar, and can only occur in blocks. We don't lose any expressiveness, since blocks are still in expressions, but we do gain the ability to write them in blocks without marking them with semicolons. Hence we can write things like:

   let rec fact_iter(n) { 
     let acc = 1;
     while n > 0 {
        acc = n * acc;
        n = n - 1;
     }
     acc
   }
   let value = fact_iter(6);
   print("Six factorial iteratively is %d", value);

   let fact_tr(n) { 
     let rec loop(n, acc) {
        if n == 0 { acc } else { loop(n - 1, n * acc) }
     }
     loop(n, 1)
   }
   let value2 = fact_tr(6);
   print("Six factorial tail-recursively is %d", value2);

   let rec fact(n) {
     if n == 0 { 1 } else { n * fact(n-1) }
   }
   print("Six factorial recursivel is %d", fact(6))

Personally, I simply wouldn't bother adding macros to this language. Both having two kinds of macros, and trying to autogenerate one syntax from the other, seems likely to be a never-ending source of pain. This is just a more C-like, but also more limited, syntax skin for Guile -- this is just there to help people doing limited scripting C/Java/C++ programs with Guile stay in the same syntactic paradigm.

Chibi

Compare notes with Alex Shinn, as he plans to implement both Lua and JavaScript in Chibi Scheme (which is also embedded, and emphasizes small size rather than speed). No reason why the front ends shouldn't be shared.

With all respect: your

With all respect: your reality distortion field is working properly if you think that we use Lua as an extension language just because of its more "normal-looking" syntax. :-)

Again with all respect let me highlight what the most important Lua's features are for us:

- Small. A single C compiler and you're all set (you don't have a dependency with GNU MP library, for instance).
- Fast when needed (the best JIT available)
- Lots of community contributed code in case of need.
- Long term readability (i.e.: no macros, for instance, nor multi language support).
- Tail recursion & closures.

I don't know what the keys for Guile's success are either. All I can say is that I'll try Guile (again) as soon as it has as *little* funcionalities and requirements as Lua has.

Whether it has a 'normal-looking' syntax or not ;-)

Lua

(Sorry, if this seems flamebaity or trollish, but something about your post just rubbed me the wrong way.)

Re: Small. Does Lua have a full numerical tower? Or even just built-in bigints?

Re: Has LuaJIT actually been demonstrated to be "the best JIT available"? What does that even mean and how do you measure it? How, for instance, does it compare to Sun's Java JIT? Is this even relevant for an extension language -- where you're presumably just driving a lot of high-level built-in (C, C++) functionality with a "script-level" program in $EXTENSION_LANGUAGE?

Re: Lots of community code. I think you may be underestimating the amount of Scheme/LISP code out there. Still, it doesn't how much code is out there. What matters is how much code there needs to be to have a strong foundation for programming extensions.

Re: Long term readability. If you think macros decrease readability, then your doin it wrong. Hygienic macros (or even better Racket style macros which are truly modular) can add a lot of DSL capability to a language... which I would have thought would be important for an extension language.

Re: Tail recursion & closures: AFAIK scheme R5RS requires both of these.

A couple of other points:

Does Lua nowadays have arrays and linked lists as proper data structures? Or do you still have to emulate them using hash maps?

Does Lua follow an actual standard or is it the usual case of "the implementation is the standard"?

Lua can be extended with

Lua can be extended with arbitrary precission arithmetic, interval arithmetic, with macros, etc, etc.

The important thing is that the freedom to select which extensions you want for a given problem is what makes an extension language powerful (after all languages should be designed not by piling feature on top of feature, right?)

And this is precisely the approach chosen for R7RS, where you have a small core and a bigger one with more functionality.

Finally:

- yes I'm pragmatic enough to prefer a wise implementation being a standard (Lua, Racket) to a standard without implementations (R7RS).

- and I think macros are a double edged sword. Things are more readable (at the beginning) at the expense of being more difficult to debug and understand in the long term (see, for instance Debugging Hygienic Macros). The ability to remove macros when you're building stuff with a not-so-proficient team does actually *boost* the team's efficiency.

Extending the Extension Language

R7RS is a special case because they're aiming to ensure that (a) there are only two languages, and (b) one is a clean subset of the other. You should not generalize from a special case. In general, freedom to select extensions is not a good thing.

Consider: given N extensions, we potentially have 2N 'languages'. Testing, maintaining, documenting, and debugging 2N languages seems a seriously undesirable hassle.

Language extensions are evidence of two things: (1) a language's maturity, and (2) a language's inadequacy. When you see a language with a lot of extensions, that serves as evidence that it wasn't a very good language to start with, yet that it was popular enough that developers chose to extend the language rather than modify it in a backwards-incompatible fashion.

We should learn from extensions when we revise the language, especially if we figure out how they generalize.

the quick vs. the dead -- er, correct

playing devil's advocate: time to market. if i can use lua and pick whatever extensions i want and manually slap them together and make my app and i do the testing of those things then hopefully i can get that out and make $. rather than having to wait for those things to be done correctly in the core language / heat death of the universe.

library vs. extension

I would not assume a faster time-to-market using extensions.

Consider: when a language has a lot of extensions, the quality of development tools (compilers, optimizers, debuggers, documentation, and libraries) will suffer. Cross-cutting extensions (e.g. ad-hoc concurrency, garbage collection, reflection, ambient authority or global state, orthogonal persistence) are especially problematic. Tools are rarely tested under all such combinations of extensions. As a developer, you can easily find yourself sinking time into subtle bugs that result from your wider TCB.

If you have good libraries, you won't need extensions to achieve a good time-to-market. If you have a lot of extensions (especially cross-cutting ones), you probably won't have robust libraries. If you have syntactic abstraction (macros and the like), cross-cutting extensions would be the only type one really needs anyway, especially if you have a good model for foreign service integration and orchestration. Frameworks can model arbitrary new cross-cutting features without need for extensions. (And, while frameworks don't compose effectively in imperative or OOP programming models, they're straightforward abstractions in reactive programming models.)

The productivity story I favor is: pick whatever libraries and services I want, mixin a few frameworks as needed (leveraging a reactive programming model), and glue them together to build the app. I don't need extensions, so I can use a common higher-quality and simpler language for these roles.

another attempt

i wasn't doing a good job making my point about time to market...

When you see a language with a lot of extensions, that serves as evidence that it wasn't a very good language to start with, yet that it was popular enough that developers chose to extend the language rather than modify it in a backwards-incompatible fashion.

...except that then you are waiting for things to be folded into the language, whereas you could have probably shipped something more half-assed based on extensions sooner and started making $ sooner, assuming that it isn't flight control software etc. well, there's a gamut anyway. not everybody wants to have to be gated by waiting for the full tool chain to be updated, that's all.

no waiting

Developers won't wait for things to be folded into the language. Instead they'll write a library, maybe a few macros, and achieve the same features without the extension. If necessary, they might add a foreign service / FFI, or develop a framework.

It seems your argument is based on an invalid premise.

i would have guessed that we're just talking past each other :-)

i interpreted what you wrote, and i already quoted,

When you see a language with a lot of extensions, that serves as evidence that it wasn't a very good language to start with, yet that it was popular enough that developers chose to extend the language rather than modify it in a backwards-incompatible fashion.

which i read as meaning releasing a new version. which is different than making a new library.

making and releasing a new, backwards compatible version means taking time.

so if i could do it with extensions myself, it could take less time.

Toil is an option

There are at least three ways to fulfill any given feature need:

  1. Extend the language.
  2. Develop an abstraction (i.e. library, module, function)
  3. Fix the language.

The ideal option is abstraction. In a perfect language, we'd use abstractions every time. By definition, we'd never have motivation to 'change' a perfect language. However, no known language is perfect for all abstractions, so we end up using boiler-plate code, abstraction-inversions, building a new language atop the first. This is what I call 'inadequacy'.

When faced with inadequacy, we still have all three options available to us. We can extend the language, trying to maintain backwards compatibility, but at risk of increasing complexity. We can try to build a better language that solves the problem while preserving simplicity, but that risks backwards compatibility. Or, we can just accept the inadequacy and bulldoze through the problem, writing some boiler-plate and frameworks and Greenspunning a language on the way.

I posit abstraction is the most productive, even when inadequate, once you weigh it against non-local complexity costs of extension and indefinite waits associated with fixing the language.

I thought this option was obvious enough that it didn't require mention. After all, bulldozing through problems seems to be what most developers do by default.

Long term, of course, we do want better programming languages. This means evolving the language as best we can, possibly replacing it. But there are many ways a language could mask inadequacies, cover corner cases, and stretch the use of abstraction - e.g. staging, partial evaluation, syntactic abstraction.

Extensions vs. libraries.

I have not really considered "extensions" to be different from "libraries." Clearly you do. But I have some very strong opinions about what are "good" and "bad" ways to extend a language, and I think maybe they map onto what you're talking about here.

When a capability is added to a language, using a syntax (or, typical for libraries, using bindings) that simply wouldn't be accepted by or aren't defined in the base language, and it doesn't change the meaning of any code that isn't explicitly calling on it by using that syntax or those bindings, I don't have a problem with it.

For example, I don't have a problem with CL-style 'if' where both false and NULL are considered false. I don't have a problem with Scheme-style 'if' where false is false and NULL is true. I don't have a problem with adding a definition for 'nif' to a scheme so that I have a CL-style 'if' available.

But I would *fight* anyone who was promulgating something that would modify either language's native 'if' to have the other language's 'if' semantics, because that would assign different meaning to existing code.

I have the same basic problem with something that adds extended numerics in a way that changes the meaning of existing numeric code (via "overloading" or similar), but no problem at all with it in a typed language where you have to declare numeric types and it just makes one or more new numeric types available.

I have strong opinions about anything that assigns different semantics to code that would otherwise be accepted by the system as having some other meaning. Therefore, ad-hoc concurrency, garbage collection, etc, are things I *DO NOT WANT* in a language if it doesn't have them "out of the box."

Is this regard for leaving alone the meanings of things whose meanings are already given by the base language the same as your distaste for the things you're calling extensions? IOW, do I read your intent correctly if I believe that you're simply giving a name to this strong opinion of mine about "good" and "bad" ways to add something to a language?

When people speak of

When people speak of 'extensions' to languages - e.g. C++ or Haskell - they generally aren't speaking of libraries. And, indeed, extensions do not usually provide a feature that can be achieved by library (because, in those cases, the path of least resistance is to develop a library).

There is a difference between 'extensible language' and 'language extensions'. The former suggests the language is extensible in a standard way that will be portable across all valid implementations of the language, e.g. user-defined syntax or reader macros or a programmable link-time environment. I'm okay with those techniques so long as they respect black-box modularity and are easy to use. But language extensions more generally are non-standard, implementation specific, often under exclusive control of the compiler developers (cf. embrace and extend).

In general, if we extend a language L, we have a new language L' that is not compatible with most tools for language L. If we have many extensions that can be applied, we may even have a combinatorial number of languages - many of which are ill-tested and unvalidated. If the extensions aren't commutative, then order of application might also matter.

Your attentions seem to be narrowly focused on one concern: is the language L' a strict superset of language L (such that L expressions have the same meaning in L')? But it seems you're considering only one extension at a time. Have you considered how you feel if two different extensions (L1->L2, L1->L3) resulted in the same syntax (L2 looks exactly like L3) but with two different meanings? L2 and L3 may even be the products of different L1 compiler developers, resulting in non-portable code that looks portable. Also, I think there is a role for 'non-monotonic' extension where L' is a subset of L (for EDSLs, shaders, optimizers, etc.).

It's probably difficult to

It's probably difficult to distinguish between extensions, language features and libraries. Are Scheme's SRFIs extensions, language features or libraries? Or a mix of them? Is the mere existence of SRFIs a sign of Scheme's maturity or inadequacy? I'd say it's not, Scheme is a nice small language that can be extended with additional features, on demand.

Guile is probably a platform and not a language. It has a virtual machine (as the .NET or JavaEE platforms have), a lot of libraries and modules, which you can program with a variety of languages: Scheme, Javascript, Lua or elisp (thus n=4 and you end up with 24 'languages').

My point is that you may or may not want to extend an application with an 'extension platform' (like guile). Maybe all you need is an 'extension language' (like Lua) that you can in turn extend on demand.

If it's a parameter or

If it's a parameter or modification to the compiler or interpreter that affects semantics (e.g. new primitives), this is the most general case of 'extension'. If it is written as a regular abstraction but organized for multiple clients, it's a library. Other possibilities include FFI or capability for a foreign service. I grant, one can consider library abstractions and capabilities to be constrained 'extensions', but when we use the word 'extension' we should be considering the general case.

Extensions are evidence that the language is not fully adequate for certain roles. They indicate demand for additional language-layer features that could not be effectively achieved by use of libraries, or was difficult enough that even the language developers acknowledged an apparent inadequacy by modifying the language. But not all RFIs are extensions, nor are all requested extensions acknowledged by acceptance. 'Extension' means something more concrete than 'desiderata'.

Using Guile as a target for Lua, Scheme, JavaScript, and elisp means you have four languages. The exponential factor applies for extensions because they are combinatorial, not exclusive. (I do not believe you mean to combine an ad-hoc mix of Lua and Scheme within a module.)

Note that with that

Note that with that definition (a modification to the compiler/interpreter that affects semantics) Scheme is full of 'extensions': macros and many SRFIs would be extensions.

Untrue. Support for macros

Untrue. Support for macros is standard for Scheme. No extensions, modifications, or switches are needed to enable macros. And none are provided. You can have libraries of macros - they're abstractions. You might say that abstractions extend the 'language', but by nature abstractions do not modify the interpreter or compiler or language semantics.

Also, 'Requests' (including 'Scheme Requests for Implementation') are not extensions. They're wishlists. Some are accepted, at which point they might become extensions, but they could just as easily become part of the language standard, or a library. You'll need to judge them on a case-by-case basis.

Please improve the quality of your arguments, if you plan to continue this discussion.

What I was meaning is that

What I was meaning is that you can extend the language from within the language (you can redefine 'let' for instance). As per your definition that would be an extension (a modification of the language semantics).

But this is starting to look like these emacs (guile) vs vi (Lua) discussions :-) I'm with vi!

A modification to the

A modification to the program is not a parameter or modification to the compiler or interpreter. It seems you misread my definition.

The fact is that macros are

The fact is that macros are a way to program the compiler, and this may lead to peculiar situations. I know this is getting off-topic, but please let me just quote this:

"Finally, in the Lisp and Scheme tradition where macros are themselves defined in a macro-extensible language, extensions can be stacked in a “language tower.” Each extension of the language can be used in implementing the next extension."

And

"Without an enforced separation, the meaning of a code fragment can depend on the order in which code is compiled and executed. At best, programmers must work hard to manage the dependencies. At worst, and more commonly, the dependencies are too subtle for programmers to manage correctly, and they cannot expect predictable results when combining libraries in new ways or when using new programming tools."

From Composable and compilable macros

Metaprogramming is programming

Macros might be considered a way to program a compiler. But the compiler/interpreter of the source language - which happens to support macros - is not modified by the presence of macros in the source language. Programming in a meta-language is still programming.

Actually, not quite

Even within the small language of R7RS, there is considerable modularity. Of the 16 libraries described by the standard, only (scheme base) is actually required for conformance, though providing the others is recommended; that is, "there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course", in the words of RFC 2119. The reason for this is to handle the extremely broad range of systems and applications for which the small language was, by its charter, designed.

Similarly, as currently planned the large language will also be modular. There will be something around 100 libraries, but most of these will again be only recommended, not required by the standard. The precise requirements will be established by Working Group 2.

misc notes / questions

WARNING
actively trying to be devil's advocate, raining on parades, since the tone of "the gnu extension language" post really rubbed me the wrong way.

i'm happy to be totally wrong about any of my knee-jerk reactions. but i hope they actually give the Guile people a chance to actually try to view things from another angle than their apparently Kool-Aid addled one?

* "Since then, many languages have come and gone, and most of those that have stuck around do have many of the features of Scheme: garbage collection, closures, lexical scope, etc." er and maybe some things that most Schemes don't do: typing (ignoring Typed Racket, Bigloo, etc.). and just because X did it first doesn't mean X doesn't suck. we don't actually generally use Simula any more.

* "First and foremost, Scheme's macros have no comparison in any other language. Macros let users extend their language." uh could we up the hyperbole a little? "no comparison" seems overbearing to me. Nemerle? Lisp-Flavoured-Erlang? ATS? Are they a band-aid anyway? And don't get me started on "oh great so now YOUR pattern matching doesn't work with MY pattern matching macros" let alone all the other issues that come with macros.

* oy veh, you can "probably" get other languages working, whereas some systems are already doing that cf. Racket, LLVM, JVM (tho Gnu wouldn't go with the JVM). i'm sure that it will be fun for devs to have to find all the bugs you haven't found in all the implementations?

but in the end like you said, it has to be Gnu technology to be Gnu, which i sorta think -- as much as i love Gnu, seriously, i've donated real cash to FSF -- is the core problem. Gnu is great in its little nitche. if that's the only place you expect to play, great, but don't be surprised when e.g. there has been insufficient uptake of Guile in the 17 years it has been around.

you claim to worry about now and 10 years hence. so that will be 27 years with nobody using your stuff???

again -- happy to be explained how i'm a doofus in my reactions. and at the end of the day yes this is all highly subjective. there's got to be a certain je ne sais qois for something to really get traction in the world.

With friends like this...

"First and foremost, Scheme's macros have no comparison in any other language. Macros let users extend their language."

This sorely tempts me to write a "Lisp: the Ultimate Blub, With No Apologies to Paul Graham" blog post.

At what point, I wonder, did the Lisp community go from being among the best informed to the most ignorant?

If I understand you

If I understand you correctly, this post was totally unnecessary. There are good discussions to be had, but we can do that without name-calling.

OK

> At what point, I wonder, did the Lisp community go from being among the best informed to the most ignorant?

What's the answer to that question?

Update: I'm completely unconcerned about my name-calling while Paul Graham, and many other Lispers, get to name-call users of superior languages to Lisp.

Ah, I see. Thank you for

Ah, I see. Thank you for the update, I can see how being called a "blub programmer" would be upsetting. However it is still entirely out of place for you to continue name-calling, on this forum at least, and I would appreciate it if you would stop.

OK

> At what point, I wonder, did the Lisp community go from being among the best informed to the most ignorant?

What's the answer to that question?

Update: I'm completely unconcerned about my name-calling while Paul Graham, and many other Lispers, get to name-call users of superior languages to Lisp.

Update to the Update: Lispers need to grow thicker skin and quit talking obvious nonsense before I get concerned about hurting their feelings.

Paul. If you really wanted

Paul. If you really wanted to know, you wouldn't have added the ad hominem to the begged question. But I don't think you do, do you?

LtU deserves better, and I think you know it.

Not OK

Nope. It's not OK that only one side of the question gets to be provocative. That's exactly my point. "First and foremost, Scheme's macros have no comparison in any other language. Macros let users extend their language" is either embarrassingly ignorant or pugnacious. Which is it?

That's a fair point. I

That's a fair point. I think that in this case the answer is twofold; one, that there is a place for writing that doesn't have nuance. If I knew that I were writing for LtU, which I didn't then I probably would have used different language; but I wasn't. I was promoting one implementation of one language to a specific audience. That's fine, IMO.

My real point in my replies to you is that LtU is a valuable space in which we suspend our inherent conflicts, because indeed everyone here is working on some language or other, and in some way we all want to win. This is a great thing, and we should work to keep it that way. That's all I'm saying there. I'm sorry that I provoked you in this way, but I did not intend it.

Now, your question. I freely admit to being ignorant of many, many things! So I'll take the ignorance route, though without the embarrassment, and with a little pluckiness. However, that doesn't mean that I am a representative in any way for "all Lispers". I plea also the audience (JavaScript users rather than e.g. ML users).

But also, if your answer to macros is laziness, then... well, I will admit that I disagree there. But is there not room to disagree, here?

Good Points and Greatly Appreciated

Hi Andy. First, thanks for this. I might've saved us both some heartburn if I'd taken into account from the outset the different audiences you mention, because it's certainly fair to write differently for them. In particular, my own ignorance is showing: if I'd known you were writing for a JavaScript audience, I'd probably have considered the likelihood of their familiarity with any dialect of Lisp, let alone the likelihood of their familiarity with languages such as OCaml with camlp4, or any of the term-rewriting languages like Pure and Maude.

So my answers to macros aren't (just) laziness, although I do think that's one tool in the toolbox. But elsewhere on LtU recently we've seen the resurrection of fexprs, and I just mentioned camlp4 and term rewriting. I think what galls me (and where I err is in making this a personal thing, for which I must apologize) is that OCaml and camlp4 are so obviously the non-Lisp answer to macros in not only effect, but implementation, and have been around for 15 years or so.

But you know what? You're right. The reasonable response to Lisp's claims about macros isn't to point to even more obscure languages, and no real harm is done when writing for 99.9% of programmers on the face of the earth to point out that Lisp has always taken "code = data" seriously and that this affords it an unusual kind and level of power.

If you're still willing at this point—heaven knows you might very reasonably not be—I know I could learn a lot from current thinking about macros if you believe Guile does something with them that I wouldn't be familiar with from having a pretty extensive R5RS/SLIB background. Similarly, if you haven't seen what John Shutt is doing with Kernel, I definitely recommend checking it out. For term rewriting, Pure seems like a very interesting approach to it that's easy to pick up on, while Maude is a much more formal algebraic specification language that uses rewriting logic as its foundational principle.

Anyway, thanks again for understanding. I was wrong, and it's very generous of you to keep striving for the middle ground in the face of that.

macros

Thanks for the gracious reply, Paul. Apologies again, and let's forget about it :)

Thanks also for the links; I have not looked into any of Pure, Maude, or camlp4. I know John Shutt is doing some interesting stuff, but I've never been able to bend my head around it. Maybe it's time to take another look.

You mention that your Scheme background was mostly R5RS and SLIB. As I understand it, though SLIB does (did?) include some macro expansion implementations, the main thing that it uses internally is defmacro. Neither environment has modules, properly speaking. I wrote in the article about features that compose well together. I don't think that defmacro composes well with modules, certainly not in a Lisp-1, because of the identifier scoping issue -- it is easy to accidentally scope free variables relative to the macro use rather than the definition.

The name "hygienic" is a bit nasty, because it implies that other things are bad; I prefer to say that Scheme's macros facilitate lexical scoping. Free variables in a macro are, by default, scoped relative to their macro's definition, not its use. I used defmacros for a number of years before starting with syntax-case (and syntax-rules), and was pleasantly surprised at the convenient, expressive power that "hygiene" gives you. The defmacro debate seems to me to be an after-echo of the old lexical-vs-dynamic-scope discussions of the early 80s.

Anyway, perhaps you knew all that already. Happy hacking in any case.

If you just said Kernel is

If you just said Kernel is mind-bending, thankyou-I-think :-). You might find of some use the few (three so far) blog posts I've written relating to it, here.

Fwiw, the word "hygiene" had always put me in mind of sterile labs (not always practical), with perhaps overtones of compulsive hand-washing, so I didn't take it as a slur on the alternative; the alternative would be getting one's hands dirty, metaphor for not being impractically squeamish.

Paul Graham hates ML

I'm sure you're aware of this, but Paul Graham has made at least one ignorant comment about ML and Haskell, essentially writing them off completely, (see "What do you think of ML and its derivatives?" in his Lisp FAQ) because they weren't dynamically typed. I'm not sure if he used any of them at all.

I assume that was one of the aspects of Blub you were referring to when you were talking about Paul Graham and Lisp hackers.

I mean no offense to Paul Graham, I simply don't agree with him on this point..

That aside, I don't know of any languages other than scheme that have hygienic macros, (except maybe Dylan) so I consider that to be unique. I'd love to see more languages adopt them.

Hygiene in Kernel

I've been accused of drinking too much Kernel Kool-Aid ;-), but if you like hygienic macros, be sure to also check out the approach John Shutt's Kernel language takes to hygiene.

The basic idea is to not quote.

With macros, you have hygiene issues because you work with quoted identifiers, that somehow have to refer back to their original bindings. (Macro-generated identifiers need to refer to the bindings in effect where the macro was defined; user-generated identifiers need to refer to the bindings in the original (user's) source code - no matter how macros or users rearrange the binding structure (e.g. introduce LET's with clashing names or whatever.))

With fexprs, you have the luxury of using live values instead of identifiers referring to values, thereby achieving hygiene simply through good old lexical scope, already built-in to the language anyway.

Let's take as an example the typical (WHEN test . body) macro, a "one-armed IF", that evaluates the body expressions when the test is true, otherwise does nothing (returns a boring value).

In a Scheme macro system, e.g. SRFI 72:

(define-syntax (when test . body)
  (quasisyntax (if ,test (begin ,@body) nil)))

(when #t 1 2 3) ==> 3
(when #f 1 2 3) ==> nil

Scheme being Scheme, there are no "reserved words", so any caller may rebind IF or BEGIN - but of course, the WHEN macro needs to continue working even in this case - which is the definition of hygiene:

(let ((if launch-missiles!))
  (when #t 1 2 3))
==> 3

In Kernel, the WHEN fexpr may look like this:

($define! $when
   ($vau (test . body) env
      (eval (list $if test (list* $sequence body) #inert)
            env)))

($SEQUENCE is Kernel's BEGIN; #INERT is used as the boring value.)

Note that the Kernel fexpr doesn't quote. The forms (lists) we construct and pass to EVAL contain the actual values of $IF, $SEQUENCE, and #INERT. Thereby we reuse ordinary lexical scoping to provide us with hygiene (as $DEITY intended, I might say) - without the need for a hygienic macro system's machinery.

($let (($if launch-missiles!))
   ($when #t 1 2 3))
==> 3

Interesting!

I've used pretty much the same (or at least a very similar) technique, if I understand your description correctly, in muSE - a scheme-like scripting language I wrote for authoring "styles" for automatic video editing. What I refer to as "macros" in muSE there are just functions that don't evaluate the argument list and they can themselves be passed around as arguments, closed over, stored in lists, etc. "Macros" and functions have the same lambda-like syntax too - just that the argument lists of macros appear quoted.

For example,

(define postfix (fn (x y op) (op x y)))

defines postfix as a function, whereas

(define postfix (fn '(x y op) (list op y x)))

defines it as a macro such that (postfix 1 2 +) will evaluate to 3 (the macro is being "used") and (apply postfix (list 1 2 +)) will give you the list (+ 1 2).

I've found this flexibility rather useful in the standard library shipped with the automatic video editor .. where I abstract some otherwise highly repetitive dsl infrastructure code as a function that returns a macro as its result :)

There is also some amount of hygiene in muSE that is a consequence of the way closures are implemented (a macro is also a closure). To redo the when example in muSE,

(define when* 
  (fn '(test . body) 
    (list if test (cons do body)  '())))

(let ((if launch-missiles) 
      (test "blastoff"))
  (when* T 1 2 test))

.. will give you "blastoff".

(PS: I write when* because using when will give you a redefinition warning in muSE, though it'll work as well. Also, I chose not to bother with quasiquote syntax for muSE, so I need to explicitly build the code body using usual functions.)

Is hygiene necessary?

I have little experience of actually using Scheme, but I'm not sure hygiene is actually necessary for a macro system. I have some experience using Camlp4 (a preprocessor for the OCaml language). I have discovered that I never needed hygiene because I could always choose an expansion where the context (bound variables) of each subterm after macro-expansion exactly matched the intended context in the pre-expansion program.

Consider the following arch-classical example of scope-subtle macro justifying hygiene -- given for example in Felleisen's "Hygienic macros for ACL2" paper:

(defmacro or (a b) ‘(let ((x ,a)) (if x x ,b)))

You could also define it this way (pardon me if it's not exactly correct, my Scheme is a bit rusty):

(defmacro or (a b) ‘(let ((x ,a) (y (lambda () ,b))) (if x x (y))))

With this version, the ,a and ,b piece of user-code are evaluated in their intended variable context, with no possibility for shadowing and thus no need for hygiene.

Granted, I see two disadvantages for this version :
- with non-optimizing compilers, this expansion will most probably be less efficient because of the allocation of `(lambda () ,b)` thunk
- it is more cluttered that the original hygiene-needing code, possibly obfuscating the real macro meaning; why play sophisticated scope tricks if a hygiene system can do them itself under-the-hood?

In both cases, I think a satisfying answer (more interesting to me than a hygiene system) would be to enrich the language with specific constructions that allow it to express those cases more easily, efficiently or naturally. For example, one may add a `subst-let` construct that would syntactically substitute its argument (without forcing evaluation first) with correct alpha-conversion/renaming, allowing to write `subst-let ((a ,a) (b ,b)) (let ((x a)) (if x x b))` (no lambda allocation), or an `unlet` construct that would locally disable the latest binding for some variables, binding them to their shadowed value if it exists: `let ((x ,a)) (if x x (unlet (x) ,b))`.

Those two constructs may not enhance readability and maintainability of user-written code and thus be discouraged for use in hand-written code. But even if they are only used in macro definitions, they have the advantage of allowing to write safe macros that only expand to source code in the language, without any specific gensym-phase at preprocessing time. All the scope/binding logic is embedded in the language constructs directly, and I think this is a cleaner and more flexible approach; e.g., unlet may be useful in other contexts, for example to implement refactorings/renamings in an IDE.

Finally, while I have argued that hygiene is not necessary to create scope-correct code (even in existing languages without the proposed `unlet` construct), it should be noted that it is still possible to write erroneous macros that produce scope-incorrect scope -- accidental shadowing, etc. I personally think that a static or dynamic check for scope-safeness (based on scope annotations) that would fail in case of scope violation would be preferable to an implicitly-fixed hygiene system. For example, it helps to design macros for new variable binding constructs -- a case in which hygiene must be disabled.
In absence of such a system, I was actually happy to manually check that my macros where scope-safe. Scoping contexts are really a primitive form a typing contexts -- we speak of 'well-formed', instead of 'well-typed', terms -- and rules for valid scoping are really easy to check syntactically, contrarily to typing rules that can be quite subtle and are preferably left for the mechanized tools to check.

In essence, I mean that the extra-linguistic, preprocessing-time logic of hygienic macros may be profitably exchanged for intra-linguistic, binding-resolution-time constructs in the source language. You get a simpler macro layer, and a more expressive language.

By the way, I'm also not sure homoiconicity is really necessary. Camlp4 manipulates a Caml abstract syntax tree as an algebraic datatype, which is an idiomatic and rather convenient choice, but I don't think we could say it's homoiconic, or that it matters much.

(defmacro or (a b) ‘(let

(defmacro or (a b) ‘(let ((x ,a) (y (lambda () ,b))) (if x x (y))))

With this version, the ,a and ,b piece of user-code are evaluated in their intended variable context, with no possibility for shadowing and thus no need for hygiene.

Somebody may have rebound LET, LAMBDA, or IF.

Your proposal may work in a language with "reserved words" - Scheme is not one of them. Even the SUBST-LET form may be rebound in the context where the macro occurs to a different value.

re: (defmacro or (a b) ‘(let

Put commas in front of the backquoted let, lambda, and if and - so long as those denote applicable fexprs in the definition environment all should be well. [Current Guile doesn't seem to support that. I forget whether older Guile and SCM did but I am certain it would have been trivially easy to support.]

passing macros by value

Somebody may have rebound LET, LAMBDA, or IF.

For some time it's seemed unfortunate to me that you can't evaluate the macro to a value, and then embed that value. One can imagine preventing this problem by evaluating LET, LAMBDA, and IF to their relevant macro expander objects, and embedding those objects:

(defmacro or (a b) 
  ‘(,let ((x ,a) (y (,lambda () ,b))) 
    (,if x x (y))))

This would require some distinction between normal functions and macro expanders of course, as the meaning of the expression `(,+ x y) is already that of a function call, not a macro expansion. I imagine there are other problems as well that I haven't thought of yet.

Also, this only solves half of the hygiene problem. X and Y are still being shadowed.

re: macros by value

There are other problems with passing macros by value. You've only considered the case where one is delaying evaluation, but you should also consider the use of macros that pattern-match their inputs.

I find it worthwhile to guard against this.

If a language permits redefinition of things that already have a syntactic definition, then it should also allow people to write something that will fail immediately and hard if this has actually occurred. And that thing must not be itself redefinable.

IOW, I want to write "assert #SANITY" or something at the top of a file and be absolutely certain that the code in that file cannot import (and has not imported) any redefinition or shadow binding for LET, LAMBDA, IF, or anything else defined by the language standard.

Let there be as many new forms as anybody wants. But don't surprise the authors of other parts of the system by modifying basic infrastructure. If basic infrastructure can be modified, then let there be a way for them to be certain that they are not working under such a modification.

Hygiene isn't necessary

But it helps prevent a lot of subtle errors.

Assuming you want a non-hygienic macro extension mechanism, it would be better if extra macro-glue is required to break hygiene (e.g. to explicitly inject a symbol) rather than to protect it.

re: hygiene isn't necessary

Just to bring this back full circle:

What's a good extension language for a (purported) GNU project? In the vision of a complete GNU system, based on but extending unix with a graphical desktop and lots of commonly used applications -- what's a good extension language for those apps to use?

There's no perfect answer. The experience of early GNU hackers was that a lisp is a fine extension language -- and that a few clean-ups relative to Emacs lisp would be nice.

Guile, since very early on, had standard hygienic macros built on top of a traditional FEXPR-like mechanism inherited from SCM. It took a while to shake out the bugs and, by now, all of that has probably been completely rewritten but... it was trivial, relatively speaking, to have hygienic macros for convenience and lower level traditional sorts of macros for a certain kind of raw gritty power when you really wanted to seize the graph interpreter by the throat and teach it a trick or two.

There was never really any fight here, in the Guile world, at least back when. Fexprs? Hygiene? Which is it? Guile's original answer was "yes".

Again: Guile did (does) not struggle for popularity for any language design reason that I can see. It was political and economic power struggles early on.

Hygiene is necessary

I assume you mean: The mechanism of hygienic macro systems isn't the only way to achieve it. (There are also fexprs, and faking it with gensym+Common Lisp's package system.)

No

I mean that sometimes it's okay to break hygiene, and sometimes it's even useful. Not every macro needs to be hygienic. Hygiene is valuable, and should be the default, but isn't necessary.

I think it would be better that, when we break hygiene, we should do so in an obvious, explicit, and controlled way. I.e. consider having the converse of gensym: instead of using gensym to fake hygiene, use 'inject-sym' to break hygiene.

I am assuming clean staging and modularity - i.e. macros run their course before static analysis, and cannot reach into code from other modules. As you know, I do not favor frankenprogramming, including fexprs (frankenprogramming expressions).

Under these constraints, though one might break hygiene for convenience (i.e. so you can capture boiler-plate that requires or binds names) it would at least have local scope.

Yes

I mean that sometimes it's okay to break hygiene, and sometimes it's even useful.

Of course, yes.

I.e. consider having the converse of gensym: instead of using gensym to fake hygiene, use 'inject-sym' to break hygiene.

datum->syntax.

fexprs (frankenprogramming expressions)

I had a good laugh - but you're wrong: fexprs are fun expressions!

All mad scientists

All mad (computer) scientists would think so... and laugh, and LAUGH.

Be sure to practice your mad-laugh in front of a mirror before exposing it in public. ;)

Fexprs aren't the evil

Fexprs aren't the evil anti-modularity device you think they are. And they are too fun. <Cackles insanely>

Anarchy and Anti-modularity

With discipline by both the module developers and clients, secure modularity is feasible with fexprs.

But I hypothesize: the domain described by 'modular applications of fexprs' is equal to module-local syntactic abstraction - such as macros. If I want modularity, macros will always be sufficient. The difference in abstractive power of fexprs relative to macros is good only for anarchistic anti-modularity.

Anarchy isn't 'evil'. Anarchy can be exciting, and surprising, and fun. But anarchy is rarely predictable, secure, scalable, or efficient in-the-large.

I would prefer to not need words like 'exciting' and 'surprising' to describe the developer experience. 'Fun' is okay, but can obtain plenty of entertainment from predictable systems and environments (e.g. Chess). If we want exciting, combine predictable with constrained resources (like time or budget) or limited knowledge (e.g. incomplete requirements).

Constraints

You're misapprehending fexprs as if they were a tool for violating constraints. In the context of the language as a whole, they're a tool for shaping constraints, so one can end up allowing clients to do exactly what one wants to allow them to do, neither more nor less. Juxtapose that with the earlier thread, where you tried to claim that it would be more powerful to allow violation of constraints, whereas one of the most basic insights behind my treatment is that a language with no constraints is very weak in the sense I'm interested in.

For what I'm calling "abstractive power", a language without constraints is weak, and a language with excessive constraints is weak. Constraints that are always subject to revocation aren't really constraints, so that's just the unconstrained option. A language with constraints of an inflexible form isn't strong, anyway. Of interest is the power of the developer to control exactly what constraints the clients will and won't be subject to. Keeping in mind, the clients (in the sense I mean, anyway) are themselves developers of what will be used by later clients, so the developer's power to shape constraints includes power to shape constraints on how later clients will be able to shape constraints on their successors.

For example. You (I'm pretty sure it was you :-) asked me a question quite a while back that I answered with a discussion of unwanted capture. A function f takes a parameter g and calls it, giving g the option of capturing its dynamic environment when called, and thus fully accessing f's local environment and reading f's static environment. It's clearly not good enough to say "g shouldn't do that", because we want to be robust against possible malicious (or incompetent) design of g. At the time I simply said I was awaiting further insight — but I could have added that once one decides what one wants to do, there's a decent chance it might not require new facilities to do it.

Perhaps you'd like to arrange that when any combiner g is called from inside module m, and g tries to capture its dynamic environment, it will only get the real one if its static environment is within either m or some other module trusted by m. I'd have to wade through it, but I suspect you'd make a minor tweak in the evaluator and use the available properties of operatives, environments, and keyed static variables.

I've got one possible approach I've mused on (in a footnote in my dissertation) that might require a language change even in theory. Even that I wouldn't care to bet on, till I'd worked it out in full detail, though. Because it's easy to underestimate what sorts of constraints can be articulated with the devices already in the language.

Rigid constraints

A language with constraints of an inflexible form isn't strong, anyway.

I disagree. Rigid constraint do not inherently conflict with abstractive power. They only constrain certain paths to abstractive power. Some paths compatible with rigid constraints include: local syntactic abstraction, frameworks, staging, monads or arrows or multi-level languages, and secure meta-object protocols.

I favor rigid constraints - i.e. universally protecting properties such as safety, modularity, purity, capability security, safe concurrency, reactive composition, and termination or progress. One aspect of modularity - that we do not leak implementation details between modules - is a sufficient condition to undermine the utility of fexprs.

Defaults matter. Dangerous things should be difficult to even allow by accident. In your `f calls g` example, even if you find a fexpr-based solution, you're still violating secure interaction design principles (path of least resistance, visibility, self-awareness) because it was not secure-by-default.

once one decides what one wants to do, there's a decent chance it might not require new facilities to do it.

IMO, a good language designer must know what developers want, and even help teach developers what they want. Language designers truly are in a position to 'know better', especially for in-the-large properties, because regular developers focus on a particular domain or problem. Application developers should not be wasting their precious brainpower thinking about all the piddly details of security and safe concurrency and runtime upgrade and compatibility between libraries.

I sometimes feel that language designers favor fexprs and the like so they can shirk the responsibility to actually provide a productive programming environment. (I.e. "I'm not giving you a good language. I'm giving you - who has no skill in language design - the tools to build a good language.")

Anyhow, your argument provides a nice example: some developer decided on an insane security policy - "g can capture the dynamic environments of modules trusted by m" - and your fexprs probably provide the expressiveness to implement it since we lack security-by-default. What we really wanted, though, was "g can capture the dynamic environments of modules that trust m".

In a model with object capability constraints, developers would bump into a wall when trying to implement this silly policy and would quickly realize: "hey, it's the policy that's the problem, not this wall". Rather than following the path-of-greater-resistance (building a multi-level language to surmount the wall) they will simply fix the policy. Fexprs provide no guidance, only power.

it's easy to underestimate what sorts of constraints can be articulated with the devices already in the language

Given my work with monads, arrows, and the like, I believe it's hard to overestimate what sort of constraints can be articulated in a language - even one with rigid constraints.

Somewhere in Thomas Kuhn's

Somewhere in Thomas Kuhn's Structure of Scientific Revolutions, he describes how scientists using different paradigms talk completely past each other.  The disparate paradigms cause them to not even be talking about the same thing.

I disagree. ... They [rigid constraints] only constrain certain paths to abstractive power.

Well, you aren't disagreeing with me about the effect of rigid constraints on abstractive power.  Nor agreeing with me.  Because you can't be talking about the concept I'm referring to under the name "abstractive power".  The thing I'm talking about, it's meaningless to talk about a path to.  The power I'm talking about lies in the shape of the entire network of paths departing from the language of interest.  Any path you look at is just a part of the topology, rather than leading to the topology.

The basic communication rift continues down through your post, flickering from one form to another.  I think you mean something entirely different by "rigid constraints" than I did by "constraints of inflexible form". The bit about

Dangerous things should be difficult to even allow by accident.

seems to reveal a worldview in which allowing something to be done is not, itself, doing something — whereas my concept of abstractive power is built on a worldview in which the consequences of each action are understood entirely by what later events are thereby allowed or disallowed. So that not only is allowing/disallowing something a form of doing something, but in fact it is the only form of doing something.

By the time you're talking about secure interaction design, I suspect we're so far down different roads, talking about different things, that it's not even practical to try to synch the secure interaction design remark with what I'm doing — the structurally divergent worldviews would have to be synched first, so we'd have some common ground from which to view other issues.

Abstractive Power

I believe you are measuring abstractive power incorrectly, at least by your own prior accounting of the subject.

A language is a point in a language-space. Abstraction is modifying one language into another by means of the former. This does not suggest the path is relevant to abstraction. The most natural measure of abstractive power is thus a reachable volume in the language-space. Two languages can have equal abstractive power even if they require different paths to reach each node in the language-space.

This is different than measuring the 'topology' of the language-space (how you get from one point to another with intermediate languages), which is an issue of expression rather than abstraction. Rigid constraints on topology can still support paths to equal volumes of the language space - i.e. paths for abstractive power.

Regarding dangerous things - even when 'allowing' is modeled as an action (e.g. sharing capabilities), the position holds. This is not about abstractive power, but also about expression, and defaults - e.g. whitelist vs. blacklist, and whether explicit syntax is needed to provide paths or to constrain them. Fexprs have the wrong default.

I'm reminded that my ideas

I'm reminded that my ideas on this did not develop quickly.  My notion of a language-space (as you say) dates from about 1990.  I didn't clearly see how to use it as more than a vague metaphor for about ten years after that.  The topology/"shape" insight might appear, from the outside, to have arrived relatively abruptly, but internally was more like reaching critical mass after a slow/patient accumulation of thought over the intervening decade.  So I appreciate this is nontrivial.

I believe you are measuring abstractive power incorrectly, at least by your own prior accounting of the subject.

If one were to suppose that 'volume' and 'topology' were separable concepts, this would seem a little like someone describes computational complexity and then goes on to describe a really interesting problem they're studying in time complexity, only to be told this is a mistake because they should be studying space complexity instead.

Except that the concept of 'volume' isn't separable from 'topology'.  Which is basically what I realized after those ten years of accumulated thought.  (Fourteen years trying to articulate an epiphany about fexprs from 1997, and now what I want to articulate is an epiphany about abstraction theory from 2001.  It's not fair! :-)

At first blush, it seems intuitively that abstractive "power" would mean "volume".  But what does "volume" mean?  One could say "how many languages are reachable", but... "how many"?  These are countable infinities.  To pin down a comparison between two languages, so as to formulate a specific claim that one is at least as powerful as the other, you have to set up some mapping between languages reachable from one and languages reachable from the other.  When are two reachable languages "the same", so one should map to the other?  When their topologies of reachable languages match.  It's going to be topologies all the way down.  The point about flexibility follows very quickly from there (so quickly one might miss if one blinks).  Reaching "more" languages means reaching languages with a greater variety of topologies.  And greater variety of topologies means the "reaching" language affords greater flexibility in controlling what will and will not be possible to do in the "reached" languages.

Is "abstractive power" the only property of these topologies that is interesting/desirable?  Surely not; that would be bizarre.  On other measures alongside abstractive power, I'm now in slow/patient accumulation-of-thought mode.  Accumulating from any and all sources, and I'd love to figure out what your insight into this is... supposing that your insight might be related to it, which I can't even be sure of yet since we have this metastasized difference of terminology between us.  I also have in mind to extensively study the consequences of abstractive power, to see what other subtleties of the "shapes" might be revealed as interesting by a deep understanding of those consequences.

Equal Power

I feel you've somehow missed the broader implications of one of your earlier arguments: "not only is allowing/disallowing something a form of doing something, but in fact it is the only form of doing something"

Consider two different origin languages (O and O') that have similar properties, and two reachable languages (R and R') that have similar properties. The languages are independent of their origin - i.e. R is a single language (with a single topology), reachable from both O or O'. O and O' may have an equivalence - they can modify themselves into the same set (volume) of languages. R and R' may have the same equivalence. I would say that this is equivalence maps well to your 'abstractive power'.

O and O' may have different topologies - i.e. require different operations for composition and abstraction. If we were to assign costs, we might measure and find that the path from O->R is orders of magnitude cheaper than the path from O'->R or O->R'.

A language with constraints of an inflexible form might refer to constraints in expression that result in a rigid topology. This is, of course, a property measured in the origin language, not in the languages we reach. Such languages can raise barriers, e.g. to discourage insane security policies or unsafe composition. When faced with such languages, developers will tend to follow paths of least resistance, e.g. from O to R rather than struggling to reach R'. After all, R effectively allows/disallows the same things as R'!

I hope some of this helps clarify my notion of 'paths to [languages of equivalent] abstractive power'. Following the paths made efficient by the origin language (and, inductively, by each intermediate language), it is possible we'll never reach the same target language even if we seek a language of equal abstractive power. In a sense, we can have natural 'parallel paths' through a language space determined by topology surrounding the origin language.

Fexprs are very 'flexible' in not just abstraction, but also expression. The benefit is that they make it cheap to reach many languages. The cost is that it's easy to reach many bad languages that are close to good languages but subtly wrong.

Your uses of the word rigid

Your uses of the word rigid consistently don't add up for me —the ways you apply it generally make me doubt we mean the same thing by the noun you're applying it to— so that I'm (unfortunately) quite confident we haven't bridged the paradigm gap yet.

My overall sense, from that post, is that you're still thinking too much of languages as islands that one occasionally hops between via acts of abstraction.  In my treatment, a language is defined entirely by the possible sequences of texts by which it may be transformed.  Think of the 'language space' as an infinite-state machine, whose transitions are labeled by terms over a CFG.  For each state of this machine (i.e., for each 'language'), there is a set of finite paths starting from that state, and a set of finite sequences of terms that label those paths.  Each such set of term sequences is closed under prefixing, i.e., for all sequences u and v, if uv is in the set, then u is in the set.  Now throw out the states.  You don't need them.  Everything that can be meaningfully said about a language can be found from that prefix-closed set of term sequences; so just formally define a language to be a prefix-closed set of term sequences.

(As I've noted, my technique for explaining my fexprs insight, though far from perfect, is years more evolved than my technique for explaining abstraction theory.  Some of the terminology needs to be used with care lest it become a distraction.  I also don't know what the preferred minimal set of points to explain would be; one can't very well explain the whole of the techreport (linked below in my reply to Matt M) as an "introduction" to the subject.)

[edit: typo fix]

Islands bridged by abstractions

Your uses of the word rigid consistently don't add up for me [...] you're still thinking too much of languages as islands that one occasionally hops between via acts of abstraction

When our abstraction mechanisms are rigid, this seems an apt metaphor. The goal is to reach a language with specified properties, not merely to move, nor to reach any particular language. We don't need to reach the same target language - e.g. if proposed target languages R and R' each meet requirements, we'll approach the one that is on the path of least resistance based on the origin language. The next edit may then start from R or R', the choice of which depends inductively on the predecessor language (ultimately, all the way back to the initial language). We may also change the program 'upstream' or in a different layer to eliminate local barriers. The initial language thus has a deep, inductive, pervasive impact on all paths of least resistance, all 'bridges'.

Take your state-machine metaphor. Step back a bit so you can see state-subgraphs of low connectivity between 'languages'. The 'islands of languages' and the 'bridges between them' will become visible. Pull back even further and you'll see each island as an individual state (a class of languages). Pull back even further, and these classes of languages will also be clumped together in subgraphs with relatively low connectivity. And on and on and on. The topology is not uniform.

And, of course, there are languages where you're pretty much 'stuck'. Not every island has outwards connections. We can easily get 'stuck in a loop'. No matter how you spin your definitions, a Brainfuck program will STILL be Brainfuck after you add another operation.

You can't avoid rigidity by putting your nose really close to the language graph.

Rigidity doesn't imply we're stuck, but does suggest we don't have many options for transitions - just a few bridges between (classes of) languages, and developers with clipped wings.

What is abstractive power?

I just found and skimmed your paper on 'S-expressiveness.' Is that what you have in mind for a definition of 'abstractive power'? Why is this a useful definition? Can't all Turing complete languages implement all the same evaluators?

Edit: Specifically, I'm thinking of Turing Complete languages that have the ability to define string literals in the "tail" of a program (such that the string is terminated by EOF). It seems that every such language would seem to be "universal" according to this definition.

Addendum: The main problem I see with the definition of S-expressivity (the one I pulled from a .PS file on the internet that lists you as author), is that it doesn't seem to capture interaction between the tail of the program and the environment (including any prefix of the program). I think the right definition has to capture the idea of what semantics a sub-program can have, and that isn't well captured by just a language (subset of strings).

Abstractive Power of Programming Languages: Formal Definition

No, that's an old paper (1999). The most recent stuff available atm is a 2008 tech report,

(PDF) WPI-CS-TR-08-01  Abstractive Power of Programming Languages: Formal Definition.

Ok I read through this new

Ok I read through this new stuff and agree it's better than the old stuff. I'm mostly with the setup through your definition of expressiveness. But when you go on to define expressive structures on the way to abstractive power, you lose me. Why do we want to study expressive structures rather than just languages with semantics (which you capture with observables)? Also, you say that a language without 'private' to encapsulate definitions can macro express one with 'private' by just projecting away 'private', but that doesn't seem true to me since it transforms erroneous programs into correct ones. What's wrong with my reasoning?

Edit: I guess really my point is not that you're wrong, but that if you'd just captured the semantics more completely by allowing result to be 'error', then that would seem to resolve the conundrum without the 'expressive structure' machinery.

To compare expressive power,

To compare expressive power, consider languages with semantics.  Look for a transformation from one language to another (in some chosen class of transformations — Felleisen used macro transformations) that also preserves semantics.  Because the thing preserved is semantics, expressiveness measures a language's ability to regulate semantics.  That's certainly of interest, but I set out to measure ability to do abstraction, i.e., ability to regulate how things will be expressible.

For abstractive power, you need to compare not languages with semantics, but languages with expressive structure.  So that when you transform one language to another, the thing preserved is expressive structure, and you get a measure of a language's ability to regulate expression, rather than its ability to regulate semantics.  If expressive power is the first derivative of computation, abstractive power is its second derivative.

Including 'error' in language semantics would presumably affect expressiveness, rather than abstractiveness, since expressiveness is what semantics directly affects.  The effect on expressiveness should be to essentially disable it:  statements of the form 'Y is at least as expressive as X' are only possible because the mapping from X to Y doesn't have to map error semantics of X onto anything of Y.  Because X error semantics don't get mapped, Y is permitted to do more than X; otherwise, the transformations wouldn't provide a partial order, merely an equivalence relation.

One use of fexprs

One use of fexprs is to collapse staged programming to a single stage of runtime.

Specifying when something is to be evaluated in the course of putting together your program can become arbitrarily complicated. Fexprs can completely avoid that problem. Fexprs implemented with proper respect for the scopes of their arguments can do so without introducing new problems.

Fexprs, unconstrained, can

Fexprs, unconstrained, can do a lot of things. Fexprs, under a constraint to respect black-box modularity, can do a lot less. I hypothesize the latter are roughly as capable as macros or user-defined syntax. You mention "fexprs implemented with proper respect for the scopes", but have you defined 'proper respect for the scopes' or considered whether something more specialized than fexprs would fulfill the same role?

In fact, I have.

"Proper respect" is just lexical scope and hygiene, same as for macros. The key is to ensure that no expression is accidentally evaluated in an environment which is not the environment where it appears.

Dynamic scope and a failure to represent closures fully are what made them such a bad idea for the environments where they were widely discovered to be a bad idea. We've done a lot since then, mostly for the sake of languages with lazy evaluation, but also for the sake of hygienic macros.

As to having the same expressiveness as macros; as a callable form you're right, but there is more to expressiveness than what happens in a call. Consider that macros aren't first-class and don't exist at runtime. Routines, including those defined by fexprs, are first-class values and do exist at runtime. They can be stored in structures, returned from functions, applied using APPLY, bound to identifiers, rebound during runtime, used anonymously the way lambda expressions allow the use of anonymous functions, etc... all of which means they allow one to express things not easily expressed with macros. To translate a program that uses those means of expressiveness into a typical Lisp, it would be necessary to substantially (nonlocally) restructure the program.

Hence, via Felleisen's definition of expressiveness, they are strictly more expressive than macros and user-defined syntax.

However, given a language that can express (in a positive sense) any semantics, the more interesting question becomes How, specifically, do you specify and implement guarantees of desired properties?

Obviously, if you have a system where any application can be defined to mean anything (which is where fexpr-based systems are going) it becomes vitally important that the programmer must be able to express and rely on _restrictions_and_constraints_ on the program in a way that makes sure those restrictions are not violated, just as easily as he can express positive actions and non-error semantics in a way that makes sure those semantics get executed when called.

Thus, abstraction is built in the space between expression (where we tell the program to 'do this', and ensure by automatic means that it does) and prohibition (where we tell the program that it must NEVER do-that, and ensure by automatic means that it does not).

Of course such expressible prohibitions have always been needed, ever since people started #include'ing code which they did not write, have not examined, and do not understand. It may be that the library works by redefining something which they rely on in other parts of the program, or that different libraries may want to introduce conflicting redefinitions of the same thing. It's just that the "completely open" nature of fexprs, which can have just about any semantics at all, bring it to a head and remind us of what's been missing from our model of abstraction.

For example, when using scheme, I'd really like to be able to guarantee that nothing can capture a continuation that includes code in my module. Having that abstraction barrier would allow me to treat the other modules in a much more tractable way. But Scheme give me no way to express that, with the result that some folks may use my module where it's not valid, and with the additional result that I may accidentally choose a library that does not comply with the constraint to import when there is one available for the purpose which does comply.

Fexprs are more expressive

Fexprs are more expressive than macros, I agree. But it seems to me that the greater expressiveness of fexprs (relative to macros + first-class functions) can only be utilized in ways that reveal the implementation details of one module to another - even at runtime, as you mention.

My goal is to express useful, high quality, maintainable programs. A high level of expressiveness for useless, error-prone, or unmaintainable programs does not contribute towards my goals. Hence, I don't acknowledge expressiveness as a goal, good, or end of its own. I believe we can find more balanced approaches that achieve expressiveness without sacrificing effective control, though I think such approaches may be more specialized (compared to the one-abstraction-fits-all fexpr).

Different goals, different needs.

If you don't acknowledge expressiveness as a goal, good, or end of its own, then you're clearly not interested in solving the same problem I'm working on.

For me it's all about creating a translation-target dialect that doesn't lower the level of abstraction of the translated programs. In other words, that which can be expressed in any language I'm regarding as a source, I want to be able to define a way to express, with no nonlocal transformations -- ie, with the same abstract syntax tree.

Ultimately my goal is to render as much code as possible automatically-translatable, and all translated code interoperable.

So, clearly expressiveness is the most important thing for my problem -- and just as clearly, not for yours.

Interoperability

If your ultimate goal is interoperability, then it is not clear to me that "expressiveness is the most important thing for your problem." Are you missing a step somewhere between "I translated all my code to a common target" and "The code interacts in a useful, consistent, and comprehensible manner"? Do you believe you will not need to constrain expressiveness to achieve interop?

Interop and smooth integration (e.g. mashups) of independently developed applets, libraries, services, and frameworks has been among my goals for years. Translating individual modules to a common target language (i.e. module-local, user-defined syntax) is part of my modus operandi, but I consequently constrain such ad-hoc expressiveness to module boundaries. Interop is achieved via composable abstractions and collaborative resources.

Again, different goals.

Interoperability is more than just getting mashups working; it's also about doing so without lowering the value and maintainability of the codebase. Constraints are an important part of the maintainability and value of the codebase, and no matter what semantics the target language *allows*, it must be constrained as the individual modules need it to be constrained in order for those modules in the translation-target language to have the same value and maintainability as the original.

Because different modules, originally from different languages, have different requirements for constraint and abstractions that often depend on those requirements, it follows that constraints must be controlled intentionally and configurably. That means either that constraints can be explicitly expressed, or that they are present by default but capable of being explicitly suppressed -- probably some of each.

It does not permit such constraints to be incontrovertibly built-in at module boundaries or elsewhere.

I believe that because different languages are constrained differently (and some, like machine code, hardly at all) it is necessary to allow the language to define the level of constraint (and which constraints) are active in a given module.

For example, one cannot implement an isomorphic translation of C without global variables, unchecked casts, preprocessor directives, and assignment. If you limit the effects of these things to module boundaries then different modules both defined originally in C will not interact as they would have in the source language, and translation of a program containing both modules therefore fails. But if these things are allowed everywhere, then the value of a module originally defined in Haskell is subverted because its guarantees are no longer valid.

A translation of something into another language is not as useful as the original unless it can provide the same guarantees as the original. Where such guarantees (compilability, static typechecking, etc) are provided due to the way the original language is implemented, the translation target needs a way for modules originally from that language to state that those guarantees are required and the implementation needs a way to check that they are achieved, or else further development done to such modules after translation (using the target language) will destroy those desirable properties and hence part of the value of those modules.

So ... short version ... I do not believe in the existence of one-size-fits-all module boundaries with abstraction barriers. I believe that it is necessary to define ways for these barriers to be configured to meet the needs of different modules depending on their source and on the properties a maintainer wants to preserve (or achieve) after translation.

Reporting conflicts

Interop really is about getting mashups/toolchains/components/frameworks/services/etc. working together - preferably without hassle. With your emphasis on ad-hoc constraints, it seems you're instead looking for a good report of why things failed to work together.

one cannot implement an isomorphic translation of C without global variables, unchecked casts, preprocessor directives, and assignment. If you limit the effects of these things to module boundaries

I limit user-defined syntax to module boundaries. But that doesn't mean cross-module effects cannot be modeled; it just needs to happen at a later stage. (e.g. modular manipulation of the link-time environment with collaborative resources)

I expect, in general, you'll need to model the whole compilation toolchain for most languages. E.g. you use phrases like "modules defined originally in C", but C technically doesn't have a module system, and you'll need to model the nitty details of the preprocessor and #include directives if you want to preserve behavior.

Unfortunately, that just means you now have a whole compilation toolchain that doesn't interop. E.g. the toolchain you model for C will not interop with the toolchain you model for Haskell. And even if you don't model the toolchains, you'll have a similar issue regarding the constraints for C modules vs. Haskell modules.

Expressiveness might give you simple translations, but will not give you interop.

Anyhow, I hope your immediate goals don't prove a dead-end for your longer term ('ultimately') goals. I can say from experience that it's quite irritating. (Back in 2006-7, my main focus was expressiveness, but it couldn't take me as far as I had hoped.)

Right...

Immediate goal: get the translation target dialect (positive semantics) working and implement an automatic translator for one or two languages (proof of concept).

Medium-term goals: extend the translation target dialect with constraints and restrictions (negative semantics) that allow modules to be specified and protected differently. Add extensive runtime libraries, possibly by translation from other languages, to fulfill the requirements of those modules.

Long-term goal: Document and code as many ways to discover semantic equality of code vectors as possible, both to eliminate redundant dependencies, and as a basis for optimization. I'm aware that the problem is intractable in general, but I consider it likely that handling a few thousand particular cases and subfamilies of it will get 90% or more of the job done in practice.

the greater expressiveness of fexprs

it seems to me that the greater expressiveness of fexprs (relative to macros + first-class functions) can only be utilized in ways that reveal the implementation details of one module to another

It is, perhaps, of some passing interest that this is a relatively specific point on which our intuitions appear to be telling us quite different things. My intuition tells me that, although the greater expressiveness of fexprs obviously does include techniques that break abstraction barriers (break either surgically or brutally), it also involves removal of deeply gratuitous restrictions on expression of things that do not break abstraction barriers.

One caveat. My intuition is likely driven, here, by a more general notion of expressiveness than Felleisen used. Felleisen's expressiveness imposed a partial ordering on languages based on semantics-preserving language transformations of a certain class — polynomial transformations, aka macro transformations. As a peripheral development of my work on abstractive power (mentioned elsewhere in this discussion), I parameterized expressiveness by the class of language transformations, so that Felleisen's notion would be just the particular expressiveness structure produced by using the class of polynomial transformations. Using a more powerful class of transformations makes "A cannot express B" a stronger statement, "A expresses B" a weaker statement, washing out subtler differences between languages. A less powerful class of transformations makes "A cannot express B" weaker, "A expresses B" stronger, identifying subtler distinctions between languages. When I speak of fexpr expressiveness that doesn't break abstraction barriers, I don't have a clear notion of how subtle the distinction may be.

What other systems are already available to extend an app ?

Last time I had a C program which I wanted to add a scripting language (first to configure the C program, then to extend it), I considered :

- chicken
- racket
- bigloo
- ecl
- clisp
- python
- lua
- guile

I needed posix threads, simple and well documented interface with C, which ruled out all but guile and lua. I favored guile since (as my initial list suggest) I wanted to practice scheme a little (and I like gnus).

I read here and there many criticisms about how guile is useless since there are so many high level languages available already, but I wonder how many people actually tried to extend a C program with them ? Python for instance is particularly not appropriate, yet it's often cited...

JavaScript is an option

JavaScript is an option - one that benefits from existing reach and use.

Javascript as an extension language

What implementation do you think is more appropriate to link with a C program ?

JavaScript implementations

I'm only familiar with two implementations - SpiderMonkey (C), and V8 (C++). Of these, for a C program, I would favor SpiderMonkey. For C++, today I would favor V8 - it is the cleaner and simpler implementation.

Unfortunately, SpiderMonkey is difficult to access these days. It used to be you could download and build just SpiderMonkey. But if you want all the improvements for the last few years, you need the version buried in a massive Mozilla package with a difficult Cygwin environment.

Recently, I've become interested in HJS (an interpreter for JavaScript in Haskell) and YHC (a Haskell-to-JavaScript compiler). Haskell, today, is extremely competitive as an web-application server.

use your delusion

Hi Raould! Thanks for the comments, they are fair. You start with the meta, and it's true that the article is parade-like; in some ways it reminds me of military parades, with the attendant possibility that the missiles are actually duds. (Digression: is there a situation in which parading a dud missile is the right thing to do? Lambda: the ultimate deterrent?) But anyway, let me explain the context a bit.

Jim Blandy also gave a presentation at the GHM that argued that (1) the multi-lingual thing won't work, making that potential benefit of Guile moot; and that (2) languages don't actually matter, and that GNU needs to do something more popular, Python in particular. I think that (1) is incorrect, and that (2) is misguided. This article is part of the argument for (2) (though there are some other things to say there).

I didn't write this article for the LtU audience; it was really for GNU folk that knew Guile. That said, while this tone is useful when you have strategic goals (as I do), there is a danger in writing with the prophetic voice, in that there is less room for nuance. So, I totally understand your reaction. But I think that marketers understand this voice as well, so there is also the danger that if you avoid charismatic writing, that your thing will never be accepted. After all, we're all about getting language ideas out there, right? We need to use the tools at hand, to the best of our abilities. And, I think, we need to be supportive of each other as we drag the C people towards Lambda the Ultimate, obliquely though it may be.

Now, unmeta: you mention typing. I mentioned it in the presentation but the video is not up yet. I grant that it is a point in which other languages are more expressive than Scheme. But I think you have to consider, is a language with a powerful type system a good extension language? Furthermore, does such a language support a dynamic programming environment? The things I heard about Yi were interesting but I also heard there were some nasty hacks.

As Paul points out, I am ignorant about many points, not coming from the typed-languages tradition. It's telling, but the most convincing approaches I have seen meld contracts and types, as the fine Racket folks are doing.

You mention macros. I would first of all question your second point; I don't even know what it means for a matcher to conflict with another matcher. Matchers interact with data and scope, not other matchers. Your argument could read, "my procedure doesn't work with your procedure problem". But, maybe you are just arguing for features to be baked into languages. While there are some positive points here, I cannot agree.

Haskell (e.g.) does manage to get away without macros, but laziness has its cost, as Peyton-Jones has argued. As a general meta-level facility, macros appear to be second only to fexprs in that regard. Even if this is mistaken (quite possible), most PL folks could make a good debate aboue it -- which, for the general audience, makes it true compared to whatever language they are using. (They're probably not using Haskell.)

Regarding other languages; this is a tricky topic. It is clear that you can implement any language on Guile. The question is, will it interoperate? Even on the JVM this experience is often not so great (I hear). If it doesn't interoperate well, why bother? Basically this was Jim's argument. Racket and Typed Racket is a good example of working well, though there are issues of course, even there; and again, the PLT folk are doing very interesting work here. But what if you want to offer someone Lua? Not like-Lua, Lua really. Can you compose features from the various languages in a sensible way? Procedures, lists? Not clear, I don't think.

You are also concerned about GNU NIH syndrome. There might be some of it there. But what is GNU, if not an organization that makes software? Take away the software and the organization is damaged, to me. If Chrome stomples Firefox, will Mozilla's stated commitment to "the open web" lead them to drop Firefox and work on something else? It sounds extraordinarily painful, and damaging to the organization. It it great that the boundaries of the free software movement have grown so much as to not possibly fit within the bounds of one project, but projects do have anchors.

Finally regarding popularity, nobody worried about this for Guile five years ago, and that's because it wasn't doing very well. But after a fair amount of work, it's doing OK. And now, the resistance comes; are we at step 2? I cannot predict the future, but if I do that first-order correlation around the short now, I see that if I work at Guile, then things get better. While that continues to be the case, I'll continue the hack (and, for better or worse, the preaching, I think). Other folks might perceive a different future, but if we were only concerned about what other folks think, we would all be using Java!

Maybe not LtU-oriented, but definitely LtU-relevant

You didn't write this article for a LtU audience, but its content feels strongly LtU-relevant to me. In essence, you argue for a flexible, 'low-level' in the "build your own abstraction on top of mine" sense, expressive extension language. Your macro example says that, but to me it's the delimited continuation example that really emphasizes that part of your argumentation; indeed, delimited continuations are really the mother of all control structures. This idea is debatable (many people have talked here about the 'incompatible libraries by lack of standard high-level abstraction' problems, it's unclear whether it applies to extension languages as well), but definitely language-centered and interesting; of course this has been the general Scheme position for a long time, but Guile may be in an interesting position to continue this vision.

It's interesting you mention the Racket people, both for their macro work and their typing and contract work. On both fronts, there seems to me to be the most advanced Lisp/Scheme implementation out there. You seem to be quite interested in low-level implementation details and efficiency problematics -- it's interesting that everybody mentions v8 or LuaJIT performance as a big point in favor of JS or Lua, while no one questions the idea that an extension language may not need so much performance after all. I hope you can advance in this way; LuaJIT certainly demonstrated that, starting from a well-designed language, a talented individual or small group of individual could build a very competitive implementation. The Racket folks are doing research on other aspects of the language, and I would love to see these ideas going into Guile -- and performance improvements go to Racket -- so I hope you find a way to collaborate.

You also talk about your view on success, popularity, group values, which I'm not sure are so relevant to LtU -- and I have generally not been very interested in the comments to your post expanding this aspect. I also wish to apologize for having maybe a bit too naively submitted your post on reddit -- I didn't think of LtU, but should have; while I think it deserves the interest it got, it was apparently not meant to be widely distributed; and I felt a bit depressed reading some of the comments on your blog.

High level incompatibility

many people have talked here about the 'incompatible libraries by lack of standard high-level abstraction' problems, it's unclear whether it applies to extension languages as well

The issue certainly applies for extension languages. But an application's architecture (regarding concurrency, shared references or closures, and other interactions between extensions, extension model, threat model) can do much to marginalize the concern.

For example, the issue of having dozens of incompatible frameworks doesn't seem so bad for JavaScript and DHTML until you start trying to develop secure mashups.

an extension language may not need so much performance after all

History has demonstrated repeatedly and adequately that the extension language eventually becomes the whole application. I personally consider the distinction a bogus one - a GPPL should be designed for extension, and it's the low-level CPU or GPU code that becomes the DSL.

Performance is certainly be a concern. Even if we allow for hand-waving about how much more powerful our computers are, battery-life still demands efficiency. But I agree with the route you favor: develop a good programming model, and don't compromise the design for apparent performance advantage.

Performance and Extension Languages

I addend my earlier statement.

If our components cannot work together effectively (safely, securely, efficiently, concurrently, remotely), we'll inevitably pay 'boiler-plate' costs at the composition boundaries - e.g. for loading and initializing replicate libraries, serialization, deserialization, validating structure or authority, synchronizing, scheduling resources, and so on. Such redundancy is often reflected in the code, which costs productivity. Further, many automated optimizations are unavailable across these boundaries - e.g. even if some data is never used, we might still pay to serialize, store, and parse it.

To avoid the composition overheads, developers are under pressure to write monolithic code. And monolithic code in turn hinders reuse. And hindered reuse, in turn, results in more not-very-composable components. This is a vicious cycle.

Languages such as C/C++ demonstrate these issues. However, these aren't the sort of problems that show up in a shootout benchmark, and they apply just as effectively to extension languages.

My work on resolving the bloat and redundancy performance immediately precede my development of RDP. Unnecessary redundancy can be eliminated by leveraging spatial commutativity and spatial idempotence to support multi-cast, caching proxy, and other distributed optimizations. Bloat can be reduced by 'demand-driven' resource management - i.e. RDP makes it easy to manage which resources and comptuations are necessary, and when, due to duration coupling, temporal semantics, and asynchronous arrows.

People who say things like: "but we need imperative languages, because our CPUs are imperative and we need this low-level performance!" are the very same people who are responsible for applications and operating systems taking LONGER to startup on machines that are vastly more powerful. I don't blame them for their narrow perspective, though. I blame the system they are forced to work within.

as the person who started the Guile project

First, as the person who started the Guile project, I should mention that Andy has by now done considerably more work on it than I did in the first place - and taken in directions I'm pretty sure I would not have. I mention this to make clear that I am not trying to take credit for his work, which I respect.

Some things I believe: I think a lot of the comments here aren't very interesting. I think that Andy has some good ideas but should stop believing the GNU project exists. I think Guile "failed" (if you see it that way) mostly because early on it was sabotaged and because the GNU project (in any meaningful sense) kind of fell apart. To these points:

It was a good goal to have Guile as "the" GNU extension language library. Here are some sample reasons: A decent extension language is a good thing to have for many kinds of program. Lisp family languages can work nicely - for example GNU Emacs is still having a decent run (and how many programs that old can say that?). For parsimony, it's nice to imagine a personal computing system in which many major components all have the same extension language support (so that they can share code and so that users can leverage what they learn). Compared to GNU Emacs lisp, Guile's dialect of scheme corrects many minor annoyances noticed over the years. By emphasizing lexical scoping -- Guile can have an easier time improving performance. Because of decent macros, Guile can be easily extended without having to modify the core of the implementation. [*] The simplicity of a Scheme core is a nice high level target for extensions to the language itself. I could go on... it has a lot of virtues, as Andy noted.

[* note: I've forgotten if Andy has deprecated or eliminated the non-hygienic macro support from Guile but I think so. For the record, I think that (if so) that that was a mistake. Also, I think keeping the graph interpreter rather than bytecode (but sure, adding incremental compilation) would have been nice. But this topic digresses so... moving on.]

It's really not important that a GNU extension language triumph over _______. You can fill in the blank with your favorite language feature that Guile lacks or your favorite scripting language or what have you. The original aim was for Guile to be useful and long-lasting and an improvement on the other useful and long lasting extension language, Emacs lisp. That original aim is still valid.

It's a little bit absurd to say something like "Well, language X came on the scene at the same time as Guile and today X is very popular while Guile far less so. Obviously X got something right that Guile got wrong." In pretty much every case, the Xs that people usually bring up were significantly advantaged because they had some loyalty of the early supporters. The organizations around those Xs were, far more than Guile, interested in seeing the effort succeed.

In contrast and for reasons far outside the scope of programming language design and implementation, a significant part of the organization around the early days of Guile tried to kill it. They damn near succeeded and damn near took out me in the process. They pushed to resist RMS' goal of making GNU programs extensible using Guile (some strongly favored TCL, instead). They pushed to diminish the FSF's ability to lead the GNU project in any meaningful sense at all.

In that sense, Guile was sabotaged: born into the world and immediately fitted with concrete shoes.

Meanwhile, the GNU project itself was (and is) not exactly a model of managerial success. In the same years that Guile was just getting started, the Free Software Foundation was less and less able and willing to attempt to coordinate the technical aspects of building a "complete system" for personal computing. This was not an entirely bad development, but also not entirely good:

Interest around the world in developing free software began to explode around that time. The Linux kernel was a shiny new toy as was the Gimp, the various desktop GUI toolkits, and so forth.

In the years preceding, GNU was being developed mainly by a relatively small society of hackers who were reasonably able to coordinate and try to build a coherent collection of software. The prospect of a "GNU extension language" seemed an easy choice because, well, many or most of the GNU hackers wanted it. As the dot com bubble formed, as corporations began to resist the FSF's leadership, and as the number of free software developers exploded -- all of that chance for coordination fell apart. The FSF, I'm told, threw up its arms and gave up at orchestrating any serious degree of technical leadership and coordination.

What that meant for Guile is that the early constituency who were expected to be early adopters became in part irrelevant, and in part corporate employees who were actively resisting the FSF's leadership.

And so the GNU project became "hollowed out" as any kind of technical vision -- and Guile was left standing in isolation, supported only by a handful of volunteers.

I think that all of that is tragic. I think that the organization I worked for at that time was corrupt in its values and did great harm to a worthy project (GNU, including Guile).

Most importantly for this discussion: I think that Andy's considerable efforts over the years since have been impressive. I think that is description of why Guile is still worth doing is simple and sensible. And I hope some of us can help restore something like the GNU project to a meaningful state of technical leadership and development.

Starting with Controversy

Thanks for taking the time to write what you saw of the events. I think Andy and yourself did / have done some great work.

I wonder if things would have been different if the whole RMS / TCL thing had not occurred. As an unimportant programmer who was just part of the crowd, it seemed like an attack out of nowhere and colored any solutions by RMS / GNU in a bad light. I was ok with TCL at the time and probably could have migrated to something else.

Lisp must be on the way up.

If a mainstream language's acceptance is directly proportional to the amount of vitriol reactions it inspires, a'la what is often directed at C++ and Java, then judging by quite a few statements here Lisp is prime for the big time. Which is great, I'm one of those strange ones that enjoys the s-exp based language.

Don't mistake static type checking for usability.

Right now, static types are getting a lot of attention in the formal-languages community, but I have the impression that much of the attention is because at this point we're starting to have tools in our toolbox of formal methods that allow us to reason in terms of static types. We can prove new things with these tools, and show relationships we couldn't show before, so right now types are where programming languages as a formal discipline are progressing.

But with all due respect, the ability for mathematically minded people to formally prove things is only partially related to ease of use and usability by programmers. And even if it did define ease of use and usability, I don't think anyone can claim that it is impossible to statically check types in a scheme program (hint: it's a pretty standard junior project to write a static checker). Nor can anyone claim that a scheme (such as guile) which implements records lacks generative types or type expressions.

Y'all are upset, perhaps, because static type checking is optional in scheme, the programmers rarely bother, and sometimes expressions which have a complete union type (like READ) crop up and nobody actually using the language much minds. Dynamic checking sorts it out when necessary and that's enough for them. So, perhaps, you feel that your worldview is being somehow disrespected because these programmers aren't attaching "enough" importance to something vital to your discipline. The implication that it isn't necessarily vital, or even particularly important, to their discipline is a challenge to a dearly held belief.

But if the actual programmers don't much mind using dynamic rather than static type checking, does that offer some perspective on how important static type checking actually is to the programmers? It's terribly important right now to guys coming up with mathematical proofs, because we have tools that can use it to shed light on some things - but the evidence from the programmers themselves seems to indicate that dynamic type checking is enough to satisfy many of them about usability and ease of use. And even if it isn't, there are static checkers out there for people who want more reassurance. But the dissonance here sounds suspiciously like people getting defensive and upset because another community fails to value something (static checking of types) which they need and hold dear.

We have to accept that, while static type checking does contribute in certain ways to usability and ease of use, it's vital right now to mathematical proofs about language but not necessarily vital to programmers actually using a language. At a later period in history, we'll have mathematical tools that enable us to prove more interesting things about dynamic type checking or whatever, and then the focus of the formal-language community will change to cover the new ground.

As for the question of macrology, I can't offhand think of any other remotely mainstream language which has exactly the kind of macros that scheme has - which is to say, hygienic macros expressible and manipulable in a homoiconic language. That's a hell of a macro system to wrap joe average programmer's head around, and starting by saying it's unique isn't all that far off the mark. On the other hand, if you know of hygienic macros defined with homoiconic macro languages in other languages that allow manipulation of macros, please do speak up with specific counterexamples -- those are some languages I want to check out.

Nice rant. But I don't see

Nice rant. But I don't see anyone in this thread confusing static type checking for usability.

Regarding remotely mainstream languages with powerful, hygienic macros or equivalent, I would suggest you look into languages such as Dylan and XL. I favor extensible syntax to macros, and prefer standardization of AST or GADT (plus compiler/interpreter as library) to homoiconic.

Several languages use techniques even more powerful than macros - fexprs, pattern calculi, term rewriting. But such power comes at significant cost.

Who are you replying to?

Nobody in the reaction to Andy's post insisted on static typing. raould mentioned it in passing, the slides of the presentation have one single line about typing, and really it wasn't part of the debate at any point. Some people have talked about other potential choices for extensibility, and I've seen Python, Javascript and Lua mentioned, none of them being statically typed.

(I have some doubt about your claim that it is "a pretty standard junior project" to write a static checker [for a scheme program]", assuming the program is not type-annotated. The closest thing I know is the Dialyzer tool for Erlang, it's very helpful and interesting but definitely not a 'junior project'. But really I'm not sure if this is in any way related to the Guile discussion.)

Standard Junior Project

My whole class got to implement static typecheckers when I was a junior in university. I used a variant of Java from Andrew Appel (as did most of the class, since we were using Appel's compilers textbook), but I know a member of class chose to use an annotated variant of Scheme.

Granted, this is very different from handling the full language, but I think Ray's claim here is reasonable - that implementing a static type-system is not an especially difficult task. There is a certain degree of type-inference even if manifest types are heavily used.

I didn't start the fire

I assume this is mostly directed at Paul Snively, based on the tone and attacks on static typing, and I don't really want to join in the flame throwing. But just fanning the flames should be ok:

At a later period in history, we'll have mathematical tools that enable us to prove more interesting things about dynamic type checking or whatever, and then the focus of the formal-language community will change to cover the new ground.

I can sort of imagine such a system, where such proofs about dynamic programs are mostly inferred, but I think we'll always need to help the inference system along with some sort of static proof annotations. Maybe in the future they'll call these annotations "static types" :).

As for the question of macrology, I can't offhand think of any other remotely mainstream language which has exactly the kind of macros that scheme has - which is to say, hygienic macros expressible and manipulable in a homoiconic language. That's a hell of a macro system to wrap joe average programmer's head around, and starting by saying it's unique isn't all that far off the mark. On the other hand, if you know of hygienic macros defined with homoiconic macro languages in other languages that allow manipulation of macros, please do speak up with specific counterexamples -- those are some languages I want to check out.

Among mainstream languages, I agree with you that there's nothing really comparable to Lisp macros, but I'm also convinced that manipulating S-exprs isn't a good idea for the same reasons that building C programs with string manipulation isn't a good idea - it's not the right level of abstraction.

Among mainstream languages,

Among mainstream languages, I agree with you that there's nothing really comparable to Lisp macros, but I'm also convinced that manipulating S-exprs isn't a good idea for the same reasons that building C programs with string manipulation isn't a good idea - it's not the right level of abstraction.

I can sort of see your point; the abstract syntax tree of an expression (which is what Lispy macros manipulate) is not actually the semantics of the expression. But I'm having real trouble imagining a system for manipulating the semantics of an expression directly, or coming up with a better manipulable representation for the semantics of an expression than its abstract syntax tree.

And of course, we're talking specifically about Scheme macros here, which means that even regular Lisp macros aren't directly comparable, because regular Lisp macros don't enforce naming hygiene. So, as far as I know, Scheme macrology truly is unique, even among lisps.

Be that as it may, you have raised a fascinating point; What is the proper level of abstraction for macros?

This is what I'm implementing with my three day weekend

I have an answer in mind to your question, but I don't want to present my approach until I've played with it some. I don't know if it's the right answer, but I think it will be nicer than macros. The basic design is that we have a syntax layer that helps assemble an AST (if we want S-expr syntax, this step becomes trivial), and then the AST is used to build a tree of what I call Constructs, which interact via a sort of typed meta-object protocol. So in a situation where you have macros A and B interacting like (A (B foo)), you don't have A interacting with B via pattern matching on its expansion. Instead, there is some protocol for what A is expecting (maybe it's expecting a Symbol or an Expression) that defines what A can do with its parameter. I'll leave the details missing, but hopefully that gives you a picture of what I have in mind. I think it will be more composable than macros and will allow a non-S-Expr surface syntax that doesn't hinder macro composition.

You should take a look at

You should take a look at David Fisher's PhD thesis on Ziggurat, links here: http://www.ccs.neu.edu/home/dfisher/

Thanks for the link. I'll

Thanks for the link. I'll check it out sometime to see how it compares.

OMeta

This reminds me of Allesandro Warth's OMeta

Thanks, I'll check this out

Thanks, I'll check this out sometime in more detail. I've looked at this before and think there are probably considerable differences, but I could be wrong about that.

See Also...

Modules over Monads and Linearity

Inspired by the classical theory of modules over a monoid, we give a first account of the natural notion of module over a monad. The associated notion of morphism of left modules ("Linear" natural transformations) captures an important property of compatibility with substitution, in the heterogeneous case where "terms" and variables therein could be of different types as well as in the homogeneous case. In this paper, we present basic constructions of modules and we show examples concerning in particular abstract syntax and lambda-calculus.

You're putting the cart

You're putting the cart before the horse. We won't be deriving proofs for programs, we're deriving programs from proofs.

But if the actual programmers don't much mind using dynamic rather than static type checking, does that offer some perspective on how important static type checking actually is to the programmers?

Those programmers may not have the mathematical background necessary to understand why static type checking is important and so their perspective is lacking. Now, I'm not trying to be elitist, but they may not have had a good survey of the computing field since actual programmers are working for actual businesses which want actual profits and they don't care how they get those profits.

We have to accept that, while static type checking does contribute in certain ways to usability and ease of use, it's vital right now to mathematical proofs about language but not necessarily vital to programmers actually using a language.

Is the goal of computing scientists and software "engineers" to appeal to those who cannot program or who do not wish to expand their knowledge of programming? Is our goal to allow people who cannot program to be able to program? Why does mediocrity have to be our minimum standard for a programmer?

You may not think these formal methods are necessary for programmers using the language but there are hundreds of thousands of program "bugs"/errors/defects that prove the otherwise.

What is convincing?

Andy Wingo's editorial called "the gnu extension language" concluded "I have argued, I hope convincingly, that Guile is a good choice for an extension language." The argument there is made by highlighting a list of technical features and claiming, somewhat abstractly, that these features are important and that Guile uniquely brings this combination.

My intuition says that an argument in the style of the Computer Language Shootout (CLS) could be the basis of a more convincing argument. The relevant comparison would not, of course, be about measuring execution speed, but rather be optimized for purposes of comparing source code representations of the same small programming task across languages, which is something people often do anyway with the CLS. Rather than reading a list of technical features, I would like to see how the presence and absence of those features is reflected in the preferred source code representations for various scripts done in different languages. It should be possible to choose stand alone tasks that are sufficiently similar to application extension tasks, under some well described mapping, to yield a reasonable comparison.

Function literals for extensibility

Macros are very good for extensibility, so that's a good choice by itself. However, the article says they are the only game in town, which is not true. Function literals serve much of the same needs in practice without adding such a wild and wooly feature. Since macros are so very powerful, any code using them requires significantly more time to read, because first you have to learn what all the macros do.

Historically, Lispy languages have had a relatively heavy syntax for function literals, because you have to write out the word "lambda", thus making the weight of a small function be large compared to its body. Compare these options from other languages:

(lambda (x) (+ x 1))
x => x + 1
[ :x | x + 1 ]

Additionally, there have been developments in the ability to leave out any function indicator at all for cases where it is clear from context that the argument is a function. In statically typed languages, the types are used:

using(new File("foo.txt"), { /* this is a function in here */ })

The type of "using" would indicate that the second argument is a function. However, while this feature hasn't been explored in dynamic languages to my knowledge, there is no fundamental technical problem. You just wait until each method call to decide whether the responding method wants its arguments to be wrapped into a function:

http://www.lexspoon.org/blog/late-bind-order.html

Long story short: saying that macros have no competitors is ignoring the more popular part of this design space. Functions are a very strong competitor if you lighten their syntax.

Delaying evaluation is only one aspect of macros

I think this is a good point, but it addresses only a small subset of macro uses - delaying evaluation.

It cannot do things like the "user-interface macros" of CLOS (DEFCLASS, DEFMETHOD, ...), for example.

The luxury of having macros in your language is knowing that whatever abstraction you invent or need, you'll be able to have a boilerplate-free syntax for it, too. In this area, I don't see lightweight function syntax as a competitor to macros.

Lambdas are not (only) about delaying evaluation

Lambdas are not only about delaying evaluation. In fact, historically, they've been invented as an universal way to express quantifiers, that is, and universal variable binding device. Lambdas bind variables in a body, and I think that's the essence of lambdas.

In programming languages, lambda have also been given an operational semantic that may or may not express "delaying evaluation" (in some lambda-calculi you may reduce under abstractions, of course it's only tractable in absence of global side effects), and this is one of their main strength. But they're still being used, when you "factor duplicate code into a function", as a fundamentally syntactic way to parametrize a piece of code over a variable value.

In fact, it is not terribly practical to use the same construct (undifferentiated lambda-abstractions) as a syntactical device of "the one true way of representing variable binding, that express all others", and as a computational device that delays evaluation, capture environment, and all the thing that come with the "efficient" operational use of lambdas (allocating a closure, etc.), when the syntactic evaluation strategy "substitute the argument in the body" is rejected for efficiency reasons. Indeed, the ability of "operational lambdas" to compute on their argument is at odd with what we expect of the "syntactical lambdas" that only bind them.

This tension appears clearly in the strongly typed lambda-calculi that use the Higher Order Abstract Syntax style: one usually needs several different function types and lambda-abstractions to express those contradictory needs. See for example the "positive" (binding) and "negative" (computational) arrow types in Dan Licata's work with Robert Harper and Noam Zeilberger ("Focusing on Binding and computation", 2008, and "A Universe of Binding and Computation", 2009).
I've often wondered how it would be possible to split those two uses in different lambda-abstractions in a more down-to-earth, less precisely typed language (no focusing in the type system, etc.). One classical example is the `with-open-file "foo.txt" (lambda (file-handle) ...)` use; here we only need a binder-lambda, not a computational-lambda -- and in particular it's not really necessary to allocate a closure and all.

CBPV

Another nice aspect of Call-by-Push-Value-like languages is that they decouple identifier binding from delaying computations (thunking). Such languages let you express, and study, these two behaviours separately.

When someone finally writes

When someone finally writes "Call by Push Value For Dummies", I'm gonna come back and re-read that comment. :)

Thanks

That got me thinking!

A good extension language

Thomas Lord asks rhetorically:

What's a good extension language for a (purported) GNU project? In the vision of a complete GNU system, based on but extending unix with a graphical desktop and lots of commonly used applications -- what's a good extension language for those apps to use?

There's no perfect answer. The experience of early GNU hackers was that a lisp is a fine extension language -- and that a few clean-ups relative to Emacs lisp would be nice.

That was a vision from fifteen to twenty years ago. The Internet was crawling from infancy into toddlerhood. AOL disks were in every mailbox. Memory-to-CPU performance ratio was orders of magnitude better. Concurrency was mostly for time-share and job control.

But what would be a good extension language today?

I feel that 'extension mashups' will be an important part of that answer - i.e. extensions that modify or integrate other extensions. So will be code distribution, continuous upgrade or maintenance, and cross-app communication. Naturally, it must be feasible to cleanly remove an extension from a mashup. Mobility and synchronization between stations (e.g. laptop and desktop) would also be nice.

It would be very useful to extend services, too, e.g. putting agent code near a database or a web server, whether to receive and transform requests from other people, or to serve as an ambassador.

Effective security and resource control really aren't optional, in my vision.

Concurrency will be much more important, due to relationships between applications, services, extensions, and users. An extension language that cannot support sanity in the face of continuous change seems to me of limited utility, even today, and would severely constrain which classes of extension are feasible.

Also, efficiency and fast startup should be a priority. Operating systems today take forever to start because the programming model for extending them (i.e. 'applications' and 'services') requires a lot of redundant loading and initialization. I wouldn't want the applications to suffer the same fate due to extensions. A programming model that can load code on demand, and cache or persist a lot of the initialization rituals, would be valuable if we succeed.

I won't pick on Guile, because it's just one of a crowd that seem inadequate as an extension language for today and tomorrow.

re: A good extension language

None of the features or requirements you describe are stated clearly enough that I see how to relate them to any practical concern for a GNU extension language in this context. I think you make a fine case for a more simple-minded approach.

Incidentally, one side effect of Guile's beat down is that because it didn't get picked up in some key components when we would have liked, the architecture of those components didn't evolve in the context of a powerful but simple extension language. Huge loss.

Simple minded approach

Requirements analysis, and turning vague concepts into precise values for a particular context, is not the job of the prospective users. I have precise requirements for my own programming model - which aims to obviate the notion of 'desktops' anyway, in favor of ZUIs and a continuous browsing/IDE experience - not for the GNU project. (I have no love for GNU or its vision, though I do appreciate the groundbreaking work developing an open-source ecosystem.)

I believe Guile will be less effective than desirable as an extension language in this world with concurrency and networking and continuous updates. No huge loss, though, since expectations for Guile are nice and simple-minded. Developers can always hack some extra APIs or frameworks to deal with inadequacies as they become obvious.

Video now available

HEY: project suggestion

The original idea of Guile is still viable, I think, for some measure of "viable". Here is a project suggestion:

Someone could write a new version of Emacs, mostly in Guile, not worrying about elisp compatibility, but worrying about getting an Emacs architecture really nicely done in Scheme. By "emacs architecture" I mean things like the nature of the event loop and keymaps, including how they generalize to graphical displays; concepts like buffers that are distinct from windows-on-buffers; an "interactive user" perspective for writing extensions (e.g., primitive functions like "move forward one character"); modes; ... that kind of thing. It should be suitable for writing an Emacs-like text processor, a word processor, a spreadsheet, a fancy emacs-like shell, and so forth.

A hard part is that it should be light enough to run on very low-end hardware. It should be lighter than a modern web browser even though it would have some analogous functionality.

It should be cleanly customizable, extensible, and self-documenting.

It should still work, on plain text files, on regular terminals (i.e., should not require a GUI).

If all of that is possible well enough to create healthy alternatives to a lot of popular free software desktop tools, then in my view the Guile project will have at long last succeeded.

p.s. re project suggestion

I would absolutely suggest not to begin with an existing "gui toolkit" because those toolkits and an "Emacs architecture" don't map to one another at all cleanly. Don't worry about "native look and feel" for a project like this.