First-class Macros

In First-class Macros Have Types, POPL 2000, Alan Bawden describes a way to make macros behave as first class values.

In modern Scheme, a macro captures the lexical environment where it is defined. This creates an opportunity for extending Scheme so that macros are first-class values. The key to achieving this goal, while preserving the ability to compile programs into reasonable code, is the addition of a type system. Many interesting things can be done with first-class macros, including the construction of a useful module system in which modules are also first-class.

Bawden points out that while he used a static type system...

[t]here is no fundamental reason why this macro type system needs to be static. The same safety could be achieved using dynamic typing....The compiler would emit code to check the tags at runtime to make sure that a value used as an operator always had the type that the programmer promised the compiler it would have.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

2000?

hm, has nobody taken up that research, towards practice?

Gasbichler's modules

Martin Gasbichler's PhD thesis, Fully-parameterized, First-class Modules with Hygienic Macros is the most promising thing further work I have seen. Check out the discussion in section 6.1.

Is It Even This Complicated?

A closure-function captures the environment it is defined in. A macro captures the environment it is applied in and evaluates its own result in that environment.

Not with hygienic macros

A macro captures the environment it is applied in

The idea of an hygienic macro is to capture the defining environment rather than the expansion environment.

What this paper adds to hygienic macros is the ability to pass a macro as an argument to a function. The result is a very clean, very powerful form of both syntactic abstraction and multi-stage compilation.

I'm with raould, this sounds like a very fruitful line of PL design.

There are languages with first-class 'macros'

The Template Haskell school (for want of a better phrase - it might better be called the MetaML school) of compile-time meta-programming inverts the traditional Lisp notion of macros. Put very simply, in Lisp macros are special things that are explicitly identified, but macro calls aren't (they're normal-looking function calls that happen to do special stuff at compile-time); in Template Haskell, macros are normal functions (that just so happen to return ASTs) but macro calls are explicitly identified. This means that you can do anything you would normally do with functions, so 'macros' are first-class in such an approach. You can also see this in my Converge language (which is dynamically typed, just to show that first-class macro-ness and static typing are not inextricably linked).

Unless I missed something

Unless I missed something in Template Haskell "macros" are expanded only at compile time whereas MetaML supports staged compilation.

Correct

Correct. This isn't a very important difference IMHO because (simplifying somewhat) I've yet to see a convincing use of MetaML's multi-stage programming. Don't forget, though, that TH/Converge-style 'macros' can generate 'macros' (since this just means generating an AST of a function) - it's not exactly the same thing, but it gives you similar power.

Benefits of macros?

Are there any benefits to macros, first-class or not, assuming one has first-class functions and staging?

Sweet, sweet sugar

Macros allow you to create several related declarations (modules, functions, variables, types, aliases, etc) from only a small bit of code. E.g. if a language doesn't have records with named fields then the right macro can create a type alias for a tuple/array/list/whatever, a constructor, and field getter/updater functions.

Another example: if your language doesn't have sugar for working with arrows, monads, or the newest functor du jour, macros can let you add the sugar to make hairy expressions more pleasant.

Another example: if your

Another example: if your language doesn't have sugar for working with arrows, monads, or the newest functor du jour, macros can let you add the sugar to make hairy expressions more pleasant.

Perhaps I should qualify my statements. In a language with S-expressions, like Scheme or Lisp, I don't see any advantage of macros over lambdas, modulo efficiency. I only have limited experience with these languages though, but given the simple syntactic structure, I think that's an accurate statement.

In languages like OCaml, with it's camlp4 preprocessor, I agree that syntactic sugar can be considered "an advantage" in a way, though I don't think there's a gain in expressiveness over ordinary combinators. Syntax extension is a dubious advantage to my mind. I used to be in the opposite camp for years, but I now think it's too easily abused.

Indeed

Matthias Felleison and Avi Bryant have made (IMHO) strong points about macros being surprisingly overrated (in the sense that you can do a lot of the same stuff without them). See past discussion and the referenced LL1 thread (in particular Matthias, Avi, Guy Steele).

But I'll throw another example of a nice macro out there. Edi Weitz's CL-PPCRE regular expression library will report errors in your regexp at compile-time. That's simple and natural because the regexp is compiled by a macro and not interpreted at runtime.

Static checks

Luke is, I guess, referencing Felleisen's LL1 post on the uses of macros: Felleisen didn't list implementing compile-time checks among his principled uses of macros, which I don't suppose is an omission he would make today.

Aren't regexps simply a less

Aren't regexps simply a less expressive parser combinator library? Parser combinators also yield compile-time errors. Again, efficiency is a concern, but then you can augment your language with staging and gain back any lost efficiency.

Staging?

How do the parser combinators yield compile-time errors? I didn't realise that e.g. Haskell supported compile-time evaluation. I've seen references to various "staging" compilers (without knowing precisely what that term means..) but I'd thought they were only research prototypes (e.g. TemplateHaskell). I'm curious to know more.

How do the parser

How do the parser combinators yield compile-time errors?

A parser built from combinators is an executable specification of that parser. The only errors that can creep in are thus specification errors, so I doubt macros can do better than that.

I've seen references to various "staging" compilers (without knowing precisely what that term means..)

Staging is strongly-typed runtime compilation. So you can replace a interpreted parser expression built from closures with a compiler using staging, and get a specialized parser that executes at full speed with no interpretive overhead.

No...

...if your parser combinators are recursive descent, then you have to worry about left-recursion leading to nontermination, and uncontrolled backtracking leading to exponential search. A good parser algorithm can and will detect (and sometimes, fix) such problems for you.

I'm not sure which of my

I'm not sure which of my statements you're refuting. I think you are refuting my statement that only specification errors can creep into combinator-built parsers? This is subject to the limits of your language of course. If your language is strongly normalizing, then nontermination is not a problem. If it is not strongly normalizing, then the programmer must guard against that possibility of course.

Types, modules, etc

In a language with S-expressions, like Scheme or Lisp, I don't see any advantage of macros over lambdas, modulo efficiency.

I don't think form of the syntax is that central to the advantage of macros (though it might make writing them easier or harder). Functions are generally constrained to work with terms. If your language has declarations that are not terms then macros provide an ability to abstract the generation of such declarations.

For example, Liskell is S-exp based, but has many constructs that can't be generated by functions, e.g. here's a monomorphic list specialized to Integers.

(defdata IList
  (IListNil)
  (IListCons Integer IList)

There's no way to write a function that generates that but you can write a macro which would allow you to write

(mono-list IList Integer)

Of course, a monomorphic list isn't all that interesting or useful, many compilers and runtimes can automatically specialize polymorphic structures for you, etc. But it does illustrate something that macros can do that is beyond lambdas and the normal conception of staged compilation.

Yes

I'm not sure what you mean by 'staging', but the answer is almost certainly yes. The rationale for MetaML, for example, is mostly given in terms of performance. Even MacroML is far less expressive than most Scheme or Lisp macro systems.

Some examples of things that can be implemented with macros, without having to extend the underlying language (all of these are from PLT Scheme): a first-class module system, a Java-like class system, a CLOS-like class system, a type system for Scheme, a contract system, a pattern matcher, a lazy version of Scheme, and a functional-reactive version of Scheme. None of these could be implemented without macros.

None of these could be

None of these could be implemented without macros.

I don't think your statement is strictly true. Laziness vs. strictness is an implementation artifact of a program's interpreter. Anyone who unclear on my meaning of 'staging' should read the finally tagless paper, as they use staging in a pertinent example.

I think all of the examples you have named can be provided from within a language by mixing orthogonal features, like higher rank polymorphism, extensible records, and staging. Using staging, one could create a metacircular interpreter for a language, and use that to load and execute a program lazily or strictly, for instance, or using dataflow ala FRP.

I don't think bolting a macro language on top of a core language as one catch-all solution is necessarily a good thing simply because of its apparent flexibility.

Laziness

First, given a strict language, adding a lazy dialect is not something that can be dismissed as an implementation artifact.

Second, creating an interpreter is not extending the language - it's inventing a new one. Unless you have some mechanism for both handling communication between the languages and recovering the lost performance, this approach is unreasonable.

Third, just saying that these examples could be implemented with other features insufficient. To my knowledge, no one has implemented anything like these features without macros. If you have any examples, please share them.

Finally, well designed macro languages are not "bolted on". In modern Scheme macro systems, macros are written in the same language as any other part of the program.

First, given a strict

First, given a strict language, adding a lazy dialect is not something that can be dismissed as an implementation artifact.

Second, creating an interpreter is not extending the language - it's inventing a new one. [...]

Actually, I think macros and interpreters are almost identical approaches from a certain viewpoint.

As far as the value of staging, I think in an untyped language staging would be purely a performance tool, but in a typed language it's more than that. Though it seems to me that dependent typing could have all the expressive power of staging. Anyone know if that's true?

Macros and Interpreters

Actually, I think macros and interpreters are almost identical approaches from a certain viewpoint.

I think this is almost exactly wrong. A macro is a compiler from the extended language to the less-extended language. An interpreter operates at runtime, a macro operates at compile time.

As for staging, it all depends on what your multi-stage system can do. If it has the power of a macro system, then it can do more than performance hacks. I don't see why types come into it though.

Dependent types are not able to replicate the expressive power of staging - you can't get back `eval' if you don't have it already.

Macros and Interpreters

I think this is almost exactly wrong. A macro is a compiler from the extended language to the less-extended language. An interpreter operates at runtime, a macro operates at compile time.

I was obviously referring to a staged interpreter with the symbol interpreting part done at compile-time. I think staging a self-evaluator to occur at compile-time is very close to adding a macro system as long as your language constructs correspond in a simple way to values (thus the need for things to be first class in this approach).

Dependent types are not able to replicate the expressive power of staging - you can't get back `eval' if you don't have it already.

Ah, right, I forgot, adding staging should increase the consistency strength (or whatever the analogous terminology is) of a strongly normalizing language.

As for staging, it all depends on what your multi-stage system can do. I don't see why types come into it though.

For example, I don't think you gain anything by adding staging to the lambda calculus other than just changing when terms get reduced. On the other hand, adding staging (and supporting it in the type system) of some strongly normalizing typed lambda calculus would seem to increase its expressive power (not just what stage reduction occurs).

Staging

I was obviously referring to a staged interpreter with the symbol interpreting part done at compile-time. I think staging a self-evaluator to occur at compile-time is very close to adding a macro system as long as your language constructs correspond in a simple way to values (thus the need for things to be first class in this approach).

I didn't know that you were referring to staged interpreters. I also don't see why adding a self-evaluator is the same as adding a macro system. For example, MacroML is much less expressive than any Scheme macro system, and it's the most advanced example of such a system that I've heard of.

For example, I don't think you gain anything by adding staging to the lambda calculus other than just changing when terms get reduced. On the other hand, adding staging (and supporting it in the type system) of some strongly normalizing typed lambda calculus would seem to increase its expressive power (not just what stage reduction occurs).

I don't think either of these claims are true as stated. First, eval adds significant expressive power to a language, even an untyped one. Your second claim rests upon a bunch of terms that we haven't precisely defined (such as staging and expressive power) which would have to have precise definitions in order to make sense of your claim.

For example, I don't think

For example, I don't think you gain anything by adding staging to the lambda calculus other than just changing when terms get reduced. [...]

I don't think either of these claims are true as stated. First, eval adds significant expressive power to a language, even an untyped one. [...]

Adding eval to the lambda calculus can't be too significantly expressive since you can write it yourself in small number of lines (though you first have to choose an encoding). But the more important point is that it seems to me staging (a la MetaML) can be projected away by just treating brackets with escapes as functions, and the projection will compute the same result (though possibly more slowly).

You're right that the next claim wasn't precise... I didn't even have a particular type system in mind.

But the more important point

But the more important point is that it seems to me staging (a la MetaML) can be projected away by just treating brackets with escapes as functions, and the projection will compute the same result (though possibly more slowly).

Indeed.

First, given a strict

First, given a strict language, adding a lazy dialect is not something that can be dismissed as an implementation artifact. [...] Second, creating an interpreter is not extending the language - it's inventing a new one [...]

Please see the tagless paper I linked to, as I am using "interpreter" in their sense of the word, which includes compilers, partial evaluators, CPS transformers (which gives reduction semantics independence), etc.

Once you have read the paper, and perhaps the comments where I started discussion on this possibility, consider a language which compiles programs to a portable bytecode. Executing a compiled program merely consists of "interpreting" the bytecode using any one of the interpreters presented in that paper, which includes a native compiler (essentially a JIT), CPS interpreters, partial evaluators, etc. We can further build "interpreters" with aspect weaving if we so choose, so long as the type signatures are satisfied.

In such a system dynamic loading is trivial, and further, the executing program can load and execute further program fragments using different interpreters.

Such a design requires no preprocessing, template expansions, or macros, just a well designed phase separation between compilation and runtime placed under the control of the program and programmer.

Finally, well designed macro languages are not "bolted on". In modern Scheme macro systems, macros are written in the same language as any other part of the program.

Same syntax, different semantics. I wouldn't consider two languages with the same syntax but different semantics to be the same language. If they were truly the same language, then there would be no expressiveness gained by adding macros to core Scheme. I am trying to argue that augmenting the core language with orthogonal constructs is preferable to adding a preprocessing/macro language.

First-class macros can possibly blur this line, since making any construct first-class means giving it a full semantics in the language. I will have to review this paper more fully to form an opinion on that.

Same syntax, different semantics

Same syntax, different semantics

That's false. The "macro language" in the modern Schemes that Sam mentioned is the exact same language as the runtime language. Same syntax, same semantics, same compiler/interpreter.

Not exactly what naasking was saying

Depends no how you look at it. Naasking didn't mean it was a different language in the "it's a separate thing compiled by a different compiler sense" but more in the sense that people can talk about the module language in ML as distinct from the rest of it.

It's fair enough to say that Scheme has the lambda language and the macro language as two conceptually different things, even though in practice you use both of them as a cohesive whole.

But that's wrong

In ML, the module language is a different language from the regular language. For example, functors and signatures are module-language-only concepts, and functions and types are value-language-only concepts.

In Scheme, they are the same language. The way you write functions in macros is with `lambda', the way you write functions other places is the the exact same `lambda'. The syntax is the same, the semantics are the same, everything is the same.

If the only macro language you have is `syntax-rules', then the systems may look different, but almost every Scheme system provides procedural macros that makes macros just Scheme code. `syntax-rules' is just a macro itself in systems like PLT Scheme.

Depends no how you look at

Depends no how you look at it. Naasking didn't mean it was a different language in the "it's a separate thing compiled by a different compiler sense" but more in the sense that people can talk about the module language in ML as distinct from the rest of it.

Yes, macro definition and macro application have their own rewriting semantics distinct from the lambda language, as you say. If macros weren't so distinct a concept from the core language, we probably wouldn't have named them.

It does not depend on how you look at it

See what Sam said: if you only look at something like syntax-rules, then you get a simple rewrite language as your macro system. But we're talking about what is usually referred to as the "low-level" macro language -- that's actually a "low-level" API, since the language is always Scheme (in all existing implementations, at least).

Here's an example, here's a piece of code that you can run in PLT Scheme:

  #lang scheme
  (provide random-elt)
  (define (random-elt l) (list-ref l (random (length l))))

And assuming that this code is in "random-elt.ss", I can write the following module:

  #lang scheme
  (require "random-elt.ss" (for-syntax "random-elt.ss"))
  (define-syntax (random-expr stx)
    (syntax-case stx ()
      [(_ expr ...) (random-elt (syntax->list #'(expr ...)))]))
  (define (foo) (random-elt '(1 2 3)))
  (define (bar) (random-expr 1 2 3))
  (list (list (foo) (foo) (foo))
        (list (bar) (bar) (bar)))

The last expression will evaluate to two lists, the first with three random numbers and the second with a random number repeated three times.

Note that "random-elt.ss" is required twice: for use in the run-time code and in the macro -- but it's exactly the same code that gets used. That's not speaking at some conceptual level, it is the exact same code down to the jitted code. It should be obvious now that whatever I can do in PLT, I can do at the syntactic level or in the runtime level in the same way. I can easily write macros that use continuations, the GUI system, run multiple threads, or play music that fits the mood of the compiled code. It is almost trivial to write a macro that starts a web server which will allow you to decide how to compile code.

There's clearly still some

There's clearly still some disconnect here, because I don't recall ever saying macros couldn't use core language values or call functions, etc. From my understanding, macros operate by a series of expansions into the context in which they're applied, performing alpha conversions on any variables as needed to avoid capture.

So when you say "exactly the same code gets used", to me that means that the exact same sequence of core language statements is generated from the macro transformation at each macro application site, not that a single code sequence is generated which is then called at the various macro application sites. Would you say this is accurate?

Still...

...wrong. The call to random-elt is a plain function call. It's (PLT) Scheme. The result of the call is the generated code (which is not really generated here, but rather chosen among the provided subexpressions).

[But all of this is basic stuff; certainly no shortage of places to read about it.]

Hmm

You continue to seem confused about how macros actually work. Have you actually used a Lisp or Scheme system with macros?

On to your comments:

Please see the tagless paper I linked to, as I am using "interpreter" in their sense of the word, which includes compilers, partial evaluators, CPS transformers (which gives reduction semantics independence), etc

First, I did look at the paper. Second, they do not use "interpreter" to mean all of those things. Third, they only use staging in the definition of a compiler.

Once you have read the paper, and perhaps the comments where I started discussion on this possibility, consider a language which compiles programs to a portable bytecode. Executing a compiled program merely consists of "interpreting" the bytecode using any one of the interpreters presented in that paper, which includes a native compiler (essentially a JIT), CPS interpreters, partial evaluators, etc. We can further build "interpreters" with aspect weaving if we so choose, so long as the type signatures are satisfied.

First, I fail to see the connection between macros, tag-free interpreters, and portable bytecode. Bytecode compilation is an implementation technique that is irrelevant to language expressiveness. Second, the paper does not include a native compiler. Third, the term 'aspect' does not appear in the paper, and suggesting building interpreters out of aspect-weaving has no relation to anything we've discussed here.

In such a system dynamic loading is trivial, and further, the executing program can load and execute further program fragments using different interpreters.

Such a design requires no preprocessing, template expansions, or macros, just a well designed phase separation between compilation and runtime placed under the control of the program and programmer.

It may be that dynamic loading is trivial in this system, but it has nothing to do with macros. Again, macros are not the same as adding interpreters or eval to your language.

Same syntax, different semantics. I wouldn't consider two languages with the same syntax but different semantics to be the same language. If they were truly the same language, then there would be no expressiveness gained by adding macros to core Scheme. I am trying to argue that augmenting the core language with orthogonal constructs is preferable to adding a preprocessing/macro language.

Here you are clearly confused about macros. The code that defines a macro (which is a source to source transformation) is simply Scheme code, with the same semantics as any other Scheme code. The additional expressive power comes from being able to specify the use of these transformations at macro-expansion time.

Adding new "orthogonal constructs" is useful, and macros are such a construct - they give you different abstractive power than functions or modules, for example.

Finally, 'first-classness' has nothing to do with it. Your statement "making any construct first-class means giving it a full semantics in the language" makes no sense. Is your claim that methods do not have full semantics in Java? What other semantics might they have?

S-exp's and sugar

In languages like OCaml I agree that syntactic sugar can be considered "an advantage"

Even with S-exp's, a little sugar can go a long way. I'd much rather write

(do
  (<- a x)
  (<- b y)
  (<- c z)
  (return (+ a (* b c))))

than


(>>= x 
  (lambda (a)
    (>>= y
      (lambda (b)
        (>>= z
           (lambda (c)
              (return (+ a (* b c))))))))

Which is just an indication

Which is just an indication of how awkward lambdas are to write. I strongly believe that functional languages should optimize for small functions, including lambdas. Consider your example written as:

x >>=:a
y >>=:b
z >>=:c
return (a + b * c)

Lambdas are delimited by an identifier prefixed with a ":" and followed by white space in the above language. So 'id' would be "let id = :a a".

I accept your argument that macros can easily abstract over all program terms that functions can't, but to me that indicates a certain lack of expressiveness in the core language. For instance, constructors can be passed as functions in OCaml but not in SML IIRC.

A lambda or a macro must be used to provide this additional abstraction, but there is no fundamental reason why the core language couldn't do it for you is there? It seems like an artificial restriction, though I'm sure there's a good reason for it, it's no doubt mired in implementation details of little relevance to the end-user.

Restrictions

but there is no fundamental reason why the core language couldn't do it for you is there? It seems like an artificial restriction, though I'm sure there's a good reason for it, it's no doubt mired in implementation details of little relevance to the end-user.

Sure, but most languages have some kind of restrictions on what functions can and can't do.

Lambdas are delimited by an identifier prefixed with a ":" and followed by white space in the above language. So 'id' would be "let id = :a a".

Where do they end? A blank line?

Anyway, I'm with you on a lightweight syntax. In S-exp form my example could have been written using a Haskell style \ for lambda and a requirement that lambdas have only a single argument

(>>= x (\ a
(>>= y (\ b
(>>= z (\ c
(return (+ a (* b c)))))))))

That still doesn't refute my point that sugar can be good for S-exps.

Where do they end? A blank

Where do they end? A blank line?

Probably at the closing parenthesis or at any other lower precedence operator.

That still doesn't refute my

Sure, but most languages have some kind of restrictions on what functions can and can't do.

Sure: any computation should be limited to the parameters that were passed in. Are there any other limitations that make sense?

Perhaps I'm trying to say that all language constants, constructors or operators are either all functions, or have are automatically lifted operation to functions as needed. To a logical extreme, this probably means everything becomes first-class.

That still doesn't refute my point that sugar can be good for S-exps.

I was disputing whether macros were a good idea at all. I fully accept that they are good for many things, but all of the ones I've read about seem to poorly integrated with a language. They're not really part of the language's semantics, just sort of a layer bolted on top, so you now have two languages to learn: a template expansion-like language (macros), and the underlying language. This seems true of S-exp languages too.

I've always thought there had to be a cleaner way to embed this type of computation into the language itself. I think lambdas + staging get you at least most of the way there.

..so you now have two

..so you now have two languages to learn: a template expansion-like language (macros), and the underlying language.

Why would you have to learn the underlying language? This seems to me like arguing that to program in Java you have to learn JVM bytecode, and while that certainly is helpfull, I can't say its necisary.

No, I'm arguing that to

No, I'm arguing that to program in C, you have learn the semantics of the C preprocessor (macros), and the semantics of C (the underlying language), instead of a single unified language. While C's preprocessor is a poor macro processor, the analogy holds I think. Can macros ever be sufficiently integrated with a language such that they are seamless? If so, then you have a single language and perhaps my objections don't apply. S-exp languages certainly get close.

Macros vs. Meta-programming

I was disputing whether macros were a good idea at all ...

I don't know how much you intend to be covered here by 'macro', but I don't think lambdas + staging replace the need for meta-programming in general. As one example, it's useful to have multiple semantics assigned a single block of code: the primary expression value, a type descriptor (which is not accessible as a value in your scheme), maybe embedded unit tests, IO to occur during the build process, debug mode differences, network orchestration, compilation results, etc. With something akin to a meta-object protocol, you can have programmable control over the code's meaning.

Edit: And responding to your comment in a parallel post, I agree that you should set things up so that you have one language semantics. The description of constructs in your language should be described in the kernel of the language. For kernel constructs it will be meta-circular.

Re: what is a macro?

Re: what is a macro?

Hmm, I would have said macros are any strongly normalizing stage -1 computation (stage 0 refers to the compiled program in the literature). The macro language is either the same as the expanded language, or different. I'm arguing for it to be the same, in which case it's just a limited form of staging. However, first-class macros may violate that definition, so I'll have to get back to you.

IO to occur during the build process

Can't this just be considered stage -1 of the program? Getting a little off-topic, but given staging and a strongly normalizing language, there's no reason I can see why stage -1 couldn't have user-specified computation.

debug mode differences

This is definitely an interesting problem. The handling of effects in general is tough, since any minor effect can significantly change the behaviour of a program. This does seem to argue for something along the lines of conditional compilation.

a type descriptor (which is not accessible as a value in your scheme)

If I understand you correctly, that's not necessarily the case. Depends whether the language permits runtime type analysis.

Re: metaprogramming

Staging provides limited forms of reflection as well, as Oleg demonstrated with a type-safe printf in MetaOCaml. I'm playing with various small languages built using a tagless encoding of a metacircular interpreter, which should permit some metaprogramming.

Maybe somewhat equivalent

IO to occur during the build process

Can't this just be considered stage -1 of the program?

It can, but then you have to separate out the part that does the IO. You can't just go to an ordinary stage 0 expression and embed a construct that does IO at compile time.

The macro language is either the same as the expanded language, or different. I'm arguing for it to be the same, in which case it's just a limited form of staging.

I think the staged computation approach would be equivalent to what I'm talking about under an idiomatic translation. Basically you'd build stage -1 computations everywhere that return meta-objects. You'd even use stage -1 computations to build run of the mill stage 0 computations, so that your constructions would be able to participate in the meta-object protocol. I think this would require way too much visible plumbing, though, to be done in practice.

It can, but then you have to

It can, but then you have to separate out the part that does the IO. You can't just go to an ordinary stage 0 expression and embed a construct that does IO at compile time. [...] I think this would require way too much visible plumbing, though, to be done in practice.

I'm actually thinking the other way around. A program specifies how to compile itself, so the compiler starts executing the program at a stage -1 "main" point, and this "main" specifies how to compile the rest of the program. The compiler can provide a default "main" for systems that don't require this complexity.

I think the more first-class all language constructs become, modules in particular, the more something like this might be needed.

We both would have the

We both would have the compiler start at stage -1 (or stage -N), so that's not a difference. The main difference I see is what definition constructs do. In a traditional staged approach, they make available a single value definition, whereas I'm proposing a more elaborate meta-object protocol take place. In my approach, 'main' is not a function but an ADT supporting a number of meta operations in addition to 'getValue'.

Also, I don't see any differences related to first class modules.

In my approach, 'main' is

In my approach, 'main' is not a function but an ADT supporting a number of meta operations in addition to 'getValue'.

Don't think I quite get it, but my experience with metaprogramming is limited, aside from reflection and staging. Since we're speculating, I'm growing to appreciate the EDSL approach via tagless interpretation.

I think metaprogramming is generally harmful (to maintainability), and as long as your language supports polymorphism, polytypism/genericity, I think most problems can be solved without metaprogramming, particularly if one has staging to make any EDSL efficient.

Also, I don't see any differences related to first class modules.

I was referring to MixML directly, not first-class modules, though I suppose it applies there too. MixML raises a runtime exception when mixing two modules creates an invalid recursion, and the thought of linking modules raising exceptions almost seems bizarre given a static, strongly typed language. I was just raising the point that the more first-class constructs in the language, the more important reasoning about a program in terms of stages becomes.

The loading and linking phase for such modules is actually a stage which most programmers can now ignore, but which they may not be able to ignore in the future.

Correction

For instance, constructors can be passed as functions in OCaml but not in SML IIRC.

No, it is the other way:

$ ocaml
        Objective Caml version 3.10.2

# type 'a foo = Foo of 'a | Bar of 'a ;;
type 'a foo = Foo of 'a | Bar of 'a
# let make ctor v = ctor v ;;
val make : ('a -> 'b) -> 'a -> 'b = <fun>
# make Foo () ;;
The constructor Foo expects 1 argument(s),
but is here applied to 0 argument(s)
Standard ML of New Jersey v110.68 [built: Wed Aug 20 22:43:11 2008]
- datatype 'a foo = Foo of 'a | Bar of 'a ;;
datatype 'a foo = Bar of 'a | Foo of 'a
- fun make ctor v = ctor v ;;
val make = fn : ('a -> 'b) -> 'a -> 'b
- make Foo () ;;
val it = Foo () : unit foo

It shouldn't take more than a couple of seconds to check things like this before posting.

I don't have access to

I don't have access to either on this machine, sorry.

Second, they do not use

Second, they do not use "interpreter" to mean all of those things.

Yes they do. R, C and P, the interpreter, the compiler aka staged interpreter, and the partial evaluator respectively, are all referred to as interpreters in the paper. Further, in the comments of that thread, Oleg repeatedly refers to abstracting over interpreters. But as long as you understand my meaning, we need not quibble over this.

Third, they only use staging in the definition of a compiler.

Yes. My original point was that macros only seemed attractive for syntactic sugar reasons and performance. If one believes, as I do, that syntactic sugar is harmful to maintainability of software systems, then performance is the only other motivator. Staging can remove the interpretive overhead of any abstraction one would care to build, which led me to question whether macros have any other uses.

First, I fail to see the connection between macros, tag-free interpreters, and portable bytecode. Bytecode compilation is an implementation technique that is irrelevant to language expressiveness. Second, the paper does not include a native compiler. Third, the term 'aspect' does not appear in the paper, and suggesting building interpreters out of aspect-weaving has no relation to anything we've discussed here.

Hmm, I had hoped that by describing a concrete implementation addressing each of your points supporting the use of macros, you would see my point that staging can be used to build a system that allows all the power of macros with static typing to boot.

To take each point in turn: yes, bytecode is not necessary, merely the easiest way to explain the system I was describing; yes, the paper does not describe a native compiler, but staging links to the native compiler, so it should be clear that a native compiler is already available for the host language, and the staged interpreter is a native compiler for the embedded language; I mentioned aspects just to point out that the meta-circular interpreter structure lends itself to more uses than merely what was described in the paper.

Again, macros are not the same as adding interpreters or eval to your language.

I never said it was. I'm saying the advantages of macros are achievable via more disciplined mechanisms that do not require full macros.

The code that defines a macro (which is a source to source transformation) is simply Scheme code, with the same semantics as any other Scheme code. The additional expressive power comes from being able to specify the use of these transformations at macro-expansion time.

I don't think I am confused. Macros have their own reduction/rewriting semantics distinct from the language core reduction semantics. Further, Scheme macros have not been given a sound typing that I know of, which makes them somewhat unsafe for developers. Apart from any other issues they may have, this would suffice for me.

However, there have been many discussions about how macros are non-compositional, a problem one does not have with lambdas. Given my argument above, I hope I more clearly explained why I think the complexity of macros are unnecessary given lambdas and staging.

Lambdas provide sufficient control-flow abstraction, and staging provides sufficient performance by eliminating any abstraction overhead.

[Edit: sorry, not sure how this got disconnected from the parent thread.]

Unneccessary?

Look. You stated originally, and, continue to state, that macros aren't necessary given "lambdas and staging". I've pointed out that there are numerous examples of abstractions written with macros that have not and cannot be written with "lambdas and staging". You've reiterated your claims, and pointed to some papers that are not really related to the discussion.

I recommend that you familiarize yourself with some actual macro systems, learn what's possible, and return to this discussion when you can give some actual examples of how to write complex macros with "lambdas and staging".

At that point, perhaps we can have a productive discussion.

.

.