Metaprogramming and Heron : Five Years Later

I just realized that I first posted about my programming language Heron here on Lambda-the-Ultimate.org over five years ago. Kind of funny to think that at the time then I didn't even know where the name "Lambda the Ultimate" came from. I have learned a lot since, and in no small part thanks to the insights of the brilliant and helpful people at this site!

I have spent the last year, completely redesigning Heron and I have come up with a language which is inspired by a slew of other languages. It is still a curly brace language, with a run-of-the-mill Java style object-oriented approach (minus virtual functions) plus some functional features. In fact it looks a lot like JavaScript 4.0 without prototypes. However, I think that the potentially most interesting feature is the new meta-programming system. Or is it a macro system? I don't actually know, which is why I am writing here.

The idea behind the Heron metaprogramming system is quite simple: a Heron program has two entry points. One for run-time execution, and another for compile-time execution. The only difference between the two entry points, is that the compile-time execution is passed a reference to the abstract syntax tree. At run-time, the modified version of the abstract syntax tree is what gets executed.

You can see this article for a more detailed description if you want, or just look at this Quine demo which prints its own source code out at compile-time.

So I have a couple of questions for the members here:

  1. Is it more accurate to call this a metaprogramming system, a macro system, or a compile-time reflection system?
  2. What other programming language metaprogramming facilities most resemble the one I describe here?
  3. Am I confused or being unintentionally confusing or misleading in some ways on this topic? I find this whole subject area a bit murky with regards to correct usage of the terminology.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Staged mutable meta-programming?

The confusing thing to me is why you set things up as the compile-time program modifying a run-time program that exists at the outset. Why don't you view this as a compile-time program that produces a run-time program? I think that would fit more cleanly into existing models of staged meta-programming.

Is that a more common method

Is that a more common method of explaining this kind of meta-programming system?

My rationale was that the compile-time component is optional, so if it doesn't get run, then where would the run-time program come from?

The compile-time program is

My rationale was that the compile-time component is optional, so if it doesn't get run, then where would the run-time program come from?

When unused, the compile-time program is the identity function?

But it isn't a function

The only problem is that the approach is very explicitly non-functional. The compile-time pass does not construct a new AST from an existing one, it mutates the AST (while executing the original copy).

Well, you can model it

Well, you can model it functionally, I was just trying to point out that for each "cut point" where you can define a metaprogram, there's a default identity function, and if the developer provides a metaprogram it's like overriding the default behaviour.

Identity function is still slightly different

That still assumes that the compile-time meta-program modifies an existing run-time program. I was thinking that a trivial compile-time program would be one that returns a literal for the run-time program.

If meta-programs act by modifying programs, how well-formed do the original programs have to be? I'd think they couldn't possibly type-check, since the meta-program might be introducing important types. Similarly I wouldn't expect that you can answer questions about symbol binding.

So semantically, this seems messy to me. My guess is that the motivation is in making the syntax more palatable than having quoted program fragments everywhere.

Messy semantics

If meta-programs act by modifying programs, how well-formed do the original programs have to be?

I think that this is an important question. I have my ideas, but I do want to kick them around a bit.

I'd think they couldn't possibly type-check, since the meta-program might be introducing important types.

Yes, I was thinking that static type-checking would be done after the meta-programming phase. At compile-time the code is interpreted and all type-checking is done dynamically.

Similarly I wouldn't expect that you can answer questions about symbol binding.

That is correct. Well to be precise you can answer questions, but you have a chance of being wrong. The meta-programming phase may add/rename/remove symbols and their bindings.

The only thing that one can rely upon about the input code, is that it still has to follow a somewhat disciplined syntax. There is no room for new operators, or constructs. I am definitely not trying to create a DSL tool-kit, or some kind of all-encompassing "superlanguage".

So semantically, this seems messy to me.

I would be inclined to agree, regarding the semantics at compile-time. There are some really pitfalls introduced, that some theorists may loathe: compilers that fail to accept valid programs because the meta-programming system enters into an infinite loop, or that has behavior dependent on user input.

So the meta-programming system, is like any standard scripting language: messy and hard to analyze, but gets the job done.

However, the language semantics after the meta-programming can be defined neatly. The language core, is small and (potentially( well-defined. I say potentially simply because I haven't yet formally defined the semantics.

My guess is that the motivation is in making the syntax more palatable than having quoted program fragments everywhere.

Exactly! This is exactly what I am trying to solve.

abstract syntax tree

the compile-time execution is passed a reference to the abstract syntax tree.

Just to be clear, is that the abstract syntax tree for the whole program?

I think that you want to define more entry points for parts of the program, because it could become unmanageable for no good reason.

What you have is not simply a macro system, so it should probably be categorised in meta-programming.

The whole program

Yes the AST is for the whole program.

I am not sure why multiple entry points would be needed? At compile-time a programmer only needs a single entry point to perform multiple passes over the AST (or any portion that they want, by navigating to the relevant portion).

I am also not sure where things could become unmanageable. Not trying to be argumentative, just overtired and lacking imagination. :-)

In my mind most meta-programming code would occur in imported modules, so that may help, but it is hard for me to imagine all the ways things could go wrong.

by navigating to the

by navigating to the relevant portion

That is the part I fear most, maybe a Xpath-style navigation would help? And at times you only want the AST, not the code (eg. For macro substitution). How do you obtain that?

Interesting idea

The Xpath-navigation idea is an interesting one. I don't know that it is necessary though. The Heron syntax for navigating the AST/CodeModel is pretty easy.

My inspiration was the HTML Document object model, and how it is manipulated by JavaScript scripts when the document is loaded. I found that this model works well, and I have always preferred that approach to XPath. However, this could be subjective, and I have to be careful about imposing my preferences on language users. Perhaps I need to create an XPath-style library for meta-programming.

I'm not sure what you meant by "the AST, not the code". I think I made things overly complicated by talking about code and AST separately, they are effectively the same thing in Heron.

Does the DOM work well for

Does the DOM work well for tasks such as, eg. create an index entry for all bold text in a document? Or in Heron, replace all calls to a specific function with a different syntax tree?

The path may be arbitrary, or it may be relative to the AST morphing instructions. Can you express "one level up from the current point"?

My "code" was not clear sorry, I meant compiled code (the second entry point). To insert stuff in the AST, it would be more convenient to express it as Heron syntax rather than as abstract syntax tree-- but then you don't want the expression evaluated before it is inserted in the AST.

Or in Heron, replace all

Or in Heron, replace all calls to a specific function with a different syntax tree?

This isn't easy out of the box, but by coding some metaprogramming libraries, it should then become easy.

Can you express "one level up from the current point"?

Right now, no. So either my raw API is insufficient, or the burden is back on libraries to make it easier to navigate the AST.

It's all just tree manipulation. Relatively straightforward stuff.

To insert stuff in the AST, it would be more convenient to express it as Heron syntax rather than as abstract syntax tree-- but then you don't want the expression evaluated before it is inserted in the AST.

What is missing now is an API for parsing a String into the appropriate AST node at compile-time.

These are all great points you are bringing up. Thanks!