Everybody Needs a Syntax Extension Sometimes

Sebastian Erdweg, Felix Rieger, Tillmann Rendel, and Klaus Ostermann, to appear in Proc. of Haskell Symposium, 2012.

Programmers need convenient syntax to write elegant and concise programs. Consequently, the Haskell standard provides syntactic sugar for some scenarios (e.g., do notation for monadic code), authors of Haskell compilers provide syntactic sugar for more scenarios (e.g., arrow notation in GHC), and some Haskell programmers implement preprocessors for their individual needs (e.g., idiom brackets in SHE). But manually written preprocessors cannot scale: They are expensive, error-prone, and not composable. Most researchers and programmers therefore refrain from using the syntactic notations they need in actual Haskell programs, but only use them in documentation or papers. We present a syntactically extensible version of Haskell, SugarHaskell, that empowers ordinary programmers to implement and use custom syntactic sugar.

Building on our previous work on syntactic extensibility for Java, SugarHaskell integrates syntactic extensions as sugar libraries into Haskell’s module system. Syntax extensions in SugarHaskell can declare arbitrary context-free and layout-sensitive syntax. SugarHaskell modules are compiled into Haskell modules and further processed by a Haskell compiler. We provide an Eclipse-based IDE for SugarHaskell that is extensible, too, and automatically provides syntax coloring for all syntax extensions imported into a module.

Paper describes an approach to extensible Haskell syntax. The whole concept looks like it could be integrated to GHC with language extension mechanism, which is easier to use than external preprocessor (though it is written in Java). When comparing to TemplateHaskell, authors emphasize that their extension is modular and composable, what cannot be said about TH. Sources on GitHub.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Extensible language problem

The usual problem with extensions to language syntax is applying them to the mind of the maintenance programmer who has to fix the code. Allowing extensions in the form of new delimiters is scary.

There was a fad for this sort of thing a few decades ago, but it never caught on. The M4 macro processor in UNIX, still used for Sendmail config files, is a legacy of that era.

Reminds me of Zawinski

Reminds me of that Zawinski quote :

'Some people, when confronted with a problem, think
“I know, I'll use a [syntax extension].” Now they have two problems.'

Do you believe that?

Surely mathematicians have been developing their syntactical notations for the last 50,000 or so years. If you and Zawinski are correct then that's quite a headache they've accumulated.

That's not the point

I'm sorry, but I think you may have missed his point.

-

Can you explain the point in a different way then? [This is not meant to be confrontational. It's a genuine request.]

Economics of Language Design

Developing a better syntax can be a worthwhile effort. But developing a better syntax does not solve the problem originally being addressed. Even after you have the new syntax, you must still solve the original problem.

The hope (without guarantee!) is that the original problem will be easier to solve in the new syntax. But, it is unlikely that the total effort - of developing a new syntax then solving the original problem - is less than the the effort of solving the specific problem in the established syntax.

In most cases, the syntax can only pay for itself in a volume of similar problems. And, to achieve this volume, the syntax must generally be taught to a number of practitioners - which has its own costs.

Ultimately, from an economics standpoint, the development of any new syntax must be considered a risky and significant action, albeit potentially a rewarding one. In any case, it's certainly a second problem.

Money is a red herring.

"There are more things in heaven and earth, Horatio, Than are dreamt of in your philosophy."
-- Shakespeare

There are areas where the extension/creation of syntax is very common. I started leafing through "Meta-Logics and Logic Programming", and pretty much every paper introduced a new syntax. I've probably got more Prolog and Lisp books than most programmers so I've seen my fair share of meta interpreters that introduce new syntax. This is pretty much idiomatic usage. It solves problems.

"Get into a rut early: Do the same process the same way. Accumulate idioms. Standardize. The only difference(!) between Shakespeare and you was the size of his idiom list - not the size of his vocabulary."
-- Alan Perlis

If you think economics is

If you think economics is about money, you need to educate yourself. There are more things in economics, Barry Watson, than are dreamt of in your philosophy.

Keep in mind, while leafing through successfully published papers, that you are studying a biased sample.

I would expect it to be even more biased with a subject line of "meta logics and logic programming" - that book is specifically discussing different logics, which is different than solving real world problems with logic. (It seems analogous to pointing at "Rise and Fall of Civilizations" to argue that revolution is common and is an effective way to solve relationship problems.)

-

" There are more things in economics, Barry Watson, than are dreamt of in your philosophy."

:) You and I are obviously products of two very different cultures.

"Keep in mind, while leafing through successfully published papers, that you are studying a biased sample."

Don't care. I used the book as an example of an area where the creation/extension of syntax was commonplace. Bias or no bias makes no difference. There is at least one group of individuals who see no problem with the creation/extension of syntax.

"I would expect it to be even more biased with a subject line of "meta logics and logic programming" - that book is specifically discussing different logics, which is different than solving real world problems with logic. "

Now, where did you get your list of "real world problems" and who decided what went on the list and what didn't?

Are you seriously trying to say that all of the authors in that book which created new syntax are:
1) engaging in risky behaviour, and
2) not solving real world problems with logic.

"(It seems analogous to pointing at "Rise and Fall of Civilizations" to argue that revolution is common and is an effective way to solve relationship problems.)"

No. We can reason about reasoning. We can use logic as a tool when we discuss logic. We can also use logic as a tool when we discuss metalogics. In such a case we might even end up solving problems in metalogic (which I would describe then as real world problems) with logic. Your analogy didn't quite hit the mark.

Seriously?

You are studying a biased sample.

Don't care.

You aren't much into the whole science or logic shtick, are you?

You say that creation of syntax is "commonplace" (and a non-problematic strategy to solve problems) based on a book that surveys meta-logics. By the same logic - assuming you had a consistent logic - you could argue that winning lotteries is "commonplace" (and a non-problematic strategy to solve problems) based on a book about lottery winners.

Now, where did you get your list of "real world problems" and who decided what went on the list and what didn't?

A real world problem is one motivated by a real world use-case, i.e. where current solutions are somehow inadequate or problematic. (I don't have a list. I don't need one. Similarly, I don't need a list of real world cats to recognize them.)

Are you seriously trying to say that all of the authors in that book which created new syntax are:
1) engaging in risky behaviour, and
2) not solving real world problems with logic.

Yes. Yet, I believe you misunderstand what I have said.

The behavior of creating any syntax is economically risky. That is, there are resource costs to develop, advocate, teach, and learn a syntax - time, food, paper, ink, sometimes even money. This effort is not guaranteed a positive return on investment - i.e. sufficient adoption and future savings on time, energy, paper, ink, money, etc. relative to established solutions to make up for the costs. This effort is, therefore, economically risky. Sure, you can point at success cases - where the effort DID pay off - but that just creates a biased sample. It does not change the fact that the effort was risky when it was performed.

The book in question is such a biased sample: it only describes the efforts that some small group of authors believed to be useful and successful. (BTW, this is NOT the same as endorsing the creation and extension of syntax as a way to solve problems.)

Regarding the second point, the problem the authors are solving is teaching the reader; a survey of declarative meta-programming. This is a real world problem, but not a problem they are solving with logic.

We can reason about reasoning. We can use logic as a tool when we discuss logic.

Indeed we can. Have you tried it?

-

You say that creation of syntax is "commonplace"

A large chunk of your reply is based upon this incorrect statement. No! I said "There are areas where the extension/creation of syntax is very common.". I later gave en example as proof. All your talk about bias is nonsensical. I never claimed that the book was representative of the entire field of programming.

Regarding the second point, the problem the authors are solving is teaching the reader; a survey of declarative meta-programming. This is a real world problem, but not a problem they are solving with logic.

I don't think you've read the book. The authors solve real world problems with logic. To demonstrate this I open it up to find an random example.... page 69 and we have "3.5 S-semantics for Meta-interpreters". The authors then build up to a logical proof on page 71 for Theorem 30. Embarrassingly enough the proof is omitted :) but they do tell us it is an inductive proof. However, the lack of trivial inductive proofs is not unusual at this level w.r.t. to logic.

We can reason about reasoning. We can use logic as a tool when we discuss logic.
Indeed we can. Have you tried it.

That actually made me laugh! Don't get angry, it's just a discussion.

All your talk about bias is

All your talk about bias is nonsensical. I never claimed that the book was representative of the entire field of programming.

The bias applies even to the specific field of programming you are discussing. Even for logic and meta-logic programming, creating and extending syntax is not common on a per problem basis. A book that surveys logic and meta-logic is offering a severely skewed view of how often new syntax is introduced in those disciplines.

You further said: "There is at least one group of individuals who see no problem with the creation/extension of syntax." I think the authors who created said extensions would say otherwise: that they do recognize the problem - the resources and risks involved with developing, documenting, and teaching a new syntax. Of course, they also recognize the potential rewards.

I don't think you've read the book. The authors solve real world problems with logic.

Indeed, I haven't read this specific book. I've books of similar subject and nature. Didactic sample problems are rarely real-world problems, btw.

-

Even for logic and meta-logic programming, creating and extending syntax is not common on a per problem basis.

It is a standard technique used to differentiate between metatheory and object theory. Either quotations are introduced (syntax extension) or a new language is created for one of the theories (syntax creation).

I've books of similar subject and nature.

How did the authors differentiate between metatheory and object theory in these books? I'm seriously interested in the answer.

Didactic sample problems are rarely real-world problems, btw.

Are they not real world problems for those who are paid to generate their solutions?

This little discussion needs to come to a head so I'm going to summarise my thoughts.

Basically, I think your economic theory is wrong. I'm saying that the creation and extension of syntax solves problems, and, doing such things is not - as you put it - "a risky and significant action".

To demonstrate that syntax solves problems I merely have to point out the plurality of different syntaxes developed since the 1950s.These systems were not designed just for the fun of it, no, thinking men and women chose not to go down the same old path for a reason - they needed to represent new ideas - they needed new syntax. The creation of syntax was not "certainly a second problem", as you put it, for people who had no other means of expression. The syntax of FORTRAN had to be invented.

Now, your claim of this being a risky action is baffling. If the creation of syntax was risky, students of computer science would be warned of the foolishness of Panini, Van Wingaarden, Backus and Naur. On the contrary, in that list we find a recipient of the IEEE Computer Pioneer Award, two recipients of the Turing Award, and a name not forgotten in 2,500 years. So let me ease your mind - the creation or modification of syntax contains no inherent risk. I think we can safely say that no programmer suffered misfortune because they put pen to paper and wrote a line of original BNF. Your risks have been misplaced, they belong elsewhere.

Let me now tackle your idea that syntax creation or extension is a "significant action". I say that collective experience proves this wrong. The creation of syntax is not difficult for the trained practitioner. It has been the object of rigorous study since the 1960s and has become one of the most well understood areas of computer science. Who amongst us has meet a student of the subject who was not taught how to write a parser? Your claim is greatly exaggerated.

At this point I think I'm done. There's not much more I can say to be honest.
You have offered no proof. You have provided no numbers. You have only given us wishy washy claims in the language of one the social sciences.

Many logic programming and

Many logic programming and term rewrite systems allow introduction of arbitrary symbols (including operators) without any syntax manipulation. The distinction between metatheory and object theory is easy enough to achieve via a modal representation - e.g. staged programming with an Applicative or Free Monad.

your claim of this being a risky action is baffling. If the creation of syntax was risky, students of computer science would be warned of the foolishness

That you associate foolishness with risk is baffling. Risky behaviors can still be wise ones. Often, reward is commensurate with risk. Have you never heard the phrase: no risk, no reward? (Here we aren't risking life or limb, of course. We're risking time, resources, money.)

Economic risk is pervasive in any research and development field. Risk is something to understand, account, and mitigate, not to avoid entirely.

Your claim is greatly exaggerated. - It seems you're the one exaggerating it.

Obvious Risk

One obvious risk, especially in logic programming, is the possibility that a construct introduces inconsistencies. The result will be that the corresponding program might generate rubish. Logicians do precisely state this that we might have formulas A, such that:

G |- A 

and

G |- ~A

But there are also other notions of inconsistency, which might be needed for example for non-classical logics. A very prominent example of a construct in the history of mathematical logic that was able to generate rubish, was naive set comprehension. It was pointed out to Frege by Russel that one of his systems had this problem.

I was careful when showing the lambda introduction inference rule somewhere else here, to point out that it is possible to show that the logic remains consistent. This can be shown via cut elimination and by imposing certain type restrictions on the arguments.

Unfortunately showing consistency is always relative to a meta system we are using in the consistency proof, which we must believe to be consistent. So we either live in constant uncertainty, or we can build trust in the chosen meta system. There are a couple of interesting meta systems around.

P.S.: Interestingly hierarchical symbol definitions don't lead to serious problems. There is an easy proof that if G is consistent, and D is a definition of a symbol S which does not itself contain the symbol S, then G, D is also consistent. If I remember well I saw a proof of this in some book by Shoenfinkel.

P.S.S.: @dmbarbour Would be interested in references to "staged programming with an Applicative or Free Monad", so as to get a grip what this term means.

It is my strong preference

It is my strong preference to syntactically (aka structurally) prevent or control most semantic garbage. While I accept static analysis, I believe it is far better to prevent a problem from being expressed than to detect the problem afterwards. Your point is well taken - a new syntax introduces risk of accepting nonsense that another formulation may have prevented.

I refer to `Applicative` and `Free Monad` as the terms is used in Haskell. Free Monads are closely related to applicative structure for staged programming. Applicatives are easily staged because they control the flow of information and prevent ad-hoc introduction of loops. If you study the `StaticArrow` transform, you can see an example of applicatives being leveraged directly for staged programming.

Where do these Alan Perlis

Where do these Alan Perlis quotes come from ?

Mathematicians and notations

Is it really true? I mean mathematicians often pick symbols, but programmers often pick variable names and it doesn't count as inventing "syntax". The syntax of modern mathematics is mostly centered around a number of conventions of how to use subscripts and superscripts and some amount of 2D-syntax. Most published mathematics is now written in LaTeX, which means there is an implementable core set of syntax rules that they follow. Finally there are a few domain-specific syntaxes (matrix syntax, Hasse diagrams, string diagrams, knots...) floating around, but again I don't have the impression that one new such syntax is invented daily.

You may precisely argue that invention of revolutionary notations has slowed down since the monopoly of LaTeX, because it makes them harder. In any case, even taking an long-term historic perspective, I'm not sure that mathematicians have invented radically new syntaxes more often than programmers have come up with new languages.

Gottlob Frege said it best in 1896

"the comfort of the typesetter is certainly not the summum bonum"

I got this from the introductory article to his Begriffschrift in "From Frege to Gödel, A Sourcebook in Mathematical Logic".

I would easily believe that the world has seen more number systems than programming languages. Here's six that haven't been forgotten:
http://www.math.twsu.edu/history/topics/num-sys.html

I would say that syntax

I would say that syntax extension facilities may be really useful, but it is better to limit their use cases to core language libraries. For example, do-notation, monad comprehensions, any other sugar (arrows, codo) can be implemented using these facilities which seems a much better choice than harcoding them into the language.

However, using syntax extension in solving everyday tasks may be not be a good idea.

I agree that syntax is

I agree that syntax is generally a matter for a core language (or maybe for a very widely used DSL). But, if you think (as I do) that syntactic differences between languages are important, then having syntactic flexibility might help people use Haskell to design other languages.

Yes, but I would say that

Yes, but I would say that languages are not primarily intended to be tools to create other languages. May be this is the task that should be solved by some specialized framework? May be something like Roslyn? Similarily there may be a Haskell framework that allows you to build your own language by leveraging existing Haskell compiler and run-time. But Haskell framework is not Haskell.

The greatest benefit of syntax - or to generalize - language extension facilities is that they can turn a language developer into a library developer (which is, I believe, very good). But I don't want to see how people are using compiler services and effectively building their own languages to solve tasks which are already idiomatic for Haskell, without any syntax/language extensions. But how you can distinguish here - what is worth of a new language/eDSL and what is not? A lot of tasks can solved 1) traditionally 2) by creating a DSL. Building DSLs all the time is not something that I want to see in a language like Haskell. But how to understand when this is appropriate?

good question

Good question. I don't know.

Where can I read more about Roslyn please?

Roslyn

I think you can start here - http://en.wikipedia.org/wiki/Microsoft_Roslyn
Other links are in the end of the article.

thanks

thanks

Logic programming

"Yes, but I would say that languages are not primarily intended to be tools to create other languages."

I have to respectfully disagree with you here. Let me offer you an argument. A first-order logical language is determined by its set of non-logical symbols (constants, functions, and relations). In logic programming, a program is an example of the Horn clause fragment of first-order logic which is given a computational interpretation --- a program has its set of constants, functions, and relations. If we accept these two premises, is the conclusion not that a whole host of programming languages are intended to create other languages?

If we accept these two

If we accept these two premises, is the conclusion not that a whole host of programming languages are intended to create other languages?

Intended to express other language semantics, not necessarily express other syntax + semantics.

I think it depends on our

I think it depends on our understanding of the "language" in the first place. On the one hand - yes, we can think that we really do create other languages all time, but these languages are usually expressed in terms of their "host" language. However, there might be a completely different approach - when you use some extension facilities to add completely new, foreign terms to your language or even to change the semantics of existing terms. These two cases should be clearly differentiated here.

BTW I am afraid that under "languages are not primarily intended to be tools to create other languages" I meant "Haskell is not primarily intended..." in the first place :) Of course, there exist DSL-oriented programming and powerful languages, based on macro-meta-programming techniques. But is it really a good thing if a, say, functional language exposes some "services" that will enable its users to turn it into something completely non-functional and to effectively override its sematincs?

Sorry

"I think it depends on our understanding of the "language" in the first place."
...
"I meant "Haskell is not primarily intended..." in the first place :)"

I have to admit that I had guessed this is what you meant and I extended the definition of "language" used in these parts (LTU) to make a point that in all honestly didn't really need to be made. My only defence was that I was bored at the time :) I promise to be serious from now on.

"....but these languages are usually expressed in terms of their "host" language. However, there might be a completely different approach - when you use some extension facilities to add completely new, foreign terms to your language or even to change the semantics of existing terms."

Are you describing the difference between new libraries and new control structures here?

This is a good point; I

This is a good point; I agree.

Didn't the authors of SICP state somewhere that they thought that programming was largely an exercise in language design? Although I think they may have been referring more to embedded languages.

Sorry that I don't have a reference for that ... I hope I'm not making it up.

-

Are you thinking of the foreword from the 2nd edition of EOPL ?

"If you do this, you will change your view of your programming, and your view of yourself as a programmer. You'll come to see yourself as a designer of languages rather than only a user of languages...."

This way of doing things has been called stratified design by some:

"A programming methodology often used in Artificial Intelligence (AI), as well as other fields is _stratified_ or _layered_ design, When faced with some problem, rather than implementing a solution directly in some base language, a new, more appropriate, higher level language is designed in which to express a solution, and this is implemented in the base language."
-- Bowles and Wilk, "Tracing Requirements for Multi-layered Meta-programming.

Prolog Yes, Logic Programming No

In reference to this post.

There is a difference between the field of logic programming and Prolog. Logic programming embraces the whole of mathematical logic and its intersection with computation. Whereas Prolog is based on Horn clauses. So you cannot say in "logic programming" a program is a set of Horn clauses, but you can say this for pure Prolog.

But for practical Prolog this is also not true. There is a call/1 which you can provide with a term (constant/functions) and which gets interpreted as a goal (relations/constants/functions). This is one of the early forms of meta programming found in Prolog. Today we have an ISO corrigendum which requires a call/n predicate. This gives pretty much what application does in functional programming languages, but a relational way.

On top of call/n there are several proposals for abstraction. There is one found in Logtalk, and one by Ulrich Neumerkel. My own take is the ^/n operator with roughly the following inference rule:

G |- A[X/T]
----------------
G |- call(X^A,T)

This is a true extension of Horn clauses. The extension is not dangerous. In a Minimal logic setting cut elimination can be shown for the above inference rule under certain type assumptions for the arguments.

So Prolog is evolving towards a variant of the vision of Logic Programming. In this variant not FOL is the interesting logic, but some HOL. BTW: In his booklet Dag Prawitz calls a very similar inference rule the lambda introduction rule. See page 66 of Natural Deduction, A Proof-Theoretical Study, 2006.

-

Whereas Prolog is based on Horn clauses. So you cannot say in "logic programming" a program is a set of Horn clauses, but you can say this for pure Prolog.

Ok, I got my definition from Doets "From Logic to Logic Programming". I might be misinterpreting him. Can you give me references for your definition - I need to do some reading.

[EDIT: I've just found the following in the introduction to "A Grammatical View of Logic Programming": "The class of logic programs considered is restricted to definite programs". - I was incorrect. You're right, my example is restricted to a subclass of logic programs.]

"My own take is the ^/n operator with roughly the following inference rule:"

You've got me worried now! They're defining ^ ? But what about the existentially quantified uninstantiated variables in goals given to bagof and setof?

I've got this in my Prolog implementation ('$intrepret'/2 does the heavy lifting of call/1 in my implementation. The second argument (CP) is a reification of the current choice point when call/1 was called):


'$interpret'('^'(_, G), CP) :- !, '$interpret'(G, CP).

I just skip the first argument of ^/2 and interpret the second. I thought it was pretty normal.

^/n works since it is not ^/2

^/n works since it is not ^/2.

When you can call(X^A,T) in your program this gets first transformed to ^(X,A,T). Since call/n works as if it were defined as follows:

call(p(X1,..,Xn),Y1,..,Ym) :- p(X1,..,Xn,Y1,..,Ym).

And call(X^A,T) is the same as call(^(X,A),T). The predicate ^/3 then does the work for you. For more info see for example:

Ulrich Neumerkels library

My take

It works pretty neat.

Confused.

Surely that won't work unless they've changed the operator table.

Do you mean call(X^(A,T)) ?

I'm confused...... help.....

No, I am using call/n and not call/1

No, I am using call/n and not call/1, in particular with n=2.

call/n is a new meta call predicate, it goes back to Richard O'Keefe and Mycroft. It is meanwhile supported by many Prolog system. I gave the definition in my previous post. You find a spec of call/n here:

Technical Corrigendum 2, call/2-8

The corrigendum has been approved this year (2012). From Ulrich Neumerkel:

"The actual first written source I am aware of is the 1984 Draft Proposed Standard for Prolog Evaluable Predicates by Richard O'Keefe which was first publicly announced 1984-07-25. You find a precursor restricted to some not fully specified kind of lambdas but not permitting partial application in Mycroft, O'Keefe A polymorphic type system for Prolog, AI Journal, August 1984."

I think I've understood

several different predicates with the same principle functor differing in their arity.

One gets used to it

Yes, like overloading, but only via arity...

Bug fixed

In an older take (^)/2 was overloaded to denote:

  • Lambda abstraction.
  • Higher order existential quantifier (aka local variables).

The examples we have run so far were all looking good,
until Ulrich Neumerkel asked about what he called
"partial application", which amounts to variants of
closures resulting from eta-conversion.

Bottom line was that there are cases where the above
distinction cannot be resolved at runtime without any
type information present.

So there is a new library(lambda) for Jekejeke Prolog
on its way. It will have two distinct operators, the
(\)/2 for lambda abstraction und the (^)\2 for local
variables.

These will also have different meta predicate
declarations:

:- meta_predicate \(?,0,?).
:- meta_predicate \(?,1,?,?).
:- meta_predicate \(?,2,?,?,?).
Etc..

:- meta_predicate ^(?,1,?).
:- meta_predicate ^(?,2,?,?).
:- meta_predicate ^(?,3,?,?,?).
Etc..

The fix is scheduled for the upcoming release 0.9.9.

Felix system

In Felix I took a different approach. Instead of thinking about some syntax and syntax extensions .. there is a bootstrap syntax which is useful only to define syntax, plus a fixed set of AST terms. What you would then consider the actual core Felix language is then defined entirely in user space as a syntax extension.

The effect is that that any suitable non-ambiguous syntax can be used to generate AST terms. The parser being used is Dypgen which is a GLR parser which accepts dynamic extensions, and the action codes are written in Scheme, the AST comprising certain s-expressions.

It is not intended that "on the fly syntax extensions" be written by the programmer to solve one problem, I will give one example of how I used this feature myself.

Just for fun, I decided to make an object system based on functions returning records of closures. I gave this system syntax that looks a lot like Java: you can specify interfaces types, extend them, you can specify objects which implement these types, and extend them. It isn't Java, but it looks a bit like it so it's reasonably easy to understand.

I just did it for fun. I never expected to use it. I'm not an OO person. But then I added "yet another feature" to my fdoc format my webserver processes: the ability to present a slideshow. The fdoc to HTML converter function was now really messy. So I decided to refactor it into separate run time dynamic link libraries (plugins) to isolate the different features.

It's a real pain in Felix to get the types to agree across dynamic link boundaries because the compiler generates types using a sequential counter, so you have to export functions with extern "C" linkage using a programmer specified C name. However the actual types don't have to be the same provided they're layout compatible.

The obvious solution: use my Java like syntax. It worked like a dream. I only had to export a single function (the object constructor) and layout compatibility took care of the record type.

IMHO extensible syntax isn't about on the fly extensions for a single problem, but about syntactically abstracting idiomatic usage patterns at a high level. Such as defining an "object oriented" system entirely in the parser.

You can find the whole grammar here, everything you see there is a syntax extension:

http://felix-lang.org/lib/grammar

Changing the economics

I agree with some of the points made earlier, that adding additional syntax has real economic costs. I view the Sugar* projects as trying to change the economics here, by making syntax extension as lightweight as possible, and providing all the programmer usability infrastructure (fine-grained IDE support, etc.) for as close to free as possible. The developer still has to learn the syntax, but at least they are well supported in doing so.

Another interesting paper on extensible syntax is this ICFP'12 paper on a generic typing approach to composing languages, drawing heavily from Data Types a la Carte. This is focusing on the language technology rather than on the programming experience, but still very interesting. Also, MetaHaskell tries to bring typechecking support to metaprogramming.

The reason it makes sense to me to build extensible syntax systems at the language level is mainly around typechecking; being able to define typing rules for your extensible syntax makes it much more useful.

With all this progress on all these fronts, I think the future is bright for extensible languages in general.