archives

A subtle extention to Lisp-style macros

I've been in the process of teaching myself language and compiler/interpreter design. During this process, I have been studying existing languages as much as possible, to see what concepts I can borrow from them. One recurring theme I've seen in the discussions around here is that all things come back to Lisp. Specifically Lisp macros.

Now I can see how Lisp macros are more powerful than, say, C macros because they end up executing a chunk of Lisp code at compile time in order to generate the substitution code (I think I've got that right). Therefore it is claimed that one can create other mini-languages within Lisp. However, the calling of a Lisp macro is still in the form of a function call syntax. So even if other languages implemented Lisp-type macros, this still doesn't solve the need for domain-specific pre-processors such as Oracle's ProC (for using embedded SQL in a C program), or the QT MOC preprocessor used on C++ code.

So here's how I've decided to structure macros in my experimental language (called 2e). Macro / preprocessor code is enclosed in a pair of symbols "/% ... %/" (designed to somewhat resemble C comments). Whatever appears between those symbols is executed as 2e code by the preprocessor. That code then has access to the whole language & libraries; it also sees the entire input program as a character array (or string object). The job of the code fragments is then to use normal string manipulation techniques to modify the code however it wants. This will include normal preprocessor macro function replacements, processing "include" directives, etc. There will be a set of functions defined to perform common operations, but one can also supply DSL-specific routines that in effect extends the language syntax in any arbitrary way. For example, 2e doesn't currently have a CASE statement, but one can write a preprocessor script that adds that functionality (simply search for the pattern of how you define your case statement, and replace it with a "case" function call converting the body of the case statement into anonymous functions that get passed as function arguments). In fact, once this is implemented I can start removing some of the current syntax from the base kernel language and move them into the preprocessor system.

Now my question is if this seems like the ultimate macro system? Or is there any problems with it that I'm not seeing? (they'll probably show up once I start the implementation).

Example:
/* Since the 2e language doesn't have a native "if" statement... */
/%
regex_replace(input_code_string, "if {\(.*\)} then {\(.*\)} else {\(.*\)}", "ifelse(@(\1), @(\2), @(\3))"
%/

The above would search through the input code, replacing any occurrence of "if {...} then {...} else {...}" with a call to the ifelse() function, in order to provide a more C-like "if" construct. (of course, the regexp would need some further tuning to get it to work properly -- this is just a quick example)

Other items, such as embedded SQL (as processed by Oracle ProC or the Postgresql equivalent) can be handled in a similar way.

Does any other mainstream language have a similar capability (that is, the preprocessor or parser being able to execute code fragments that can directly manipulate the input programs code stream)?

JOVIAL: Stand up Schwartz

Information on the web about JOVIAL is rather scarce. Came across some old grainy footage of a mpeg video of Jules Schwartz giving an amusing speech for those interested in the early pioneers of PLs.