archives

Language design: Escaping escapes

Most languages use special characters or special sequence of characters to denote the use of special directives. For instance, in C format strings:

"\n" denotes new line, and if you wanted to have backquote n, you'd have to write "\\n".

In VB, since " is used to delimit strings, you'd use "" to mean double quotes. eg. "He said ""Boo""!"

In each case, if you wanted to show source code in the same language, it gets very painful, since you'd have to escape escapes. For instance to show the example of <img> tag in HTML, I'd have to do this:

&lt;img src="example" &gt;

In C you'd do this:

printf("For example: printf(\"Hello World\\n\"); prints \Hello World\"\n");

Are there any languages or patterns which handle this kind of situation more elegantly?

Examples of patterns:

  1. Escaping HTML escapes is not really necessary because the user can "View source"
  2. Another example is python, which uses triple quotes """.
  3. ASP, JSP, PHP use a sequence of unusual characters eg: <php?>

Writing a DSL for Java

The site java.net has an article on extending Java with Tasks, code blocks executing in a separate thread. This particular extension may not be very exciting, but perhaps it will introduce the idea of DSLs to a new group of programmers.

The article uses a parser generator called VisualLangLab that seems to be "Yacc with a GUI". Since the goal of the article is to extend the Java language, they need a Java grammar to start with. VisualLangLab comes with a Java 1.4 grammar, so that is what is used.

This means that the Task extension cannot be used with a 1.5 compiler; even though the extension only touches parts of the Java grammar that have not changed between 1.4 and 1.5 (I guess). Could this problem be circumvented if the standard libraries provided a representation of the grammar used in the current platform?