a + b * c in Brian Meek's "The static semantics file"

Although my previous post, translational vs. denotational semantics, failed to stimulate any discussion, hopefully this one will.

Consider the following excerpt from Brian Meek's well-written The static semantics file.

[...]
term = factor | term, multiplying op, factor;
and
term = factor, [multiplying op, factor];
apparently mean the same (square brackets = zero or more repetitions). However, the recursive version better expresses the associativity from left to right. [...]
[...] The order of evaluation of an expression like a + b * c is undoubtedly semantic (linked to code generation, see later) but the use of recursion above only hints at the semantics (to use a phrase from a later electronic correspondent); it is descriptive, not prescriptive. The difference is that the expression can be generated by either rule (or by one implying right to left associativity). [...] The semantics gets put in at code generation and that can be done from any of the equivalent definitions. [...] [I]t appears to me to be the nearest approach among the examples put forward to anything which could be termed "static semantics"; it is perhaps a borderline case.

This seems a bit "off" to me. First of all, how "a + b * c" is parsed is a question of operator precedence, not operator associativity. Second, and more importantly, it seems to me that parsing is clearly a syntactic issue unless you view the semantics of a language as being defined with respect to its concrete rather than abstract syntax.

Such a view can be reconciled with the more traditional "semantics is defined with respect to abstract syntax" view in the following manner. Let language L's "broad" semantic function be defined as the composition of L's parser with L's "narrow" semantic function. Another (perhaps weirder) way to look at it is to view the parser as defining a language in its own right, with the abstract syntactic domain of L being the parser's semantic domain. To put this all in Haskell-ish notation,

parser :: ConSynDom -> AbsSynDom
narrow :: AbsSynDom -> SemDom
broad :: ConSynDom -> SemDom
broad = narrow . parser

Another way of looking at this is that syntax can of course be viewed as merely discriminating between valid and invalid sentences, or it can be viewed as including the parsing of sentences if they are valid. The former, "boolean" role for syntax pushes parsing issues somewhere else, perhaps into pragmatics, perhaps into semantics, or, perhaps into some ill-defined area called "static semantics." This last option is of course what Meek is writing about.

So, if all we're looking for from a grammar is whether it can generate a sentence or not (the "boolean" role) then, as Meek mentions, it need not deal with ambiguities arising from issues like operator precedence and associativity. Maybe this "boolean" role is what Meek calls a "descriptive" role. The grammar merely describes what valid sentences look like. But I'm having problems relating how a "prescriptive" role includes parsing. Would this mean that an unambiguous grammar, i.e. one capable of parsing, is "prescriptive" in the sense that it tells you where you "should" have put parentheses? Seems like a stretch.

Anyway, my final take is this: I think I agree with Meek in the big picture. Different languages draw the syntax/semantics line in different places, but it is always there, and there is nothing useful in between called "static semantics." One of the ways in which you can move the line so that the semantics is bigger is to define the syntax ambiguously with regard to operator precedence or associativity. Then these issues must be defined by the semantics. In the extreme, you could define the syntax of a traditional textual language merely by the character set its programs are expected to be in, leaving everything else to semantics. But in practice this is not a useful way to define a language.

I find friendly people suspicious.

From the link:

If there is a definitive statement of what static semantics is - or even where syntax stops and semantics starts - I have not found it.

I have to agree with this. The situation described in the "The static semantics file", where people are poring over error messages and trying to classify them as syntactic or static semantic, is misguided from the start. The distinction between the two is a property of a particular definition, not of the thing being defined. You can have two different definitions of the same thing. For example, a property which is enforced by a type system formalism in one definition could be enforced by a sufficiently powerful grammar formalism in another. The distinction is useful only because traditionally we limit syntax formalisms to CFG-type things.

By Frank Atanassow at Fri, 2005-11-04 01:36 | login or register to post comments

Abstract Syntax and VW Grammars...

Well, having read that particular paper while working on my Masters thesis hopefully I can clear some things up. The paper is using van Wijngaarden(VW) Grammars to define the languages in question.

VW grammars can specify semantics because they are Turing Complete. They are "two-level" grammars: the meta level is simply a context free grammar which is used by the lower level to complete rule schemas, which are themselves context free rules.

Anyway, VW grammars really neat, and have some interesting ideas, but their use predates(Algol 68) other ways of handling semantics, which are for the most part all around better. Meeks is using, as Frank said, a "sufficiently powerful grammar formalism", and uses to it that the line between "syntax" and "semantics" can be pretty arbitrary.

I don't know if this helps, but I spent two years of my life on this particular back-alley of CSC, trying to figure out semantics the wrong way :). Someone said that "Syntax is the Vietnam of Programming Languages", I must agree.

By Matt Estes at Fri, 2005-11-04 12:46 | login or register to post comments

VW grammars

I wouldn't say the paper uses two-level/VW grammars, though I agree it does mention them to make the important point that by using a more powerful grammar you can shift the dividing line between syntax and semantics.

This gets to the point that Frank Atanassow reinforced, that the dividing line is a function of the way the language is specified, not an inherent property of the language itself.

But, in this respect, there is nothing special about two-level/VW grammars; as Meek mentions, the point is really about context-free vs. context-sensitive ("more powerful") grammars, which would include the more popular alternative to two-level/VW grammars, attribute grammars.

I would also qualify your statement "VW grammars can specify semantics" to "VW grammars can specify what would be called semantics if a context free grammar were used to specify syntax"

By bdenckla at Fri, 2005-11-04 17:24 | login or register to post comments

Lambda the Ultimate

User login

Navigation