archives

Help with Mixfix in Bison?

As BitC moves towards a more human-compatible (sorry lispers!) surface syntax, we're considering mixfix parsing. Since it really applies only in the expression sub-grammar, it seems like a shame not to be able to use Bison (or something similar) for the rest of the grammar.

As near as I can tell, the only way to implement this is to have Bison simply accumulate a token sequence for expressions without trying to deal with precedence at all, and then apply a rewriter on the resulting AST to apply operator precedence rules dynamically.

Hmm. A kludge may be possible with mid-production actions and GLR parsing.

Is there a known solution to this, or am I just barking up trees?

To CPS or not to CPS

Old question, still difficult. I am in the process of moving from a compiler which compiles the source code to an AST representation in C and then runs a trampolining interpreter on top of it, towards full-blown C compilation.

I am puzzled whether to do the CPS transform or not (major code expansion, seems like a large performance hit? Or am I using the wrong transform?). I can also go ANF but actually, I am not too sure how to compile the ANF representation efficiently.

And actually, is there any _real_ difference between CPS and ANF? Seems to me that the CPS transform just makes the return pointer explicit in the lambda representation.

Any thoughts?

Can function pointers be "fixed"

Whilst thinking on the subject of language design, specifically lowish-level (C++ level), I came up against the seeming brick wall of function pointers. Now, whilst function declarations can be modified to allow for such niceties as closures, coroutines and multiple return values, function pointers, it seems, can never be harnessed for the power of good. The reason for this seems to be the way that functions are declared in C-type languages, i.e. completely differently from data and, I might add, rightly so. I much prefer:

int doIt(int a)
CODE

or even something along the lines of:

int,int iHandle:swapInt(int a, int b)
CODE

as a more elaborate coroutine with a multiple returns and a handle to its own instance, compared to more object based delcarations:


(int)function(int) doIt = new (int)function(int)
CODE

(note: this is just an example in a sugarless but statically typed language, I'm sure there are nicer ways of having OOP function declaration)
which, in any case also brings up the question of integrating return types into the declaration (hint: Pascal is also very ugly)
However, even for the modest simple function, the C function pointer of

int(*fp)(int);

is just completely whack compared to the rest of the data declarations.

int a;
char c;
int(*fp)(int);

(spot the odd one out)

So my question is this, has anyone encountered a statically-typed function/funcpointer declaration system that is at all elegant?

-DNQ

Programming Languages Aren’t

What is a programming language? The common definition generally only includes the syntax and semantics of, for lack of a better description, the “text” of a program. For instance, we think of languages as defining how variables are declared, how loops are specified, functions defined, if certain mechanisms like continuations or closures are first-class (or close to it), etc.

This definition fails to capture how the languages are really used, though. Just because a handy library is written in C++ doesn’t mean it will integrate nicely into my present C++ program. The reason is that it is the APIs that are the real language - not the obscure rules that define what a loop looks like or which characters are allowed in a variable name.

I realize this is a (potentially mis-informed) blog post (from my own blog, no less), but I'm not familiar enough with the research to know if this phenomenon has been widely studied or not. (And if so, where I can read more about it.)

In my experience, the integration of multiple libraries with different styles is one of the primary reasons large programs often devolve into an incomprehensible mess over time. It's not the only reason, of course, but I think it contributes - especially when some programmers on the team are more familiar with one library vs. another, etc.

What could be done in the context of a programming language to mitigate this? Or, what has been done? It doesn't seem like the mainstream languages are very good at keeping this problem at bay. (Or is it perhaps not a problem for anyone but me?)