DSL or dialects used inside compilers

Hello All,

I am interested in languages used inside compilers. Of course, for front-ends, there are lot of parser code generators (like ANTLR, bison, menhir, ...) and some more sophisticated ones (e.g. attribute grammar based).

Also, functional languages are not only used for their own compilers (eg ocaml being coded in ocaml) but also in other source code related tools like Frama-C (a C static analyzer framework coded in Ocaml).

And code generation also uses specialized formalism (e.g. GCC machine description) and all the BURG like tools.

And even GCC has a middle end lisp dialect (my MELT branch) designed for middle end transforamtion and static analysis.

But there have surely been many other dialects or DSL used inside compilers. Any hints?

Regards.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Gimple/Generic

A question of my own: what is the status of MELT? (How does it compare to Gimple/Generic? Do you produce Generic from it?)

MELT & Gimple/Generic

MELT is able to handle GIMPLE tuples, in the sense that some MELT code mentionning gimple (eg gcc/melt/ana-base.bysl) is translated into C.

But MELT is a translator to C; it is not a gcc front-end. The generated C code is expected to be compiled (by your C compiler which usually is gcc, but could be your system C compiler) into a dynamicly loadable library. Generating C means that MELT can work (e.g. when gcc is a cross compiler) even on systems whose C compiler is not gcc.

I played with in-compiler DSLs a bit

I found it quite useful to have a rule propagation DSL inside a compiler. Prolog (or something similar) is good as well, but performance is not acceptable in general. I'm using Prolog interpreter and autogenerated Prolog programs for quick and easy implementations of type systems (like Hindley&Milner), and a simple rule propagation DSL for tagging AST nodes with trivial but useful facts. DSL itself looks like this:

    ( (returns Cn Cx) (conscell Cx Orig) -> (conscell Cn Orig) )
    ( (binds Vn Cn) (conscell Cn Orig) -> (consvar Vn Orig) )
    ( (varrefs Tn Va) (consvar Va Orig)
      -> (conscell Tn Orig) )
    ( (conscell Tn Orig) -> (deconode Tn) )

The code above tracks variables and AST nodes where they're bound to a result of a constructor application, and another set of similar rules is used to unroll deconstructors, so a static supercompiler can get rid of temporary constructions, and things like map or fold over a map result are optimised properly.

Another DSL I'm using is an abstract flat representation of a code, which provides variable liveness information for register scheduling and other similar things. The advantage of this approach is that some algorithms can be invariant to source and target languages, operating only on a generic intermediate abstract DSL.

This is an amusing question.

This is an amusing question. Do you have a particular reason for asking it, or just curiosity?

What is so funny about

What is so funny about asking the initial question?

WAG

I'd assume there is a common interest in DSL's.

But then the pater familias of LtU has been known to find humor in arid climes before. :-)

I meant amusing as in

I meant amusing as in "providing pleasure", hence "interesting", not as in "funny". Sorry if I offended you.

soory for the misunderstanding

Please accept my apologies for misundertanding your comments.

(I was asking my initial question while writing a grant proposal)

And yes, I am interested in compilation, DSLs, meta programming, and more generally meta knowledge based systems (but have too few occcasions to really work on that last item).

One place I worked at used

One place I worked at used velocity for some things, but I wasn't too happy with it (unfortunately, I didn't record my thoughts, and haven't had much success with other systems). I also did some source-to-source stuff and ssa conversion with antlr 3 for javascript awhile back, and that wasn't too fun either (tree rewrite grammars were still being added at that point, they've probably matured since then). Ocamllex/yacc is fast and safe, but doing typed stages is verbose.

As another data point, we experimented with doing our compilers course @ berkeley in python -- I had very few basic programming questions that semester. This included some parsing, type checking, and a few source transformation / translation assignments. We're a Java school (!) with a diverse student body.

ROSE compiler

ROSE

Is one of the more sophisticated AST manipulation languages I've seen, it uses the EDG compiler front-end to compile ansi c++. It allows for nearly arbitrary re-writing of the AST during compilation. I'm also starting to work in this area I call these languages DSTL's, Domain Specific Transformation Languages. I see them as higher-level specific-purpose transformation languages that use more general languages like ROSE or MELT.