An Incremental Approach to Compiler Construction

Abdulaziz Ghuloum (Indiana University), An Incremental Approach to Compiler Construction

Compilers are perceived to be magical artifacts, carefully crafted by the wizards, and unfathomable by the mere mortals. Books on compilers are better described as wizard-talk: written by and for a clique of all-knowing practitioners. Real-life compilers are too complex to serve as an educational tool. And the gap between real-life compilers and the educational toy compilers is too wide. The novice compiler writer stands puzzled facing an impenetrable barrier, â€œbetter write an interpreter instead.â€

The goal of this paper is to break that barrier. We show that building a compiler can be as easy as building an interpreter.

This paper from the Scheme workshop presents an extended tutorial showing how to implement a compiler from Scheme to x86 assembly language. The compiler is written in Scheme, obviously (something that may confuse students, I might add, so beware).

An important aspect of the presentation of the material is that the compiler is built incrementally, and the product of each step is a working compiler for a growing subset of the language. This is similar to the EOPL approach. In contrast many compiler textbooks and courses follow a different route, in which each phase is dedicated to building a different part of the compiler (i.e., lexer, parser etc.), and thus the student only sees how the parts interconnect, and gets the satisfaction of seeing the compiler work, at the very end of the process.

Supporting material can be found here.

By Ehud Lamm at 2006-10-02 12:02 | Implementation | Teaching & Learning | other blogs | 45909 reads

Bottom up

It's a particularly fun paper because it builds the compiler bottom up. The first thing he handles are fixnums: he moves them into eax, because that's what gcc does for a simple program that returns an integer. Then he moves on to tagging for different types of values, and so on.

A pretty cool idea, actually -- he assumes no knowledge of x86 in particular, instead using the code that gcc generates as a guide.

By Michael Greenberg at Mon, 2006-10-02 15:33 | login or register to post comments

definitely fun

It's a particularly fun paper

I am working though the paper now and I can definitely agree that it is fun!

By Peter Michaux at Wed, 2009-12-30 18:23 | login or register to post comments

Nice

For the enthusiast that's not going to be building compilers for a living, it's nice to be able to get to the meat of a compiler without getting bogged down in theory, which can be explored later if further interest is there.

I'm sure it's been mentioned here before, but Jack Crenshaw's Let's Build a Compiler seems to be something along the same lines.

By Dave Lopez at Tue, 2006-10-03 01:39 | login or register to post comments

It has been mentioned

I first heard about Let's Buld a Compiler through this very forum, but was a bit disappointed when I tried reading it. It's a great series of articles, but it's rather long-in-the-tooth now, and I don't think the years have been kind to it. (68000 assembler + Pascal anyone?)

By Peter McArthur at Tue, 2006-10-03 08:09 | login or register to post comments

Well for readability, I'd

Well for readability, I'd take 68000 assembler instead of x86 assembler without any hesitation!

By renox at Tue, 2006-10-03 12:07 | login or register to post comments

Minimalist good

I appreciate the minimalist style of this tutorial. Ghuloum seems to have put a lot of thought into giving the reader exactly the right amount of information -- just enough to leave the reader feeling confident that he can figure out what to do next for himself.

Previously on LtU: The 90 Minute Scheme to C Compiler.

By Peter McArthur at Tue, 2006-10-03 08:17 | login or register to post comments

Good, but familiar

Structure and Interpretation of Computer Programs makes the same point. Ditto for Norvig's wonderful Paradigms of AI Programming. And then of course there's Crenshaw's classic "Let's Build a Compiler!" (which now has a Forth version. Not that I get tired of papers like this, mind you :)

By James Hague at Tue, 2006-10-03 13:14 | login or register to post comments

Minimalist Language Development

How does this approach lend itself to language development? It would be interesting to develop a language from object code primitives and build its compiler at the same time...

By Nathaniel Dean at Tue, 2006-10-03 14:34 | login or register to post comments

Compared to EOPL

How would this compare to the approach in Essentials of Programming Languages, from those who have read both?

By rbe5000 at Thu, 2006-10-05 00:07 | login or register to post comments

Both? I am not sure exactly

Both? I am not sure exactly what you are referring to.

The EOPL appraoch (philosophy, if you will) is to gradually build language constructs, while having a working interepreter at all stages. That's was my main point.

As regards compilation, EOPL1 derives a compiler (and VM) from an interpreter. This specific technique is unrelated to the current paper.

By Ehud Lamm at Thu, 2006-10-05 18:43 | login or register to post comments

Ah yes

Sorry. I somehow forgot the line in your initial post that says it is similar to EOPL in this respect.

By rbe5000 at Thu, 2006-10-05 20:48 | login or register to post comments

Lambda the Ultimate

User login

Navigation