Generating compiler back ends at the snap of a finger

The paper:

Resourceable, Retargetable, Modular Instruction Selection Using a Machine-Independent, Type-Based Tiling of Low-Level Intermediate Code

Ramsey and Dias have a series of papers about making it ever easier to generate compiler backends, and the claim is that they produce decent code to boot. I wonder if this stuff has/will show up in compilers I can use? (Or, do you think it not actually matter, for some pragmatic reason or other?)

Abstract: We present a novel variation on the standard technique of selecting instructions by tiling an intermediate-code tree. Typical compilers use a different set of tiles for every target machine. By analyzing a formal model of machine-level computation, we have developed a set of tiles that is machine-independent while retaining the expressive power of machine code. Using this tileset, we reduce the number of tilers required from one per machine to one per architectural family (e.g., register architecture or stack architecture). Because the tiler is the part of the instruction selector that is most difficult to reason about, our technique makes it possible to retarget an instruction selector with significantly less effort than standard techniques. Retargeting effort is further reduced by applying an earlier result which generates the machine-dependent implementation of our tileset automatically from a declarative description of instructions' semantics. Our design has the additional benefit of enabling modular reasoning about three aspects of code generation that are not typically separated: the semantics of the compiler's intermediate representation, the semantics of the target instruction set, and the techniques needed to generate good target code.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Neat

Though, it isn't clear to me that this easily targets higher level backends such as JavaScript. But I might give this a try.

Pragmatically...

... let me handwave with (somewhat) educated guesses, but I don't see GCC/clang & friends throwing out code to switch to this technique, unless the code is a maintenance headache.

What seems to me more realistic is that this kind of research will be used either in new compilers, or in code generators that don't have the manpower behind industrial C/Fortran compilers, so don't have this kind of phase yet. I'm thinking of JITted languages that have performance within 2x/4x of C simply because of the engineering effort of the backend (I seem to remember this kind of performance difference in papers about Smalltalk/Self virtual machines).

But this is somewhat far from my field, so I'd be curious on feedback.

Less potential

I think the 2x-4x performance that separates an awesome backend from a mediocre one is mostly register allocation. I'd be surprised if instruction tiling makes a performance difference of more than 15% on x86.

Related work?

No mention of CoSy or BEG?