Why compiler optimizations are interesting

    Compiler optimizations are interesting because they automatically improve the efficiency of programs. Hand-optimization by a programmer is a time consuming and notoriously error-prone activity, so it is of the utmost importance for the compiler to make correct optimizations automatically wherever possible. Compiler optimization also serves as an important link between theory and practice: many optimizations are made by proving certain properties of programs -- for instance, that an array index cannot be out of bounds or that a data structure is uniquely referenced -- and the relevant proof techniques are a valuable area of research in themselves.

    Ensuring that whole programs can be effectively optimized does impose certain design constraints on the compiler and on the programming language. The benefits of automatic optimization do, however, far outweigh these relatively minor restrictions.

To me that is a reasonable perspective and I think it's widely held. My own real perspective is quite different, but it's nothing novel and can probably be summarised using catch-phrases:

    Adequate performance really comes from algorithms, data structures, profilers, experience; the compiler itself is the program that most needs to be simple and fast; whole-program analysis is the enemy of interactive programming; the means seem to justify the ends for many optimizations presented at conferences; I must trust my own experience more than microbenchmarks and Socratic discourses on modern computer architecture.
Of course this has absolutely nothing to do with type systems.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Simple and fast

I just yesterday finally took a look at all the presentations from Oberon Day. Aside from Wirth and Digital Mars, who else cares about simplicity?

Sometimes I wonder whether in the future, due to the increasing popularity of bytecode or maybe back-end frameworks like MLRISC, compilers will simply compile and optimizers (as completely separate entities) will optimize.

lost in translation

compilers will simply compile and optimizers (as completely separate entities) will optimize

Do you mean that an optimizer would be a new pass that operates on already-compiled code? If so, then wouldn't you miss out on all the optimizations that can be done due to properties of the source language that get lost in translation?

Or maybe you mean for optimization to be modularized. Which is less controversial (although it might be ripe for some interesting--wait for it--modularization of cross-cutting concerns).

Re: lost in translation

He might mean that the compile phases be split into two distinct programs.

So the front-end, or "compiler" piece would run and generate some sort of intermediate code file which has additional optimization info. The optimizaer would read in this file and produce the executable.

The final goal, of course, would be that the intermediate representation becomes some sort of UNCOL. ;-)

Matt Markland

Optimizations

If so, then wouldn't you miss out on all the optimizations that can be done due to properties of the source language that get lost in translation?

I think the point is that you can make whatever high-level language-specific optimizations you want in your compiler, but that the optimizers can perform very significant language independent optimizations including all the standbys like inlining, constant propagations, dead code elimination, loop unrolling, etc...

If you look at LLVM, you can see that, even though the instruction set is pretty simple, it carries significant optimization-supporting information.

Compilers separate from optimizers

compilers will simply compile and optimizers (as completely separate entities) will optimize.

I was just beginning to think the same thing. I'm in the middle of writing a dynamic recompilation core for a Z80 emulator that targets the JVM. The cool thing, is that it looks like I won't have to generate an IR and perform compiler optimizations on it, because I can just generate Java bytecode directly and let the optimizer in the virtual machine take care of it. The end result is that the emulator should run very fast, with a modicum of effort on my part.

I love writing optimizers

The idea that with just one bit of code I can make every occurrence of a certain code pattern run faster in language X is just very appealing. It is meta-refactoring, in a way. A sport in itself.

I am very big on simple/small/fast, but I will happily make an exception for optimizers. A good optimizer can be scaled, so it can be fast if needed. A bugfree optimizer is a black box, so all complexity is completely hidden. The cost of "complexity" is worth it, as every algorithm in it is amortized over so many pieces of code.

Like me, a lot of people here like very productive/expressive/dynamic languages... but out there many people still think in terms of "won't use it if not with N% of C". Optimizers can pull a larger section of expressive language features within that benchmark... they are essential for a language's marketability.

The biggest hurdle for optimizers for very expressive languages is performance predictability. Certain optimisations (like e.g. constant folding) are trivial and therefore predictable, the more complex they get less so. It is very cool if "primes" in Haskell can run as fast as C, but it becomes useless if a similar program suddenly is 100x slower. Sure, you can always do more with smart algorithms, but working with such variances is frustrating. Besides, it is sometimes nice not having to hand optimise your simple & elegant but brute force code because the compiler guarantees very tight code which is "fast enough".

But type system is important

Of course this has absolutely nothing to do with type systems.

In static typed languages optimizer could use the fact that types are known on the compilation stage to do some type depended optimizations (e.g. place integers into registers).

Erehwon

It's temping to name a language that.

I'm beginning to think that the place to focus now is just on some extremely minimalistic vectorized virtual machine and bootstrap from there, with the focus absolutely being on threading and vectorization for modern hardware. This is not at all an original thought with me; see vlerq.