## Supporting a spectrum from whole program to separate compilation to aid in efficient program generation

There are certainly whole program compilers with the aim of making higher level languages compile far more efficient runtime executables. But as I currently understand these compilers, their practical usefulness for large scale program development is limited due to efficiency of the compilation process itself.

So, I was wondering if any languages - more specifically, the types of higher level languages that feature language abstractions that might benefit most whole program compilation - supported a compilation model and associated language abstractions that tried to encompass a wider spectrum of optional efficiency oriented language abstractions and features between whole program model compilation and separate compilation in the name of runtime efficiency.

As for "whole program compilation," what I have in mind instead is some kind of limited "library" - or "package" or "unit" (etc.), let's use "library" for now - abstraction comprised of multiple source files that are subject to "whole library compilation. This would allow programming teams to decide when to apply a more limited scope of "whole program compilation" just to performance critical libraries in their programs.

Between these "libraries" (however they are compiled), we have "separate compilation" made safe via traditional module "signatures," or unit "interfaces" or whatever (pick your lingo and preferred module header styled feature).

But additionally, within the "separate compilation model," we might also support other language features that assist the compiler in generating more efficient code. Some low hanging fruit might be: annotations and/or compiler driven function inlining, annotation or compiler driven loop unrolling, C++ or D style template data types and functions (as an alternative to "normal" generic, parametric polymorphic functions and data types), C (and Smalltalk, interestingly)-styled variable length data structures, full blown Scheme styled macros (no, not always or even primarily an efficiency tool, but still...), annotations for "destructured" function argument data types (potentially avoiding memory allocation for passing many function arguments), and so on and so forth.

Once could write entire programs in the language without using these features, but one could also write entire libraries (say, just for example, a performance oriented OpenGL or encryption library) using these directives and alternative language level abstractions - and then pull out all the stops by subjecting the "library" to whole program compilation. Aside from segregation of the program into discrete libraries (yes, not always an easy task in the general case), most of these efficiency oriented features could be utilized incrementally in modifying a program in response to profiling.

Skipping the not-so-high-level C++ (simple macros, templates), are there any higher level languages that support such a "spectrum of efficiency oriented language abstractions and features." I'm particularly interested in the concept of a more limited scope for "whole program compilation" via some kind of more selective "library" abstraction.

I'd welcome any wisdom on efficacy, usability and implementation challenges.

## Comment viewing options

### Partial Evaluation

I certainly don't believe profile directed control is an answer.

I'm more inclined to think partial evaluation is the answer. You have a program with general types such as "matrix". Then you specialise to "dense matrix". Then to "symmetric matrix". Each type refinement produces optimisations.

The final refinement produces the ultimate optimisation: you refine to the exact data and get a result.

What is important here is not to pick the best points to do incremental specialisations for solving one problem, but rather to consider the interesting sets of problems and a whole hierarchy of partial specialisations: we want to solve many problems quickly not just one.

So, the cost of solutions should be directed by the structure of the data set.

I hate to say this but Object Oriented systems today have the best support for this. You can write programs that solve general problems and improve the performance by use of derived classes.

### Interfaces are already a spectrum

The very point of "separate compilation" is to force you to design an interface language to describe which aspects of other modules you rely on. A perfect interface language is one where you can always abstract a concrete definition into an abstract one, you can always describe what was being relied upon, and no more.

ML languages have reasonable, but perfectible, interface languages. I claim that an excellent interface language will already encompass your "spectrum" from separate compilation to whole-compilation... while still guaranteeing separation compilation by definition.

When you use a value M.foo from a concrete module M, and suddenly you decide to abstract M under an interface, what will you describe about foo?

- maybe you and the compiler only relied on the fact that M.foo is an integer; then foo : int should suffice in the interface description

- maybe your language has some notion of constant analysis that was able to determine that foo is a constant value, and use it for constant propagation; then you should be able to give this constant value in the interface

- maybe foo is actually an elaborate definition that it is important to have inlined for efficiency (eg. because it usually expose wrapper/worker-like optimization opportunities); then you should be able to give this elaborate definition through the interface

An interface language for a module should be able, in the limit case, to simply give the concrete definition of the value its characterizing. You get a "not really abstract" interface that is usable by the compiler for inlining, static reasoning, etc.

Of course, the less abstract an interface is, the more fragile the abstraction boundary; if you published the concrete definition through the interface, and you now change it, you probably need to recompile the depending module that compiled against the old definition (and, on the user side, fix the depending code that relied on this old definition). You may want to show different interfaces to your fellow coworkers and to the optimizing compilers, but in the end you'll always be encouraged to sacrifice some performance for more flexibility and less hard-binding. Notice that you're able to make this choice locally, on a value-per-value basis, as a natural property of a rich interface language.

### Also for sharing

The very point of "separate compilation" is to force you to design an interface language to describe which aspects of other modules you rely on. A perfect interface language is one where you can always abstract a concrete definition into an abstract one, you can always describe what was being relied upon, and no more.

That's an interesting view, mainly because it never occurred to me. So far as I can tell, the historical point of separate compilation was to deal with limited compiler and CPU performance and limited available memory. It simply wasn't feasible to do whole-program compilation on large bodies of code in any human-tolerable compile/edit/debug cycle.

Nowadays that remains true for a few very large programs, mainly because certain optimizations are exponential in code size, but the primary use remaining seems to be shared libraries, for which the interface needs to be concrete rather than abstract.

One can, of course, choose to compile down to some form of bytecode and do code generation at runtime. Whether that is viable depends a lot on how robust the resulting binary execution needs to be. A runtime code generator is a lot of code, and therefore a lot of potential bugs.

### You're right that separate

You're right that separate compilation was mostly implemented as a means of dealing with CPU and memory limitations. OTOH, module interfaces as we understand them are mostly a means of dealing with namespace management. We consider namespace management to be for our own benefit now (and it is mainly for our own benefit -- *now* ), but originally file-level scope was a side effect of using headers to speed up the linker.

There is in principle no problem with deriving interface information from the source code itself. You can do that with separate compilation, because the compiler will discover all the interface points (definitions that might be referred to from elsewhere and references not resolved by definitions in scope in the current file) while it does compilation. The compiler can write all this information to a file and the linker can then use it (along with the object code file which contains the locations that the interface file refers to) to put together the project.

Poof, under this scenario separate compilation works. But that assumes you have a global namespace, or else you have to introduce an ugly hack like using the filename that source code appears in when referring to its definitions. And global namespaces are sort of a problem; now because they are confusing, but at the time more because they were potentially big.

Header files as we understand them address namespace management; on one hand they are ways of having seven different variables named 'list' in different modules and they don't interfere with each other. On the other hand they are a promise that we can limit the amount the linker has to know. The header file contains *ONLY* those symbols we want to have resolved by definitions in the relevant file.

Honestly, I think that limiting the size of the data the linker had to work with was probably more important early on; remember some of those early languages had compilers implemented as seven or eight different binaries each of which did one "pass" and wrote intermediate results for the next program to use, because a single program that did everything would have been too big to fit in memory (or even in the addressable space in some cases). Remember that many of them existed on systems where you couldn't have more than three files open at a time, or where the filesystem was a tape drive and thus random-access took time linear in file size. The linker, in this view, was just one last "pass" in the process of compilation, and was the pass most sensitive to the whole project size because it was the only one that had to deal with all of the source files.

The linking job, which is bottlenecked by file I/O and thus slower than almost everything else a computer can do, scales as the product of the amount of object code you have to work on and the number of times bigger the symbol table is than the symbol table you can fit in memory -- which is to say, roughly as the square of the size of the total source code in our global-namespace scenario.

Under those circumstances it really benefits you to get the whole symbol table to fit into memory, so you can hold it there while you process each of the other files ONCE. So headers, by implicitly declaring that the linker didn't need to worry about any of the *other* definitions in the file, limited the job of the linker and could result in reducing link times by 80% or more in adverse circumstances. And that was more important, back when programs were relatively small and simple, than the organizational benefits of namespace management.

Ray Dillinger