Seeking nearly anything re: so called language "bootstrapping" process

So like 30 years ago in Byte, some cool dude had an m4 clone and an assembler on a CP/M machine and managed to bootstrap a full Pascal development environment.

Ok, I exaggerate, but it's hard to meet an old Forth'er who doesn't have more than one Forth system bootstrap story.

I don't know if there's a formal definition of "bootstrap" (esp. re: prog langs/environments).... So any help here welcome too.

But in this day of compiling to C, the JVM or CLR, LLVM, C--, Boehm's "instant" GC, huge and complex runtime systems and tools + tools + tools, it got me thinking about the days or yore (or current practice, even better) of "bootstrapping" a language and programming environment from "minimal components," to say the least.

I'm also interested if there are *qualities* of various languages/environments that lend themselves to bootstrapping from small parts, while other languages lack these qualities for whatever reasons.

Thanks much in advance.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.


If you haven't seen jonesforth before, I'm sure you'll enjoy it now.

90 minute Scheme to C compiler

How about SmallTalks - classic "image" based bootstrap

Still amazes me that folks will write a relatively tiny core in C or something, implement a *really* tiny number of primops by today's standard (think BitBlt), and then just mostly magically "bring up" the SmallTalk image and have a truly REMARKABLE body of library code and development functionality at their fingertips.

Anyone know who specifically came up with this original approach to Smalltalk implementation and why? What about Smalltalk 72? I can't imagine that the core was implemented in C back then? Any one know the implementation language?

I might be able to look it up

Somewhere in my midden I have a listing of a Smalltalk 72 system (printed on an XGP!) I believe there was a small assembly language core with entries that looked something like [CODE 29 arg ...]. It was a moving target though. At some point all of lowest levels turned into microcode.

And by the way - What About Lisp Machines.....

Why weren't Lisp Machines designed to be similarly "bootstrapped" like the Smalltalk Machines???? I mean, the Smalltalk boxen had their own dedicated hardware too, etc.

Such similar languages in many ways, but no real, relatively easily "boot strappable" popular Lisp or Scheme that I can think off.

Ten years further back...

...a team including participants from Stanford and UC Santa Cruz developed XPL, a streamlined dialect of PL/I intended for compiler writing, briefly described on Wikipedia.

This link has a bit more information about XPL and ports to various environments, and mentions the book A Compiler Generator, by McKeeman, Horning, and Wortman, which discussed XPL bootstrapping. The "T-diagrams", used in that book to describe the bootstrapping process, seem to have fallen out of use, but may be of interest to you.

Aren't T diagrams described

Aren't T diagrams described in the Dragon Book?

What types are Ts?

I personally was always confused by T-diagrams, especially when Translators were intermixed with Interpreters (TI-diagrams?).
Would be interesting to provide a mapping from these diagrams to higher-order functional types (or maybe first-order functions on existentials/ADTs (or maybe even just first-order functions on code representations :) )).

T diagrams

"Aren't T diagrams described in the Dragon Book?"

Yes, briefly.

"Recipe for yogurt: Add yogurt to milk." - Anon.

Dragon Book and T Diagrams

An employee stole my dragon book years back :-(

There's gotta be more war stories here.... type systems?

With all the expertise on type systems, I'd LOVE to hear about bootstrapping some language from zero to one that's finally statically typed. Or how about the building/integration of really complex type systems into languages?

Paper on the topic

I think the following reference to a paper by Andrew Appel might be useful. I got the reference off his page; unfortunately, the link was broken:

Axiomatic Bootstrapping: A guide for compiler hackers, Andrew W. Appel, ACM Transactions on Programming Languages and Systems, vol. 16, number 6, pp. 1699-1718, November 1994.


TR-451-94 Axiomatic Bootstrapping: A Guide for Compiler Hackers

The Standard ML of New Jersey compiler is written in Standard ML. Thus, SML/NJ can compile itself. This should be simple and obvious. There are some complications. The compiler produces binary files that contain both executable code and static environments (mapping identifiers to structures, signatures, and types). Normal binary files are intended to be loaded into a running SML/NJ system, where the static environment enhances the environment already available in that system, and the code is loaded and run. Some new version of the compiler may make different assumptions about the format of static environments, and its executing code may not be compatible with the existing interactive system. Thus, there can be a severe ``bootstrap'' problem: how can the old version of the compiler compile and bootstrap the new version? Here I attempt an axiomatic clarification of the bootstrapping technique. This should be useful to implementors of any self-applicable compiler with nontrivial object-file and runtime-system compatibility problems.

Appel again....

Leave it to Andrew Appel to really nicely address an interesting and complicated issue more-or-less clearly, rigorously and thoroughly :-)

What is he up to these days? I never did read any of his new series of compiler text books? Are they worth a read?


Compiling with Continuations

Compiling with Continuations is a very good read, and well worth the price. It is an engineering manual written by a computer scientist. It basically sacrifices nothing in explanation, while still being to-the-point.

I've not read any of his other texts.


It depends on what you are bootstrapping.

For an OS, the hardest and most annoying piece of code in the entire OS to write, to simply build a working OS, will likely be the bootloader.

In Advances in Object-Oriented Metalevel Architectures and Reflection there is a paper on how to use reflection to bootstrap a system, as well as what challenges reflection causes the implementor to deal with: Bootstrapping the Object-Oriented Operating System Merlin: Just Add Reflection -- before PLT-tight people complain about this being an OS theory paper, it is not! Merlin is very much a Smalltalk-like environment setup, where the language is used to write the OS. The author thus also deals with the question of bootstrapping both an OS and a language at the same time.

Not so myopic

before PLT-tight people complain about this being an OS theory paper...

To be fair, I don't recall ever having seen this attitude on LtU...


Viewpoints Research Institute, aka Alan Kay's organization, has been mentioned on LtU a number of times in the past. Check out the various bootstraps they've done for the COLA (Combined Object-Lambda Architecture) family of languages, as well as the bootstrapping of its SourceIDE.

Need to add "http://" to

Need to add "http://" to your links.



Some new links to share

Been going back reading some older things that I read a few years ago, trying to cycle through ideas I've seen before and see if I missed any wisdom.

Read Kay's VPRI NSF 2006 grant proposal. He lists the following sources for "from nothing" bootstrapping on page 21:

Page 10 [FNB] “From Nothing” Bootstrapping
E. T. Irons, “Experience with an Extensible Language,” in Comm. of the ACM, vol. 13, no. 1, pp. 31-40, 1970.
L. Tesler, H. Enea and D.C. Smith, “The Lisp-70 pattern matching system,” J. Nilsson, Ed., in Proc. of the 3rd
International Joint Conference on Artificial Intelligence, 1973, pp. 671-676.
D. Ingalls, “The Smalltalk-76 Programming System Design and Implementation,” in Proc. of the 5th ACM SIGACTSIGPLAN
Symposium on Principles of Programming Languages, 1978, pp. 9-16.
G. Krasner (ed), Smalltalk-80: Bits of History, Words of Advice (Addison-Wesley series in computer science)
(Paperback) 1983
P. Deutsch, “The past, present, and future of Smalltalk,” in Proc. of the 3rd European Conference on Object Oriented
Programming, Cambridge University Press, 1989, pp. 73-87.
A. Kay, “The Early History of Smalltalk”, History of Programming Languages II, T. Bergin (ed.), Addison Wesley,
D. Ingalls, T. Kaehler, J. Maloney, S. Wallace and A. Kay, “Back to the future: the story of Squeak, a practical
Smalltalk written in itself,”
in OOPSLA ’97: Proc. of the 12th ACM SIGPLAN Conference on Object-oriented
Programming, 1997, pp. 318-326.
J. Aycock, “A brief history of just-in-time”, in ACM Computing Surveys, vol. 35, no. 2, pp. 97 – 113, 2003.
I. Piumarta, “The Virtual Processor: Fast, Architecture-Neutral Dynamic Code Generation,” in Proc. of the 3rd
Virtual Machine Research and Technology Symposium, 2004. Available at HTTP:
I. Piumarta, “Accessible Language-Based Environments of Recursive Theories”[Internal Whitepaper], Glendale,
Calif.: Viewpoints Research Institute, Inc., Sept., 2005. Available at HTTP:

Some old links too

The implementation of BCPL was one of the earliest examples of compiler bootstrapping. Specifically, bootstrapping was achieved by introducing a virtual machine: OCODE. I believe this was the first virtual machine ever invented – a stack machine. M.Richards later came up with a lower-level register machine which he called Intcode: In the current implementation of BCPL it is Cintcode: