Lambda the Ultimate The Programming Languages Weblog - join today!

XML icon






(new topic)

Language Evaluation

PL Courses

Research Papers

Design Docs


Genealogical Diagrams

Join Now



Lisp Machine Progress Report
AI Memo 444, 1977. Alan Bawden, Richard Greenblatt, Jack Holloway, Thomas Knight, David Moon, Daniel Weinreb

This informal paper introduces the Lisp machine, describes the goals and current status of the project, and explicates some of the key ideas. It covers the Lisp machine implementation, Lisp as a system language, input/output, representation of data, representation of programs, control structures, storage organization, garbage collection, the editor, and the current status of the work.

This illustrates what I find attractive about Lisp: nevermind the macros, I want to do stuff as cool as this!

P.S., does anyone know if the system sources to the MIT Lisp Machine, which RMS so famously hacked, are available? I want them!

Posted to implementation by Luke Gorrie on 5/16/04; 7:16:58 PM

Discuss (1 response)

The GNU 64-bit PL8 compiler
The GNU 64-bit PL8 compiler: Toward an open standard environment for firmware development. W. Gellerich, T. Hendel, R. Land, H. Lehmann, M. Mueller, P. H. Oden, and H. Penner. IBM Journal of R & D. Vol. 48 No. 3/4. 2004.

PL8 is used in zSeries (S/370, S/390) firmware development. Since new machines are 64bit addressable, the 32bit PL8 compiler had to be rewritten.

This paper provides a rather rare oppurtuinity to look inside the beast. I am sure BAL programmers (ASM/370) curious about microcode are going to find this paper interesting. I am pretty sure others are also going to find the implementation details of the new (GNU based) compiler interesting.

Useful, but extreme, persepctive on high-level languages and how they relate to programing for the bare metal.

Posted to implementation by Ehud Lamm on 5/15/04; 2:27:13 AM


Narcissus: JavaScript in JavaScript
Not sure if I understand the motivation for building a meta-circular interpreter, but the project might be of interest to those using or tinkering with javascript. The source is fairly easy to follow at this juncture in the project.
Posted to implementation by Chris Rathman on 4/7/04; 7:46:06 AM

Discuss (3 responses)

Eric Lippert does SimpleScript
I got to thinking last night that maybe it would be interesting to develop a script engine from scratch, describing every interface as I implement it, giving some of the design history behind the interfaces, etc, etc, etc. I could start it in C++ and then see whether there were ways to make it work well in managed code, etc.

Ehud, I recall you asking for exactly this information. Seems Eric Lippert is going to provide it over weeks/months.
Posted to implementation by Dan Shappir on 3/30/04; 3:14:07 PM

Discuss (6 responses)

Practical Aspects of Multi-Stage Programming
via MetaOcaml list

Jason L. Eckhardt, Roumen Kaiabachev, Kedar N. Swadi, Walid Taha and Oleg Kiselyov (draft)

Abstract. High-level languages offer abstraction mechanisms that can reduce development time and improve software quality. But abstraction mechanisms often have an accumulative runtime overhead that can discourage their use.

Multi-stage programming (MSP) languages offer constructs that make it possible to use abstraction mechanisms without paying a runtime overhead. This paper studies applying MSP to implementing dynamic programming (DP) problems. The study reveals that staging high-level implementations of DP algorithms naturally leads to a code explosion problem.

In addition, it is common that high-level languages are not designed to deliver the kind of performance that is desirable in implementations of such algorithms. The paper proposes a solution to each of these two problems. Staged memoization is used for code explosion, and a kind of “offshoring” translation is used to address the second. For basic DP problems, the performance of the resulting specialized C implementations is almost always better than the hand-written generic C implementations.

Since I know at least one of the authors reads LTU fairly regularly, perhaps they'll comment?
Posted to implementation by Bryn Keller on 3/4/04; 10:02:51 AM

Discuss (2 responses)

Interview with Intel's C++ Compiler Team

After all the CLR posts, here's a real compiler story.

Posted to implementation by Mark Evans on 2/18/04; 8:27:44 AM

Discuss (8 responses)

Packrat Parsing
Packrat parsing is a novel and practical method for implementing linear-time parsers for grammars defined in Top-Down Parsing Language (TDPL). While TDPL was originally created as a formal model for top-down parsers with backtracking capability, this thesis extends TDPL into a powerful general-purpose notation for describing language syntax, providing a compelling alternative to traditional context-free grammars (CFGs). Common syntactic idioms that cannot be represented concisely in a CFG are easily expressed in TDPL, such as longest-match disambiguation and "syntactic predicates," making it possible to describe the complete lexical and grammatical syntax of a practical programming language in a single TDPL grammar.

There's also an implementation.

From ll1-discuss
Posted to implementation by andrew cooke on 2/13/04; 12:51:26 PM

Discuss (2 responses)

Why Events Are a Bad Idea
posted in the discussion group by Matt Brubeck.

Event-based programming has been highly touted in recent years as the best way to write highly concurrent applications. Having worked on several of these systems, we now believe this approach to be a mistake. Specifically, we believe that threads can achieve all of the strengths of events, including support for high concurrency, low overhead, and a simple concurrency model. Moreover, we argue that threads allow a simpler and more natural programming style.

A recent paper from the networking community that considers higher-level approaches to concurrent programming, and cites both Erlang and Concurrent ML (albeit in passing).

Unfortunately they take only a small step towards language-level concurrency. They are enthusiastic about whole-program static analysis to discover concurrency/atomicity, but don't consider the Erlang approach that makes this simple and explicit in the language. It will be exciting to see what the networking community make of that when they evaluate it seriously.

(Continuation-savvy LtU readers may groan at some of the State Machine vs. Threads myths that they have to dispel.)
Posted to implementation by Luke Gorrie on 1/23/04; 10:05:24 AM

Discuss (18 responses)

Combining Generics, Pre-Compilation and Sharing Between Software-Based Processes
Andrew Kennedy and Don Syme. Combining Generics, Pre-Compilation and Sharing Between Software-Based Processes. Semantics, Program Analysis, and Computing Environments for memory management (SPACE 2004).

Now that the CLR has generics, I suppose it is only natural to see more research on generics from MSR...

The underlying theme of this paper is the 'PL as OS' paradigm, which is quickly becoming the dominant approach, surpassing the 'PL as Platform' we were getting used to. In a nutshell: Platforms mean huge libraries; OS-like thinking means resource management.

Be that as it may, recall that when the CLR generic papers was first mentioned here I asked about run-time types? This paper shows what happens when you have to deal with run time type descriptors, in the presence of resource management and sharing. I am still not sure that compilation by erasure wouldn't have been a better approach.

Posted to implementation by Ehud Lamm on 1/13/04; 2:19:57 AM

Discuss (1 response)

Continuations: Haystack and Python
From the Chandler email list...

In September, Hamish Harvey described some of the neat things about Haystack, including UI continuations. (Harvey's message was posted Fri, 26 Sep 2003; there's an article describing Haystack's continuations at

I just wanted to add a reference to a tutorial article on continuations using Python: "Continuations Made Simple and Illustrated" by Denys Duchier at
Posted to implementation by Patrick Logan on 1/4/04; 4:18:42 PM

Discuss (1 response)

via slashdot

As the programmability and performance of modern GPUs continues to increase, many researchers are looking to graphics hardware to solve problems previously performed on general purpose CPUs. In many cases, performing general purpose computation on graphics hardware can provide a significant advantage over implementations on traditional CPUs. However, if GPUs are to become a powerful processing resource, it is important to establish the correct abstraction of the hardware; this will encourage efficient application design as well as an optimizable interface for hardware designers.

They have implemented a data-parallel language based on C that compiles code to execute on the powerful graphics cards used in standard PCs. WOW!

Posted to implementation by Luke Gorrie on 12/21/03; 7:41:20 PM

Discuss (8 responses)

Transparent Language Implementation and Design
I ran across an interesting comp.lang.scheme discussion on numerical processing in Scheme implementations. This dovetails with Ehud's C++ posting, in which he discussed how blogging by language designers gives us a view into the workings of language design.

In this usenet post about MzScheme's numerical shortcomings (and Matthew Flatt's response) we can see useful technology transfer between Gambit and MzScheme.

This is the kind of thing I would expect from Open Source projects, but have rarely seen in practice. I often see cases where one project will crib code from another, but I don't often see active discussions between "competing" implementations about why one is better at X than another. Open Source langauge projects should ideally provide cross-pollenation that allows the sum to be greater than the parts. This does happen to some extent, but all to often we see a balkanization of projects in which there are no open dialogs between development camps, and the only information exchange occurs when someone peeks into someone else's CVS tree.

I wonder which approach is more successful? Free exchange of ideas in the electronic beatnik cafe of USENET, or miserly code hoarding resulting in an "arms race" as two implementations seek to outdo the other? I guess the MzScheme/Gabmit would be the former case, and the Mono/Portable DotGNU effort as the later. Either way, it will be interesting to find out.
Posted to implementation by Brent Fulgham on 12/3/03; 12:05:28 PM

Discuss (5 responses)

On Garbage Collection
This comes from the EROS capability-based operating system architecture email list. The leader of that project describes what he believes may be needed as a language for writing applications safely for EROS, especially with respect to garbage collection, concurrency, response time, and protection from memory allocation faults.

Several responses have been posted to the list as well. Here's the top of the thread.

In the course of deciding not to submit a DARPA proposal, I spent some time looking at what is going on in the world of penetration attacks. I conclude that we are solving the wrong problem -- or rather, we have been solving phase II in the hope that other people would solve phase I. I have now concluded that this hope is in vain...

There are those who may argue that we already have a replacement. Possible candidates might include C#, Java, or D (I restrict my attention here to statically typed languages, but without prejudice). I think that none of these are a good replacement.
Posted to implementation by Patrick Logan on 11/22/03; 7:14:28 PM

Discuss (12 responses)

LLVM Compiler Infrastructure Project
Technology which allows for continuous optimization of a program throughout its lifetime, starting with compile time, again at link time, at run time, and in-between runs. Looks interesting. The code is available, though apparently only for Linux and Solaris at this point.

Low Level Virtual Machine (LLVM) is:

1. A compilation strategy - Fundamentally, LLVM is a compilation strategy designed to enable effective program optimization across the entire lifetime of a program. LLVM supports effective optimization at compile time, link-time (particularly interprocedural), run-time and offline (i.e., after software is installed), while remaining transparent to developers and maintaining compatibility with existing build scripts.

2. A virtual instruction set - LLVM is a low-level object code representation that uses simple RISC-like instructions, but provides rich, language-independent, type information and dataflow (SSA) information about operands. This combination enables sophisticated transformations on object code, while remaining light-weight enough to be attached to the executable. This combination is key to allowing link-time, run-time, and offline transformations.

3. A compiler infrastructure - LLVM is also a collection of source code that implements the language and compilation strategy. The primary components of the LLVM infrastructure are a GCC-based C & C++ front-end, a link-time optimization framework with a growing set of global and interprocedural analyses and transformations, static back-ends for the SPARC v9 and X86 architectures, a back-end which emits C, and a Just-In-Time compiler for X86 and SPARC v9 processors. See "Current Projects" for information about other components under development.

Posted to implementation by Bryn Keller on 11/7/03; 2:37:52 PM

Discuss (3 responses)

Open Programming Services for Virtual Machines: The Design of Mozart and SEAM
Peter mentioned their SEAM approach to virtual machines. There are some definite merits over more commercial approaches. Here's their site. And the blurb...

This paper discusses designs for integrating services in general and open programming services in particular into virtual machines. We draw on our experience with two systems.

The first is Mozart, a programming system implementing the language Oz. Mozart's virtual machine provides a rich set of services for open programming, such as concurrency, persistence of data and code, components with dynamic linking, and distribution. The second system is Alice, which is at the same time a statically-typed variant of Oz, and an extension of Standard ML for open programming.

Our design proposals for open programming services culminate in the definition of a virtual machine called SEAM, which we claim to be simple, efficient, and extensible. We substantiate these claims with preliminary performance results.
Posted to implementation by Patrick Logan on 10/23/03; 10:36:47 AM

Discuss (1 response)

A Research C# Compiler
Todd A. Proebsting and David R. Hanson. MSR-2003-32

This paper describes a new C# compiler designed specifically to provide that infrastructure. The overall design is deceptively simple. The parser is generated automatically from a possibly ambiguous grammar, accepts C# source, perhaps with new features, and produces an abstract syntax tree, or AST. Subsequent phases (dubbed visitors) traverse the AST, perhaps modifying, annotating it or emitting output, and pass it along to the next visitor. Visitors are specified entirely at compilation time and are loaded dynamically as needed. There is no fixed set of visitors, and visitors are completely unconstrained.

Being able to reason about language behaviour based solely on abstract semantics is like having perfect pitch in music: immensly useful if you are a composer (or language designer) but extremely rare in the poplulation. Practice helps, but not much.

That's why definitional interpreters are so useful: they allow you to experiment with language constructs, and language design ideas. Want to see if your new construct should have dynamic scope? Make the change, and see what happens.

This work follows this standard approach, even if it deals with a compilation system (is that just a fancy way to say compiler?) Obviously, the focus here is on compile time facilites and not run time behvaiour, and indeed the authors confess that most usage to date has been for tools, not for language design experiments. Still, via the magic of source-to-source transformations real language design work can also be done.

Posted to implementation by Ehud Lamm on 10/19/03; 12:06:14 PM

Discuss (3 responses)

Microsoft Forges Ahead With New Compiler Technology
Next week, Microsoft plans to offer an update on new compiler technology that is being co-developed by the company's Programmer Productivity Research Center (PPRC) and the company's developer tools division.

From the Phoenix web site:

The basic Phoenix framework is an extensible system that can be adapted to read and write binaries and MSIL assemblies, represent the input files in an IR which can be analyzed and manipulated by applications by using the Phoenix API and then the code can be written in binary, or JITable form for execution.

The Research Development Kit (RDK) may prove to be interesting. If you are into that sort of thing, you may want to take a look at the tool ideas presented in the Phoenix presentation (ppt).

Judging from the last couple of days we are quickly becoming a Microsoft firendly blog. Let's hope we can keep our objectivity...

Posted to implementation by Ehud Lamm on 7/23/03; 1:05:36 PM

Discuss (2 responses)

Compilation of Functional Programming Languages using GCC -- Tail Calls
This master's thesis by Andreas Bauer covers the state of the art in compiling functional languages using the GNU Compiler Collection (GCC), with particular reference to the Glasgow Haskell Compiler (GHC). Here's a summary of the abstract, edited for brevity:

Explains the difficulty of implementing proper tail call support in a C compiler. Discusses approaches to overcoming this problem in GCC, so that functional language front ends like GHC, which use GCC, gain a greater amount of flexibility. Shows a connection between current C compiler constraints affecting tail calls, and early Unix design decisions. Examines broader issues surrounding the GCC project in general.

This ambitious, 120-page paper could be summarized as "all you ever wanted to know about compiling functional languages using GCC but were afraid to ask".

Posted to implementation by Anton van Straaten on 6/29/03; 12:23:45 PM


CLR Memory Model
So what is a memory model? It’s the abstraction that makes the reality of today’s exotic hardware comprehensible to software developers.

Chris Brumme has a very informative discussion of the issues involved in designing a memory model for a programming language.

Many programmers take the memory model for granted, and often don't realize that this abstraction layer actually exists. They are certain that hardware "really" behaves the way their programs expect it to behave.

This is something we should really educate people about. Hopefully, as more languages are based on standard VMs, this concept will become more well known.

Another lesson to take home from Chris's post is about the complexity of modern CPUs. Even if you know how a specific CPU works, it is quite hard to predict how programs will behave, even before yo introduce complicating factors like garbage collection, virtual memroy and multiprogramming operating systems.

Posted to implementation by Ehud Lamm on 5/28/03; 1:27:58 PM

Discuss (4 responses)

Dynamic Languages on CLR

This is a pretty good thread about compiling for CLR and the implementation of the runtime itself. Miguel de Icaza, Fergus Henderson and Don Syme all pitched in.

The maling list archive is pretty messy. The best way to read this thread is to click on the "Messages sorted by subject" link.

Posted to implementation by Dejan Jelovic on 5/14/03; 8:18:38 AM

Discuss (1 response)

Weblog on CLR Implementation

Chris Brumme has a weblog about various design decissions that Microsoft programmers made when implementing the CLR. Lots of detailed, precise stuff that people interested in runtimes will love.

Posted to implementation by Dejan Jelovic on 5/12/03; 6:02:58 AM

Discuss (3 responses)

Don Box on C# generics vs. C++ generics
C++ can get away with this very loose model as template instantiation is always done at compile-time and finding a member that "fits" can happen at a somewhat leisurely pace.

In contrast, type instantiation of a CLR generics happens at runtime, and things like speed and type-safe verifiability are the order of the day.

A short piece by Don Box demonstrating the difference between generics in C# and C++. Given the similarity in syntax, I wonder how MS will add .NET generics to C++.NET
Posted to implementation by Dan Shappir on 5/12/03; 12:36:19 AM

Discuss (12 responses)

Schemix - A Scheme In The Linux Kernel
via Slashdot

Schemix is a Scheme system, implemented as a patch to the Linux kernel. It aims to attain R5RS compliance while remaining small, fast and easy to understand.

The intended use of Schemix is for exploration of the Linux kernel and for rapid, interactive prototyping of Linux drivers and other new kernel features. To achieve this, Schemix will attempt to make a large subset of the kernel functionality available to Scheme programs. Interactivity is via a character device, /dev/schemix which presents a REPL (Read, Eval, Print Loop) to anyone having access to the device.

The perfect thing to combat Insecure Macho Love.
Posted to implementation by Dan Shappir on 4/28/03; 1:35:58 AM

Discuss (2 responses)

How to make a fast curry: push/enter vs eval/apply
Simon Marlow and Simon Peyton Jones. 12 pages, March 2003.

Higher-order languages that encourage currying are implemented using one of two basic evaluation models: push/enter or eval/apply. Implementors use their intuition and qualitative judgements to choose one model or the other.

Our goal in this paper is to provide, for the first time, a more substantial basis for this choice, based on our qualitative and quantitative experience of implementing both models in a state-of-the-art compiler for Haskell.

Our conclusion is simple, and contradicts our initial intuition: compiled implementations should use eval/apply.

Well worth your time even if you are not going to implement a functional language any time soon.

Posted to implementation by Ehud Lamm on 4/27/03; 7:21:01 AM


Glowing review of Shared Source CLI Essentials
Werner Vogels posted a glowing review of Shared Source CLI Essentials by David Stutz, Ted Neward and Geoff Shilling.

Looks like a book I should order for our university library.

I am not aware of good books about language (run time) implementation, that cover the techniques used in real systems (as opposed to student projects). Feel free to tell us about your favorites.

Posted to implementation by Ehud Lamm on 4/4/03; 5:23:09 AM

Discuss (2 responses)

Byte Code Engineering Library
The Byte Code Engineering Library (formerly known as JavaClass) is intended to give users a convenient possibility to analyze, create, and manipulate (binary) Java class files (those ending with .class). Classes are represented by objects which contain all the symbolic information of the given class: methods, fields and byte code instructions, in particular.

I don't think we mentioned this before, and it does look like a useful tool.

Among the projects using BCEL, you may find the following especially interesting: Poor Man's Genericity, Portable JRAF, Funnel, Kava and Sally.

Posted to implementation by Ehud Lamm on 3/5/03; 7:15:23 AM

Discuss (5 responses)

Control Transfer in Operating System Kernels

You got to love practical researches. Richard Draves has modified the MACH 3.0 kernel to use continuations for control transfer. This has resulted in increased flexibility and performance.

[Via Green Hat Journal]

Posted to implementation by Dejan Jelovic on 3/1/03; 2:55:29 PM


The essence of compiling exceptions
The essence of compiling exceptions. Graham Hutton and Joel Wright. To be submitted for publication, 2003.

Given the importance of exceptions, it is perhaps surpris­ing that in textbooks and research articles on compilation, exceptions are often not considered at all, or one finds rather brief mention of concepts such as "marking the stack" and "unwinding the stack". In this article we make such ideas precise, by presenting a compiler for a small language with exceptions, together with a proof of its correctness with re­spect to a formal semantics for the language. To the best of our knowledge, this is the first time that a compiler for exceptions has been proved to be correct, but we would welcome any pointers to other work in this area.

A nice, and rather short, paper.

The first two sections (sections 2 and 3), which show a basic compiler for arithmetic expressions and prove its correctness, are well worth reading for their elegance, even if you are not interested in the discussion of compiling exceptions.

It should be noted that the exception mechanism discussed is less expressive than what you would find in languages such as Ada or C++, and the implementation discussed is a bit naive.

EOPL2 devotes section 7.4 to exceptions, but these are implemented using continutations, in the context of a CPS interpreter, so efficient compilation of exceptions isn't covered.

The paper uses Haskell as an executable notation. Interestingly, the authors mention they used QuickCheck while developing the code in the paper and before proving it is correct.

Posted to implementation by Ehud Lamm on 2/18/03; 1:09:18 PM

Discuss (1 response)

Apache vs. Yaws

Apache is the web-server most of the world uses. Yaws is a web-server written in Erlang. Apache has recently been totally redesigned to take advantage of OS threads in an effort to support higher loads. How successful have the Apache team been? Not very, compared to Yaws. In a through-put benchmark Apache falters at around 8'000 concurrent sessions. Yaws handles over 80'000 concurrent sessions. Why the difference? Because Erlang doesn't use OS threads, but its own user-level threads. In other words, continuations. The two points I got from this are:

  • Theory is inextricably tied up with practice
  • Languages are operating systems

The talk from LL2 is also worth a read.

Seen on the plt-scheme mailing list. Thanks Brent!

Posted to implementation by Noel Welsh on 2/12/03; 7:26:20 AM

Discuss (17 responses)

Partial Evaluation (slides)
A quick introduction to partial evaluation.

A nice touch is the discussion of the Futamura Projections, and compiling by partial evaluation.

The following set of slides from the same course goes into more detail, covering binding time annotation and specialization.

Posted to implementation by Ehud Lamm on 2/8/03; 10:29:09 AM

Discuss (2 responses)

How Java's Floating Point Hurts Everyone Everywhere
An interesting document I saw mentioned on the OCaml mailing list. It talks (in great detail) about what is wrong with Java's floating point implementation, and floating point implementations in general. Suggests several additions to future languages that will overcome these deficiencies. A short taste can be got by reading The Numerical Analyst as Computer Scientist Curmudgeon.
Posted to implementation by Noel Welsh on 1/21/03; 5:57:19 AM

Discuss (4 responses)

From Reginald Braithwaite-Lee, as anounced on ll1-discuss:
Somewhat inspired by Todd's talk at LL1, I wrote up some thoughts I had about a 'disruptive' programming language.

The target users are the many people who have jobs or contracts building Intranet and Internet applications for businesses but would prefer to build them with a lightweight language.

FWIW, my early proof of concept experiments include a macro facility. ;-)

and from the web site:
"Elle" is an extension of the Java programming language and environment for building Intranet and Internet applications. Elle is for individuals and small teams that are under pressure to deliver robust business applications with quick turnaround.

The "Todd" mentioned is Todd Proebsting, Manager of the Programming Language Systems group of Microsoft Research, who actually spoke at LL2 this past year. The joke about macros refers to Todd's arguments against them (PowerPoint Presentation).
Posted to implementation by Dan Moniz on 1/14/03; 1:28:24 AM

Discuss (5 responses)

Refill - A Generative Java Dialect
Kasper Osterbye, Refill - a generative java dialect (presentation and paper), ECOOP'02

It has become popular to extend the Java programming language to experiment with different new and old language ideas. If the extension requires changes to the syntax, extensions are typically done by adapting an existing compiler, or by building a pre-processor. The pre-processor approach has the disadvantage that it is difficult to integrate static error checking, the adaptation approach has the disadvantage that standard compilers are optimised for speed and not flexibility. Hence they are not geared towards experiments in language extension. The refill compiler generator for Java is our attempt to provide an environment where one can experiment with Java extensions that requires extensions to syntax, static semantic checking, and where final code generation is done by generating base level Java.

There's also a powerpoint presentation here.
Posted to implementation by Bryn Keller on 1/13/03; 1:40:14 PM

Discuss (1 response)

Vortex and Cecil
We don't normally post news of compiler upgrades here, but this one seemed worth an exception - plus, we don't actually have a news item for Cecil or Vortex yet, just the occasional mention when Self or Dylan comes up.

I've made a release of our latest versions of Cecil and Vortex available off our project web page ( under the Software link. This release is mostly just a maintenance release, including many small improvements to the Vortex optimizer and the Cecil language and run-time system. An experimental implementation of interprocedural synchronization elimination has been added (see our SAS'99 paper and the follow-on journal article). Debugging support for Cecil has been improved, including analogues to gdb's next and finish. Cecil's parameterized type system has been refined. Support for the current version of Solaris (2.5.8) has been added. Plus many more small things accumulated over the past three years.

Happy Holidays!

-- Craig Chambers
Posted to implementation by Bryn Keller on 12/30/02; 3:29:06 PM

Discuss (2 responses)

Type inference in Icon
More holiday reading.

The original, interpretiveimplementation of Icon performs rigorous run-time type checking and incurs significant overhead as a result. A new optimizing compiler for Icon, on the other hand, has a type inferencing system that is effective in determining type usage and in eliminating much of the run-time checking that otherwise would be required.

In a sense this is similar to soft typing.

One interesting particularity is that Icon generators, that produce a sequence of values, need not always yield values of the same type.

The type inferencing system was used to collect data about many Icon programs. The data showed that, on the average, the operands of about 80% of the operators in these programs always had the same type.

Posted to implementation by Ehud Lamm on 12/26/02; 3:26:11 PM

Discuss (1 response)

vmgen: A Generator of Efficient Virtual Machine Interpreters
... We present an interpreter generator that automatically generates code for the virtual machine instructions from simple instruction descriptions; it generates code for the [vm] interpreter, for generating [vm] code, for [vm] code disassembly, for tracing, and for profiling. The generator is designed to support efficient interpreters: it supports threaded code, caching the top-of-stack item in a register, combining simple instructions into superinstructions... We have used the generator to create interpreters for Forth and Java. The resulting interpreters are faster tahn other interpreters for the same languages. This paper also presents results for the effects of the individual optimizations suppoorted byt he generator.

I thought we already had a pointer to this, but apparently not...

This seems like an interesting and useful system, which brings the authors experience designing and maintaining Gforth to a more general audience. Does anyone know of research which shows that either stack- or register-based VMs are faster? The authors claim that stack-based VMs are faster, but I know the Parrot folks chose a register-based approach because they thought that would be faster. Is there no conclusive evidence?
Posted to implementation by Bryn Keller on 12/17/02; 2:43:04 PM

Discuss (2 responses)

Barrier Methods for Garbage Collection
Benjamin Zorn. Barrier methods for garbage collection. Technical Report CUCS -494-90, University of Colorado at Boulder, November 1990.

Since I am reading a student project about GC, here's another GC related paper.

Generation based collectors require a "write barrier", while incremental collectors (a la Baker) require "read barriers."

These barriers can be implemented in a variety of ways. This paper evaluates the performence of several implementation techniques. These include: hardware support, software implementation and OS support (protection faults).

Posted to implementation by Ehud Lamm on 12/17/02; 5:39:57 AM


On-the-fly garbage collection: An exercise in cooperation
Edsgar W. Dijkstra, Leslie Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens. On-the-fly garbage collection: An exercise in cooperation. Communications of the ACM, 21(11):965-975, November 1978.

One of the classic papers about incremental and parallel garbage collection. Communicating sequential processes, implementing cooperative multitasking, are the form of parrallism used.

The main goal was to design an algorithm with as little synchronization and mutual exclusion as possible.

Earlier version of the paper: EWD492, EWD520.

Posted to implementation by Ehud Lamm on 12/15/02; 2:57:21 AM


SmartEiffel: implementation techniques
SmartEiffel, the GNU Eiffel compiler, claimed to be the fastest and the slimmest multi-platform Eiffel compiler on earth, was released on Friday December 6th.

These research papers cover a range of implementation issues, including optimizing dynamic binding and compiler support for grabage collection.

(thanks Chad)

Posted to implementation by Ehud Lamm on 12/7/02; 12:15:50 PM


[The Creation of a Constraint-based Programming Language]
I had to paraphrase the title to fit the maximum length, it's really:

The Definition and Implementation of a Computer Programming Language Based on Constraints, by Guy L. Steele (thesis)

This is a really wonderful piece of writing. What I got out of it was an introduction to constraint-based programming, a gentle tour of growing a language bit by bit, and an amazing demonstration of Lisp techniques for writing interpreters that fit in neatly with their host environment. (And I only read the first half!)

That said, it is book-length and the electronic copy is a scan. The result is 20 megabytes and challenging to print - but well worth the effort.
Posted to implementation by Luke Gorrie on 11/12/02; 7:30:09 AM

Discuss (5 responses)

Loell, a new programming language
Found via Chris Double's web log:
Traditional languages either have static or dynamic scoping. With Loell you can have both. When a closure is created, the scope it is created in is stored in the scope property of the closure. When you use that scope when evaluating the closure
you have static scoping. But when you use the current scope
you get dynamic scoping. But you can use any object as scope.

Posted to implementation by Dan Moniz on 11/6/02; 10:33:34 PM

Discuss (4 responses)

A Free Implementation of CLIM
McCLIM is a free implementation of the Common Lisp Interface Manager, or CLIM, specification. In this paper we review the distinguishing features of CLIM, describe the McCLIM implementation, and recount some of the history of the McCLIM effort, give a status report, and contemplate future directions.

A paper about McCLIM from the International Lisp Conference 2002.

This is exciting because it's a major piece of the fabled Lisp Machine functionality (the user interface) being implemented as free software for widely available platforms.

I've been playing with McCLIM on Linux with CMUCL for a while, and it seems to be progressing very well.

Posted to implementation by Luke Gorrie on 11/1/02; 12:39:42 AM


Idel: a virtual machine for mobile code
Idel is a free virtual machine for safely and portably running untrusted code. This is not a new programming language, but a substrate for existing ones. It's designed to:

  • be relatively language neutral, able to sandbox unsafe languages like C
  • control resource consumption
  • support mobility and persistence (including runtime stack state)
  • need only an auditably small trusted computing base
  • yield small object files and fast execution
  • accommodate capability security mechanisms in a coming release
  • be easy to incorporate in other programs, the way Tcl is.

A very cool program by our own Darius Bacon, who is also the reigning champion of our Idel-based Roshambo contest.
Posted to implementation by Luke Gorrie on 10/31/02; 9:08:34 AM

Discuss (1 response)

Jikes RVM implementation tutorials
Two interesting presentations about the Jikes RVM, and optimizing compiler, one from PLDI'02 and a much longer and more detailed one from PACT'01.

The web page lists several other intersting presentations.

Posted to implementation by Ehud Lamm on 10/29/02; 5:51:25 AM

Discuss (1 response)

Efficient JVM Just-in-Time Compilation
A. Krall, Efficient JavaVM just-in-time compilation. Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, pp. 54--61, 1998.

This is not a new paper but it does provide a detailed and readable description of JIT compilation.

Any good references on state of the art JIT techniques?

Posted to implementation by Ehud Lamm on 10/20/02; 1:34:29 PM


M: The basis for Matrix
See Java as a metalinguistic framework by Vijay Saraswat (HTML slides produced by PowerPoint, works best in Internet Explorer). A JPEG version and a PowerPoint version are available at
Posted to implementation by Dan Moniz on 10/9/02; 3:19:02 PM


The Implementation of the Icon Programming Language
The Implementation of the Icon Programming Language. Ralph E. Griswold and Madge T. Griswold Princeton University Press, 1986 ISBN 0-691-08431-9, out of print.

Great news from the LtU DG.

I've been wanting to read this book for a long time.

Icon has some interesting control structures (e.g., generators and goal directed evaluation). The original implementation, as far as I know, was done by translation into VM instructions (at least, that's the original implementation I used...)

Posted to implementation by Ehud Lamm on 10/4/02; 2:39:34 PM

Discuss (1 response)

A Type Notation for Scheme

A lightweight type notation and type checker for Scheme. The type annotations can be added to already existing code in a file separate from the Scheme source code. These types "can also be thought of as abstractions of the type predicates found in Scheme." There is a nice table giving the "correspondence between types, predicates, and expressions" is also given. There is also a PDF version PDF Version of the paper. The paper Polymorphic Type-Checking in Scheme (Jenkins-Leavens, 1996) provides a broader overview of the notation.

Posted to implementation by jon fernquest on 9/25/02; 2:49:18 AM

Discuss (5 responses)

Evaluation Order Semantics in Expressions with Side Effects
The paper is a "step towards a complete and accurate denotational description of ANSI C." "ELSE", a simple eager expression language with side effects and a subset of ANSI C expressions, is introduced to explore "the denotational semantics of evaluation order." Monads and monad transformers are used to modularly introduce variations in evaluation order.

Several evaluation orders for "+" are implemented in Haskell: left-to-right evaluation (Java, ML), non-deterministic choice, interleaving, evaluation in one atomic step, and evaluation with ANSI C control points which have a margin of semantic ambiguity to implement optimizations in. (A Study of Evaluation Order Semantics in Expressions with Side Effects, NIKOLAOS S. PAPASPYROU, DRAGON MACOS, 2000)
Posted to implementation by jon fernquest on 9/15/02; 8:10:08 AM

Discuss (2 responses)

How small is small ?

On the VM side (SqM, the SmallSqueak MobVM), for Windows, the executable SqM.exe is about 10KB and _almost_ fully compatible with the mainstream VM. The goal is 100% compatibility.

The size of the image totally dependends on the Squeak modularizing efforts. The goal for this image is one around 0KB ;-) in size and fully compatible with mainstream Squeak image.

This looks like an interesting experiment.

Note that even though Smalltalk is an OOP language, this item goes under implementation, because it is about implementation techniques. Same goes for entries under Software-Eng etc.
Posted to implementation by Ehud Lamm on 9/11/02; 1:25:18 PM


Composing Programming Languages by Combining Action-Semantic
(Composing Programming Languages by Combining Action-Semantics Modules, Kyung-Goo Doh and Peter D. Mosses) Reuse of language functionality by combining modules of language features and resolving conflicts in these combinations:

Hoare noted in his POPL’73 paper Hints on programming language design that much of programming language design is consolidation, not innovation. In other words, a language designer should largely utilize constructions and principles that have worked well in earlier design projects and experiments (cf. the evident close relationship between C and C++, C and Java). To this end, language definitions should be modularized.... Modules for constructs stemming from the so-called Landin-Tennent design principles (the principles of abstraction, parameterization, correspondence, and qualification) are particularly uniform, and straightforward to combine.

The ASF-SDF formalism is used together with a new streamlined version of action semantic notation AN-2. A simple functional expression language example is used as a running example in a literate programming style that intersperses code with commentary on the code. Among the language features combined include:

"...expression bindings, expression blocks, expression parameters, and expressions with effects,... eager vs. lazy binding, static vs. dynamic scoping, call-by-value vs. call-by-name parameter passing schemes, and expressions with effects."

The later part of the paper deals with the issues of unifying modules, module consistency, module conflicts ("when there are two different semantic equations for the same syntactic construct in the modules to be combined"), and conditions for a language composed from modules to be complete.
Posted to implementation by jon fernquest on 9/2/02; 9:13:52 AM


Why Don’t Programming Language Designers Use Formal Methods?
An argument for Action Semantics. Discusses the life cycles of some languages (Pascal , ADA-83, ML) and how they failed to use formal specifications until long after they'd actually been specified. Action Semantics emphasizes pragmatics and can specify the semantics of imperative, object-oriented, functional, and concurrent programming languages.

...these specifications are unusually writable, modifiable, reusable, and readable. These qualities are consequences of the design of action notation. Familiar programming language concepts like binding, storing, sequencing, iteration, etc., are directly expressed in action action-semantic specification reads rather like an operational specification expressed in English.

There is also an Action Semantics home page that has three well-written tutorials:

Also note that Action Semantics influenced Espinosa's Scheme monad-based Semantic Lego ("Why Don’t Programming Language Designers Use Formal Methods?", David A. Watt, 1996)
Posted to implementation by jon fernquest on 8/24/02; 3:23:25 AM

Discuss (2 responses)

Mithril vm
This plain text document is a one draft/one sitting description of the Mithril virtual machine which intends to motivate a specification of programming language syntax by grounding syntax in a semantic context...

The Mithril programming language is a Smalltalk variant with an object system and a coding syntax that both strongly resemble Smalltalk's.

Many interesting memory management techniques.

Posted to implementation by Ehud Lamm on 8/17/02; 12:04:24 PM

Discuss (1 response)

Mark-sweep vs. copying collection
Hans-J. Boehm: It has often been argued that copying collection is superior to mark-sweep for two reasons. First, it compacts memory, and hence avoids any fragmentation. Second, it's running time is proportional to the amount of live memory, not the size of the heap. We argue here that both are relevant only under special circumstances. Furthermore the second claim is meaningful only if garbage collection time is defined in a somewhat contrived manner.

Opinions? Any more recent references?

Posted to implementation by Ehud Lamm on 8/14/02; 2:52:35 PM


Perl Foundation Grants are Almost Empty
As noted and discussed on Slashdot.
Posted to implementation by andrew cooke on 7/8/02; 1:35:58 PM


Playing with the GNAT Run Time
Miranda, J., Guerra, F., Gonzalez, A., Martin, J. How to modify the GNAT Run Time to experiment with Ada extensions. (ISBN: 84-87526-68-3).

Another Ada-Europe link. Gnat (free download) is an open source GCC based Ada compiler. It is one of the widely used Ada compilers, and provides many interesting features. This book length document is a detailed description of the Gnat runtime.

Since the Ada language supports concurrency, exceptions and various other language features that effect the runtime, this book may even interest people not using Ada. Obviously, if you want to play around with Gnat this is a great resource.

If you are more interested in the compiler front-end, I believe a book about the Gnat front-end is in the works. Until it comes out you can make use of some papers by the same author that talk about extending the front-end. (Springer LINK required for full-text access)

Posted to implementation by Ehud Lamm on 6/26/02; 8:07:03 AM

Discuss (4 responses)

Runtime optimization mailing list
Topics of interest include (but are not limited to) binary translations and code generation techniques, dynamic and adaptive optimization technologies, interpreters, emulators and runtime environments, static versus dynamic compiler technologies, profile- or feedback-directed optimizations, testing and performance measurement, hot-spot or critical path prediction, interaction between hardware and software technologies, cache/caching and memory management strategies, and more.

I am not going to follow this closely, but if you find something interesting on this list, do let us know.

Posted to implementation by Ehud Lamm on 6/14/02; 8:03:26 AM


Stack Inspection: Theory and Variants
Stack inspection is a security mechanism implemented in runtimes such as the JVM and the CLR to accommodate components with diverse levels of trust. Although stack inspection enables the fine-grained expression of access control policies, it has rather a complex and subtle semantics. We present a formal semantics and an equational theory to explain how stack inspection affects program behaviour and code optimisations.

The tech report is very detailed (~70 pages).

The most interesting part of this work is the careful discussion of how various code transformations are effected by stack inspection. Esp. notice the discussion of tail calls.

Posted to implementation by Ehud Lamm on 6/13/02; 8:18:20 AM


Investigations into Portable Object Codes
In contrast to functional assembler code, a functional object code is akin to the lowest level intermediate code in a transformation-based, type-preserving compiler for a functional language. From a denotational point of view, it is indeed nothing, but a simple functional language; however, it has a clearly defined operational interpretation, which determines its translation into machine code. As the functional flavour of the program is still preserved, a range of optimisations (such as, type specialisation and points-to analysis) are still possible, which can not be applied anymore after the program has been translated into machine code. The abstraction level of functional object code is similar to that of the Spineless Tagless G-machine and an instance is defined in a workshop paper.

Another approach to language neutrality.

Posted to implementation by Ehud Lamm on 6/3/02; 4:35:25 AM

Discuss (2 responses)

Semantic Lego
[...] we divide a semantics into two parts, a computation ADT and a language ADT. The computation ADT represents the basic semantic structure of the language. The language ADT represents the [...] grammar. [...] Language constructs are polymorphic over many different computation ADTs. [...] We build the computation ADT from composable parts [] many different computation ADTs and, since our language constructs are polymorphic, many different language semantics.

Plug and play semantics. The quote above is from the thesis, but the link contains related papers. This all follows on from this thread.
Posted to implementation by andrew cooke on 5/21/02; 5:47:34 PM


Squeampiler is an retargetable optimizing compiler backend for Squeak by Lee Salzman. Squeampiler aspires to become a dynamic and static compiler for Squeak. Squeampiler implements a number of strong optimizations and analyses found in commercial compilers including the following: SSA translation, memory alias analysis, second-chance linear scan register allocation and more.

Posted to implementation by Ehud Lamm on 5/6/02; 7:53:29 AM

Discuss (3 responses)

Compiling functional languages
Xavier Leroy, Compiling functional languages. Course given at the Spring School Semantics of programming languages, Agay, March 2002.

A set of 90 slides covering important aspects of the implementation of functional programming languages (including closure representation, abstract machines, control flow analysis, and representation of data).

This is a good starting point for anyone interested in the implementation of functiona languages. References to research papers are given throughout.

I seem recall someone around these parts is looking for info on compiling closures. This may be your lucky day.

A Postscript version is also available (two slides per page).

Posted to implementation by Ehud Lamm on 5/1/02; 11:53:47 PM


Implementation Strategies for Continuations
Will Clinger, Anne Hartheimer, Eric Ost. Implementation strategies for continuations. Proceedings of the 1988 ACM conference on LISP and functional programming.

Scheme and Smalltalk continuations may have unlimited extent. This means that a purely stack-based implementation of continuations, as suffices for most languages, is inadequate. Several implementation strategies have been described in the literature. Determining which is best reuqires knowledge of the kinds of programs that will commonly be run.

I found this through the LL1 discussion list which is mostly dead by now. This message gives more details as to the origin of the paper.

This paper may help make the quite abstract concept of continuations more concrete - since it describes implementation techniques. It also touches on how continuations are used in practice.

Posted to implementation by Ehud Lamm on 3/11/02; 4:43:48 AM


Partial Evaluation and Automatic Program Generation
(via comp.compilers)

The book Partial Evaluation and Automatic Program Generation (N.D. Jones, C.K. Gomard, and P. Sestoft) gives a comprehensive presentation of partial evaluation: theory, techniques, and applications. It is suitable for self-study, and for graduate courses and advanced undergraduate courses on program transformation techniques.

The full text of the book can be downloaded from the site.

The site includes a short explanation of what partial evaluation is all about, and links to Scheme and Prolog example code from the book.

Posted to implementation by Ehud Lamm on 3/9/02; 4:32:12 AM


Teaching an ML compiler to speak C natively
No-Longer-Foreign: Teaching an ML compiler to speak C ``natively.'' Matthias Blume. In BABEL'01: First workshop on multi-language infrastructure and interoperability, September 2001, Firenze, Italy. (Here is a slightly more detailed early draft of the same paper.)

This paper shows how to encode (almost all of) the C type system using ML. This allows ML and C program to achieve data-level interoperability.

The ML FFI system described in the paper knows how to take C type definitions (h files) and create the ML information automatically.

The paper shows how to represent array types so that array dimensions can be checked statically, using phantom types. I believe Oleg has has implemented similar things in Haskell.

Posted to implementation by Ehud Lamm on 2/25/02; 10:56:09 AM


A Reduction Semantics for Array Expressions

Arrays are one of the most important data structures in computing. They are also one of the most fun; if you're into sound or graphics you will almost certainly do some array processing at some time. Unsurprisingly there are many languages around that focus on array manipulation. Broadly speaking they are divided into two camps: rubbish languages that produce fast code (think Fortran and derivatives) and good languages that produce slow code (think APL and derivatives). Wouldn't it be nice to have a decent language that produced fast code?

Well the state of the art is getting there. Single Assignment C produces extremely fast code and uses a pleasant declarative form to specify array manipulation. It's based on a calculus of arrays called the Mathematics of Arrays. This paper has the details of the reduction rules used to perform these optimisations. It's worth reading if you're interested in getting fast array code out of a functional language; hopefully something functional compiler authors will turn to.
Posted to implementation by Noel Welsh on 1/29/02; 12:22:25 AM

Discuss (1 response)

New Block Closures in Squeak
Block closure is when blocks have their own state separate from their creating context and can be invoked more than once at the same time.

Over on the Squeak wiki. The page includes a brief explanation of some implementation details.

More on the semantics.

Posted to implementation by Ehud Lamm on 1/22/02; 5:56:40 AM

Discuss (2 responses)

Incremental Mature GCUsing the Train Algorithm
(via caml-list)
We present an implementation of the Train Algorithm, an incremental collection scheme for reclamation of mature garbage in generation-based memory management systems. To the best of our knowledge, this is the first Train Algorithm implementation ever. Using the algorithm, the traditional mark-sweep garbage collector employed by the Mjølner run-time system for the object-oriented BETA programming language was replaced by a non-disruptive one, with only negligible time and storage overheads.

Also check out Mjølner, the makers of a commercial BETA implementation.
Posted to implementation by Bryn Keller on 1/11/02; 1:15:41 PM


Trampolined Style
Trampolined Style, Steven E. Ganz, Daniel P. Friedman, and Mitchell Wand, ICFP 1999.

I was sure this paper was linked here before, but I can't seem to find it.

Since two of my students recently needed this paper for projects they are doing, I decided it may be of interests to loyal LtU readers.

A trampolined program is organised as a single loop in which computations are sechduled and their execution allowed to proceed in discrete steps.

Trampolining can be used as a way to avoid stack-overflow in interpreters written in languages not supporting proper tail-calls. More interesting uses are for debugging (stepping, breakpoints etc.) and multithreading. These can be based on trampoilining the interpreter, or the program itself.

Posted to implementation by Ehud Lamm on 1/8/02; 11:00:54 AM


The Future of Compilers
Proebsting's Law states that compiler technology doubles in power every 18 years, a direct analog to Moore's famous 18 month rule. This suggests increasing programmer productivity is a better research field though other contend that evidence does not support Proebsting's Law.

Regardless of the validity of Proebstring's Law it is interesting to ponder the future of compilers, since they affect all programming activities. Some say more effort should be spend optimising higher level code, to increase programmer productivity with out harming runtime performance. No doubt LtU readers are familiar with efforts in this area, exemplified by the ML and Haskell compilers. In fact higher level languages offer more opportunities for optimisation due to two properties. Firstly, they tend to have much cleaner semantics and are therefore easier to reason about within a compiler. Secondly, there isn't such a direct mapping to CPU operations as there is in portable assembly like C or C++ so there is more room for the compiler to try different options. I'll return to this point later.

It is interesting to look at novel ways of increasing program performance. The Dynamo project has created an on-line compiler that complements traditional off-line optimisations by concentrating on optimising the things that are hard to get right off-line: branch prediction and code placement. The results are impressive showing 20% speedups over code compiled at -O4, and even improving code compiled on the basis of profile information. In addition to the technical reports there is a nice overview at ArsTechnica and the usual grab-bag at Slashdot.

The Haskell compiler GHC takes a different approach based on two observations: Firstly, there are infinitely many optimisations Secondly, most optimisations are application specific. The conclusion is to let users specify the optimisations in the form of rewrite rules. (As a Scheme hacker I must point out that Lisp has had this since the beginning of time in the form of compiler macros.) This approach offers great potential for high-level optimisations though it remains to be seen how effective it is in practice.

The final point I'd like to raise gets back to the 'more-wriggle-room' argument above. It has been suggested that future compilers should incorporate AI techniques to search for the best optimisations for a given program. Traditional compilers don't adapt their optimisations to the program. Smarter compiler will, but the complexities in this approach are formidable. Heavily optimising compilers (e.g. Vortex and Stalin) already have long compile times. Imagine what could happen if you then allow the compiler to search through different representations. There are undoubtably some challenges to solve with this approach but there is room for great gains as well.
Posted to implementation by Noel Welsh on 12/8/01; 2:49:43 AM

Discuss (4 responses)

Hitch Hiker's Guide to the Smalltalk Compiler
A fun tour of the innards of two Smalltalk compilers.

Touches on the virtual machine, and shows some interesting hacks like adding support for assertions.

Posted to implementation by Ehud Lamm on 12/4/01; 8:31:45 AM


ElectricalFire is a Java Virtual Machine that uses JIT (Just In Time) compilation techniques to accelerate Java code execution. ElectricalFire was designed from the start to generate high-performance machine code and to be portable to many different processor architectures.

I think this is the last JVM/Java JIT compiler I am going to mention for a while...

This one is at least open source.

Thanks, Paul.
Posted to implementation by Ehud Lamm on 12/4/01; 8:26:27 AM


GNU lightning
(via sweetcode)

GNU lightning is a library that generates assembly language code at run-time; it is very fast, making it ideal for Just-In-Time compilers, and it abstracts over the target CPU, as it exposes to the clients a standardized RISC instruction set inspired by the MIPS and SPARC chips.

This is almost all the information found on the web site. I guess if you want more, you just got to read the source..

To much about implementation recently, any one got some cool theory papers?

Posted to implementation by Ehud Lamm on 11/30/01; 12:56:53 AM

Discuss (4 responses)

JRockit is a family of superior Virtual Machines for server-side Java that can be integrated, optimized and independently configured and managed for their unique operating environments.

This trend of more and more Java VMs and native compilers is starting to be boring, but at least its a chance to compare various design decisions.

Jrockit comes with a variey of garbage collectors. You can also check this list for other distinguishing features.

Posted to implementation by Ehud Lamm on 11/28/01; 12:37:14 PM


Spirit Parser Framework
A parser "generator" that uses C++ template metaprogramming. You write EBNF directly into C++ source code, without any preprocessors or code generators (seen at sweetcode).

Could someone wiser than me compare this with combinator parsing in functional languages? Is is the same thing?

(I searched for "Spirit", but turned up nothing, apologies if this is a duplicate - only been reading this site again for the last week or two)
Posted to implementation by andrew cooke on 11/26/01; 2:47:37 AM

Discuss (2 responses)

Proto was one of the languages discussed at the lightweight languages workshop. The site features some presentations about Proto. I found the discussion on bootstrapping interesting. The presentation about the implementation details covers some issues that arise when using C as a target language, specifically the techniques of closure flattening and lambda lifting.

Posted to implementation by Ehud Lamm on 11/23/01; 7:38:09 AM

Discuss (1 response)

Psyco, the Python Specializing Compiler
(via Daily Python-URL)

The aim of the Psyco project is to show that it is possible to execute Python code at speeds approaching that of fully compiled languages. The current prototype operates on i386-compatible processors. Preliminary results show potential speed-ups of a factor 10 to 100, which means that we can hope execution speeds closer to fully optimized C than to traditional Python.

Psyco is an on-line partial evaluator for the Python language. It is based on run-time code generation.

Posted to implementation by Ehud Lamm on 11/22/01; 11:16:03 AM

Discuss (2 responses)

Wonka - Embedded Java VM
(via HtP)

Wonka is ACUNIA's cleanroom Virtual Machine for the Java language. It is extremely portable and self-contained, and can optionally be used with its own real-time executive (OSwald) to provide a complete solution for embedded devices. It is a full implementation of the Java language, not just a subset. And it's Open Source.

You may find the design decisions interesting.

Posted to implementation by Ehud Lamm on 11/12/01; 2:35:39 AM

Discuss (3 responses)

(via comp.compilers)

Manta is a native Java compiler. It compiles Java source codes to x86 executables. Its goals are to beat the performance of all current Java implementations. Currently it already contains a highly efficient RMI implementation (source code compatible with std. RMI). It is currently about 30 times faster than standard implementations.

Manta was created for experimenting with parallel and distributed computing:

Manta also supports some Java extentions, such as the JavaParty programming model (the 'remote' keyword), replicated objects (described at JavaGrande 2000), and efficient divide and conquer parallelism (the 'spawn' and 'sync' keywords from cilk). The divide and conquer system is called 'Satin' and was described at Euro-Par 2000 and PPoPP'01. Furthermore, we have built a distributed shared memory (DSM) system on top of Manta, called Jackal, described at JavaGrande 2001 and PPoPP'01.

(the papers are linked to directly from the Manta website).

Posted to implementation by Ehud Lamm on 10/26/01; 8:02:42 AM


Jikes Research Virtual Machine (RVM)
(via comp.compilers)

The Jikes™ Research Virtual Machine (Jikes RVM) provides the academic and research communities with a flexible open testbed to prototype new virtual machine technologies and experiment with a large variety of design alternatives...

...The Jikes RVM runs on AIX™/PowerPC™, Linux©/PowerPC and Linux/IA-32 platforms and includes state-of-the-art virtual machine technologies for dynamic compilation, adaptive optimization, garbage collection, thread scheduling and synchronization.

Jikes can be downloaded from the website.

Posted to implementation by Ehud Lamm on 10/26/01; 7:57:27 AM


Instruction scheduling and imperative functional programming
The article makes the case that a transformation of a typical imperative code into functional code is similar to the data and control dependency analyses in compilers for modern CPUs. By typical imperative code we mean a "typical numerical" code that involves nested loops and updateable arrays....

...The analysis of data and control dependencies across loop iterations, scheduling of instructions and interlocking of their results, pre-fetching of data -- is the job of an advanced compiler or the CPU. On the other hand, we have to do the same kind of analyses to separate the functional part of a computation from its imperative part. It appears functional programming is closer to "the bare metal" as one would have thought.

Posted to implementation by Ehud Lamm on 10/23/01; 1:55:51 PM

Discuss (2 responses)

Java Performance
(TechNetCast program)

IBM Research engineer Peter Haggar dispels the myth and makes the case that the reputation of Java as a slug is entirely undeserved.

Main focus of this talk is about JIT and hotspot compilation, rather than on what directly effects run time performance like heap management and GC. More useful information on these topics can be found in the links posted here in the past about Java and realtime.

Posted to implementation by Ehud Lamm on 9/29/01; 8:28:43 AM


Garbage collection vs. reference counting in .NET

One of the gripes I have about garbage collection is that it is, unlike reference counting, non-deterministic and thus unusable for managing resources other than memory. As a result managing resources is much easier in C++ than in, say, Java.

A while ago there was a heated discussion on one of the Develop Mentor mailing lists about the above problem. It culminated with a message in which Brian Harry, one of the Product Managers on the .NET team, explained why they decided to go with a tracing collector and not reference counting after all. A very interesting read.

Posted to implementation by Dejan Jelovic on 8/20/01; 8:49:15 AM

Discuss (3 responses)

Dave Simmons of QKS (a SmallTalk implementation company) posted an interesting message to the language-dev mailing list discussing his forthcoming AOS VM, expected in August or September. He describes it as being extremely fast (within 1-2x C speeds), having a small footprint (~600k including compiler and tools), and supporting a horde of nice features (multi-methods, type inference, MOP, e.g.) in addition to the usual.

It runs his language, SmallScript, but a number of ports are apparently underway for Python and so on. Programs which run on the AOS VM can be 'cross-jitted' into .Net IL/VM code. The AOS VM will be free (no cost), but not open source as far as I can tell. The AOS/.Net converter will be a commercial product.

More information (disappointingly little, actually) is available at
Posted to implementation by Bryn Keller on 8/8/01; 1:28:54 PM


for and against conservative GC
An informative comp.lang.scheme thread (with contributions by William Clinger and Hans Boehm among others).

(Another interesting but long thread on comp.lang.scheme revolves around the environment model of evaluation, mutation etc.)

Posted to implementation by Ehud Lamm on 8/3/01; 4:32:39 AM


Hidden complexities of tail-call/tail-recursion optimization
Tail call elimination and and the difference between tail calls and tail recursion are frequent topics on language related discussion groups. Last time this came up on comp.lang.fnctional, I asked Oleg to summarize the subject for LtU. Happily, he obliged.

Posted to implementation by Ehud Lamm on 7/11/01; 5:24:52 AM

Discuss (4 responses)

Language Independent Arithmetic (LIA-2)
I don't know if this is well known document, I just heard about it on mercury mailing list about wether 0^0 should return 1 (at least for float,int -> float)

This document can solve things like choosing wether 13 / -4 should return -3 (C) or -4 (python, ruby...) (ruby's behaviour evolved on this)

Alas I can't find an html version available...
Posted to implementation by pixel on 7/4/01; 5:04:43 PM

Discuss (1 response)

Cross-Language Implementation mailing list
An article on the use Perl site notes the creation of a new mailing list for language developers. The list is meant to include developers of Perl, Python, and Tcl developers and has been set up to discuss common problems in language development including Unicode, threads, and numeric conversion.
Posted to implementation by pixel on 7/4/01; 5:01:10 PM

Discuss (1 response)

Linux Mag: GCC.NET
Quite a few interesting ideas here.

The article talks about various ideas connecting GCC and .Net. These include using GCC to compile the IL to machine code and making GCC emit IL bytecodes. Compiling C# is also discussed.

Posted to implementation by Ehud Lamm on 6/28/01; 5:23:32 AM


Hardware Support for Objects/Java
Hardware for supporting objects and other features of Java such as multithreading, dynamic linking and loading is the focus of this workshop. The impact of Java's features on micro-architectural resources and issues in the design of Java-specific architectures are interesting topics that require the immediate attention of the research community. While Java has become an important part of desktop applications, it is now being used widely in high-end server markets, and making forays into low-end embedded computing. Papers focussing on these emerging areas are particularly invited.

This is the site of the third annual workshop, which will take place September 2001. The site links to the previous workshops pages, and their proceedings. These inculde discussions on such things as garbage collection, threading and just-in-time compilation.

Posted to implementation by Ehud Lamm on 6/5/01; 1:58:06 PM


Compiler reliability
The question of compiler reliability came up in a class discussion. I tired to explain that this question is more subtle than it seems. Mission critical systems may require that the compiler is valdated, which requires a standard. But they may also require the ability to review to object code, and have direct traceability to the source.

This paper is about validating a compiler.

Another, perhaps more interesting approach, is discussed in this paper. It talks about using proof carrying code etc. (also check the references).

The tradtional approach to these issues is, of course, to create validation test suites for compilers. Check the Ada Conformity Assessment Test Suite .

Posted to implementation by Ehud Lamm on 5/31/01; 1:41:46 PM


Will Java always be slower than C++
This essay claims that Java is inherently slow, because of its language features. I'd be glad to hear more opinions on this.

One of the keynote talks in the conference I attended last week was about real-time Java. As is well known, Ada is very strong in the real-time and high relaibility sectors. The paper was written by two experts in both Ada and Java, and my conclusion from the paper is that some interesting work is being done.

Some relavent links are the requirements for RT Java from NIST, the work of the J Consortium, and this paper about The Design and Implementation of Real-Time Java.

Posted to implementation by Ehud Lamm on 5/22/01; 8:33:27 AM

Discuss (8 responses)

Janos VM is available
The Janos project's objective is to develop a principled local operating system for active network nodes, oriented to executing untrusted Java bytecode. The primary focus is resource management and control, with secondary objectives of information security, performance, and technology transfer of broadly and separately useful software components.

The Janos VM is special because unlike any available virtual machine it supports multiple, separate processes (called "teams" in JanosVM) within a single VM. The JanosVM supports per-team separate heaps, per-team garbage collection threads, inter-team thread migration, and safe cross-team reference objects. Designed to support asynchronous termination of uncooperative or malicious Java bytecode applications, the JanosVM provides robust and scalable multi-process support within a single virtual machine.

Posted to implementation by Ehud Lamm on 5/9/01; 7:56:02 AM


Garbage Collection Can Be Faster Than Stack Allocation
A paper by Andrew Appel.

The conclusion of this paper may seem trivial at first: A very old and simple algorithm for garbage collection gives very good results when the physical memory is much larger than the number of reachable cells.

However get this the overhead associated with allocating and collecting cells from the heap can be reduced to less than one instruction per cell by increasing the size of physical memory.

Whether trivial or surprising, it is worth reading.

Posted to implementation by Ehud Lamm on 5/1/01; 8:25:05 AM


Uniprocessor Garbage Collection Techniques
We survey basic garbage collection algorithms, and variations such as incremental and generational collection; we then discuss low-level implementation considerations and relationships between storage management systems, languages, and compilers. Throughout, we attempt to present a unified view based on abstract traversal strategies, addressing issues of conservatism, opportunism, and immediacy of reclamation; we also point out a variety of implemetation details that are likely to have significant impact on the performance.

A very extensive (~70 pages) survey paper.

Covers most common GC issues, and contains a section about GC-related language features. Doesn't cover functional programming languages specifically, but touches on many of the important issues. (I was looking for a discussion of closure and continuation allocation, and these are not covered here. I'll post references to papers discussing this issue later on).

Even if you don't have the time to read all of this, I recommend reading the beginning which discusses the pros and cons of GC as compared to manual memory handling.

Also check out the large paper archive.

Posted to implementation by Ehud Lamm on 4/30/01; 8:34:05 AM

Discuss (1 response)

Logs: Hack The Planet ; JavaLobby ; Daily Python-URL ; xmlhack ; PHP everywhere ; (more)
Wikis: WikiWiki ; Erlang ; Common Lisp ; Haskell ; Squeak ; Tcl ; Program Transformation
Print-Friendly Version
Create your own Manila site in minutes. Everyone's doing it!