Implementation

COLA Brainfuck

From the Software Architecture Group at the Hasso Plattner Institut:

Our tutorial on COLA provides insight on how programming languages can be implemented using the combined abstractions and an implementation of parsing expression grammars in COLA. The "esoteric" programming language brainfuck was chosen for its simplicity, which allows for concentrating on COLA's features.

Previously: COLA and Open, extensible object models; via neuraxon77.

Register Allocation by Proof Transformation

Register Allocation by Proof Transformation, Atsushi Ohori. ESOP 2003.

This paper presents a proof-theoretical framework for register allocation that accounts for the entire process of register allocation. Liveness analysis is proof reconstruction (similar to type inference), and register allocation is proof transformation from a proof system with unrestricted variable accesses to the one with restricted variable access. In our framework, the set of registers acts as the ``working set'' of the live variables at each instruction step, which changes during the execution of the code. This eliminates the ad-hoc notion of ``spilling''. The necessary memory-register moves are systematically incorporated in the proof transformation process. Its correctness is a simple corollary of our construction; the resulting proof is equivalent to the proof of the original code modulo the treatment of structural rules. This yields a simple yet powerful register allocation algorithm. The algorithm has been implemented, demonstrating the feasibility of the framework.

The idea that making uses of the structural rules explicit gives you proof terms to model register-memory moves is very pretty. Another fun thing to do would be to take a continuation calculus and apply the ideas here to see if it produces a good low-level IR.

EDIT: Ehud asked for some further exposition, so here goes.

At a high level, you can think of the need to do register allocation as arising from a mismatch between a programming language and the hardware. In a language, we refer to data using variables, and in any given expression, we can use as many variables as we like. However, when a CPU does stuff, it wants the data to be in registers -- and it has only a small, finite set of them. So when a program is compiled, some variables can be represented by registers, and the rest must be represented by locations in memory (usually on the stack). Whenever the CPU needs to use a variable in memory, there must be explicit code to move it from memory into a register.

The idea in this paper is to take the typing derivation of a program with an unbounded variable set, and then divide the context into two parts, one representing the register file and the other representing variables in memory. In terms of the implementation, moves between these two zones correspond to register-memory moves; and in terms of logic, this is a use of the structural rule of Exchange, which permutes the order of variables in a context.

So this gives us a high-level, machine-independent characterization of the register allocation problem: take a one-zone derivation and convert it to a two-zone derivation. But it gets even better: as long as the register allocation algorithm only adds uses of the structural rules in its transformation, we know that the meaning of the original program is unchanged -- so this method also yields a simple way of proving that a register allocation pass is semantics-preserving. (The fact that this is an easy proof is one indication of the power of this idea.)

Simply efficient functional reactivity

Simply efficient functional reactivity. Conal Elliott.

Functional reactive programming (FRP) has simple and powerful semantics, but has resisted efficient implementation. In particular, most past implementations have used demand-driven sampling, which accommodates FRP's continuous time semantics and fits well with the nature of functional programming. Consequently, values are wastefully recomputed even when inputs don't change, and reaction latency can be as high as the sampling period.

This paper presents a way to implement FRP that combines data- and demand-driven evaluation, in which values are recomputed only when necessary, and reactions are nearly instantaneous. The implementation is rooted in a new simple formulation of FRP and its semantics and so is easy to understand and reason about.

On the road to efficiency and simplicity, we'll meet some old friends (monoids, functors, applicative functors, monads, morphisms, and improving values) and make some new friends (functional future values, reactive normal form, and concurrent “unambiguous choice”).

I'm not sure exactly where to classify this submission to ICFP 2008, but I think many here will be interested in it.

Relating Complexity and Precision in Control Flow Analysis

Relating Complexity and Precision in Control Flow Analysis, David Van Horn and Harry Mairson. ICFP 2007.

We analyze the computational complexity of kCFA, a hierarchy of control flow analyses that determine which functions may be applied at a given call-site. This hierarchy specifies related decision problems, quite apart from any algorithms that may implement their solutions. We identify a simple decision problem answered by this analysis and prove that in the 0CFA case, the problem is complete for polynomial time. The proof is based on a nonstandard, symmetric implementation of Boolean logic within multiplicative linear logic (MLL). We also identify a simpler version of 0CFA related to eta-expansion, and prove that it is complete for logarithmic space, using arguments based on computing paths and permutations.

For any fixed k > 0, it is known that kCFA (and the analogous decision problem) can be computed in time exponential in the program size. For k = 1, we show that the decision problem is NP-hard, and sketch why this remains true for larger fixed values of k. The proof technique depends on using the approximation of CFA as an essentially nondeterministic computing mechanism, as distinct from the exactness of normalization. When k = n, so that the "depth" of the control flow analysis grows linearly in the program length, we show that the decision problem is complete for exponential time.

In addition, we sketch how the analysis presented here may be extended naturally to languages with control operators. All of the insights presented give clear examples of how straightforward observations about linearity, and linear logic, may in turn be used to give a greater understanding of functional programming and program analysis.

There's ton of really good stuff in here; I was particularly fascinated by the fact that 0-CFA is exact for multiplicatively linear programs (ie, that use variables at most once), because linearity guarantees that every lambda can flow to at most one use site.

The YNot Project

The YNot Project

The goal of the Ynot project is to make programming with dependent types practical for a modern programming language. In particular, we are extending the Coq proof assistant to make it possible to write higher-order, imperative and concurrent programs (in the style of Haskell) through a shallow embedding of Hoare Type Theory (HTT). HTT provides a clean separation between pure and effectful computations, makes it possible to formally specify and reason about effects, and is fully compositional.

This was actually commented on here, by Greg Morrisett himself. Conceptually, this seems like it's related to Adam Chlipala's A Certified Type-Preserving Compiler from Lambda Calculus to Assembly Language. See, in particular, slides 23-24 of this presentation (PDF). More generally, computation and reflection seem to be gaining recognition as important features for the practical use of such powerful tools as Coq; see also SSREFLECT tactics for Coq and their accompanying paper A Small Scale Reflection Extension for the Coq system (PDF).

It's also interesting to observe that the work appears to depend on the "Program" keyword in the forthcoming Coq 8.2, the work behind which is known as "Russell," as referred to in this thread. Russell's main page in the meantime is here. Again, the point is to simplify programming with dependent types in Coq.

Update: Some preliminary code is available from the project page.

Open Multi-Methods for C++

Peter Pirkelbauer, Yuriy Solodkyy, and Bjarne Stroustrup. Open Multi-Methods for C++. Proc. ACM 6th International Conference on Generative Programming and Component Engineering (GPCE). October 2007.

Multiple dispatch – the selection of a function to be invoked based on the dynamic type of two or more arguments – is a solution to several classical problems in object-oriented programming. Open multi-methods generalize multiple dispatch towards open-class extensions, which improve separation of concerns and provisions for retroactive design. We present the rationale, design, implementation, and performance of a language feature, called open multi-methods, for C++ . Our open multi-methods support both repeated and virtual inheritance... ...our approach is simpler to use, catches more user mistakes, and resolves more ambiguities through link-time analysis, runs significantly faster, and requires less memory. In particular, the runtime cost of calling an open multimethod is constant and less than the cost of a double dispatch (two virtual function calls). Finally, we provide a sketch of a design for open multi-methods in the presence of dynamic loading and linking of libraries.

Who said C++ isn't evolving?

The discussion in section 4 of the actual implementation (using EDG) is particularly detailed, which is a bonus.

Theorem proving support in programming language semantics

We describe several views of the semantics of a simple programming language as formal documents in the calculus of inductive constructions that can be verified by the Coq proof system. Covered aspects are natural semantics, denotational semantics, axiomatic semantics, and abstract interpretation. Descriptions as recursive functions are also provided whenever suitable, thus yielding a a verification condition generator and a static analyser that can be run inside the theorem prover for use in reflective proofs. Extraction of an interpreter from the denotational semantics is also described. All different aspects are formally proved sound with respect to the natural semantics specification.

More work on mechanized metatheory with an eye towards naturalness of representation and automation. This seems to me to relate to Adam Chlipala's work on A Certified Type-Preserving Compiler from Lambda Calculus to Assembly Language, which relies on denotational semantics and proof by reflection, in crucial ways. More generally, it seems to reinforce an important trend in using type-theory-based theorem provers to tackle programming language design from the semantic point of view (see also A Very Modal Model of a Modern, Major, General Type System and Verifying Semantic Type Soundness of a Simple Compiler). I find the trend exciting, but of course I also wonder how far we can practically go with it today, given that the overwhelming majority of the literature, including our beloved Types and Programming Languages, is based on A Syntactic Approach to Type Soundness. Even the upcoming How to write your next POPL paper in Coq at POPL '08 centers on the syntactic approach.

Is the syntactic approach barking up the wrong tree, in the long term? The semantic approach? Both? Neither?

Avi Bryant: Ruby IS-A Smalltalk

A short audio presentation (Avi speaks for less than ten minutes, I guess), about the lessons the Ruby community should learn from Smalltalk. It's mainly about turtles-all-the-way-down, but Self (fast VMs), GemStone (transactional distributed persistence), Seaside (web frameworks) are also mentioned briefly.

Concurrency: The Compiler Writer's Perspective

A short interview with Brian Grant (of PeakStream and currently Google). From SD Times, Nov. 15, 2007.

SD Times: Is concurrency too difficult for developers accustomed to linear programming to grasp?

Brian Grant: It is challenging, but I don’t think it’s too challenging. I do think certain languages, especially C and C++, make it very challenging to write robust big multithreaded systems. Basically, they have features that are hostile to concurrency.

Does your experience in compilers give you a different perspective than the average developer?

[...]
Higher-level languages favor ease of correctness over ease of performance. For example, in functional languages such as Haskell, the code you write does not have any side effects. The model is that they don’t modify existing data; they generate new data. The advantage is that it’s easier for the compiler to reason about what different components of the code can do.
[...]

OCaml Light: A Formal Semantics For a Substantial Subset of the Objective Caml Language

OCaml Light: a formal semantics for a substantial subset of the Objective Caml language.

OCaml light is a formal semantics for a substantial subset of the Objective Caml language. It is written in Ott, and it comprises a small-step operational semantics and a syntactic, non-algorithmic type system. A type soundness theorem has been proved and mechanized using the HOL-4 proof assistant, thereby ensuring that the proof is free from errors. To ensure that the operational semantics accurately models Objective Caml, an executable version of the semantics has been created (and proved equivalent in HOL to the original, relational version) and tested on a number of small test cases.

Note that while we have tried to make the semantics accurate, we are not part of the OCaml development team - this is in no sense a normative specification of the implemented language.

From a team including Peter Sewell (Acute, HashCaml, Ott).

I continue to believe that things are heating up nicely in mechanized metatheory, which, in the multicore/multiprocessor world in which we now live, is extremely good news.

XML feed