Implementation

Representing Control in the Presence of First-Class Continuations

Robert Hieb, R. Kent Dybvig, and Carl Bruggeman. Representing Control in the Presence of First-Class Continuations. ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation, June 1990.

Languages such as Scheme and Smalltalk that provide continuations as first-class data objects present a challenge to efficient implementation. Allocating activation records in a heap has proven unsatisfactory because of increased frame linkage costs, increased garbage collection overhead, and decreased locality of reference. However, simply allocating activation records on a stack and copying them when a continuation is created results in unbounded copying overhead. This paper describes a new approach based on stack allocation that does not require the stack to be copied when a continuation is created and that allows us to place a small upper bound on the amount copied when a continuation is reinstated. This new approach is faster than the naive stack allocation approach, and it does not suffer from the problems associated with unbounded copying. For continuation-intensive programs, our approach is at worst a constant factor slower than the heap allocation approach, and for typical programs, it is significantly faster. An important additional benefit is that recovery from stack overflow is handled gracefully and efficiently.

Many Scheme implementations do not implement call/cc particularly well, but it is important to remember that call/cc is not inherently slow. This is the classic paper describing a clever run-time representation that supports call/cc efficiently.

Unladen Swallow: LLVM based Python compiler

The second release of Unladen Swallow was made available yesterday. The project plan describes the goals as follows:

We want to make Python faster, but we also want to make it easy for large, well-established applications to switch to Unladen Swallow.

  1. Produce a version of Python at least 5x faster than CPython.
  2. Python application performance should be stable.
  3. Maintain source-level compatibility with CPython applications.
  4. Maintain source-level compatibility with CPython extension modules.
  5. We do not want to maintain a Python implementation forever; we view our work as a branch, not a fork.

Of course, Google is known for writing a lot of their software in Python, and now they are trying to speed it up, just like the V8 project has done for JavaScript.

A Reactive Model-based Programming Language for Robotic Space Explorers

Michel Ingham, Robert Ragno, and Brian Williams, A Reactive Model-based Programming Language for Robotic Space Explorers, Proceedings of the 6th International Symposium on Artificial Intelligence, Robotics and Automation in Space, Montreal, Canada, June 2001.

Model-based autonomous agents have emerged recently as vital technologies in the development of highly autonomous reactive systems, particularly in the aerospace domain. These agents utilize many automated reasoning capabilities, but are complicated to use because of the variety of languages employed for each capability. To address this problem, we introduce model-based programming, a novel approach to designing embedded software systems. In particular, we introduce the Reactive Model-based Programming Language (RMPL), which provides a framework for constraint-based modeling, as well as a suite of reactive programming constructs. To convey the expressiveness of RMPL, we show how it captures the main features of synchronous programming languages and advanced robotic execution languages.

Reactive programming has been discussed on LtU quite a bit in the past. Functional reactive programming, in particular, has got a lot of attention. But I don't think we've covered reactive model-based programming before. This is an interesting approach that combines constraint-based modeling with reactive programming, and compiles programs down to hierarchical constraint automata for execution. Whereas reactive languages like Esterel focus on manipulating signals, RMPL works by manipulating (potentially hidden) system states.

Modern dynamic linking infrastructure for PLT

Given that Unix won, I think it's interesting that dynamic languages make very little use of the dynamic linking and loading infrastructure provided by modern free Unixes such as Linux and the BSDs.

Most dynamic PLs opt to implement "dynamism" (i.e. redefining stuff, loading code at runtime) with application-specific data structures (e.g. Lisp: red-black trees for uniquifying symbols, function pointers and indirect function calls), and they do so solely at runtime (mostly using interpreters and JITs, although Scheme, one of the most advanced dynamic languages, is increasingly illuminating the possibilities of static, independent compilation of dynamic programs).

(Metaprogramming at runtime is perilous, as it is easy to mix up phase distinctions, something we can expect newer dynamic programming languages to discover in a decade (of course we don't know which decade.))

Link-time is mostly ignored. And yet, under Linux with its heavy use of shared objects, one cannot even start a single program without invoking the dynamic linker.

But some people, even some computer programmers, don't know how linkers work and what they do. Basically, a modern linking file format, such as ELF, is a declarative way to construct the memory image of a running process, with lots of features for dynamic customization of the image construction process (ELF even contains a customization hook called "program interpreter" in every executable!).

Likewise, modern compilers and runtime systems such as GNU C contain sophisticated features aimed squarely at dynamic languages: weak symbols for runtime redefinability (used by libc's malloc and in the Linux kernel, for example), computed gotos, nested functions, and increasingly, GC. And there is evidence that dynamic compilation and linking of C snippets is accepted and used in modern systems software.

I have collected some links to these topics, and would be interested to hear of languages and systems that you know that exploit them.

(Updates: added 3 ELF links; added Drepper; added Taylor)

Types are Calling Conventions

If optimization and intermediate languages for lazy functional languages are your things, take a look at Types are Calling Conventions by Max Bolingbroke and Simon Peyton Jones.

It is common for compilers to derive the calling convention of a function from its type. Doing so is simple and modular but misses many optimisation opportunities, particularly in lazy, higher-order functional languages with extensive use of currying. We restore the lost opportunities by defining Strict Core, a new intermediate language whose type system makes the missing distinctions: laziness is explicit, and functions take multiple arguments and return multiple results.

LNGen

LNGen

LNgen generates locally nameless infrastructure for the Coq proof assistant from Ott-like specifications. Its output is based on the locally nameless style advocated in Engineering Formal Metatheory and includes all of the "infrastructure" lemmas associated with that style. Compared to Ott's locally nameless backend, LNgen favors generating a large collection of infrastructure lemmas over handling complex binding specifications and methods of defining syntax and relations.

There are really three stories here:

  1. Coq 8.2 shipped a while ago.
  2. Ott, a tool for PLT semantics work, has acquired a backend in support of the increasingly-popular "locally nameless" representation of binding structure in mechanized programming language metatheory.
  3. LNGen is another tool, using a subset of Ott syntax, that takes a slightly different approach from Ott's new backend to addressing the same issues.

From the U. Penn folks who brought us the Coq tutorial at POPL '08.

Achieving Security Despite Compromise Using Zero-Knowledge

Achieving Security Despite Compromise Using Zero-Knowledge

One of the important challenges when designing and analyzing cryptographic protocols is the enforcement of security properties in the presence of compromised participants. This paper presents a general technique for strengthening cryptographic protocols in order to satisfy authorization policies despite participant compromise. The central idea is to automatically transform the original cryptographic protocols by adding non-interactive zero-knowledge proofs. Each participant proves that the messages sent to the other participants are generated
in accordance to the protocol. The zero-knowledge proofs are forwarded to ensure the correct behavior of all participants involved in the protocol, without revealing any secret data. We use an enhanced type system for zero-knowledge to verify that the transformed protocols conform to their authorization policy even if some participants are compromised. Finally, we developed a tool that automatically generates ML implementations of protocols based on zero-knowledge proofs. The protocol transformation, the verification, and the generation of protocol implementations are fully automated.

This is the follow-up to this story. The prior work did not account for compromised participants. This work does.

I continue to be excited about the prospect of this previous story's work being applied to the type system described in this story, possibly resulting in an awesome new language for developing secure software.

A Generic Type-and-Effect System

A Generic Type-and-Effect System

Type-and-effect systems are a natural approach for statically reasoning about a program's execution. They have been used to track a variety of computational effects, for example memory manipulation, exceptions, and locking. However, each type-and-effect system is typically implemented as its own monolithic type system that hard-codes a particular syntax of effects along with particular rules to track and control those effects. We present a generic type-and-effect system, which is parameterized by the syntax of effects to track and by two functions that together specify the effect discipline to be statically enforced. We describe how a standard form of type soundness is ensured by requiring these two functions to obey a few natural monotonicity requirements. We demonstrate that several effect systems from the literature can be viewed as instantiations of our generic type system. Finally, we describe the implementation of our type-and-effect system and mechanically checked type soundness proof in the Twelf proof assistant.

Things really seem to be heating up in both the type-and-effects world and the mechanized metatheory world (but then, long-time LtU readers will expect me to claim that, won't they)? Of course, I also have my usual question as to what aspects of Twelf these researchers found helpful to their constructions vs. those of Coq.

The Art of the Propagator

The Art of the Propagator, Alexey Radul and Gerald Jay Sussman.

We develop a programming model built on the idea that the basic computational elements are autonomous machines interconnected by shared cells through which they communicate. Each machine continuously examines the cells it is interested in, and adds information to some based on deductions it can make from information from the others. This model makes it easy to smoothly combine expression-oriented and constraint-based programming; it also easily accommodates implicit incremental distributed search in ordinary programs. This work builds on the original research of Guy Lewis Steele Jr. and was developed more recently with the help of Chris Hanson.

I just ran across this tech report. I haven't read it yet, but the subject is particularly well-timed for me, since I just finished a correctness proof for a simple FRP system implemented via imperative dataflow graphs, and so constraint propagation has been much on my mind recently. It's pretty clear that constraint propagation can do things that FRP doesn't, but it's not so clear to me whether this is a case of "more expressiveness" or "more fragile abstractions".

A Tiny Computer

A Tiny Computer. An unpublished memo by Chuck Thacker, Microsoft Research, 3 September 2007. Posted with permission.

Alan Kay recently posed the following problem:

"I'd like to show JHS and HS kids 'the simplest non-tricky architecture' in which simple gates and flipflops manifest a programmable computer”.

Alan posed a couple of other desiderata, primarily that the computer needs to demonstrate fundamental principles, but should be capable of running real programs produced by a compiler. This introduces some tension into the design, since simplicity and performance sometimes are in conflict.

This sounded like an interesting challenge, and I have a proposed design.

Presents the design of a complete CPU in under two pages of FPGA-ready Verilog. The TC3 is a Harvard architecture 32-bit RISC with 1KB of instruction memory, 1KB of data memory, and 128 general-purpose registers. This design is an ancestor of the DDR2 DRAM Controller used in the BEE3.

To help us into the brave new world of Hardware/Software co-design!

XML feed