Implementation

Modern dynamic linking infrastructure for PLT

Given that Unix won, I think it's interesting that dynamic languages make very little use of the dynamic linking and loading infrastructure provided by modern free Unixes such as Linux and the BSDs.

Most dynamic PLs opt to implement "dynamism" (i.e. redefining stuff, loading code at runtime) with application-specific data structures (e.g. Lisp: red-black trees for uniquifying symbols, function pointers and indirect function calls), and they do so solely at runtime (mostly using interpreters and JITs, although Scheme, one of the most advanced dynamic languages, is increasingly illuminating the possibilities of static, independent compilation of dynamic programs).

(Metaprogramming at runtime is perilous, as it is easy to mix up phase distinctions, something we can expect newer dynamic programming languages to discover in a decade (of course we don't know which decade.))

Link-time is mostly ignored. And yet, under Linux with its heavy use of shared objects, one cannot even start a single program without invoking the dynamic linker.

But some people, even some computer programmers, don't know how linkers work and what they do. Basically, a modern linking file format, such as ELF, is a declarative way to construct the memory image of a running process, with lots of features for dynamic customization of the image construction process (ELF even contains a customization hook called "program interpreter" in every executable!).

Likewise, modern compilers and runtime systems such as GNU C contain sophisticated features aimed squarely at dynamic languages: weak symbols for runtime redefinability (used by libc's malloc and in the Linux kernel, for example), computed gotos, nested functions, and increasingly, GC. And there is evidence that dynamic compilation and linking of C snippets is accepted and used in modern systems software.

I have collected some links to these topics, and would be interested to hear of languages and systems that you know that exploit them.

(Updates: added 3 ELF links; added Drepper; added Taylor)

Types are Calling Conventions

If optimization and intermediate languages for lazy functional languages are your things, take a look at Types are Calling Conventions by Max Bolingbroke and Simon Peyton Jones.

It is common for compilers to derive the calling convention of a function from its type. Doing so is simple and modular but misses many optimisation opportunities, particularly in lazy, higher-order functional languages with extensive use of currying. We restore the lost opportunities by defining Strict Core, a new intermediate language whose type system makes the missing distinctions: laziness is explicit, and functions take multiple arguments and return multiple results.

LNGen

LNGen

LNgen generates locally nameless infrastructure for the Coq proof assistant from Ott-like specifications. Its output is based on the locally nameless style advocated in Engineering Formal Metatheory and includes all of the "infrastructure" lemmas associated with that style. Compared to Ott's locally nameless backend, LNgen favors generating a large collection of infrastructure lemmas over handling complex binding specifications and methods of defining syntax and relations.

There are really three stories here:

  1. Coq 8.2 shipped a while ago.
  2. Ott, a tool for PLT semantics work, has acquired a backend in support of the increasingly-popular "locally nameless" representation of binding structure in mechanized programming language metatheory.
  3. LNGen is another tool, using a subset of Ott syntax, that takes a slightly different approach from Ott's new backend to addressing the same issues.

From the U. Penn folks who brought us the Coq tutorial at POPL '08.

Achieving Security Despite Compromise Using Zero-Knowledge

Achieving Security Despite Compromise Using Zero-Knowledge

One of the important challenges when designing and analyzing cryptographic protocols is the enforcement of security properties in the presence of compromised participants. This paper presents a general technique for strengthening cryptographic protocols in order to satisfy authorization policies despite participant compromise. The central idea is to automatically transform the original cryptographic protocols by adding non-interactive zero-knowledge proofs. Each participant proves that the messages sent to the other participants are generated
in accordance to the protocol. The zero-knowledge proofs are forwarded to ensure the correct behavior of all participants involved in the protocol, without revealing any secret data. We use an enhanced type system for zero-knowledge to verify that the transformed protocols conform to their authorization policy even if some participants are compromised. Finally, we developed a tool that automatically generates ML implementations of protocols based on zero-knowledge proofs. The protocol transformation, the verification, and the generation of protocol implementations are fully automated.

This is the follow-up to this story. The prior work did not account for compromised participants. This work does.

I continue to be excited about the prospect of this previous story's work being applied to the type system described in this story, possibly resulting in an awesome new language for developing secure software.

A Generic Type-and-Effect System

A Generic Type-and-Effect System

Type-and-effect systems are a natural approach for statically reasoning about a program's execution. They have been used to track a variety of computational effects, for example memory manipulation, exceptions, and locking. However, each type-and-effect system is typically implemented as its own monolithic type system that hard-codes a particular syntax of effects along with particular rules to track and control those effects. We present a generic type-and-effect system, which is parameterized by the syntax of effects to track and by two functions that together specify the effect discipline to be statically enforced. We describe how a standard form of type soundness is ensured by requiring these two functions to obey a few natural monotonicity requirements. We demonstrate that several effect systems from the literature can be viewed as instantiations of our generic type system. Finally, we describe the implementation of our type-and-effect system and mechanically checked type soundness proof in the Twelf proof assistant.

Things really seem to be heating up in both the type-and-effects world and the mechanized metatheory world (but then, long-time LtU readers will expect me to claim that, won't they)? Of course, I also have my usual question as to what aspects of Twelf these researchers found helpful to their constructions vs. those of Coq.

The Art of the Propagator

The Art of the Propagator, Alexey Radul and Gerald Jay Sussman.

We develop a programming model built on the idea that the basic computational elements are autonomous machines interconnected by shared cells through which they communicate. Each machine continuously examines the cells it is interested in, and adds information to some based on deductions it can make from information from the others. This model makes it easy to smoothly combine expression-oriented and constraint-based programming; it also easily accommodates implicit incremental distributed search in ordinary programs. This work builds on the original research of Guy Lewis Steele Jr. and was developed more recently with the help of Chris Hanson.

I just ran across this tech report. I haven't read it yet, but the subject is particularly well-timed for me, since I just finished a correctness proof for a simple FRP system implemented via imperative dataflow graphs, and so constraint propagation has been much on my mind recently. It's pretty clear that constraint propagation can do things that FRP doesn't, but it's not so clear to me whether this is a case of "more expressiveness" or "more fragile abstractions".

A Tiny Computer

A Tiny Computer. An unpublished memo by Chuck Thacker, Microsoft Research, 3 September 2007. Posted with permission.

Alan Kay recently posed the following problem:

"I'd like to show JHS and HS kids 'the simplest non-tricky architecture' in which simple gates and flipflops manifest a programmable computer”.

Alan posed a couple of other desiderata, primarily that the computer needs to demonstrate fundamental principles, but should be capable of running real programs produced by a compiler. This introduces some tension into the design, since simplicity and performance sometimes are in conflict.

This sounded like an interesting challenge, and I have a proposed design.

Presents the design of a complete CPU in under two pages of FPGA-ready Verilog. The TC3 is a Harvard architecture 32-bit RISC with 1KB of instruction memory, 1KB of data memory, and 128 general-purpose registers. This design is an ancestor of the DDR2 DRAM Controller used in the BEE3.

To help us into the brave new world of Hardware/Software co-design!

Ensuring Correct-by-Construction Resource Usage by using Full-Spectrum Dependent Types

Ensuring Correct-by-Construction Resource Usage by using Full-Spectrum Dependent Types

Where it has been done at all, formally demonstrating the correctness of functional programs has historically focused on proving essential properties derived from the functional specification of the software. In this paper, we show that correct-by-construction software development can also handle the equally important class of extra-functional properties, namely the correct usage of resources. We do this using a novel embedded domain-specific language approach that exploits the capabilities of full-spectrum dependent types. Our approach provides significant benefits over previous approaches based on less powerful type systems in reducing notational overhead, and in simplifying the process of formal proof.

More ammunition for the importance of embedded domain-specific languages, dependent types, and correctness-by-construction.

Equality Saturation: A New Approach to Optimization

Equality Saturation: A New Approach to Optimization, Ross Tate, Michael Stepp, Zachary Tatlock, Sorin Lerner, POPL 2009.

Optimizations in a traditional compiler are applied sequentially, with each optimization destructively modifying the program to produce a transformed program that is then passed to the next optimization.

We present a new approach for structuring the optimization phase of a compiler. In our approach, optimizations take the form of equality analyses that add equality information to a common intermediate representation. The optimizer works by repeatedly applying these analyses to infer equivalences between program fragments, thus saturating the intermediate representation with equalities. Once saturated, the intermediate representation encodes multiple optimized versions of the input program.

At this point, a profitability heuristic picks the final optimized program from the various programs represented in the saturated representation. Our proposed way of structuring optimizers has a variety of benefits over previous approaches: our approach obviates the need to worry about optimization ordering, enables the use of a global optimization heuristic that selects among fully optimized programs, and can be used to perform translation validation, even on compilers other than our own.

We present our approach, formalize it, and describe our choice of intermediate representation. We also present experimental results showing that our approach is practical in terms of time and space overhead, is effective at discovering intricate optimization opportunities, and is effective at performing translation validation for a realistic optimizer.

I thought this was one of the more interesting papers at POPL this year. The idea of tackling the phase ordering problem by splitting the problem into two steps --- first computing classes of equivalent programs, and second picking the best member of the equivalence class --- is very clever.

Dana

Luke Palmer and Nick Szabo can shoot me for this if they want, but I think this is warranted, and I want to connect a couple of dots as well. Luke is one of a number of computer scientists, with Conal Elliott probably being the best known, who have devoted quite a bit of attention to Functional Reactive Programming, or FRP. FRP has been discussed on LtU off and on over the years, but, unusually for LtU IMHO, does not seem to have gotten the traction that some other similarly abstruse subjects have.

In parallel, LtU has had a couple of interesting threads about Second Life's economy, smart contracts, usage control, denial of service, technical vs. legal remedies, and the like. I would particularly like to call attention to this post by Nick Szabo, in which he discusses a contract language that he designed:

Designing the contract language radically changed my idea of what program flow and instruction pointers can be. Its successor, a general-purpose programming language, may thereby make event-oriented programming and concurrency far easier. The language is targeted at GUI programming as well as smart contracts, real-time, and workflow programming. My answer to the problems of concurrency and event handling is to make the ordering of instructions the syntactic and semantic core of the language. The order of execution and event handlers are the easiest things to express in the language rather than kludged add-ons to a procedural or functional language.

In recent private correspondence, Nick commented that he'd determined that he was reinventing synchronous programming à la Esterel, and mentioned "Reactive" programming.

Ding!

To make a potentially long entry somewhat shorter, Luke is working on a new language, Dana, which appears to have grown out of some frustration with existing FRP systems, including Conal Elliot's Reactive, currently perhaps the lynchpin of FRP research. Luke's motivating kickoff post for the Dana project can be found here, and there are several follow-up posts, including links to experimental source code repositories. Of particularly motivating interest, IMHO, is this post, where Luke discusses FRP's interaction with garbage collection succinctly but nevertheless in some depth. Luke's most recent post makes the connection from Dana, which Luke has determined needs to have a dependently-typed core, to Illative Combinatory Logic, explicit, and offers a ~100 line type checker for the core.

I find this very exciting, as I believe strongly in the project of being able to express computation centered on time, in the sense of Nick's contract language, in easy and safe ways extremely compelling. I've intuited for some time now that FRP represents a real breakthrough in moving us past the Von Neumann runtime paradigm in fundamental ways, and between Conal Elliott's and Luke's work (and no doubt that of others), it seems to me that my sense of this may be borne out, with Nick's contract language, or something like it, as an initial application realm.

So I wanted to call attention to Luke's work, and by extension recapitulate Conal's and Nick's, both for the PLT aspects that Luke's clearly represents, but also as a challenge to the community to assist in the realization of Nick's design efforts, if at all possible.

XML feed