DSL

Translation of Tree-processing Programs into Stream-processing Programs based on Ordered Linear Types

Translation of Tree-processing Programs into Stream-processing Programs based on Ordered Linear Types, Koichi Kodama, Kohei Suenaga, Naoki Kobayashi, JFP 2008.

There are two ways to write a program for manipulating tree-structured data such as XML documents: One is to write a tree-processing program focusing on the logical structure of the data and the other is to write a stream-processing program focusing on the physical structure. While tree-processing programs are easier to write than stream-processing programs, tree-processing programs are less efficient in memory usage since they use trees as intermediate data. Our aim is to establish a method for automatically translating a tree-processing program to a stream-processing one in order to take the best of both worlds.

We first define a programming language for processing binary trees and a type system based on ordered linear types, and show that every well-typed program can be translated to an equivalent stream-processing program. We then extend the language and the type system to deal with XML documents. We have implemented an XML stream processor generator based on our algorithm, and obtained promising experimental results.

Linear logic is what you get when you give up the structural rules of weakening and contraction -- it's logic when you cannot reuse or forget any hypothesis you make in the course of a proof. This means that every formula is used exactly once, which makes it useful (via Curry-Howard) for programming, as well: linear types give us a way of tracking resources and state in a type system. Intuitively, you can think of it as a kind of static analysis that ensures that an object's runtime reference count will always be one.

However, linear logic retains the structural rule of exchange, which lets us use hypotheses in a different order than we introduced them. Ordered logic is what you get when you drop exchange, as well. (Amusingly, it was invented long before linear logic -- in the 1950s, Lambek introduced it with an eye towards applications in linguistics: he wanted to express the difference between one word occurring either before, or after, another.)

This means that you can use ordered logic to express the order in your types the order in which you use data, as well, which the authors here use to design a language whose typechecker statically rules out memory-inefficient programs.

Going functional on exotic trades

Simon Frankau, Diomidis Spinellis, Nick Nassuphis, Christoph Burgard. Going functional on exotic trades. JFP 19(01):27-45.

The Functional Payout Framework (fpf) is a Haskell application that uses an embedded domain-specific functional language to represent and process exotic financial derivatives. Whereas scripting languages for pricing exotic derivatives are common in banking, fpf uses multiple interpretations to not only price such trades, but also to analyse the scripts to provide lifecycle support and more. This paper discusses fpf in relation to the wider trading workflow and our experiences in using a functional language in such a system as both an implementation language and a domain-specific language.

Section 3 is a nice discussion of why Haskell was chosen. Section 4 illustrates one of the major benefits of using DSLs, namely that different backends can be applied to programs written in the DSL, allowing various types of runtime behavior and analysis without the need to re-code each program for each purpose (thus n+m, rather than n*m programs).

Another topic that may be worth discussing is the authors' dislike of point free style.

What we really need, of course, are DSLs for financial regulation (and possibly for specifying various definitions of honesty), but that's a separate issue...

Purpose-Built Languages

Mike Shapiro, Purpose-Built Languages, ACM Queue vol. 7, no. 1, February 2009.

Mike Shapiro from Sun Microsystems examines the evolution of what he calls "purpose-built languages" (essentially domain-specific languages). Shapiro discusses the way that such languages have often been a key part of the development of larger software systems, and describes how many of them have sprung into existence and evolved without formal design. In particular, he traces the evolution of the adb debugger language. As Shapiro says:

The debugger tale illustrates that a little purpose-built language can evolve essentially at random, have no clear design, no consistent grammar or parser, and no name, and yet endure and grow in shipping operating systems for more than 40 years. In the same time period, many mainstream languages came and went into the great beyond... For purpose-built languages, a deep connection to a task and the user community for that task is often worth more than clever design or elegant syntax.

The same article also appeared in the April 2009 issue of Communications of the ACM.

The Art of the Propagator

The Art of the Propagator, Alexey Radul and Gerald Jay Sussman.

We develop a programming model built on the idea that the basic computational elements are autonomous machines interconnected by shared cells through which they communicate. Each machine continuously examines the cells it is interested in, and adds information to some based on deductions it can make from information from the others. This model makes it easy to smoothly combine expression-oriented and constraint-based programming; it also easily accommodates implicit incremental distributed search in ordinary programs. This work builds on the original research of Guy Lewis Steele Jr. and was developed more recently with the help of Chris Hanson.

I just ran across this tech report. I haven't read it yet, but the subject is particularly well-timed for me, since I just finished a correctness proof for a simple FRP system implemented via imperative dataflow graphs, and so constraint propagation has been much on my mind recently. It's pretty clear that constraint propagation can do things that FRP doesn't, but it's not so clear to me whether this is a case of "more expressiveness" or "more fragile abstractions".

D is for Domain and Declarative

The list of accepted papers is out for the IFIP Working Conference on Domain Specific Languages. Happily for me, the program reveals much interest in languages for reasoning, decision making, and search. Even among people who are not my coauthors. :)

Declarative programming tends to attract skepticism because it has the reputation of poor and hard-to-control performance. The approach of DSL embedding appears to ameliorate this problem, and the success of SAT solvers appears to chip away at this reputation.

Meanwhile, the call for papers is out for Principles and Practice of Declarative Programming 2009, which has a venerable program committee. The submission deadline is May 7.

Swift: making web applications secure by construction

Swift is a language-based approach to building web applications that are secure by construction. Swift applications are written in the Jif language, a Java-based language that incorporates "security-typing" to manage the flow of information within an application. The Swift compiler automatically partitions the application code into a client-side JavaScript application and a server-side Java application, with code placement constrained by declarative information flow policies that strongly enforce the confidentiality and integrity of server-side information.

Swift was recently featured in the "Research Highlights" section of the Communications of the ACM, as a condensed version of an earlier conference paper. The original conference paper is Stephen Chong, Jed Liu, Andrew C. Myers, Xin Qi, K. Vikram, Lantian Zheng, and Xin Zheng, Secure web applications via automatic partitioning, Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP'07), pages 31–44, October 2007.

Jif has been mentioned previously on LtU here and here.

A Foundation for Flow-Based Program Matching Using Temporal Logic and Model Checking

A Foundation for Flow-Based Program Matching Using Temporal Logic and Model Checking, Julien Brunel, Damien Doliguez, René Rydhof Hansen, Julia Lawall, Gilles Muller. POPL 2009.

Reasoning about program control-flow paths is an important functionality of a number of recent program matching languages and associated searching and transformation tools. Temporal logic provides a well-defined means of expressing properties of control-flow paths in programs, and indeed an extension of the temporal logic CTL has been applied to the problem of specifying and verifying the transformations commonly performed by optimizing compilers.

Nevertheless, in developing the Coccinelle program transformation tool for performing Linux collateral evolutions in systems code, we have found that existing variants of CTL do not adequately support rules that transform subterms other than the ones matching an entire formula. Being able to transform any of the subterms of a matched term seems essential in the domain targeted by Coccinelle. In this paper, we propose an extension to CTL named CTL-VW (CTL with variables and witnesses) that is a suitable basis for the semantics and implementation of the Coccinelle’s program matching language.

Our extension to CTL includes existential quantification over program fragments, which allows metavariables in the program matching language to range over different values within different control-flow paths, and a notion of witnesses that record such existential bindings for use in the subsequent program transformation process. We formalize CTL-VW and describe its use in the context of Coccinelle. We then assess the performance of the approach in practice, using a transformation rule that fixes several reference count bugs in Linux code

The Coccinelle tool is quite fun to play with. You write things that look like the output of patch, only with some extra patterns and boolean conditions in it, and the tool will go over your C source, find all the source code that matches it, and apply all the changes you've specified. It's open source and available online.

The theory described in this paper is quite fun, too -- the algorithms they describe are (surprisingly) not too complicated and apparently quite speedy.

Ensuring Correct-by-Construction Resource Usage by using Full-Spectrum Dependent Types

Ensuring Correct-by-Construction Resource Usage by using Full-Spectrum Dependent Types

Where it has been done at all, formally demonstrating the correctness of functional programs has historically focused on proving essential properties derived from the functional specification of the software. In this paper, we show that correct-by-construction software development can also handle the equally important class of extra-functional properties, namely the correct usage of resources. We do this using a novel embedded domain-specific language approach that exploits the capabilities of full-spectrum dependent types. Our approach provides significant benefits over previous approaches based on less powerful type systems in reducing notational overhead, and in simplifying the process of formal proof.

More ammunition for the importance of embedded domain-specific languages, dependent types, and correctness-by-construction.

Ziggurat

Ziggurat is a framework for writing extensible programming languages, and for defining static semantics for those languages. In other words, it is a language designer's toolkit.

Ziggurat is based on macros. When building a language using Ziggurat, it is easy to make that language extensible by adding a macro system. Ziggurat macros allow for incremental extension of the language by rewriting. What makes Ziggurat different from other macro systems is that Ziggurat allows the language extender to optionally define static semantics for her new language, and connect these static semantics amongst language levels. This makes it possible to write specialized analysis algorithms for the higher-level languages, either for optimization purposes, profiling purposes, debugging purposes, or whatever task analysis is put to.

Strangely enough this project from Olin Shivers and David Fisher was not mentioned here before.

Those with access may want to check out the paper on Ziggurat in the September 2008 double issue of JFP.

The programming languages behind "the mother of all demos"

On December 9, 1968, Dr. Douglas C. Engelbart and the Augmentation Research Center (ARC) at Stanford Research Institute staged a 90-minute public multimedia demonstration at the Fall Joint Computer Conference in San Francisco. It was the world debut of personal and interactive computing: for the first time, the public saw a computer mouse, which controlled a networked computer system to demonstrate hypertext linking, real-time text editing, multiple windows with flexible view control, cathode display tubes, and shared-screen teleconferencing.

To commemorate this famous event, commonly known as the mother of all demos, SRI held a 40th anniversary celebration at Stanford today. As a small tribute to the innovative ideas that made up the demo, it is befitting to mention some of the programming languages that were used by Engelbart's team. A few were mentioned in passing in the event today, making me realize that they are not that widely known.

The Tree Meta Language was used for describing translators, which were produced by the Tree Meta compiler-compiler. MOL940 ("Machine Oriented Language" for the SDS 940) was an Algol-like high level language for system programming which allowed the programmer to switch to machine-level coding where necessary. Alas (and ironically), I have not found the primary documents about these languages online. Section IV of Engelbart's Study for the development of Human Augmentation Techniques gives an account of the language and tools that were used in the project, and includes an example giving the metalanguage description for part of the Control Language. Figure 8 in in this document is a useful overview of the system and the compilers and compiler compilers used to build it. The tech report Development of a Multidisplay, Time-Shared Computer Facility and Computer-Augmented Management-System Research (only the abstract of which is online) also mentions "four Special-Purpose Languages (SPL's), for high-level specification of user control functions" which sound intriguing. The tech report specifying MOL 940 is also apparently not available online.

If I understood what Andries van Dam said, the Language for Systems Development (LSD) developed at Brown, which targeted OS/360 and was based on PL/I, was influenced by the work of Engelbart's team. They were also claiming to have built the first (or one of the first) cross-compiler.

When asked about prior work that influenced them, SNOBOL was mentioned as an important influence. The influence the demo had on programming languages was manifested by having Alan Kay's talk conclude the event (he did not mention Smalltalk once in his talk, by the way, but it was mentioned a couple of times earlier in the day).

XML feed