DryadLINQ

The goal of DryadLINQ is to make distributed computing on large compute cluster simple enough for ordinary programmers. DryadLINQ combines two important pieces of Microsoft technology: the Dryad distributed execution engine and the .NET Language Integrated Query (LINQ).

I don't think this project was mentioned here before.

History of Logic Programming: What went wrong, What was done about it, and What it might mean for the future

Carl Hewitt is speaking tomorrow at Stanford's CSLI CogLunch on the history of logic programming.

A paper is here, so LtU readers can offer their perspectives on the argument.

Language geek at the Maker Faire

Maker Faire was fun, but you can read all about it on numerous web sites and blogs. While I enjoyed the Coke + Mentos demonstration like everyone else, some things caught my eye in particular and may also amuse LtU readers.

Talking to the guys demoing the CNC machines I discovered G Code which turns out to be the main machine languages used to control the CNC machines. It was cool to meet people who actually wrote their own software to emit or/consume G codes (the styrofoam CNC machine was way cool).

I also enjoyed The Art of Motion Control sculpture, since the text said "Path designs using custom LISP routines running within AutoCAD." I think that was the only one explicitly mentioning Lisp. At least, that was the only one I saw...

Any cool language references I missed?

Automatic Generation of Peephole Superoptimizers

Automatic Generation of Peephole Superoptimizers, Sorav Bansal and Alex Aiken, ASPLOS 2006.

Peephole optimizers are typically constructed using human-written pattern matching rules, an approach that requires expertise and time, as well as being less than systematic at exploiting all opportunities for optimization. We explore fully automatic construction of peephole optimizers using brute force superoptimization. While the optimizations discovered by our automatic system may be less general than human-written counterparts, our approach has the potential to automatically learn a database of thousands to millions of optimizations, in contrast to the hundreds found in current peephole optimizers. We show experimentally that our optimizer is able to exploit performance opportunities not found by existing compilers; in particular, we show speedups from 1.7 to a factor of 10 on some compute intensive kernels over a conventional optimizing compiler.

It's always fun to see a method that ought to be intractable rendered tractable through a combination of cleverness and strategically applied brute force.

I found their performance measurements suggestive but perplexing. Their results on their kernels are really astoundingly good, which they claim that is because they make use of the x86's SIMD instructions. But there is very little change in the SPEC benchmarks, and they subsequently suggest that their peephole optimizer catches many things that other compilers get via dataflow optimization. As a result, I don't feel like I have a clear picture of what their optimizer does and doesn't do, or where it does and doesn't overlaps with existing optimizations. But they definitely have enough to convince me that this is worth further study, so I really hope Bonsal and Aiken publish some more about this -- a tech report or journal paper with more systematic measurements and in-depth analysis would be really illuminating.

Arrows generalise monads and idioms

Two fresh papers from the Edinburgh theory stable:
  • Lindley, Wadler & Yallop, 2008. The Arrow Calculus, (Functional Pearl) (submitted to ICFP).
  • Lindley, Wadler & Yallop, 2008. Idioms are oblivious, arrows are meticulous, monads are promiscuous (submitted to MSFP)
    We revisit the connection between three notions of computation: Moggi’s monads, Hughes’s arrows and McBride and Paterson’s idioms (also called applicative functors ). We show that idioms are equivalent to arrows that satisfy the type isomorphism A ∼> B ≅ 1 ∼> (A -> B) and that monads are equivalent to arrows that satisfy the type isomorphism A ∼> B ≅A → (1 ∼> B). Further, idioms embed into arrows and arrows embed into monads.
The first paper introduce a reformulation of the Power/Thielecke/Paterson/McBride axiomatisation of arrows, which the authors argue is more natural, and shows that arrows generalise both monads and idioms. The second paper studies the relationships between the three formalisations in more formal depth; in particular the results about applicative functors struck me as significant.

A located lambda calculus

A located lambda calculus. Ezra Cooper and Philip Wadler. Submitted to ICFP 2008.

Several recent language designs have offered a unified language for programming a distributed system; we call these "location-aware" languages. These languages provide constructs that allow the programmer to control the location (the choice of host, for example) where a piece of code should run, which can be useful for security or performance reasons. On the other hand, a central mantra of web engineering insists that web servers should be "stateless": that no "session state" should be maintained on behalf of individual clients---that is, no state that pertains to the particular point of the interaction at which a client program resides. Thus far, most implementations of unified location-aware languages have ignored this precept, usually keeping a process for each client running on the server, or otherwise storing state information in memory. We show how to implement a location-aware language on top of the stateless-server model.

This paper is technical, and I assume most LtU members will mainly read sections 1, 5 & 6. Figure 5 is definition of the located LC.

COLA Brainfuck

From the Software Architecture Group at the Hasso Plattner Institut:

Our tutorial on COLA provides insight on how programming languages can be implemented using the combined abstractions and an implementation of parsing expression grammars in COLA. The "esoteric" programming language brainfuck was chosen for its simplicity, which allows for concentrating on COLA's features.

Previously: COLA and Open, extensible object models; via neuraxon77.

Automatic Patch-Based Exploit Generation

Brumley, Poosankam, Song & Zheng, 2008. Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications:
The automatic patch-based exploit generation problem is: given a program P and a patched version of the program P′, automatically generate an exploit for the potentially unknown vulnerability present in P but fixed in P'. In this paper, we propose techniques for automatic patch-based exploit generation, and show that our techniques can automatically generate exploits for 5 Microsoft programs based upon patches provided via Windows Update. Although our techniques may not work in all cases, a fundamental tenet of security is to conservatively estimate the capabilities of attackers. Thus, our results indicate that automatic patch-based exploit generation should be considered practical. One important security implication of our results is that current patch distribution schemes which stagger patch distribution over long time periods, such as Windows Update, may allow attackers who receive the patch first to compromise the significant fraction of vulnerable hosts who have not yet received the patch.
The technique is based on flow analysis, to test code that gets changed for boundaries where safety properties fail. The limitations of the technique they have developed automatically generate vulnerabilities for only a small fraction of propagated updates. Nonetheless I find it astonishing that such a simple analysis can provide such a payoff. Via Bruce Schneier.

Species: making analytic functors practical for functional programming

Species: making analytic functors practical for functional programming, Jacques Carette and Gordon Uszkay. Submitted to MSFP 2008.

Inspired by Joyal's theory of species, we show how to add new type constructors and constructor combinators to the tool set of functional languages. We show that all the important properties of inductive types lift to this new setting. Species are analytic functors, representing a broader range of structures than regular functors. This includes structures such as bags, cycles and graphs. The theory is greatly inspired by combinatorics rather than type theory: this adds interesting new tools to bear, but also requires more work on our part to show that species form a good foundations for a theory of type constructors. The combinatorial tools provide a calculus for these structures which has strong links with classical analysis, via a functorial interpretation of generating series. Furthermore, we show how generic programming idioms also generalise to this richer setting. Once the theory is understood, various methods of implementation are relatively straightforward.

Learning more about the theory of species and working out its implications for programming has been on my to-do list for several years now. It really seems like the natural way to more tightly integrate combinatoric ideas into the functional programming style. (For an example of how fruitful this connection can be, consider how datatype differentiation explains zippers/functional pointers.)

However, there's been this whole "graduation" and "primary research focus" stuff that's kept getting in the way, so I'm happy to see that Carette and Uszkay are figuring it out so I can just crib from them. :)

program verification: the very idea

James H. Fetzer's Program Verification: The Very Idea (1988) is one of the two most frequently cited position papers on the subject of program verification. The other one is Social Processes and Proofs by De Millo, Lipton, and Perlis (1979), previously discussed on LtU. Fetzer's paper generated a lot of heated discussion, both in the subsequent issues of CACM and on Usenet.

It's not clear to me what all the fuss is about. Fetzer's main thesis seems pretty uncontroversial:

The notion of program verification appears to trade upon an equivocation. Algorithms, as logical structures, are appropriate subjects for deductive verification. Programs, as causal models of those structures, are not. The success of program verification as a generally applicable and completely reliable method for guaranteeing program performance is not even a theoretical possibility.

(See also part I, part II, and part III.)