The essence of Dataflow Programming by Tarmo Uustalu and Varmo Vene

The Essence of Dataflow Programming

The abstract:

We propose a novel, comonadic approach to dataflow (streambased) computation. This is based on the observation that both general and causal stream functions can be characterized as coKleisli arrows of comonads and on the intuition that comonads in general must be a good means to structure context-dependent computation. In particular, we develop a generic comonadic interpreter of languages for context-dependent computation and instantiate it for stream-based computation. We also discuss distributive laws of a comonad over a monad as a means to structure combinations of effectful and context-dependent computation. We apply the latter to analyse clocked dataflow (partial streams based) computation.

If you've ever wondered about dataflow or comonads, this paper is a good read. It begins with short reviews of monads, arrows, and comonads and includes an implementation. One feature that stood out is the idea of a higher-order dataflow language.

Plugging Haskell In

André Pang, Don Stewart, Sean Seefried, and Manuel M. T. Chakravarty. Plugging Haskell In (pdf). Proceedings of the ACM SIGPLAN Workshop on Haskell, 2004, pp. 10-21.

Extension languages enable users to expand the functionality of an application without touching its source code. Commonly, these languages are dynamically typed languages, such as Lisp, Python, or domain-specific languages, which support runtime plugins via dynamic loading of components. We show that Haskell can be comfortably used as a statically typed extension language, and that it can support type-safe dynamic loading of plugins using dynamic types.

Two of the authors also have a more recent paper (pdf) describing applications they've built using hs-plugins.

A Typed Intermediate Language for Compiling Multiple Inheritance

Juan Chen. A Typed Intermediate Language for Compiling Multiple Inheritance.
This paper presents a typed intermediate language EMI that supports multiple and virtual inheritance of classes in C++-like languages. EMI faithfully represents standard implementation strategies of object layout, "this" pointer adjustment, and dynamic dispatch. The type system is sound. Type checking is decidable. The translation from a source language to EMI preserves types. We believe that EMI is the first typed intermediate language that is expressive enough for describing implementation details of multiple and virtual inheritance of classes.

If you really must have mutiple inheritance...

A Plan for Pugs

Anyway. So, I ordered a bunch of books online including TaPL and ATTaPL so I could learn more about mysterious things like Category Theory and Type Inference and Curry-Howard Correspondence.

A rather amusing interview about Pugs, the Perl 6 implementation written in Haskell.

Parrot 0.2.2 Released

Parrot 0.2.2 has been released! Parrot is a versatile virtual machine for dynamic languages. Though originally designed for Perl 6, it's power and flexibility has generated a lot of interest in the language development community. Parrot is distributed with a number of sample compiler implementations at varying stages of completion, including partial compilers for common languages like lisp, scheme, tcl and python.

The latest release features grammar and rule support in PGE, the Parrot Grammar Engine. Parrot also comes with utilties that convert PIR (Parrot Intermediate Representation) and PASM (Parrot Assembly) into PBC (Parrot bytecode).

Those who are interested in learning how to implement languages for Parrot should start with the documentation. You may also be interested in my own (extremely feeble) attempt at implementing PIR generators in Haskell and Ocaml.

A Typed, Compositional Logic for a Stack-Based Abstract Machine

A Typed, Compositional Logic for a Stack-Based Abstract Machine. Nick Benton. MSR-TR-2005-84. June 2005.

We define a compositional program logic in the style of Floyd and Hoare for a simple, typed, stack-based abstract machine with unstructured control flow, global variables and mutually recursive procedure calls. Notable features of the logic include a careful treatment of auxiliary variables and quantification and the use of substructural typing to permit local, modular reasoning about program fragments. Semantic soundness is established using an interpretation of types and assertions defined by orthogonality with respect to sets of contexts.

Also related to PCC, TAL etc.

GHC Survey Results

The results are in for the 2005 Glasgow Haskell Compiler user survey, with a summary and all the raw data. The comments were the highlight for me; see for instance Applications I use GHC for.

(Previous LtU mention)

Accurate step counting

Accurate step counting. Catherine Hope and Graham Hutton.

Starting with an evaluator for a language, an abstract machine for the same language can be mechanically derived using successive program transformations. This has relevance to studying both the space and time properties of programs because these can be estimated by counting transitions of the abstract machine and measuring the size of the additional data structures needed, such as environments and stacks. In this article we use this process to derive a function that accurately counts the number of steps required to evaluate expressions in a simple language.

Somewhat related to a recent discussion (in which I mentioned EOPL1's scetion on deriving a compiler and machine from an interpreter).

The paper is, of course, worth checking out regardless.

HP's Dynamo

Dynamic optimization refers to the runtime optimization of a native program binary. This paper describes the design and implementation of Dynamo, a prototype dynamic optimizer that is capable of optimizing a native program binary at runtime... Contrary to intuition, we demonstrate that it is possible to use a piece of software to improve the performance of a native, statically optimized program binary, while it is executing. Dynamo not only speeds up real application programs, its performance improvement is often quite significant. For example, the performance of many +O2 optimized SPECint95 binaries running under Dynamo is comparable to the performance of their +O4 optimized version running without Dynamo.

I think Dynamo is pretty well known, but in the light of the Apple switch to x86, it offers perspective on the possibility of using JIT compilation to translate binaries.

The application to programming languages is also straightforward.

Judy Stores

A Judy tree is generally faster than and uses less memory than contemporary forms of trees such as binary (AVL) trees, b-trees, and skip-lists. When used in the "Judy Scalable Hashing" configuration, Judy is generally faster then a hashing method at all populations.

Some of the reasons Judy outperforms binary trees, b-trees, and skip-lists:

  • Judy rarely compromises speed/space performance for simplicity (Judy will never be called simple except at the API).
  • Judy is designed to avoid cache-line fills wherever possible.
  • A b-tree requires a search of each node (branch), resulting in more cache-line fills.
  • A binary-tree has many more levels (about 8X), resulting in more cache-line fills.
  • A skip-list is roughly equivalent to a degree-4 (4-ary) tree, resulting in more cache-line fills.
  • An "expanse"-based digital tree (of which Judy is a variation) never needs balancing as it grows.
  • A portion (8 bits) of the key is used to subdivide an expanse into sub-trees. Only the remainder of the key need exist in the sub-trees, if at all, resulting in key compression.

The Achilles heel of a simple digital tree is very poor memory utilization, especially when the N in N-ary (the degree or fanout of each branch) increases. The Judy tree design was able to solve this problem. In fact a Judy tree is more memory-efficient than almost any other competitive structure (including a simple linked list). A highly populated linear array[] is the notable exception.

Worth a look, considering the open-source LGPL license and range of applications. HP released this library in 2004. From the PDF manual,

Judy arrays are remarkably fast, space-efficient, and simple to use. No initialization, configuration, or tuning is required or even possible, yet Judy works well over a wide dynamic range from zero to billions of indexes, over a wide variety of types of data sets -- sequential, clustered, periodic, random.

XML feed