PyPy, the Python implementation written in Python, was mentioned here a couple of times in the past. After it was mentioned in a recent LtU discussion, I took another look, and boy did they make a lot of progress when I wasn't looking. PyPy can even compile itself now... You should check it out again if you are interested in this sort of thing.

There's even an introduction to the techniques used by PyPy, including a nice (but very high level) overview of abstract interpretation.

Semantic Distance: NLP Not a Resource Sink

Following the story on Mind Mappers and other RDF comments of late, I thought this NLP slide show (PDF) should get a story. Dr. Adrian Walker offers an interesting perspective in a friendly crayon-colored format, including a critique of RDF. Source site Internet Business Logic has other offerings including an online demo.

Ragel State Machine Compiler

What kind of task is Ragel good for?

  • Lexing programming languages.
  • Parsing file formats.
  • Creating robust custom protocols.
  • Checking your "Theory of Computation" homework.


  • Describe arbitrary state machines using regular language operators and/or state tables.
  • NFA to DFA conversion.
  • Hopcroft's state minimization.
  • Embed any number of actions into machines at arbitrary places.
  • Control non-determinism using priorities on transitions.
  • Visualize output with Graphviz.
  • Use byte, double byte or word sized alphabets.
  • Generate C/C++/Objective-C code with no dependencies.
  • Choose from table or control flow driven output.

An Overview of the Singularity Project

Singularity is a research project in Microsoft Research that started with the question: what would a software platform look like if it was designed from scratch with the primary goal of dependability? Singularity is working to answer this question by building on advances in programming languages and tools to develop a new system architecture and operating system (named Singularity), with the aim of producing a more robust and dependable software platform. Singularity demonstrates the practicality of new technologies and architectural decisions, which should lead to the construction of more robust and dependable systems...
Singularity... starts from a premise of language safety and builds a system architecture that supports and enhances the language guarantees.

An interesting overview of what sounds like an intersting project.

The choice of implementation language is also interesting:

Singularity is written in Sing#, which is an extension to the Spec# language developed in Microsoft Research. Spec# itself is an extension to Microsoft’s C# language that provides constructs (pre- and post-conditions and object invariants) for specifying program behavior. Specifications can be statically verified by the Boogie verifier or checked by compiler-inserted run-time tests. Sing# extends this language with support for channels and low-level constructs necessary for system code....integrating a feature into a language allows more aspects of a program to be verified. Singularity’s constructs allow communication to be statically verified.

An interesting aspect is the support for meta-programming, which is implemented in an unusal manner:

Compile-time reflection (CTR) is a partial substitute for the CLR’s full reflection capability. CTR is similar to techniques such as macros, binary code rewriting, aspects, meta-programming, and multi-stage languages. The basic idea is that programs may contain place-holder elements (classes, methods, fields, etc.) that are subsequently expanded by a generator.

Many other intersting design decisions are discussed in the paper (e.g., various DbC facilities), so do check it out.

Abstract interpretation for constraint handling rules

Abstract interpretation for constraint handling rules. Tom Schrijvers, Peter J. Stuckey, Gregory J. Duck. PPDP’05

Program analysis is essential for the optimized compilation of Constraint Handling Rules (CHRs) as well as the inference of behavioral properties such as confluence and termination. Up to now all program analyses for CHRs have been developed in an ad hoc fashion.In this work we bring the general program analysis methodology of abstract interpretation to CHRs: we formulate an abstract interpretation framework over the call-based operational semantics of CHRs. The abstract interpretation framework is non-obvious since it needs to handle the highly non-deterministic execution of CHRs. The use of the framework is illustrated with two instantiations: the CHR-specific late storage analysis and the more generally known groundness analysis. In addition, we discuss optimizations based on these analyses and present experimental results.

I haven't read this paper carefully yet, but since the authors claim it is the first published account of abstract interpretation for CHRs, I decided to mention it here anyway.

Automatic type inference via partial evaluation

Automatic type inference via partial evaluation. Aaron Tomb, Cormac Flanagan. PPDP’05.

Type checking and type inference are fundamentally similar problems. However, the algorithms for performing the two operations, on the same type system, often differ significantly. The type checker is typically a straightforward encoding of the original type rules. For many systems, type inference is performed using a two-phase, constraint-based algorithm.We present an approach that, given the original type rules written as clauses in a logic programming language, automatically generates an efficient, two-phase, constraint-based type inference algorithm. Our approach works by partially evaluating the type checking rules with respect to the target program to yield a set of constraints suitable for input to an external constraint solver. This approach avoids the need to manually develop and verify a separate type inference algorithm, and is ideal for experimentation with and rapid prototyping of novel type systems.

Also somewhat relevant to the discussions here about type checking as abstract interpretation.

Implicitly Heterogeneous Multi-stage Programming

Implicitly Heterogeneous Multi-stage Programming. Jason Eckhardt, Roumen Kaiabachev, Emir Pasalic, Kedar Swadi and Walid Taha

Previous work on semantics-based multi-stage programming (MSP) language design focused on homogeneous languages designs, where the generating and the generated languages are the same. Homogeneous designs simply add a hygienic quasi-quotation and evaluation mechanism to a base language. An apparent disadvantage of this approach is that the programmer is bound to both expressivity and performance charcteristics of the base language. This paper proposes a practical means to show that this can be avoided by providing specialized translations from subsets of the base language to different target languages.

The idea is that the first stage is done in OCaml and the second stage in C or Fortran (or other similar language). The main point is that the output from the first stage is a subset of OCaml (actually, a subset of Ocaml AST), which is more regular and permits efficient translation to C (as compared to what's required in compiling full Ocaml).

The generated C code is, of course, automatically type-correct. As Oleg remarks, this brings us close to the goal of enjoying all benefits of abstractions with no overhead.

More information here.

Relating FFTW and Split-Radix

Relating FFTW and Split-Radix. Proc. of ICESS'04, the First International Conference on Embedded Software and System, December 9-10 2004, Hangzhou (Zhejiang), China.

This ongoing attempt to reproduce an efficient implementation of FFT using staging and abstract interpretation attempts to answer the question "How can we get the raw performance of hardware without giving up the expressivity and clarity of software?"

Here's how Oleg describes the contribution of this paper,

One may think that generating truly optimal power-of-two FFT is straightforward: we generate the naive radix-2 FFT code, and then optimize it, removing trivial multiplications (x*1), trivial additions (x+0), etc. That was the approach demonstrated previously.

And the bottom line is: the improved code is better than the original, but still worse than the best (FFTW) code. Here's why: if we consider an expression, (sin(pi/4) * x + cos(pi/4) * y), we may think that it can be optimized to sin(pi/4)*(x+y) thus saving one multiplication. Alas, computed sin(pi/4) and cos(pi/4) are not _exactly_ equal. Therefore, no optimizer is able to detect that they are the common factor. The previous paper BTW did handle the case of sin(pi/4) -- and still fell short of FFTW. To achieve optimum, more complex trigonometric transformations have to be applied.

This paper shows that to achieve the best quality of the generated code, we have to use domain knowledge. In case of FFT, we use the fact that it is a linear transform whose factors are roots of unity. This gives us an abstract representation of terms amenable to exact trigonometry. We can essentially symbolically manipulate the terms, and then concretize our abstract representation into the code. In the end, we generated FFT code that exactly matches the number of FP operations of FFTW. Somewhat unexpectedly, when we changed the function that generates the code for general complex multiplication, we obtained `split-radix FFT' -- another well-known FFT technique.

Oleg points out that the point isn't that they managed to reproduced the FFTW results. The crucial point is that we know exactly which identities (i.e., axioms) contributed to the optimum. The search was principled rather heuristic, and the code is generated in only one pass. There are no manipulations on the code: it is generated just right.


Gianluigi Ferrari, Eugenio Moggi and Rosario Pugliese

MetaKlaim - a Type Safe Multi-stage Language for Global Computing

This paper describes the design and the semantics of MetaKlaim, an higher order distributed process calculus equipped with staging mechanisms. MetaKlaim integrates MetaML (an extension of SML for multi-stage programming) and Klaim (a Kernel Language for Agents Interaction and Mobility), to permit interleaving of meta-programming activities (like assembly and linking of code fragments), dynamic checking of security policies at administrative boundaries and “traditional” computational activities on a wide area network (like remote communication and code mobility). MetaKlaim exploits a powerful type system (including polymorphic types ´a la system F) to deal with highly parameterized mobile components and to dynamically enforce security policies: types are metadata which are extracted from code at run-time and are used to express trustiness guarantees. The dynamic type checking ensures that the trustiness guarantees of wide are network applications are maintained whenever computations interoperate with potentially untrusted components.


Ωmega is a new programming language by Tim Sheard which is descended from Haskell and adds new facilities for defining static type constraints, such as allowing "users to write functions at the level of types, and then use those functions in the type of functions at value level". It also has "equality qualified types". See also Programming with Static Invariants in Omega and the manual for more information. Mentioned previously (in passing) on LtU.

XML feed