Implementation

An Overview of the Singularity Project

Singularity is a research project in Microsoft Research that started with the question: what would a software platform look like if it was designed from scratch with the primary goal of dependability? Singularity is working to answer this question by building on advances in programming languages and tools to develop a new system architecture and operating system (named Singularity), with the aim of producing a more robust and dependable software platform. Singularity demonstrates the practicality of new technologies and architectural decisions, which should lead to the construction of more robust and dependable systems...
Singularity... starts from a premise of language safety and builds a system architecture that supports and enhances the language guarantees.

An interesting overview of what sounds like an intersting project.

The choice of implementation language is also interesting:

Singularity is written in Sing#, which is an extension to the Spec# language developed in Microsoft Research. Spec# itself is an extension to Microsoft’s C# language that provides constructs (pre- and post-conditions and object invariants) for specifying program behavior. Specifications can be statically verified by the Boogie verifier or checked by compiler-inserted run-time tests. Sing# extends this language with support for channels and low-level constructs necessary for system code....integrating a feature into a language allows more aspects of a program to be verified. Singularity’s constructs allow communication to be statically verified.

An interesting aspect is the support for meta-programming, which is implemented in an unusal manner:

Compile-time reflection (CTR) is a partial substitute for the CLR’s full reflection capability. CTR is similar to techniques such as macros, binary code rewriting, aspects, meta-programming, and multi-stage languages. The basic idea is that programs may contain place-holder elements (classes, methods, fields, etc.) that are subsequently expanded by a generator.

Many other intersting design decisions are discussed in the paper (e.g., various DbC facilities), so do check it out.

Zipper-based file server/OS

We present a file server/OS where threading and exceptions are all realized via delimited continuations. We use zipper to navigate within any term. If the term in question is a finite map whose keys are strings and values are either strings or other finite maps, the zipper-based file system looks almost the same as the Unix file system. Unlike the latter, however, we offer: transactional semantics; undo of any file and directory operation; snapshots; statically guaranteed the strongest, repeatable read, isolation mode for clients; built-in traversal facility; and just the right behavior for cyclic directory references.



We can easily change our file server to support NFS or 9P or other distributed file system protocol. We can traverse richer terms than mere finite maps with string keys. In particular, we can use lambda-terms as our file system: one can cd into a lambda-term in bash.



The implementation of ZFS has no unsafe operations, no GHC let alone Unix threads, and no concurrency problems. Our threads cannot even do IO and cannot mutate any global state --- and the type system sees to it. We thus demonstrate how delimited continuations let us statically isolate effects even if the whole program eventually runs in an IO monad.

zfs-talk: Expanded talk with the demo. It was originally presented as an extra demo at the Haskell Workshop 2005

ZFS.tar.gz: The complete source code, with comments. It was tested with GHC 6.4 on FreeBSD and Linux


From: Zipper-based file server/OS


Neat, a referentially transparent filesystem with transactional semantics in 540 lines of Haskell. I can think of some interesting uses for this.

Control-Flow Integrity

Two papers about CFI.

Control-Flow Integrity - Principles, Implementations, and Applications:

Current software attacks often build on exploits that subvert machine- code execution. The enforcement of a basic safety property, Control-Flow Integrity (CFI), can prevent such attacks from arbitrarily controlling program behavior. CFI enforcement is simple, and its guarantees can be established formally, even with respect to powerful adversaries. Moreover, CFI enforcement is practical: it is compatible with existing software and can be done efficiently using software rewriting in commodity systems. Finally, CFI provides a useful foundation for enforcing further security policies, as we demonstrate with efficient software implementations of a protected shadow call stack and of access control for memory regions.

A Theory of Secure Control-Flow:

Control-Flow Integrity (CFI) means that the execution of a program dynamically follows only certain paths, in accordance with a static policy. CFI can prevent attacks that, by exploiting buffer overflows and other vulnerabilities, attempt to control program behavior. This paper develops the basic theory that underlies two practical techniques for CFI enforcement, with precise formulations of hypotheses and guarantees.

Not very language oriented, I am afraid. However, this is related to PCC and TAL which were discussed here in the past.

Number-Parameterized Types by Oleg Kiselyov

Number-Parameterized Types by Oleg Kiselyov

from the abstract:

This paper describes practical programming with types parameterized by numbers: e.g., an array type parameterized by the array's size or a modular group type Zn parameterized by the modulus. An attempt to add, for example, two integers of different moduli should result in a compile-time error with a clear error message. Number-parameterized types let the programmer capture more invariants through types and eliminate some run-time checks.

Oleg shows several approaches towards encoding numbers into types and using those numbers to check list length, matching sizes for matrices or vectors. Oleg also points out connections to dependent types, phantom types, and shape-invariant programming.

Continuations library for Java

RIFE is an open source Java web application framework which allows web applications to benefit from first-class continuations. Now, RIFE's pure Java continuation engine, which uses Java bytecode manipulation to implement continuations, has been extracted into a standalone Java library:

Announcing RIFE/Continuations, pure Java continuations for everyone

The immediate motivation for doing this was to achieve support for continuations in the WebWork application framework, which now supports them it its latest beta.

Along similar lines, there's also the Apache Jakarta project, Commons Javaflow.

From a language design perspective, it's interesting to see features like this being introduced without modifying the language implementation. Other examples of continuations being added to languages, as in Stackless Python and the Javascript implementation Rhino with Continuations, have been done by forking the language implementation, which of course isn't as practical in Java's case.

XQuery Implementation in a Relational Database System

XQuery Implementation in a Relational Database System. VLDB 2005.

This paper describes the experiences and the challenges in implementing XQuery in Microsoft’s SQL Server 2005. XQuery language constructs are compiled into an enhanced set of relational operators while preserving the semantics of XQuery. The query tree is optimized using relational optimization techniques, such as cost-based decisions, and rewrite rules based on XML schemas.

Quite detailed. Section 6 covers optimization, and coupled with section 7 discussing related work, might be enough for those not really interested in all the intricate details of SQL Server's implementation of XQuery.

The essence of Dataflow Programming by Tarmo Uustalu and Varmo Vene

The Essence of Dataflow Programming

The abstract:

We propose a novel, comonadic approach to dataflow (streambased) computation. This is based on the observation that both general and causal stream functions can be characterized as coKleisli arrows of comonads and on the intuition that comonads in general must be a good means to structure context-dependent computation. In particular, we develop a generic comonadic interpreter of languages for context-dependent computation and instantiate it for stream-based computation. We also discuss distributive laws of a comonad over a monad as a means to structure combinations of effectful and context-dependent computation. We apply the latter to analyse clocked dataflow (partial streams based) computation.

If you've ever wondered about dataflow or comonads, this paper is a good read. It begins with short reviews of monads, arrows, and comonads and includes an implementation. One feature that stood out is the idea of a higher-order dataflow language.

Plugging Haskell In

André Pang, Don Stewart, Sean Seefried, and Manuel M. T. Chakravarty. Plugging Haskell In (pdf). Proceedings of the ACM SIGPLAN Workshop on Haskell, 2004, pp. 10-21.

Extension languages enable users to expand the functionality of an application without touching its source code. Commonly, these languages are dynamically typed languages, such as Lisp, Python, or domain-specific languages, which support runtime plugins via dynamic loading of components. We show that Haskell can be comfortably used as a statically typed extension language, and that it can support type-safe dynamic loading of plugins using dynamic types.

Two of the authors also have a more recent paper (pdf) describing applications they've built using hs-plugins.

A Typed Intermediate Language for Compiling Multiple Inheritance

Juan Chen. A Typed Intermediate Language for Compiling Multiple Inheritance.
This paper presents a typed intermediate language EMI that supports multiple and virtual inheritance of classes in C++-like languages. EMI faithfully represents standard implementation strategies of object layout, "this" pointer adjustment, and dynamic dispatch. The type system is sound. Type checking is decidable. The translation from a source language to EMI preserves types. We believe that EMI is the first typed intermediate language that is expressive enough for describing implementation details of multiple and virtual inheritance of classes.

If you really must have mutiple inheritance...

A Plan for Pugs

Anyway. So, I ordered a bunch of books online including TaPL and ATTaPL so I could learn more about mysterious things like Category Theory and Type Inference and Curry-Howard Correspondence.

A rather amusing interview about Pugs, the Perl 6 implementation written in Haskell.

XML feed