Guest Bloggers

Programming Language Beauty: Look Closure

In the past year I have been passionately fighting what Simon Peyton Jones calls "the effects monster", although often it feels like I am fighting windmils instead. No useful programs can be written without effects, but effects turn bad when they are observable from within the program itself. Instead we should strive for encapsulating effects such that they become harmless first class pure values, but more on that in the future. In this first installment in a longer series on the perils of side-effects, we will look at one of the most beautiful examples of observable effects, namely closures and variable capture in imperative languages.

Democratizing the Cloud using Microsoft Live Labs Volta

Nearly two years ago I posted Beyond LINQ: A Manifesto For Distributed Data-Intensive Programming on this forum. Now, within a period of a few weeks, both LINQ as well as a rather different realization of my original post-LINQ plans are shipping. I am particularly proud to announce that a community preview of Volta is available for immediate download from http://labs.live.com/volta/.

Volta is a collection of tools that enable programmers to develop asynchronous and distributed (including but not limited to AJAX) applications by successive refactoring of normal, sequential, programs written in standard .NET languages (this CTP requires Visual Studio 2008) and deploy the resulting applications on a wide variety of target platforms (this CTP supports Internet Explorer and FireFox). Or as I sometimes say when I am trying to sound like a marketing person “ Volta stretches the .NET platform to cover the Cloud.” Volta allows programmers to concentrate on the essential complexity involved in building AJAX application and have our tools take care of the gory details and accidental complexity.

When using Volta, programmers can specify their intent of running certain classes on the server by decorating the class declaration using a [RunAtOrigin()] attribute. The Volta post-compiler then weaves in (you never heard me say that I thought AOP was a bad idea did you :-) all the necessary boilerplate code to partition the original program to run across multiple tiers. Similarly, programmers can create an asynchronous version of a method by decorating an empty method declaration of a related signature with an [Async()] attribute. Again, the Volta post-compiler takes care of all the boilerplate code under the hood to enable asynchronous invocation of the target method. Another pain point in writing AJAX applications is supporting multiple browsers. To ease this pain, Volta includes Scott Isaac’s cross-browser compatibility layer that is also used in Windows Live.

Volta embraces the Lean Programming principle of delaying irreversible decisions until the last possible responsible moment. In particular we want to delay decisions about distribution as long as possible. To help developers make informed decisions about the distribution a program across tiers, the Rotunda profiler from MSR is fully integrated in the Volta toolchain. By automatically injection hooks for all interesting events, Rotunda creates trace information that can be inspected using the standard Service Trace Viewer tool.

When I speak about Volta or show a demo, the best compliment I can get is when people say this is “trivial” or “really straightforward”. The best tools are those that do their work unobtrusive hidden in the background. Anyway, with the holidays around the corner you may have some spare cycles to give Volta a spin and let us know what you think!

Lang .NET Symposium Registration Now Open

The day of the beast has passed without any noticeable effect; Bill Gates has announced his retirement and the Microsoft stock actually goes up. The coast  finally seems safe for another post on the Lang .NET symposium!

Registration for this heavily debated event is now open. I believe that we have a very interesting set of non-Microsoft invited speakers including Gilad Bracha from Sun, William Cook from UT Austin, John Gough from QUT, Miguel de Icaza from Novel, and Shriram Krishnamurthi from Brown; Microsoft folks including Mike Barnett, Gary Flake, Jim Hugunin, Polita Paulus, Don Syme, and Paul Vick; and a fine line-up of submitted papers.

Hope to see you all in Redmond this August!

Gilad Is Right

Gilad Is Right (Confessions From A Recovering Typoholic)

If you have not seen Gilad Bracha's talk on pluggable and optional type systems or read the corresponding paper, I really urge you to do so (or invite Gilad as the invited talk in your conference or workshop). The thesis of optional and pluggable type systems is that type-systems should be an optional layer on top of an otherwise dynamically typed system in such a way that (a) types cannot change the run-time behavior of the program, and (b) they cannot prevent an otherwise legal program form compiling or executing. In short what Gilad is saying is that you should not depend on static typing. However, we all know that static type-systems are very addictive, like the finest crack from the backstreets of the ghetto, and I will stop beating around the bush and confess "I am Erik, and I am a (recovering) typoholic".

To illustrate the tantalizing power of static typing, take the concept of axis members in Visual Basic 9. In our first design we keyed "late" binding over XML on the static type of the receiver. For example take the following element declaration

Dim Pu As XElement = <Atom AtomicWeight="244">
                       <Name>Plutonium</Name> 
                       <Symbol AtomicNumber="94">Pu</Symbol>
                       <Radioactive>true</Radioactive> 
                     </Atom> 

Since the static type of Pu is XElement, our compiler interprets the member access expression Pu.Symbol as the call Pu.Elements("Symbol") on the underlying XLinq API. While this is rather cute, it is not without problems. First of all, the XElement class itself has quite a lot of members, and the question is who wins when there is an ambiguity like in Pu.Name. Should this mean Pu.Elements("Name") or should it just directly call the underlying Name property of XElement. Even worse, what happens if the static type of the receiver is Object, but its dynamic type is XElement as in CObj(Pu).Symbol. What we should really do is to extend the Visual Basic late binder to understand axis members, which means we now have two levels of possible ambiguity! At this point, we have lost the majority of our users. Keying axis member binding on the static type of the receiver is just too cute, we are really doing some kind of fancy type-based overloading.

Besides the child axis, we have special support for attribute axis, written using an @-sign as in Pu.@AtomicWeight and for the descendant axis, written using three consecutive dots as in Pu...RadioActive. Obviously, for these two there is no ambiguity with normal member access, depending on how you look at it, it is clear from the member name (Pu .@AtomicWeight and Pu . ••RadioActive) or the selector (Pu •@ AtomicWeight and Pu ••• AtomicWeight) what the intention is, independent of the static type of the receiver. There is no danger for ambiguity, and it all works fine for late-binding when the receiver has type object since we can always interpret Pu.@AtomicWeight as Pu.Attributes("AtomicWeight") and then do ordinary member resolution and type-checking on that.

To solve our pain, we recently decided to also introduce special syntax for the child axis and write Pu.<Symbol> instead of Pu.Symbol. Now there is no ambiguity between Pu.Name, which returns the string "Atom", and Pu.<Name> which returns the XElement child node <Name>Plutonium</Name>. For consistency, we also changed the syntax for the descendant axis to be Pu...<RadioActive> instead of Pu...RadioActive.

I hope that you agree that we have masked out the seductive voices of the static typing sirens by providing a syntax that is more beautiful and a semantics that is much simpler than our previous one that relied heavily on static typing. Gilad is right!

Beyond LINQ: A Manifesto For Distributed Data-Intensive Programming

The LINQ project as embodied by C# 3.0 and Visual Basic 9 brings concepts from functional programming such as type-inference, lambda-expressions, and most importantly monad comprehensions into mainstream object-oriented programming. This is a definitively exciting for the programming language community, but realistically, it is just a tiny step towards democratizing building distributed data-intensive applications. To merely approach that goal there is still much work to do in (at least) the following areas:

Tools and IDE
It is fair to say that the the days of writing code using a text editor and batch compiler are over. Visual Studio, Eclipse, and Emacs are the norm rather than the exception. However, whenever you meet an (ex)-Smalltalk or VB6 programmer, they reminiscence the highly interactive development environment, scripters cannot live without their REPL, and tools like Ruby on Rails disrupt traditional development because of its simplicity and quick turn-around time. To simplify programming for the masses we need to shake of the yoke of the dreaded (edit, compile, run, debug)* loop and replace it with a lightweight (edit=compile=run=debug) experience.
Language and Type Systems
Writing distributed data-intensive applications naturally means dealing with many forms of data, relational, XML, objects, typed, semi-structured, or untyped. Current languages are not well equipped to deal any scrap of data, and much language and type-system innovation is required such explicit relationships, contracts,  layered type-systems, seamlessly dealing with both static and dynamic typing in the same program, extensibility, etc. The challenge is to package advanced ideas from programming language research in such a way that you do not need a PhD in type-theory to understand them.
Runtime and Libraries
Sometimes, and preferably, the compiler is able map all language and type extensions to an existing runtime such as the JVM or the CLR. However, often this is not feasible and we need to extend the runtime infrastructure to accommodate these new features. A prime example is the support for generics in the CLR versus the JVM, other examples include efficient support for first class continuations, query execution, etc. Obviously, a dynamic language is ultimately all about how dynamic the underlying runtime infrastructure is unless you emulate the dynamic features of the language, which kills interoperability. In addition to runtime support, new language and type-system features often need extensive library and infrastructure support. For example, in the context of LINQ, the language extensions are just the tip of the iceberg, and in fact the bulk of the work is in the libraries such as XLinq and in particular the OR-mapping infrastructure.
Transactions Everywhere
To make programming web services accessible to the masses, we need have to have a comprehensible way to deal with concurrency. In addition, on the desktop itself we need some way to harness the upcoming multi-core revolution. We believe that transactions are the most promising approach in this space. In fact transaction are the only way ordinary people can deal with concurrency, unless of course you are a sadomasochist who likes to wear black leather and play with locks.

As you can imagine, this is a lot of work and it will keep us language geeks off the streets for a long, long, time! And in case you are currently wandering the streets looking for a job as a compiler writer, virtual machine hacker, tool smith, etc. drop me an email. We have several job openings available.

LINQ BOF at OOPSLA

On Wednesday October 19, 2005, Mads Torgersen, Amanda Silver, and yours truly will be presenting a BOF on LINQ in the Royal Palm Salon 1+2, 5:00 – 7:30pm during OOPSLA 2005 in the Town & Country Resort & Convention Center in San Diego, CA.

With three language geeks (one from VB land, one ex-academic from the Java side, and one ex-academic from the Haskell side) presenting, this should be a fun night.

Abstract
Modern applications operate on data in several different forms: Relational tables, XML documents, and in-memory objects. Each of these domains can have profound differences in semantics, data types, and capabilities, and much of the complexity in today's applications is the result of these mismatches. The future "Orcas" release of Visual Studio aims to unify the programming models through integrated query capabilities in C# and Visual Basic, a strongly typed data access framework, and an innovative API for manipulating and querying XML. This talk explains the ideas behind language integrated queries (LINQ) and discusses the language enhancements behind them.

Slides The slides for the presentation are here.

XLinq: XML Programming Refactored (The Return Of The Monoids)

I just posted my XML 2005 submission about XLinq on my homepage.
It describes the XLinq API in somewhat detail, and informally explains the relationship between LINQ and monads.

FLOPS 2006

The call for papers for FLOPS 2006 is now out.

FLOPS benefits from an eclectic mix of FP and LP papers,
one of the few venues where the two communities get
together. It should be a congenial meeting, situated under
Mt Fuji.

We have two excellent invited speakers.
Peter van Roy, on Mozart
Guy Steele, on Fortress (just confirmed)

Submission deadline is 11 November 2005,
the conference is 24--28 April 2006.
Do submit, do come!

Visual Basic 9 Interview on DDJ

For those interested in the new VB9 language, there is an interview on DDJ with me and my partners in crime Paul Vick, Amanda Silver, together with our more pointy haired, but good, friends Alan Griver, Rob Copeland, Jay Roxe.

The reason I got sold on software transactions, as opposed to joins, is this paper.

Visual Basic and LINQ

Over the last couple of months, both my existence and my judgments have been questioned several times on my favorite programming languages waterhole :-)

In the mean time, I was busily working with the SQL, XML, C# and the Visual Basic teams on language integrated query, or as it is now called project LINQ. In particular since early this year I am collaborating with Amanda Silver, Paul Vick, and Rob Copeland and Alan Griver on what has become my programming language of choice Visual Basic.

If you look closely at the new features introduced to C# and Visual Basic in the context of LINQ, you will recognize many familiar concepts that are regularly discussed on LTU ranging from monads, to meta-programming, lambda expressions, XML programming, to the relationship between static and dynamic typing.

The LINQ project consists of a base pattern of query operators (compare to the monad primitives) such as Select (map), SelectMany (concatMap), Where (filter), OrderBy (sort), and GroupBy (groupBy) on top of which Visual Basic and C# define query comprehensions (compare to monad comprehensions) that  facilitate querying objects, relational data and XML. The C# syntax for query comprehensions is similar to FLWOR expressions, while the Visual Basic syntax stays close to SQL including aggregation.

In addition to the language extensions and base operators, LINQ provides two supplementary domain-specific APIs namely DLinq (compare to HaskellDB) for SQL relational data access, and XLinq (compare to HaXml) for XML hierarchical data access. Besides query comprehensions, Visual Basic provides deep XML integration with XML literals and XML late binding on top of XLinq (compare to Haskell Server Pages, XMl, Comega).

Both Visual Basic and C# have added several additional language extensions in support of LINQ, including local type inference (the type of local variable declarations are inferred from their initializers), lambda expressions (with type inference), local functions, anonymous types, object initializers, extension methods (static methods that can be called using instance method syntax), and meta-programming via expression trees (compare to type-based quote and quasi-quote).

Visual Basic adds some further enhancements to leverage the fact that it allows static typing where possible and dynamic typing where necessary in the form of relaxed delegates, improved nullable support, dynamic identifiers (makes writing meta-circular interpreters a breeze) and last but not least dynamic interfaces, or as I like to refer to them strong duck typing (compare to simplified qualified types/type classes).

LINQ general website: http://msdn.microsoft.com/netframework/future/linq/
VB9 specific website: http://msdn.microsoft.com/vbasic/future

XML feed