DSL

COLA Brainfuck

From the Software Architecture Group at the Hasso Plattner Institut:

Our tutorial on COLA provides insight on how programming languages can be implemented using the combined abstractions and an implementation of parsing expression grammars in COLA. The "esoteric" programming language brainfuck was chosen for its simplicity, which allows for concentrating on COLA's features.

Previously: COLA and Open, extensible object models; via neuraxon77.

Chris Crawford's 9 Breakthroughs

Games industry curmudgeon and interactive storytelling proponent Chris Crawford spoke at the Game Developers Exchange conference here in Atlanta yesterday. As a part of the talk he explained the "Nine Breakthroughs" that were important to his work on Storytron. I recorded them here...

Surprisingly, many of the ideas are about languages - either the programming languages used to program the games, or the language interfaces provided to users.

Applied Metamodelling: A Foundation for Language Driven Development

Applied Metamodelling: A Foundation for Language Driven Development (2004)
by Tony Clark, Paul Sammut, James Willans

An excerpt:

Language-driven development is fundamentally based on the ability to rapidly design new languages and tools in a unified and interoperable manner. We argue that existing technologies do not provide this capability, but a language engineering approach based on metamodelling can. The detailed study of metamodelling and how it can realise the Language-Driven Development vision will form the focus for the remainder of this book.

In software engineering circles the term "language driven development" is synonymous with "language oriented programming", a term which LtU members are more familiar with (thanks to Martin Ward's article Language Oriented Programming which first appeared in 1994, and then Martin Fowler's essays on the topic). The book hasn't appeared on the radar here on LtU, despite 41 citations. I suspect this is due in part to only one citation at Citeseer, and the lack of cross-talk between computer scientists and software engineers.

There are a lot of similarities between the XMF language (discussion at LtU) and that of the Katahdin language (discussion at LtU). Other related discussions here at LtU, include Language Workbenches: The Killer App for DSLs - about the essay by Martin Fowler, Ralph Johnson: Language workbenches - a response to Fowler's essay, XActium - Lightweight Language Engineering? - which discusses an essay about a previous version of XMF, Generating Interpreters? , Language Oriented Programming - discusses an essay by Jetbrain's Sergey Dmitriev, "Language Oriented Programming" Meta Programming System - discussion of the Jetbrain MPS system, The DSL, MDA, UML thing again... - an older discussion on the relationship between DSLs and MDA.

(Disclaimer: Some may notice that I am mentioned on the XMF web site, but this is just because I subjected their XMF language to a number of grueling challenges which they passed with flying colors: see the language snippets in the documentation. I have no affiliation with their company.)

The little b language: shared models built from reusable parts

The little b project is an effort to provide an open source language which allows scientists to build mathematical models of complex systems. The initial focus is systems biology. The goal is to stimulate widespread sharing and reuse of models.

The little b language is designed to allow biologists to build models quickly and easily from shared parts, and to allow theorists to program new ways of describing complex systems. Currently, libraries have been developed for building ODE models of molecular networks in multi-compartment systems such as cellular epithelia.

Little b is based in Common Lisp and contains mechanisms for rule-based reasoning, symbolic mathematics and object-oriented definitions. The syntax is designed to be terse and human-readable to facilitate communication. The environment is both interactive and compilable.

Yet another biological DSL.

As usual, it is best to start by looking at some sample models.

CUFP write-up

A write-up of the Commercial Users of Functional Programming meeting held this October is available, for those of us who didn't attend. The write-up is well written and thought provoking (it was written by Jeremy Gibbons, so that's not a surprise).

The goal of the Commercial Users of Functional Programming series of workshops is to build a community for users of functional programming languages and technology. This, the fourth workshop in the series, drew 104 registered participants (and apparently more than a handful of casual observers).

It is often observed that functional programming is most widely used for language related projects (DSLs, static analysis tools and the like). Part of the reason is surely cultural. People working on projects of this type are more familiar with functional programming than people working in other domains. But it seems that it may be worthwhile to discuss the other reasons that make functional languages suitable for this type of work. There are plenty of reasons, many of them previously discussed here (e.g., Scheme's uniform syntax, hygiene, DSELs), but perhaps the issue is worth revisiting, seeing as this remains the killer application for functional programming, even taking into account the other types of project described in the workshop.

NEXCEL, a Deductive Spreadsheet

NEXCEL, a Deductive Spreadsheet, Iliano Cervesato. 2006.

Usability and usefulness have made the spreadsheet one of the most successful computing applications of all times: millions rely on it every day for anything from typing grocery lists to developing multimillion dollar budgets. One thing spreadsheets are not very good at is manipulating symbolic data and helping users make decisions based on them. By tapping into recent research in Logic Programming, Databases and Cognitive Psychology, we propose a deductive extension to the spreadsheet paradigm which addresses precisely this issue. The accompanying tool, which we call NEXCEL, is intended as an automated assistant for the daily reasoning and decision-making needs of computer users, in the same way as a spreadsheet application such as Microsoft Excel assists them every day with calculations simple and complex. Users without formal training in Logic or even Computer Science can interactively define logical rules in the same simple way as they define formulas in Excel. NEXCEL immediately evaluates these rules thereby returning lists of values that satisfy them, again just like with numerical formulas. The deductive component is seamlessly integrated into the traditional spreadsheet so that a user not only still has access to the usual functionalities, but is able to use them as part of the logical inference and, dually, to embed deductive steps in a numerical calculation.

This is a neat paper about using Datalog-style relations to extend spreadsheets with some deductive database features. It seems like Datalog represents a real sweet spot in the design space for logic programming -- I've seen a lot of people put it to effective use.

Technometria: Google Web Toolkit

Phil Windley Technometria podcast is dedicated to the Google Web Toolkit. The guest on the show is Bruce Johnson a Tech Lead of GWT.

The show is very good, and more technical than usual. Many themes that are near and dear to LtU are discussed. Here are some pointers:

Bruce talks at length about the advantages of compiling from Java to JS, many of which arise from Java's static typing. He mainly talks about optimizations, but also about how static typing helps with tools in general (IDEs etc.). This was a subject of long and stormy debates here in the past.

The advantages, from a software engineering standpoint, of building in Java vs. JS are discussed. This is directly related to the ongoing discusison here on the new programming-in-the-large features added to JS2. I wonder if someone will write a compiler from Java/GWT to JS2 at some point, which will enable projects to move to JS2 and jump ship on Java all together.

Bruce mentions that since JS isn't class-based, and thus doesn't directly support the OO style many people are used to, there are many ways of translating common OO idioms into JS. This is, of course, the same type of dilemma the Scheme community has about many high level features. Cast as a question on OOP support the questions becomes is it better to provide language constructs that allow various libraries to add OO support in different ways, or to provide language support for a specific style. The same can be asked about a variety of features and programming styles, of course.

Finally, Bruce mentions that as far as he knows no one thought about something like GWT before they did. Well, I for one, and I don't think I was the only one, talked many times (probably on LtU) about Javascript as a VM/assembly language of the browser, clearly thinking about JS as a target language. I admint I wasn't thinking aobut compiling Java... But then, I am not into writing Java, so why would I think about Java as the source language...

The End of an Architectural Era (It’s Time for a Complete Rewrite)

The End of an Architectural Era (It’s Time for a Complete Rewrite). Michael Stonebraker, Samuel Madden, Daniel J. Abadi, Stavros Harizopoulos, Nabil Hachem, Pat Helland. VLDB 2007.

A not directly PL-related paper about a new database architecture, but the authors provide some interesting and possibly controversial perspectives:

  • They split the application into per-core, single-threaded instances without any communication between them.
  • Instead of using SQL from an external (web app) process to communicate with the database, they envision embedding Ruby on Rails directly into the database.
  • They state that most database warehouse tasks rely on pre-canned queries only, so there is no need for ad-hoc querying.

The somewhat performance-focused abstract:

In two previous papers some of us predicted the end of "one size fits all" as a commercial relational DBMS paradigm. These papers presented reasons and experimental evidence that showed that the major RDBMS vendors can be outperformed by 1-2 orders of magnitude by specialized engines in the data warehouse, stream processing, text, and scientific data base markets.

Assuming that specialized engines dominate these markets over time, the current relational DBMS code lines will be left with the business data processing (OLTP) market and hybrid markets where more than one kind of capability is required. In this paper we show that current RDBMSs can be beaten by nearly two orders of magnitude in the OLTP market as well. The experimental evidence comes from comparing a new OLTP prototype, H-Store, which we have built at M.I.T. to one of the popular RDBMSs on the standard transactional benchmark, TPC-C.

We conclude that the current RDBMS code lines, while attempting to be a "one size fits all" solution, in fact, excel at nothing. Hence, they are 25 year old legacy code lines that should be retired in favor of a collection of "from scratch" specialized engines. The DBMS vendors (and the research community) should start with a clean sheet of paper and design systems for tomorrow's requirements, not continue to push code lines and architectures designed for the yesterday's requirements.

A critical comment by Amazon's CTO, Werner Vogels.

binpac: A yacc for Writing Application Protocol Parsers

binpac: A yacc for Writing Application Protocol Parsers.

R. Pang, V. Paxson, R. Sommer, and L. Peterson. ACM Internet Measurement Conference. October 2006.

A key step in the semantic analysis of network traffic is to parse the traffic stream according to the high-level protocols it contains. This process transforms raw bytes into structured, typed, and semantically meaningful data fields that provide a high-level representation of the traffic. However, constructing protocol parsers by hand is a tedious and error-prone affair due to the complexity and sheer number of application protocols. This paper presents binpac, a declarative language and compiler designed to simplify the task of constructing robust and efficient semantic analyzers for complex network protocols. We discuss the design of the binpac language and a range of issues in generating efficient parsers from high-level specifications. We have used binpac to build several protocol parsers for the "Bro" network intrusion detection system, replacing some of its existing analyzers (handcrafted in C++), and supplementing its operation with analyzers for new protocols. We can then use Bro's powerful scripting language to express application-level analysis of network traffic in high-level terms that are both concise and expressive. binpac is now part of the open-source Bro distribution.

Binpac nicely abstracts away issues such as large numbers of concurrent, asynchronous parsing processes and protocol specifics (such as HTTP's chunked encoding). A parser for a large part of HTTP is presented in the paper and fits on half a page. The authors have also written parsers for CIFS/SMB, DCE/RPC, DNS, NCP, and Sun/RPC.

Jon Udell on CoScripter

The web site description of CoScripter:

CoScripter is a system for recording, automating, and sharing processes performed in a web browser such as printing photos online, requesting a vacation hold for postal mail, or checking bank account information. Instructions for processes are recorded and stored in easy-to-read text here on the CoScripter web site, so anyone can make use of them. If you are having trouble with a web-based process, check to see if someone has written a CoScript for it!

Jon's discussion emphasizes the DSL perspective (and end-user programming).

The community dynamics enabled by exposing a DSL seem to me the interesting aspect of this discussion. Yet another example you can use when arguing in favor of textual, user accessible, DSLs.

XML feed