DSL

NEXCEL, a Deductive Spreadsheet

NEXCEL, a Deductive Spreadsheet, Iliano Cervesato. 2006.

Usability and usefulness have made the spreadsheet one of the most successful computing applications of all times: millions rely on it every day for anything from typing grocery lists to developing multimillion dollar budgets. One thing spreadsheets are not very good at is manipulating symbolic data and helping users make decisions based on them. By tapping into recent research in Logic Programming, Databases and Cognitive Psychology, we propose a deductive extension to the spreadsheet paradigm which addresses precisely this issue. The accompanying tool, which we call NEXCEL, is intended as an automated assistant for the daily reasoning and decision-making needs of computer users, in the same way as a spreadsheet application such as Microsoft Excel assists them every day with calculations simple and complex. Users without formal training in Logic or even Computer Science can interactively define logical rules in the same simple way as they define formulas in Excel. NEXCEL immediately evaluates these rules thereby returning lists of values that satisfy them, again just like with numerical formulas. The deductive component is seamlessly integrated into the traditional spreadsheet so that a user not only still has access to the usual functionalities, but is able to use them as part of the logical inference and, dually, to embed deductive steps in a numerical calculation.

This is a neat paper about using Datalog-style relations to extend spreadsheets with some deductive database features. It seems like Datalog represents a real sweet spot in the design space for logic programming -- I've seen a lot of people put it to effective use.

Technometria: Google Web Toolkit

Phil Windley Technometria podcast is dedicated to the Google Web Toolkit. The guest on the show is Bruce Johnson a Tech Lead of GWT.

The show is very good, and more technical than usual. Many themes that are near and dear to LtU are discussed. Here are some pointers:

Bruce talks at length about the advantages of compiling from Java to JS, many of which arise from Java's static typing. He mainly talks about optimizations, but also about how static typing helps with tools in general (IDEs etc.). This was a subject of long and stormy debates here in the past.

The advantages, from a software engineering standpoint, of building in Java vs. JS are discussed. This is directly related to the ongoing discusison here on the new programming-in-the-large features added to JS2. I wonder if someone will write a compiler from Java/GWT to JS2 at some point, which will enable projects to move to JS2 and jump ship on Java all together.

Bruce mentions that since JS isn't class-based, and thus doesn't directly support the OO style many people are used to, there are many ways of translating common OO idioms into JS. This is, of course, the same type of dilemma the Scheme community has about many high level features. Cast as a question on OOP support the questions becomes is it better to provide language constructs that allow various libraries to add OO support in different ways, or to provide language support for a specific style. The same can be asked about a variety of features and programming styles, of course.

Finally, Bruce mentions that as far as he knows no one thought about something like GWT before they did. Well, I for one, and I don't think I was the only one, talked many times (probably on LtU) about Javascript as a VM/assembly language of the browser, clearly thinking about JS as a target language. I admint I wasn't thinking aobut compiling Java... But then, I am not into writing Java, so why would I think about Java as the source language...

The End of an Architectural Era (It’s Time for a Complete Rewrite)

The End of an Architectural Era (It’s Time for a Complete Rewrite). Michael Stonebraker, Samuel Madden, Daniel J. Abadi, Stavros Harizopoulos, Nabil Hachem, Pat Helland. VLDB 2007.

A not directly PL-related paper about a new database architecture, but the authors provide some interesting and possibly controversial perspectives:

  • They split the application into per-core, single-threaded instances without any communication between them.
  • Instead of using SQL from an external (web app) process to communicate with the database, they envision embedding Ruby on Rails directly into the database.
  • They state that most database warehouse tasks rely on pre-canned queries only, so there is no need for ad-hoc querying.

The somewhat performance-focused abstract:

In two previous papers some of us predicted the end of "one size fits all" as a commercial relational DBMS paradigm. These papers presented reasons and experimental evidence that showed that the major RDBMS vendors can be outperformed by 1-2 orders of magnitude by specialized engines in the data warehouse, stream processing, text, and scientific data base markets.

Assuming that specialized engines dominate these markets over time, the current relational DBMS code lines will be left with the business data processing (OLTP) market and hybrid markets where more than one kind of capability is required. In this paper we show that current RDBMSs can be beaten by nearly two orders of magnitude in the OLTP market as well. The experimental evidence comes from comparing a new OLTP prototype, H-Store, which we have built at M.I.T. to one of the popular RDBMSs on the standard transactional benchmark, TPC-C.

We conclude that the current RDBMS code lines, while attempting to be a "one size fits all" solution, in fact, excel at nothing. Hence, they are 25 year old legacy code lines that should be retired in favor of a collection of "from scratch" specialized engines. The DBMS vendors (and the research community) should start with a clean sheet of paper and design systems for tomorrow's requirements, not continue to push code lines and architectures designed for the yesterday's requirements.

A critical comment by Amazon's CTO, Werner Vogels.

binpac: A yacc for Writing Application Protocol Parsers

binpac: A yacc for Writing Application Protocol Parsers.

R. Pang, V. Paxson, R. Sommer, and L. Peterson. ACM Internet Measurement Conference. October 2006.

A key step in the semantic analysis of network traffic is to parse the traffic stream according to the high-level protocols it contains. This process transforms raw bytes into structured, typed, and semantically meaningful data fields that provide a high-level representation of the traffic. However, constructing protocol parsers by hand is a tedious and error-prone affair due to the complexity and sheer number of application protocols. This paper presents binpac, a declarative language and compiler designed to simplify the task of constructing robust and efficient semantic analyzers for complex network protocols. We discuss the design of the binpac language and a range of issues in generating efficient parsers from high-level specifications. We have used binpac to build several protocol parsers for the "Bro" network intrusion detection system, replacing some of its existing analyzers (handcrafted in C++), and supplementing its operation with analyzers for new protocols. We can then use Bro's powerful scripting language to express application-level analysis of network traffic in high-level terms that are both concise and expressive. binpac is now part of the open-source Bro distribution.

Binpac nicely abstracts away issues such as large numbers of concurrent, asynchronous parsing processes and protocol specifics (such as HTTP's chunked encoding). A parser for a large part of HTTP is presented in the paper and fits on half a page. The authors have also written parsers for CIFS/SMB, DCE/RPC, DNS, NCP, and Sun/RPC.

Jon Udell on CoScripter

The web site description of CoScripter:

CoScripter is a system for recording, automating, and sharing processes performed in a web browser such as printing photos online, requesting a vacation hold for postal mail, or checking bank account information. Instructions for processes are recorded and stored in easy-to-read text here on the CoScripter web site, so anyone can make use of them. If you are having trouble with a web-based process, check to see if someone has written a CoScript for it!

Jon's discussion emphasizes the DSL perspective (and end-user programming).

The community dynamics enabled by exposing a DSL seem to me the interesting aspect of this discussion. Yet another example you can use when arguing in favor of textual, user accessible, DSLs.

Beyond Pretty-Printing: Galley Concepts in Document Formatting Combinators

Beyond Pretty-Printing: Galley Concepts in Document Formatting Combinators, Wolfram Kahl. 1999.

Galleys have been introduced by Jeff Kingston as one of the key concepts underlying his advanced document formatting system Lout. Although Lout is built on a lazy functional programming language, galley concepts are implemented as part of that language and defined only informally. In this paper we present a first formalization of document formatting combinators using galley concepts in the purely functional programming language Haskell.

We've talked about functional layout algorithms for mathematics before, so it seems like a good idea to link to a paper about typesetting all the text those formulas are surrounded by.

Taming the IXP network processor

Taming the IXP network processor, Lal George and Matthias Blume. PLDI 2003.

We compile Nova, a new language designed for writing network processing applications, using a back end based on integer-linear programming (ILP) for register allocation, optimal bank assignment, and spills. The compiler's optimizer employs CPS as its intermediate representation; some of the invariants that this IR guarantees are essential for the formulation of a practical ILP model.

Appel and George used a similar ILP-based technique for the IA32 to decide which variables reside in registers but deferred the actual assignment of colors to a later phase. We demonstrate how to carry over their idea to an architecture with many more banks, register aggregates, variables with multiple simultaneous register assignments, and, very importantly, one where bank- and register-assignment cannot be done in isolation from each other. Our approach performs well in practise--without causing an explosion in size or solve time of the generated integer linear programs.

This is a nice paper showing how to design and compile a small functional language in a high-performance domain with a very irregular underlying machine model.

User-level transactional programming in Haskell

User level transactional programming in Haskell, Peter Thiemann, 2006 Haskell Workshop.

Correct handling of concurrently accessed external resources is a demanding problem in programming. The standard approaches rely on database transactions or concurrency mechanisms like locks. The paper considers two such resources, global variables and databases, and defines transactional APIs for them in Haskell. The APIs provide a novel flavor of user-level transactions which are particularly suitable in the context of web-based systems. This suitability is demonstrated by providing a second implementation in the context of WASH, a Haskell-based Web programming system. The underlying implementation framework works for both kinds of resources and can serve as a blueprint for further implementations of user-level transactions. The Haskell type system provides an encapsulation of the transactional scope that avoids unintended breakage of the transactional guarantees.

I thought this was an interesting paper because it gives a concrete example of a case where you want transactions, but positively don't want the full suite of ACID properties. Maybe it's not so surprising to those hardcore software architects in our audience who go to sleep with Jim Gray and Andreas Reuter's book under their pillows, though. :)

Skipping C - SPE and synthetic programming in Python

Expression and Loop Libraries for High-Performance Code Synthesis. Christopher Mueller and Andrew Lumsdaine. LCPC2006.

To simultaneously provide rapid application development and high performance, developers of scientific and multimedia applications often mix languages, using scripting languages to glue together high-performance components written in compiled languages. While this can be a useful development strategy, it distances application developers from the optimization process while adding complexity to the development tool chain. Recently, we introduced the Synthetic Programming Environment (SPE) for Python, a library for generating and executing machine instructions at run-time.

The authors didn't show much interest yet in supporting the most widespread ISAs, those for Intel processors. Instead they focus on PowerPC but also Cell. Have fun with Python and PS3 hacking.

Edit: You might also checkout the site of Christopher Mueller containing related material.

Domain-Specific Aspect Languages

Although the majority of work in the AOSD community focuses on general-purpose aspect languages (e.g. AspectJ), seminal work on AOSD proposed a number of domain-specific aspect languages, such as COOL for concurrency management and RIDL for serialization, RG, AML, and others. A growing trend of research in the AOSD community is returning to this seminal work

Since it seems it is DSL week around here, and since Domain-Specific Aspect Languages were not discussed here before as far as I can remember, I think now may be an appropriate time to discuss this notion.

To begin the tour, head out to the web page of the first DSAL workshop: DSAL'06 which "approached domain-specific aspect languages from a language implementation point of view, where advances in the field of domain-specific language engineering were investigated to answer the implementation challenges of aspect languages," and then move over to DSAL'07 which dealt with the design and implementation of new domain-specific aspect languages in more detail.

XML feed