DSL

Project Sikuli

Picture or screenshot driven programming from the MIT.

From the Sikuli project page:

Sikuli is a visual technology to search and automate graphical user interfaces (GUI) using images (screenshots). The first release of Sikuli contains Sikuli Script, a visual scripting API for Jython, and Sikuli IDE, an integrated development environment for writing visual scripts with screenshots easily.

ScalaModules: a DSL for bringing OSGi to Scala

ScalaModules is an open source project aimed at providing fluent support for OSGi to Scala developers. It takes advantage of Scala's infix operator notation, higher order functions, and implicit conversions. ScalaModules transparently uses the Scala compiler to wrap an OSGi BundleContext with its own RichBundleContext model.

This general technique is not unusual for creating DSLs in mainstream languages. Sean McDirmid uses similar tricks for his C# Bling library for WPF, except that Bling must overcome the lack of C# offering comparable extensions to Scala.

The AI Systems of Left 4 Dead

There's no PL content per se in this presentation, but a PL weenie will surely think of a DSL for almost every slide, so I hope posting this is warranted.

Tim Sweeney has written previously on the programming challenges raised by game development (The Next Mainstream Programming Languages: A Game Developer's Perspective), and I think this presentation is another showcase of the huge problems that need solving in game development.

Finally, I wonder why anyone would use a language that doesn't allow quick and simple syntax extension for driving a game engine? Seriously, the possibilities for ad-hoc DSLs seem endless.

Back to the Future: Lisp as a Base for a Statistical Computing System

Back to the Future: Lisp as a Base for a Statistical Computing System by Ross Ihaka and Duncan Temple Lang, and the accompanying slides.

This paper was previously discussed on comp.lang.lisp, but apparently not covered on LtU before.

The application of cutting-edge statistical methodology is limited by the capabilities of the systems in which it is implemented. In particular, the limitations of R mean that applications developed there do not scale to the larger problems of interest in practice. We identify some of the limitations of the computational model of the R language that reduces its effectiveness for dealing with large data efficiently in the modern era.

We propose developing an R-like language on top of a Lisp-based engine for statistical computing that provides a paradigm for modern challenges and which leverages the work of a wider community. At its simplest, this provides a convenient, high-level language with support for compiling code to machine instructions for very significant improvements in computational performance. But we also propose to provide a framework which supports more computationally intensive approaches for dealing with large datasets and position ourselves for dealing with future directions in high-performance computing.

We discuss some of the trade-offs and describe our efforts to realizing this approach. More abstractly, we feel that it is important that our community explore more ambitious, experimental and risky research to explore computational innovation for modern data analyses.

Foot note:
Ross Ihaka co-developed the R statistical programming language with Robert Gentleman. For those unaware, R is effectively an open source implementation of S-PLUS, which in turn was based on S. R is sort of the lingua franca of statistics, and you can usually find R code provided in the back of several Springer Verlag monographs.

Duncan Temple Lang is a core developer of R and has worked on the core engine for TIBCO's S-PLUS.

Thanks to LtU user bashyal for providing the links.

Why API Design Matters

Michi Henning, Why API Design Matters, Communications of the ACM, May 2009.

After more than 25 years as a software engineer, I still find myself underestimating the time it takes to complete a particular programming task. Sometimes, the resulting schedule slip is caused by my own shortcomings: as I dig into a problem, I simply discover it is a lot more difficult than I initially thought, so the problem takes longer to solve—such is life as a programmer. Just as often I know exactly what I want to achieve and how to achieve it, but it still takes far longer than anticipated. When that happens, it is usually because I am struggling with an application programming interface (API) that seems to do its level best to throw rocks in my path and make my life difficult. What I find telling is that, even after 25 years of progress in software engineering, this still happens. Worse, recent APIs implemented in modern programming languages make the same mistakes as their 20-year-old counterparts written in C. There seems to be something elusive about API design that, despite years of progress, we have yet to master.

This is a rather accessible look at the consequences of bad API design. Interestingly enough, the main example revolves around the inappropriate use of side effects. The last section concludes with cultural changes the author feels is necessary to improve the situation.

Ï€: a pattern language

π - not to be confused with the π-calculus - is a pattern-based language being developed by the Software Technology group at Technische Universität Darmstadt. Quoting from the project website:

There is only one language construct in π: the pattern. Patterns are, simply speaking, EBNF-expressions with an associated meaning; a pattern can be easiest understood as a function with a syntactically complex (context-free) "signature". The non-terminal symbols in the signature are then the parameters of the pattern. A π-program is a sequence of instruction symbols (technically, sentences), each being a sequence of (Unicode) characters. The sentences are then evaluated (executed) in the respective order.

The basic idea here seems similar to the OMeta language, previously mentioned on LtU here, but based on EBNF instead of Parsing Expression Grammars (PEGs).

Pattern definitions in π have the form

declare_pattern name ≔ syntax ⇒ type ➞ meaning;

Here's a trivial example of defining a pattern:

declare_pattern
   integer_potentiation ≔ integer:i %W- "^" %W- integer:j ⇒ integer ➞
   {
      int result = i;
      for (int k = 1; k <= j-1; k++)
         result *= i;
      return result;
   };

The resulting pattern can then be used directly in expressions, such as print(42^6);.

More information about the language, as well as the implementation, can be found at http://www.pi-programming.org. There's an OOPSLA09 paper on π as well, but I haven't been able to find an open access version of it yet.

[Update: the π team has made their OOPSLA article available here]

Causal Commutative Arrows and Their Optimization

Causal Commutative Arrows and Their Optimization, Hai Liu. Eric Cheng. Paul Hudak. ICFP 2009.

Arrows are a popular form of abstract computation. Being more general than monads, they are more broadly applicable, and in particular are a good abstraction for signal processing and dataflow computations. Most notably, arrows form the basis for a domain specific language called Yampa, which has been used in a variety of concrete applications, including animation, robotics, sound synthesis, control systems, and graphical user interfaces.

Our primary interest is in better understanding the class of abstract computations captured by Yampa. Unfortunately, arrows are not concrete enough to do this with precision. To remedy this situation we introduce the concept of commutative arrows that capture a kind of non-interference property of concurrent computations. We also add an init operator, and identify a crucial law that captures the causal nature of arrow effects. We call the resulting computational model causal commutative arrows.

To study this class of computations in more detail, we define an extension to the simply typed lambda calculus called causal commutative arrows (CCA), and study its properties. Our key contribution is the identification of a normal form for CCA called causal commutative normal form (CCNF). By defining a normalization procedure we have developed an optimization strategy that yields dramatic improvements in performance over conventional implementations of arrows. We have implemented this technique in Haskell, and conducted benchmarks that validate the effectiveness of our approach. When combined with stream fusion, the overall methodology can result in speed-ups of greater than two orders of magnitude.

One way of understanding what is going on in this paper is that in terms of dataflow programming, FRP programs correspond to programs with single-entry, single-exit dataflow graphs. This means that none of the internal dataflow nodes in an FRP program are actually necessary -- you can coalesce all those nodes into a single node while preserving the observable behavior. (They briefly touch on this point when they mention that synchronous languages try to compile to "single loop code".) What's very slick is that they have a nice normal form procedure that (a) is defined entirely in terms of their high-level language, and (b) always yields code corresponding to the the coalesced dataflow graph. It's an elegant demonstration of the power of equational reasoning.

L+C Modeling Language

Radoslaw Karwowski and Przemyslaw Prusinkiewicz. Design and implementation of the L+C modeling language. Electronic Notes in Theoretical Computer Science 86 (2), 19 pp.

L-systems are parallel grammars that provide a theoretical foundation for a class of programs used in the simulation of plant development and procedural image synthesis. In particular, the formalism of L-systems guides the construction of declarative languages for specifying input to these programs. We outline key factors that have motivated the development of L-system-based languages in the past, and introduce a new language, L+C, that addresses the shortcomings of its predecessors. We also describe a simulation program, lpfg, which makes it possible to execute models specified in L+C. To this end, L+C programs are translated into C++, compiled into a DLL, and linked with lpfg at runtime. The use of this strategy simplifies the implementation of the modeling system.

I somehow failed to notice L+C as a demarcated language before. I am not entirely sure what one needs to download in order to use L+C directly.

I am sure this is relevant to the great DSL debate, but I am not sure in favor of which side of the debate...

The page gives links to other papers about L+C. While you are there, don't miss The Algorithmic Beauty of Plants, which is great fun.

DSL goodness

The site for DSL'09, which took place two months ago in Oxford, is a treasure trove for DSL fans. While the blog format may be a bit confusing for a conference website, you should be able to find your way around to links to slides and paper versions of the presentations. Even better, the posts include various comments people made about the papers (each talk was followed by a comment from a discussant). Apparently one can even join in and post comments!

So tell us: What item caught your attention? Which paper should everyone here rush to read? Which DSL is downloadable and worth downloading?

Lifted inference: normalizing loops by evaluation

Lifted inference: normalizing loops by evaluation. Oleg Kiselyov and Chung-chieh Shan. 2009 Workshop on Normalization by Evaluation.

Many loops in probabilistic inference map almost every individual in their domain to the same result. Running such loops symbolically takes time sublinear in the domain size. Using normalization by evaluation with first-class delimited continuations, we lift inference procedures to reap this speed-up without interpretive overhead. To express nested loops, we use multiple control delimiters for metacircular interpretation. To express loops over a powerset domain, we convert nested loops over a subset to unnested loops.

The paper is a bit hard to follow, but there are enough little tricks here to merit attentive reading. Or better yet, read the code. The basic PLT idea might be summed as doing abstract interpretation on a shallowly embedded DSL using delimited continuations.

XML feed