DSL goodness

The site for DSL'09, which took place two months ago in Oxford, is a treasure trove for DSL fans. While the blog format may be a bit confusing for a conference website, you should be able to find your way around to links to slides and paper versions of the presentations. Even better, the posts include various comments people made about the papers (each talk was followed by a comment from a discussant). Apparently one can even join in and post comments!

So tell us: What item caught your attention? Which paper should everyone here rush to read? Which DSL is downloadable and worth downloading?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Conference program

The conference program is separately available as a table.

Panel

I think people here might want to say something after taking a look at the panel discussion...

Great, now I'll have to try

Great, now I'll have to try and figure out what Functional Hybrid Modelling is...

DSL's are...

DSL's are...

* ...what programmers create because the language they are forced to use (C/C++/Java/C#) is so broken, they need more.

* ...the Evil of Greenspun's Tenth Law oozing to the surface...

* ...the stunted malformed horrors config files grow into, because the language the program was written in lacked "eval".

* ...emasculated things that dumb control freaks use (or demand be created) because a Real Language scares them.

* ...what every programmer creates at least once in his life because the compiler construction course at Uni was so cool (even if totally impractical).

"Greenspun's Tenth Rule of Programming: any sufficiently complicated C or Fortran program contains an ad hoc informally-specified bug-ridden slow implementation of half of Common Lisp."

I haven't yet seen a DSL that hasn't on further analysis been a horrid limited malformed outgrowth of the brokenness of the underlying language.

(Yes, I have also proudly created DSL's in my life. In retrospect and hind sight, they were symptoms, not products.)

As part of a problem I've

As part of a problem I've been researching on-and-off-again for the past 2-3 years, I've come to a useful observation: a user interface is an application-specific language between the application and the user. If we interpret an application as a particular domain, we see that the designer's job is to make a language for the user to communicate needs to the underlying computation engine.

Under this perspective, DSLs are among the most important languages out there. It might be cost-effective for making a few engineers suffers, but not folks further down the user-pyramid.

I'll agree with that...

But here is a small clue, a heads up then, about why I say many of these DSL's (or as you would say, User Interface) are malformed abortions in Programming Language Design.

When your user interface has some fields (in whatever form or format those fields may be presented) to enter a number, and the user cannot enter that number as an expression involving other numbers he has already entered, you have arbitrarily emasculated your user interface by your poor choice of implementation language.

To an extent, I agree: the

To an extent, I agree: the benefit of limiting a language must outweigh the benefit of empowering it. In my work, I've been more interested in approaches to introduce additional abstractions with useful properties. Instead of restricting languages to gain properties, doing the above as extensions over muck (e.g., Flapjax-related work) or fuzzy analysis over muck (fuzzy analysis driven UI manipulation work).

However, this is not always true. E.g., we're finishing up a security project that required a heavily reduced language subset for policy writing (which can be very usefully be thought of as DSL) to avoid common attacks. In this case, the limits were well motivated: we found others that did not have these limitations to have vulnerable code.

Heavily reduced language subset for policy writing... Noooo!

...ooo, my other hobby horse! Let me ride it...

Dumbing down your UI doesn't increase your security. It makes your system appear to your user as, surprise, surprise, Dumb!

Reducing language facilities is the _wrong_ mechanism for creating security. The Operating System has the Right Facilities.

ie. Give your user the full power of a well designed language with a powerful and appropriate framework living in a OS constrained user space. Let him loose. Let him do _anything_ the language and OS permits.

But then have your system, running as another OS level User ID / process perform (typical key/value) queries through a narrow, well designed, Plain Old Data protocol.

Too often I have seen this "We must restrict the config language for security reasons", but on closely analysis it turns out we must restrict the config language because we are writing the system in a broken language, and deploying the system on a broken OS.

But that's OK because we assume our users are stupid and we are only trying to secure the system from our stupid users. Securing the system from smart people is in the "too hard basket" especially since the OS we are deploying on is fundamentally crackable.

So what I have seen out in the industry is this...

  • A huge proliferation of
    • very badly designed,
    • very poorly documented,
    • proprietary (ie. you have to invest the effort to learn it... but then never be able to reuse that knowledge),

    buggy Domain Specific Languages.

  • A BAD backdoor on this has been XML. Everybody and his brother has an XML config format.

    So sayeth the XML messiahs, everybody knows XML.

    NOOOO! Every different DTD or Schema is a _different_ grammar / language!

    All you are really doing is reusing the lexer / tokenizer / validator!

    Nobody knows the XML language you are using... and because it looks like just like more XML, you probably have been even worse than usual about documenting it.

    And because it is taggy badly formatted verbose XML, it's has worse than usual readability!

Ironically, this project was

Ironically, this project was prompted by 1) an audit revealing the inability of kernel-level folks to write secure (web) code and 2) the coarse nature of features expressible in MashupOS. So, based on empirical evaluation, I disagree ;-)

But.. but... why are kernel level folks writing web code at all?

But.. but... why are kernel level folks writing web code at all?

...sounds like you agreeing most emphatically. In particularly with the statement that most DSL are a symptom of the brokeness of the underlying language and OS, rather than a feature.

In that case, everything is

In that case, everything is broken. I like DSLs, to me they are just another way of implementing good abstractions. You don't even have to create a new language to create a new DSL, expression-tree meta programming is enough (or similar in Haskell).

The need for abstraction is a symptom of general systems underneath, which is not really a problem from my viewpoint.

Nice illustration.

You don't even have to create a new language to create a new DSL, expression-tree meta programming is enough (or similar in Haskell).

Thanks, you illustrate my point for me.

You don't even have to create a new language IF you have a sufficiently good implementation language (such as Haskell).

People who have grasped a language like Lisp/Scheme/Joy realize that syntax is just a dusting of sugar. It's the semantics underneath that matter.

So the choice then becomes whether to _add_ the domain vocabulary nouns and verbs to an existing powerful and expressive language... or start anew a come up with a tiny syntactically sweet language having essentially only the new nouns and verbs (and perhaps what small smattering of a real language you have time to add)

The latter solution is almost always the wrong one, and unfortunately it is always almost the choice implementers of DSL's in the real world have made.

Then I disagree on your

Then I disagree on your definition of DSL. To me, a DSL could be a library that presents some expression tree-like syntax, or a library that has its own semantics separate from the hosting language. Many DSLs actually are not new languages, they are embedded, and are not necessarily called DSLs by their authors. However, the principles are the same.

Playing with a small isolated (unhosted) DSL is also useful, but maybe more from an experimental purposes. E.g., superglue is one such language (how much expressiveness can we squeeze out of a simple DSL?).

In this case, kernel code

In this case, kernel code corresponds to something like the application framework and libraries. It can be further silo'd such as with a Caja core.

I agree with a more general philosophy -- POLA -- but process isolation is too coarse. If you're sharing js objects, os support makes little sense as traditionally understood beyond initial isolation. Arguably the problem is javascript, but it really just serves as a flexible language level: different types
of code in a large system have different reqs, and we found a need for verifying the security kernel.

Ah. A misunderstand perhaps?

To me "kernel" means an operating system "kernel" such as the Linux of Mach kernel.

I think you are talking about something else... (I'm not quite sure what)

If "caja" means http://code.google.com/p/google-caja/ then http://en.wikipedia.org/wiki/Object-capability_model is pretty close to what I'm suggesting.

I'm partially conflating an

I'm partially conflating an application core library with trusted computing base. In the case of code like Caja, it'd also partially correspond to an OS kernel: a common concern of capability languages is achieving guarantees common to operating systems.

Going back to the example... we are working on a problem similar to making sure the (trusted) guts of Caja are safe. That's hard to do if the developers are just writing it in JavaScript. Recognizing this, there's a push in the Caja project to write the libraries in sublanguages, leaving as little as possible in the more flexible but tricky JavaScript superset.

Tying the conversation back to the main topic.. a common related theme in web security right now is language subsetting and wrapping, playing into the DSL'09 discussions of homomorphisms and isomorphisms. A big drawback is deployment: rewriting or wrapping is expensive, and, generally, some sort seems to be needed. A useful question might be how a language might be made to match this need -- perhaps sort of like PLT Scheme's language levels, but with more control relevant to security concerns.

A good DSL: Sawzall

I would like to draw your attention to Sawzall, a language specifically designed for analyzing huge amounts of logs at Google. If anything, it is a DSL. It was designed with some very specific goals in mind, chief of which was a language level guarantee that programs should not be able to communicate state across log records. This made it possible to analyze logs in arbitrary order (and on arbitrary many machines using arbitrary many threads), and still get the same answer. Of course, major languages were considered, but it seemed impossible to prevent users from communicating state across log records.

I have been using Sawzall for a while, and I can tell you that it's a neat, compact, efficient language (write-only variables are such a cool trick) that is perfect for the purpose it was designed for. In my mind, the strong theoretical properties crucial to its success could only come from it being a mini-language, designed for a well-defined particular purpose.

It is also a brain-child of Rob Pike, with some interesting idiosyncrasies, but once you get the hang of it, it's quite fun to work with.

For those of you out there, who want to cry "Haskell!" (pick your favorite lazy FP), it has been considered, but, it was too big, too functional (remember, most programmers, God save their souls, are still more comfortable in an imperative setting), and the task of writing a static analysis tool enforcing the required theoretical properties seemed too formidable.

DSL are... programming for non-programmers

"Greenspun's Tenth Rule of Programming: any sufficiently complicated C or Fortran program contains an ad hoc informally-specified bug-ridden slow implementation of half of Common Lisp."

This remark is rather acerbic!

My interest right now is writing a DSL for machinists to use to generate gcode for CNC machines. Realistically, you can't just hand these guys a copy of Common Lisp along with a library of gcode-related functions, and say: "Good luck!" The whole point of a DSL, IMHO, is to allow non-programmers to get into the game --- to make a step up from captive-user-interface software without having to become full-blown computer-programmers --- somewhat like the way that kids who can't swim will stay in the shallow end of the pool and not cross the big red rope into the end of the pool where they might drown.

I haven't started the project yet. I had been planning on using Forth (most likely one of the C-based implementations, such as FICL), but am currently toying with the idea of using Clojure instead.

I am writing a Factor program to generate gcode and LaTeX graphics for building slide-rules. Factor is not a good choice for the DSL though because it is way too big (in the same class as Common Lisp). Also, I need security --- a sandbox for the DSL to operate in --- so that the company owners will allow their employees to program on the company computers without worry that the employees will trash the computer system and/or spend their time writing computer games. To be accepted in the corporate environment it has to be a pretty limited and focused language.

As if G Code isn't enough...

As if G Code isn't enough...

G code is 'orrible!

Decades ago I got involved in helping my uncle try do visualizations of what his gcode programs were going to do.

On every count of readability, generality, modularity, composability, safety, ... gcode stinks.

He told me this tale of a factory up the road from his...

Gcode (at that time, I hope they have fixed it!) had the silly convention that a length with a "." was in inches and a length without was thousandths of an inch. (I may misremember the exact figures, this was decades ago)

Thus "5." was five inches, which is a reasonable distance, and "5" was a minuscule distance.

Some poor sod of a programmer got it wrong and left the "." out.

The machine unfortunately was a numerical controlled lathe working on a 2m long by 1m in diameter piece of steel spinning at a humoungous speed.

The tool zoomed into the chuck, sheared it off, the load started to wobble then flew off BANG through the operator shield, BANG up through the corrugated iron roof of the factory way up and BANG down through the roof of the factory AGAIN.

Amazingly no one was injured.

It struck me then and there that that piece of language design was inutterably crap. Just waiting to kill someone.

Conclusions...

  • gcode should be treated like computer machine code. No human should write it.
  • Machining specs should be written in a Good Clean, well designed general purpose language (like ruby / python/...) with a Good domain specific vocabulary of classes and methods.
  • Which should sanity check the hell out of it self both at the individual method invocation level and the end to end machine run level.
  • Provide a Good visualization of the tool path.
  • Generate gcode.
  • Run a gcode interpretor to verify that the gcode tool path matches desired path.
  • Feed the gcode to the hardware.

havoc etc.

add the step in there of running it through a physics simulation of the shop floor? :-)

Expedia vs. Gallileo ,comparing GUI vs. CLI to DSL vs. libraries

Expedia is a system that allows You to book tickets via web. Easy to use, success from 1st attempt. Target audience: everybody and his dog.

Gallileo is what travel agents use. In one key stroke they find all flights. It takes them 30 seconds or less. Target audience: travel agents who book stuff all day long.

That query language is a proper DSL in a proper place. End users benefit from it tremendously. It is not turing complete, it is not as powerful as SQL, but it is very adapted to users' needs. A proper win!

I've seen a bunch of core banking systems that have a DSL to access some functionality supplemented by GUI like system to make less common choices. Employees love that as it saves them time clicking through 10 levels of drop-downs as banking systems get that complex.

DSL is not always for programmers or implementors to write macros. Indeed, why invent a DSL when a security-aware library written in a popular or system language can do it better? But why restrict user to a GUI?