How widespread are inhouse DSLs?

A student asked me this question, and apart from saying that quite a large percentage of large organizations use in house DSLs, I couldn't give any details, nor am I aware of any research.

So if anyone came across a survey or research report that gives useful (hopefully current) information about DSL use, I'd be glad to hear about it.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

In Maple and at Maplesoft

When I worked at Maplesoft, we created our own DSL for writing automated tests. It was a meta-language on top of Maple, implemented in parts in perl and mostly in Maple.

Inside Maple itself, Maplets are an embedded DSL for a (mostly) declarative language for GUIs spawned from Maple. There is another DSL for describing the context-sensitive menus, thus making them fully user-extensible. There are actually 3 DSLs for reflective use (ie which are supposed to describe Maple programs), with different designs. I designed the one used by the functions ToInert and FromInert; they are in heavy use for meta-programming at Maplesoft (including its CodeGeneration package as well as the forthcoming Compiler package).

I think there is actually one more, but I'll have to wait until the next version of Maple actually ships before I talk about it, NDA and all that jazz.

ATM (teller machine) software

I guilt for writing it. ;)

Our C version contained, AFAIK, two. One was used to describe receipt printing and second -- for communication facility initialisation.

And our Tcl-based version contains just many of them. There were DSELs for:
- customer dialog description (FSM language with simple and easily verifable semantics),
- communication subsystem language,
- language that ease interface to platform API (in the lines of SWIG, but in less lines of code).
I just can't remember others, but they were there.

Many (3 from 9) of our developers just can't stand opportunity to create a language to suit expression of their ideas. It was moderately error-proof, simple and fast process -- and they used it.

Might be less than you think -- here's some from my experience

I've been in a lot of industries from financial services to aerospace and I've rarely seen either an EDSL or DSL. I think it really depends on the type of developers a company has. In the US here, it seems that developers who have a computer science degree forget almost all of it after graduating. Other developers without any formal CS training aren't aware of how to develop them. In over 20 years, I've seen only about 5.

Here's the roundup of the ones I've seen in my experience.

1. A DSL written in C for doing screen-scraping financial services against IBM mainframe 3270 systems. What was really amazing to me was this was written by an industrial engineer who had only had a 2 week C training course a few months before she wrote it (this all occured before I joined the company). I later took it and rewrote it using a top down parser, added a debugger to it, and added a foreign function interface to it.

2. Same financial services company, one of my team mates wrote a cron-like DSL for scheduling batch application runs on a cluster of Sun workstations (this was before widespread availability of OS solutions was easily available).

3. Same financial services company, this one was a DSL that allowed customer service reps to write templatized letters with markup (pre-HTML days, though), and mail-merge facillities to information on the mainframe to suck out customer name, address and other specifics to the issue. Written in lex, yacc, c++. The interpreter did the mail-merge with info from the mainframe and then generated troff output, which was then sent to the printing queue. User's were really happy with it, because the quality of the letters went way up, in content (that was able to directly address customer issue), spell checking, speed, etc. This was a pretty big and complicated system, but in the end, it was just a mail-merge markup language. I remember when the web came out a few years later we were going, D'oh!. I advised on the yacc usage, but didn't directly participate. I was secretly hoping they'd use a custom TeX macro language to get better looking output, but they wanted to stay with troff.

4. At Motorola on the Iridium satellite project, a teammate wrote a DSL in SML for taking a persistant object schema and then generating Objectivity C++ database code and C++ interface code for it (along the lines in the Pragmatic Programmer book).

5. Another one I wrote at my current company, engineers write tons of Perl code to interface to LabVIEW to control and test various satellite hardware components. I got volunteered to do this for a series of battery tests. Instead of writing the scripts in Perl, I wrote a DSL that closely reflected the syntax and semantics of the test specification document, parsed that and interpreted the AST. The tricky thing about this was that one test script could actually take weeks to complete, and interuptions to the testing frequently took place (replacing / fixing faulty hardware, checking out a too-hot battery, etc). These tests then needed to resume where they left off, so I designed the DSL from the beginning to serialize it's execution state (stack and global state), and then be able to resume it where it left off. It also had its own debugger. The DSL was successful enough that it was modified slightly for a different battery test project and the end users now write and maintain their own DSL scripts. The commerical battery test equipment that the company bought to perform some of the tests couldn't resume tests like mine could.

How many lives have they saved!

We had two DSLs used to control our Magnetic Resonance Imagers. One described programs run by the real-time hardware controller. Another described the steps used in image reconstruction. Both languages were extensible by adding C-language modules, but not all users had access to the APIs for such modules and to C compilers for the systems.

All other manufacturers had similar systems. AFAIK ours was the only independent DSL. Other manufacturers were using DSLs embedded (in C++), although I doubt they see it that way. In an ideal world these languages would be embedded in Haskell. There are a huge number of characteristics of MR images, and composition rules for those types. The way they are programmed, using inheritance, can be very difficult to verify those rules. It would be much more transparent to encode them with Haskell classes and algebraic data types.

So, we see that DSLs are busy saving lives in a hospital near you, but they are almost certainly not recognized as doing so.

abuse

I once found that a project to generate some sort of reports had a language designed specifically for 'end users.' Except the end user was never going to have access to the area which was going to contain source code for this language. The language was too complicated to be used even by developers (inconsistent and too constrained). Either the language was wrapping java's EJBs or the EJBs were wrapping the language...it was a bit of a mess and frankly I think those guys (consultants) were milking the company for lots of money...while they built something which would impress the bosses...without actually being useful. Impressive actually :)

Strange

while they built something which would impress the bosses...without actually being useful

My experience is that to sell an idea of using a (end-user) DSL to management, it has to satisfy:

  1. Either a graphical syntax based on UML
  2. or showing compelling reasons with multiple buzzwords why it's superior to UML, in this case it must look much more generic than UML, and be based on either a pseudo-natural language or an RDF-like super-universal description language.

Not that fulfilling these criteria is really useful to actual users, it's just needed to sell the idea. From what you've written, the project did not satisfy my criteria... Probably, they are domain-specific :-)

An atypical example

I guess my company may be a bit atypical, but we use several in-house DSLs regularly, in the course of everyday development. But we also like to invent new DSLs on the fly with the slightest excuse.

The regularly-used, long-lived DSLs are:
- A grammar language, SGML-based. We might have used Yacc or something similar, of course. Used to generate some C code and grammar documentation in HTML. There's an ambiguity checker and a tool to derive things like first sets.
- A language for describing coroutines, translated into unmaintainable C functions simulating coroutining behaviour.
- A language for describing state machines, translated into C functions with long switch statements and gotos.
- An extensible templating language with conditionals and loops, translated to OmniMark. It can also be run through an interpreter.

We also have a ton of "non-executable" DSLs that specify various formats. I don't know if the non-executable languages are considered DSLs. Some of them are translated to C, and most have SGML syntax. Some examples:
- A language for specifying precedence and associativity of built-in operators, SGML-based. Used to generate C code that populates the operator tables.
- An SGML-based language for specifying the format of bytecode files. Used to generate bytecode disassembler (in C) and compiler and interpreter header files (also in C).
- An SGML-based language for listing strings (i.e., error message templates). Used to generate C header files.

As for the other category containing the small ad-hoc languages developed to be used in a particular project, I believe they are mostly templating languages. I have personally developed and used at least three, in various service jobs. The development rarely takes more than a day, it's much more fun than the rest of the service job, and it helps to clarify the structure of the project.

There was one particularly interesting occasion, where the project went something like this:

1. I specified the language syntax (mostly by examples) and sent it to the customer.
2. The customer then wrote up their requirements using the language and sent them back.
3. In the meantime, I wrote up the requirements language compiler.
4. The compiler processed the customer's requirements and generated executable OmniMark code. I had to fix some small syntactic errors in the spec.
5. The customer came over. We went through the errors in the specification and repeated the edit-compile-run cycle several times.
6. After the spec was complete, I hand-edited the generated code to optimize it and add some bells and whistles.
7. The End. The whole process took about a week (or two man-weeks). I was told that the customer's in-house depeloper team had been working on the same project for four months, and they hadn't completed yet.

A note on SGML: I know it's considered a dead language only dinosaurs would use. But for most of the uses I've described above, it's a nearly perfect match. Writing an ad-hoc parser from scratch would take much more effort than writing a DTD. And XML simply has too much noise to be a good DSL. What I'd consider a perfect match would be a SAX-compatible tool that could just take a BNF grammar and produce a parse tree of the input.