Everything old is new again: Quoted Domain Specific Languages, by Shayan Najd, Sam Lindley, Josef Svenningsson, Philip Wadler:
We describe a new approach to domain specific languages (DSLs), called Quoted DSLs (QDSLs), that resurrects two old ideas: quotation,
from McCarthy’s Lisp of 1960, and the subformula property, from Gentzen’s natural deduction of 1935. Quoted terms allow the DSL to share the syntax and type system of the host language. Normalising quoted terms ensures the subformula property, which guarantees that one can use higher-order types in the source while guaranteeing first-order types in the target, and enables using types to guide fusion. We test our ideas by re-implementing Feldspar, which was originally implemented as an Embedded DSL (EDSL), as a QDSL; and we compare the QDSL and EDSL variants.
Neat paper first posted here by Blaisorblade that formalizes the properties needed for using quotation to implement DSLs. Most embedded DSLs use custom operators and lifted conditionals to make embedded programs seem more natural, but quoting enables the use of native operators and conditions, and might lead to more natural expressions of lightweight (possibly heterogenous) staged computations.
Cyrus Omar, Darya Kurilova, Ligia Nistor, Benjamin Chung, Alex Potanin, and Jonathan Aldrich, "Safely Composable Type-Specific Languages", ECOOP14.
Programming languages often include specialized syntax for common datatypes (e.g. lists) and some also build in support for specific specialized datatypes (e.g. regular expressions), but user-defined types must use general-purpose syntax. Frustration with this causes developers to use strings, rather than structured data, with alarming frequency, leading to correctness, performance, security, and usability issues. Allowing library providers to modularly extend a language with new syntax could help address these issues. Unfortunately, prior mechanisms either limit expressiveness or are not safely composable: individually unambiguous extensions can still cause ambiguities when used together. We introduce type-specific languages (TSLs): logic associated with a type that determines how the bodies of generic literals, able to contain arbitrary syntax, are parsed and elaborated, hygienically. The TSL for a type is invoked only when a literal appears where a term of that type is expected, guaranteeing non-interference. We give evidence supporting the applicability of this approach and formally specify it with a bidirectionally typed elaboration semantics for the Wyvern programming language.
A Next Generation Smart Contract and Decentralized Application Platform, Vitalik Buterin.
When Satoshi Nakamoto first set the Bitcoin blockchain into motion in January 2009, he was simultaneously introducing two radical and untested concepts. The first is the "bitcoin", a decentralized peer-to-peer online currency that maintains a value without any backing, intrinsic value or central issuer. So far, the "bitcoin" as a currency unit has taken up the bulk of the public attention, both in terms of the political aspects of a currency without a central bank and its extreme upward and downward volatility in price. However, there is also another, equally important, part to Satoshi's grand experiment: the concept of a proof of work-based blockchain to allow for public agreement on the order of transactions. Bitcoin as an application can be described as a first-to-file system: if one entity has 50 BTC, and simultaneously sends the same 50 BTC to A and to B, only the transaction that gets confirmed first will process. There is no intrinsic way of determining from two transactions which came earlier, and for decades this stymied the development of decentralized digital currency. Satoshi's blockchain was the first credible decentralized solution. And now, attention is rapidly starting to shift toward this second part of Bitcoin's technology, and how the blockchain concept can be used for more than just money.
Commonly cited applications include using on-blockchain digital assets to represent custom currencies and financial instruments ("colored coins"), the ownership of an underlying physical device ("smart property"), non-fungible assets such as domain names ("Namecoin") as well as more advanced applications such as decentralized exchange, financial derivatives, peer-to-peer gambling and on-blockchain identity and reputation systems. Another important area of inquiry is "smart contracts" - systems which automatically move digital assets according to arbitrary pre-specified rules. For example, one might have a treasury contract of the form "A can withdraw up to X currency units per day, B can withdraw up to Y per day, A and B together can withdraw anything, and A can shut off B's ability to withdraw". The logical extension of this is decentralized autonomous organizations (DAOs) - long-term smart contracts that contain the assets and encode the bylaws of an entire organization. What Ethereum intends to provide is a blockchain with a built-in fully fledged Turing-complete programming language that can be used to create "contracts" that can be used to encode arbitrary state transition functions, allowing users to create any of the systems described above, as well as many others that we have not yet imagined, simply by writing up the logic in a few lines of code.
Includes code samples.
It is increasingly important for applications to protect user privacy. Unfortunately, it is often non-trivial for programmers to enforce privacy policies. We have developed Jeeves to make it easier for programmers to enforce information flow policies: policies that describe who can see what information flows through a program. Jeeves allows the programmer to write policy-agnostic programs, separately implementing policies on sensitive values from other functionality. Just like Wooster's clever valet Jeeves in Wodehouse's stories, the Jeeves runtime does the hard work, automatically enforcing the policies to show the appropriate output to each viewer.
From what I gather, Jeeves takes Aspect Oriented approach to privacy. This is of course not a new idea. I presume that many of the classic problems with AOP would apply to Jeeves. Likewise, using information flow analysis for handling privacy policies is not an new idea. Combining the two, however, seems like a smart move. Putting the enforcement at the run-time level makes this sound more practical than other ideas I have heard before. Still, I personally think that specifying privacy policies at the end-user level and clarifying the concept of privacy at the normative, legal and conceptual levels are more pressing concerns. Indeed, come to think of it: I don't really recall a privacy breach that was caused by a simple information flow bug. Privacy expectations are broken on purpose by many companies and major data breaches occur when big databases are shared (recall the Netflix Prize thing). Given this, I assume the major use-case is for Apps, maybe even as a technology that someone like Apple could use to enforce the compliance of third-party Apps to their privacy policies.
I haven't looked too closely, so comments from more informed people are welcome.
Jeeves is implemented as an embedded DSL in Scala and Python.
I was surprised to see that DYNAMO hasn't been mentioned here in the past. DYNAMO (DYNAmic MOdels) was the simulation language used to code the simulations that led to the famous 1972 book The Limits to Growth from The Club of Rome. The language was designed in the late 1950s. It is clear that the language was used in several other places and evolved through several iterations, though I am not sure how extensively it was used. When Stafford Beer was creating Cybersyn for Salvador Allende he used DYNAMO to save time suggesting it was somewhat of a standard tool (this is described in Andrew Pickering's important book The Cybernetic Brain).
The language itself is essentially what you'd expect. It is declarative, programs consisting of a set of equations. The equations are zero and first-order difference equations of two kinds: level equations (accumulations) and rate equations (flows). Computation is integration over time. Levels can depend on rates and vice versa with the language automatically handling dependencies and circularities. Code looks like code looked those days: fixed columns, all caps, eight characters identifiers.
Here are a few links:
- Section 3.7 of this history of discrete event simulation languages is a succinct description of the history of the language and its main features.
- Finally, a nice piece on Jay Forrester who prompted the creation of SIMPLE and DYNAMO, its offspring.
Validating LR(1) parsers
An LR(1) parser is a finite-state automaton, equipped with a stack, which uses a combination of its current state and one lookahead symbol in order to determine which action to perform next. We present a validator which, when applied to a context-free grammar G and an automaton A, checks that A and G agree. Validating the parser provides the correctness guarantees required by verified compilers and other high-assurance software that involves parsing. The validation process is independent of which technique was used to construct A. The validator is implemented and proved correct using the Coq proof assistant. As an application, we build a formally-verified parser for the C99 language.
I've always been somewhat frustrated, while studying verified compiler technology, that the scope of the effort has generally been limited to ensuring that the AST and the generated code mean the same thing, as important as that obviously is. Not enough attention has been paid, IMHO, to other compiler phases. Parsing: The Solved Problem That Isn't does a good job illuminating some of the conceptual issues that arise in attempting to take parsers seriously as functions that we would like to compose etc. while maintaining some set of properties that hold of the individuals. Perhaps this work can shed some light on possible solutions to some of those issues, in addition to being worthwhile in its own right. Note the pleasing presence of an actual implementation that's been used on the parser of a real-world language, C99.
Tool Demo: Scala-Virtualized
This paper describes Scala-Virtualized, which extends the Scala language and compiler with a small number of features that enable combining the beneï¬ts of shallow and deep embeddings of DSLs. We demonstrate our approach by showing how to embed three different domain-speciï¬c languages in Scala. Moreover, we summarize how others have been using our extended compiler in their own research and teaching. Supporting artifacts of our tool include web-based tutorials, nightly builds, and an Eclipse update site hosting an up-to-date version of the Scala IDE for Eclipse based on the Virtualized Scala compiler and standard library.
Scala has always had a quite good EDSL story thanks to implicits, dot- and paren-inference, and methods-as-operators. Lately there are proposals to provide it with both macros-in-the-camlp4-sense and support for multi-stage programming. This paper goes into some depth on the foundations of the latter subject.
Nick Benton and Neel Krishnaswami, ICFP'11, A Semantic Model for Graphical User Interfaces:
We give a denotational model for graphical user interface (GUI) programming using the Cartesian closed category of ultrametric spaces. [..] We capture the arbitrariness of user input [..] [by a nondeterminism] â€œpowerspaceâ€ monad.
Algebras for the powerspace monad yield a model of intuitionistic linear logic, which we exploit in the definition of a mixed linear/non-linear domain-specific language for writing GUI programs. The non-linear part of the language is used for writing reactive stream-processing functions whilst the linear sublanguage naturally captures the generativity and usage constraints on the various linear objects in GUIs, such as the elements of a DOM or scene graph.
We have implemented this DSL as an extension to OCaml, and give examples demonstrating that programs in this style can be short and readable.
This is an application of their (more squiggly) LICS'11 submission, Ultrametric Semantics of Reactive Programs. In both these cases, I find appealing the fact the semantic model led to a type system and a language that was tricky to find.
Phil Windley whose blog posts on his startup Kynetx I sometimes mention here, since the company's product is built around a DSL, posted a nice item on reasons for designing a DSL. While partly about why people should go ahead and learn KRL, the post discusses some of the business advantages for building a product around a DSL, and some of the reasons for using and building DSLs that we here take for granted but not everyone else is cognizant of.