It is increasingly important for applications to protect user privacy. Unfortunately, it is often non-trivial for programmers to enforce privacy policies. We have developed Jeeves to make it easier for programmers to enforce information flow policies: policies that describe who can see what information flows through a program. Jeeves allows the programmer to write policy-agnostic programs, separately implementing policies on sensitive values from other functionality. Just like Wooster's clever valet Jeeves in Wodehouse's stories, the Jeeves runtime does the hard work, automatically enforcing the policies to show the appropriate output to each viewer.
From what I gather, Jeeves takes Aspect Oriented approach to privacy. This is of course not a new idea. I presume that many of the classic problems with AOP would apply to Jeeves. Likewise, using information flow analysis for handling privacy policies is not an new idea. Combining the two, however, seems like a smart move. Putting the enforcement at the run-time level makes this sound more practical than other ideas I have heard before. Still, I personally think that specifying privacy policies at the end-user level and clarifying the concept of privacy at the normative, legal and conceptual levels are more pressing concerns. Indeed, come to think of it: I don't really recall a privacy breach that was caused by a simple information flow bug. Privacy expectations are broken on purpose by many companies and major data breaches occur when big databases are shared (recall the Netflix Prize thing). Given this, I assume the major use-case is for Apps, maybe even as a technology that someone like Apple could use to enforce the compliance of third-party Apps to their privacy policies.
I haven't looked too closely, so comments from more informed people are welcome.
Jeeves is implemented as an embedded DSL in Scala and Python.
I was surprised to see that DYNAMO hasn't been mentioned here in the past. DYNAMO (DYNAmic MOdels) was the simulation language used to code the simulations that led to the famous 1972 book The Limits to Growth from The Club of Rome. The language was designed in the late 1950s. It is clear that the language was used in several other places and evolved through several iterations, though I am not sure how extensively it was used. When Stafford Beer was creating Cybersyn for Salvador Allende he used DYNAMO to save time suggesting it was somewhat of a standard tool (this is described in Andrew Pickering's important book The Cybernetic Brain).
The language itself is essentially what you'd expect. It is declarative, programs consisting of a set of equations. The equations are zero and first-order difference equations of two kinds: level equations (accumulations) and rate equations (flows). Computation is integration over time. Levels can depend on rates and vice versa with the language automatically handling dependencies and circularities. Code looks like code looked those days: fixed columns, all caps, eight characters identifiers.
Here are a few links:
- Section 3.7 of this history of discrete event simulation languages is a succinct description of the history of the language and its main features.
- Finally, a nice piece on Jay Forrester who prompted the creation of SIMPLE and DYNAMO, its offspring.
Validating LR(1) parsers
An LR(1) parser is a finite-state automaton, equipped with a stack, which uses a combination of its current state and one lookahead symbol in order to determine which action to perform next. We present a validator which, when applied to a context-free grammar G and an automaton A, checks that A and G agree. Validating the parser provides the correctness guarantees required by verified compilers and other high-assurance software that involves parsing. The validation process is independent of which technique was used to construct A. The validator is implemented and proved correct using the Coq proof assistant. As an application, we build a formally-verified parser for the C99 language.
I've always been somewhat frustrated, while studying verified compiler technology, that the scope of the effort has generally been limited to ensuring that the AST and the generated code mean the same thing, as important as that obviously is. Not enough attention has been paid, IMHO, to other compiler phases. Parsing: The Solved Problem That Isn't does a good job illuminating some of the conceptual issues that arise in attempting to take parsers seriously as functions that we would like to compose etc. while maintaining some set of properties that hold of the individuals. Perhaps this work can shed some light on possible solutions to some of those issues, in addition to being worthwhile in its own right. Note the pleasing presence of an actual implementation that's been used on the parser of a real-world language, C99.
Tool Demo: Scala-Virtualized
This paper describes Scala-Virtualized, which extends the Scala language and compiler with a small number of features that enable combining the beneﬁts of shallow and deep embeddings of DSLs. We demonstrate our approach by showing how to embed three different domain-speciﬁc languages in Scala. Moreover, we summarize how others have been using our extended compiler in their own research and teaching. Supporting artifacts of our tool include web-based tutorials, nightly builds, and an Eclipse update site hosting an up-to-date version of the Scala IDE for Eclipse based on the Virtualized Scala compiler and standard library.
Scala has always had a quite good EDSL story thanks to implicits, dot- and paren-inference, and methods-as-operators. Lately there are proposals to provide it with both macros-in-the-camlp4-sense and support for multi-stage programming. This paper goes into some depth on the foundations of the latter subject.
Nick Benton and Neel Krishnaswami, ICFP'11, A Semantic Model for Graphical User Interfaces:
We give a denotational model for graphical user interface (GUI) programming using the Cartesian closed category of ultrametric spaces. [..] We capture the arbitrariness of user input [..] [by a nondeterminism] “powerspace” monad.
Algebras for the powerspace monad yield a model of intuitionistic linear logic, which we exploit in the definition of a mixed linear/non-linear domain-specific language for writing GUI programs. The non-linear part of the language is used for writing reactive stream-processing functions whilst the linear sublanguage naturally captures the generativity and usage constraints on the various linear objects in GUIs, such as the elements of a DOM or scene graph.
We have implemented this DSL as an extension to OCaml, and give examples demonstrating that programs in this style can be short and readable.
This is an application of their (more squiggly) LICS'11 submission, Ultrametric Semantics of Reactive Programs. In both these cases, I find appealing the fact the semantic model led to a type system and a language that was tricky to find.
Phil Windley whose blog posts on his startup Kynetx I sometimes mention here, since the company's product is built around a DSL, posted a nice item on reasons for designing a DSL. While partly about why people should go ahead and learn KRL, the post discusses some of the business advantages for building a product around a DSL, and some of the reasons for using and building DSLs that we here take for granted but not everyone else is cognizant of.
Macros that Work Together - Compile-Time Bindings, Partial Expansion, and Definition Contexts, Matthew Flatt, Ryan Culpepper, David Darais, and Robert Bruce Findler. Under consideration for publication in J. Functional Programming.
Racket (formerly PLT Scheme) is a large language that is built mostly within itself. Unlike the usual
approach taken by non-Lisp languages, the self-hosting of Racket is not a matter of bootstrapping
one implementation through a previous implementation, but instead a matter of building a tower of
languages and libraries via macros. The upper layers of the tower include a class system, a component
system, pedagogic variants of Scheme, a statically typed dialect of Scheme, and more. The demands
of this language-construction effort require a macro system that is substantially more expressive than
previous macro systems. In particular, while conventional Scheme macro systems handle stand-alone
syntactic forms adequately, they provide weak support for macros that share information or macros
that use existing syntactic forms in new contexts.
This paper describes and models novel features of the Racket macro system, including support for
general compile-time bindings, sub-form expansion and analysis, and environment management. The
presentation assumes a basic familiarity with Lisp-style macros, and it takes for granted the need for
macros that respect lexical scope. The model, however, strips away the pattern and template system
that is normally associated with Scheme macros, isolating a core that is simpler, that can support
pattern and template forms themselves as macros, and that generalizes naturally to Racket’s other
A good description of Racket's rocket science tools for growing languages.
Oleg Kiselyov has just posted another amazing work: Semi-implicit batched remote code execution as staging.
Batching several remote-procedure or remote-object operations into one request decreases the number of network client/server round-trips, reduces the communication overhead and indeed significantly improves performance of distributed applications. The benefits are offset by the cost of restructuring the code to incite large batches; and by the increase in the difficulty of reasoning about the code, predicting its performance let alone establishing correctness. The overall research goal is to reduce the downside.
We describe a semi-implicit batching mechanism that makes the points of remote server communication explicit yet conceals the proxies, saving the trouble of writing them. The changes to the client code are minimal: mainly, adding calls to force. The type-checker will not let the programmer forget to call force. The remote batch server is simple and generic, with no need to optimize for specific clients.
Our mechanism batches both independent and data-dependent remote calls. Our mechanism is compositional, letting the programmer build nested applications and conditional (and, potentially, iterative) statements using composition, application and naming. Writing a remote program is exactly like writing a typed local program, which is type-checked locally, and can even be executed locally (for debugging).
The key insights are treating remote execution as a form of staging (meta-programming), generalizing mere remote function calls to remote applicative and conditional expressions, and introducing an embedded domain-specific language, Chourai, for such expressions. A batch of dependent remote function calls can then be regarded as a complex applicative expression in the A-normal form. Another key insight is that emulating call-by-value via call-by-need surprisingly makes sense.
Here's an example piece of Chourai code, for deleting albums whose rating is below 5 among the first n albums of an album database (called "large") hosted by the server.
next_album, and similar functions constitute the "RPC" interface to the server.
let delete_low_rating n =
let rec loop album i =
let t = guard (app2 lt (app get_rating album) (int 5))
(fun () -> app delete_album album) in
if i >= n then force t else
loop (app next_album album) (succ i)
in loop (app get_album (string "large")) 0;;
delete_low_rating 4 requires just one round-trip to the server!
Pure and Declarative Syntax Definition: Paradise Lost and Regained by Lennart C. L. Kats, Eelco Visser, Guido Wachsmuth from Delft
Syntax definitions are pervasive in modern software systems, and serve as the basis for language processing tools like parsers and compilers. Mainstream parser generators pose restrictions on syntax definitions that follow from their implementation algorithm. They hamper evolution, maintainability, and compositionality of syntax definitions. The pureness and declarativity of syntax definitions is lost. We analyze how these problems arise for different aspects of syntax definitions, discuss their consequences for language engineers, and show how the pure and declarative nature of syntax definitions can be regained.
I haven't compared this version with the Onward 2010 version, but they look essentially the same. It seems timely to post this paper, considering the other recent story Yacc is dead. There is not a whole lot to argue against in this paper, since we all "know" the other approaches aren't as elegant and only resort to them for specific reasons such as efficiency. Yet, this is the first paper I know of that tries to state the argument to software engineers.
For example, the Dragon Book, in every single edition, effectively brushes these topics aside. In particular, the Dragon Book does not even mention scannerless parsing as a technique, and instead only explains the "advantages" of using a scanner. Unfortunately, the authors of this paper don't consider other design proposals, either, such as Van Wyk's context-aware scanners from GPCE 2007. It is examples like these that made me wish the paper was a bit more robust in its analysis; the examples seem focused on the author's previous work.
If you are not familiar with the author's previous work in this area, the paper covers it in the references. It includes Martin Bravenboer's work on modular Eclipse IDE support for AspectJ.