RustBelt: Securing the Foundations of the Rust Programming Language

RustBelt: Securing the Foundations of the Rust Programming Language by Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, Derek Dreyer:

Rust is a new systems programming language that promises to overcome the seemingly fundamental tradeoff between high-level safety guarantees and low-level control over resource management. Unfortunately, none of Rust’s safety claims have been formally proven, and there is good reason to question whether they actually hold. Specifically, Rust employs a strong, ownership-based type system, but then extends the expressive power of this core type system through libraries that internally use unsafe features. In this paper, we give the first formal (and machine-checked) safety proof for a language representing a realistic subset of Rust. Our proof is extensible in the sense that, for each new Rust library that uses unsafe features, we can say what verification condition it must satisfy in order for it to be deemed a safe extension to the language. We have carried out this verification for some of the most important libraries that are used throughout the Rust ecosystem.

Rust is definitely pushing the envelope in a new direction, but there's always a little wariness around using libraries that make use of unsafe features, since "safety with performance" is a main reason people want to use Rust. So this is a great step in the right direction!

Automating Ad hoc Data Representation Transformations

Automating Ad hoc Data Representation Transformations by Vlad Ureche, Aggelos Biboudis, Yannis Smaragdakis, and Martin Odersky:

To maximize run-time performance, programmers often specialize their code by hand, replacing library collections and containers by custom objects in which data is restructured for efficient access. However, changing the data representation is a tedious and error-prone process that makes it hard to test, maintain and evolve the source code.

We present an automated and composable mechanism that allows programmers to safely change the data representation in delimited scopes containing anything from expressions to entire class definitions. To achieve this, programmers define a transformation and our mechanism automatically and transparently applies it during compilation, eliminating the need to manually change the source code.

Our technique leverages the type system in order to offer correctness guarantees on the transformation and its interaction with object-oriented language features, such as dynamic dispatch, inheritance and generics.

We have embedded this technique in a Scala compiler plugin and used it in four very different transformations, ranging from improving the data layout and encoding, to
retrofitting specialization and value class status, and all the way to collection deforestation. On our benchmarks, the technique obtained speedups between 1.8x and 24.5x.

This is a realization of an idea that has been briefly discussed here on LtU a few times, whereby a program is written using high-level representations, and the user has the option to provide a lowering to a more efficient representation after the fact.

This contrasts with the typical approach of providing efficient primitives, like primitive unboxed values, and leaving it to the programmer to compose them efficiently up front.

Type Checking Modular Multiple Dispatch with Parametric Polymorphism and Multiple Inheritance

Type Checking Modular Multiple Dispatch with Parametric Polymorphism and Multiple Inheritance by Eric Allen, Justin Hilburn, Scott Kilpatrick, Victor Luchangco, Sukyoung Ryu, David Chase, Guy L. Steele Jr.:

In previous work, we presented rules for defining overloaded functions that ensure type safety under symmetric multiple dispatch in an object-oriented language with multiple inheritance, and we showed how to check these rules without requiring the entire type hierarchy to be known, thus supporting modularity and extensibility. In this work, we extend these rules to a language that supports parametric polymorphism on both classes and functions.

In a multiple-inheritance language in which any type may be extended by types in other modules, some overloaded functions that might seem valid are correctly rejected by our rules. We explain how these functions can be permitted in a language that additionally supports an exclusion relation among types, allowing programmers to declare “nominal exclusions” and also implicitly imposing exclusion among different instances of each polymorphic type. We give rules for computing the exclusion relation, deriving many type exclusions from declared and implicit ones.

We also show how to check our rules for ensuring the safety of overloaded functions. In particular, we reduce the problem of handling parametric polymorphism to one of determining subtyping relationships among universal and existential types. Our system has been implemented as part of the open-source Fortress compiler.

Fortress was briefly covered here a couple of times, as were multimethods and multiple dispatch, but this paper really generalizes and nicely summarizes previous work on statically typed modular multimethods, and does a good job explaining the typing rules in an accessible way. The integration with parametric polymorphism I think is key to applying multimethods in other domains which may want modular multimethods, but not multiple inheritance.

The Formalization in COQ might also be of interest to some.

Also, another interesting point is Fortress' use of second-class intersection and union types to simplify type checking.

Xavier Leroy will receive the Royal Society's 2016 Milner Award

The Royal Society will award Xavier Leroy the Milner Award 2016

... in recognition of his research on the OCaml functional programming language and on the formal verification of compilers.

Xavier's replied:

It is very moving to see how far we have come, from Milner's great ideas of the 1970s to tools as powerful and as widely used as OCaml and Coq.

Reagents: Expressing and Composing Fine-grained Concurrency

Reagents: Expressing and Composing Fine-grained Concurrency, by Aaron Turon:

Efficient communication and synchronization is crucial for finegrained parallelism. Libraries providing such features, while indispensable, are difficult to write, and often cannot be tailored or composed to meet the needs of specific users. We introduce reagents, a set of combinators for concisely expressing concurrency algorithms. Reagents scale as well as their hand-coded counterparts, while providing the composability existing libraries lack.

This is a pretty neat approach to writing concurrent code, which lies somewhere between manually implementing low-level concurrent algorithms and STM. Concurrent algorithms are expressed and composed semi-naively, and Reagents automates the retries for you in case of thread interference (for transient failure of CAS updates), or they block waiting for input from another thread (in case of permanent failure where no input is available).

The core seems to be k-CAS with synchronous communication between threads to coordinate reactions on shared state. The properties seem rather nice, as Aaron describes:

When used in isolation, reagents are guaranteed to perform only the CASes that the hand-written algorithm would, so they introduce no overhead on shared-memory operations; by recoding an algorithm use reagents, you lose nothing. Yet unlike hand-written algorithms, reagents can be composed using choice, tailored with new blocking behavior, or combined into larger atomic blocks.

The benchmarks in section 6 look promising. This appears to be work towards Aaron's thesis which provides many more details.

Don Syme receives a medal for F#

Don Syme receives the Royal Academy of Engineering's Silver Medal for his work on F#. The citation reads:

F# is known for being a clear and more concise language that interoperates well with other systems, and is used in applications as diverse asanalysing the UK energy market to tackling money laundering. It allows programmers to write code with fewer bugs than other languages, so users can get their programme delivered to market both rapidly and accurately. Used by major enterprises in the UK and worldwide, F# is both cross-platform and open source, and includes innovative features such as unit-of-measure inference, asynchronous programming and type providers, which have in turn influenced later editions of C# and other industry languages.


Pure Subtype Systems

Pure Subtype Systems, by DeLesley S. Hutchins:

This paper introduces a new approach to type theory called pure subtype systems. Pure subtype systems differ from traditional approaches to type theory (such as pure type systems) because the theory is based on subtyping, rather than typing. Proper types and typing are completely absent from the theory; the subtype relation is defined directly over objects. The traditional typing relation is shown to be a special case of subtyping, so the loss of types comes without any loss of generality.

Pure subtype systems provide a uniform framework which seamlessly integrates subtyping with dependent and singleton types. The framework was designed as a theoretical foundation for several problems of practical interest, including mixin modules, virtual classes, and feature-oriented programming.

The cost of using pure subtype systems is the complexity of the meta-theory. We formulate the subtype relation as an abstract reduction system, and show that the theory is sound if the underlying reductions commute. We are able to show that the reductions commute locally, but have thus far been unable to show that they commute globally. Although the proof is incomplete, it is “close enough” to rule out obvious counter-examples. We present it as an open problem in type theory.

A thought-provoking take on type theory using subtyping as the foundation for all relations. He collapses the type hierarchy and unifies types and terms via the subtyping relation. This also has the side-effect of combining type checking and partial evaluation. Functions can accept "types" and can also return "types".

Of course, it's not all sunshine and roses. As the abstract explains, the metatheory is quite complicated and soundness is still an open question. Not too surprising considering type checking Type:Type is undecidable.

Hutchins' thesis is also available for a more thorough treatment. This work is all in pursuit of Hitchens' goal of feature-oriented programming.

Types for Flexible Objects

Types for Flexible Objects, by Pottayil Harisanker Menon, Zachary Palmer, Alexander Rozenshteyn, Scott Smith:

Scripting languages are popular in part due to their extremely flexible objects. These languages support numerous object features, including dynamic extension, mixins, traits, and first-class messages. While some work has succeeded in typing these features individually, the solutions have limitations in some cases and no project has combined the results.

In this paper we define TinyBang, a small typed language containing only functions, labeled data, a data combinator, and pattern matching. We show how it can directly express all of the aforementioned flexible object features and still have sound typing. We use a subtype constraint type inference system with several novel extensions to ensure full type inference; our algorithm refines parametric polymorphism for both flexibility and efficiency. We also use TinyBang to solve an open problem in OO literature: objects can be extended after being messaged without loss of width or depth subtyping and without dedicated metatheory. A core subset of TinyBang is proven sound and a preliminary implementation has been constructed.

An interesting paper I stumbled across quite by accident, it purports quite an ambitious set of features: generalizing previous work on first-class cases while supporting subtyping, mutation, and polymorphism all with full type inference, in an effort to match the flexibility of dynamically typed languages.

It does so by introducing a host of new concepts that are almost-but-not-quite generalizations of existing concepts, like "onions" which are kind of a type-indexed extensible record, and "scapes" which are sort of a generalization of pattern matching cases.

Instead of approaching objects via a record calculus, they approach it using its dual as variant matching. Matching functions then have degenerate dependent types, which I first saw in the paper Type Inference for First-Class Messages with Match-Functions. Interesting aside, Scott Smith was a coauthor on this last paper too, but it isn't referenced in the "flexible objects" paper, despite the fact that "scapes" are "match-functions".

Overall, quite a dense and ambitous paper, but the resulting TinyBang language looks very promising and quite expressive. Future work includes making the system more modular, as it currently requires whole program compilation, and adding first-class labels, which in past work has led to interesting results as well. Most work exploiting row polymorphism is particularly interesting because it supports efficient compilation to index-passing code for both records and variants. It's not clear if onions and scapes are also amenable to this sort of translation.

Edit: a previous paper was published in 2012, A Practical, Typed Variant Object Model -- Or, How to Stand On Your Head and Enjoy the View. BigBang is their language that provides syntactic sugar on top of TinyBang.

Edit 2: commas fixed, thanks!

Dependent Types for JavaScript

Dependent Types for JavaScript, by Ravi Chugh, David Herman, Ranjit Jhala:

We present Dependent JavaScript (DJS), a statically-typed dialect of the imperative, object-oriented, dynamic language. DJS supports the particularly challenging features such as run-time type-tests, higher-order functions, extensible objects, prototype inheritance, and arrays through a combination of nested refinement types, strong updates to the heap, and heap unrolling to precisely track prototype hierarchies. With our implementation of DJS, we demonstrate that the type system is expressive enough to reason about a variety of tricky idioms found in small examples drawn from several sources, including the popular book JavaScript: The Good Parts and the SunSpider benchmark suite.

Some good progress on inferring types for a very dynamic language. Explicit type declarations are placed in comments that start with "/*:".

/*: x∶Top → {ν ∣ite Num(x) Num(ν) Bool(ν)} */
function negate(x) {
    if (typeof x == "number") { return 0 - x; }
    else { return !x; }

Tool Demo: Scala-Virtualized

Tool Demo: Scala-Virtualized

This paper describes Scala-Virtualized, which extends the Scala language and compiler with a small number of features that enable combining the benefits of shallow and deep embeddings of DSLs. We demonstrate our approach by showing how to embed three different domain-specific languages in Scala. Moreover, we summarize how others have been using our extended compiler in their own research and teaching. Supporting artifacts of our tool include web-based tutorials, nightly builds, and an Eclipse update site hosting an up-to-date version of the Scala IDE for Eclipse based on the Virtualized Scala compiler and standard library.

Scala has always had a quite good EDSL story thanks to implicits, dot- and paren-inference, and methods-as-operators. Lately there are proposals to provide it with both macros-in-the-camlp4-sense and support for multi-stage programming. This paper goes into some depth on the foundations of the latter subject.

XML feed