Type Theory

Safe Dynamic Memory Management in Ada and SPARK

Safe Dynamic Memory Management in Ada and SPARK by Maroua Maalej, Tucker Taft, Yannick Moy:

Handling memory in a correct and efficient way is a step toward safer, less complex, and higher performing software-intensive systems. However, languages used for critical software development such as Ada, which supports formal verification with its SPARK subset, face challenges regarding any use of pointers due to potential pointer aliasing. In this work, we introduce an extension to the Ada language, and to its SPARK subset, to provide pointer types (“access types” in Ada) that provide provably safe, automatic storage management without any asynchronous garbage collection, and without explicit deallocation by the user. Because the mechanism for these safe pointers relies on strict control of aliasing, it can be used in the SPARK subset for formal verification, including both information flow analysis and proof of safety and correctness properties. In this paper, we present this proposal (which has been submitted for inclusion in the next version of Ada), and explain how we are able to incorporate these pointers into formal analyses

For the systems programmers among you, you might be interested in some new developments in Ada where they propose to add ownership types to Ada's pointer/access types, to improve the flexibility of the programs that can be written and whose safety can be automatically verified. The automated satisfiability of these safety properties is a key goal of the SPARK Ada subset.

The Gentle Art of Levitation

The Gentle Art of Levitation

2010 by James Chapman, Pierre-Evariste Dagand, Conor McBride, Peter Morrisy

We present a closed dependent type theory whose inductive types are given not by a scheme for generative declarations, but by encoding in a universe. Each inductive datatype arises by interpreting its description—a first-class value in a datatype of descriptions. Moreover, the latter itself has a description. Datatype-generic programming thus becomes ordinary programming. We show some of the resulting generic operations and deploy them in particular, useful ways on the datatype of datatype descriptions itself. Surprisingly this apparently self-supporting setup is achievable without paradox or infinite regress.
It's datatype descriptions all the way down.

Comprehending Ringads

Comprehending Ringads

2016 by Jeremy Gibbons

Ringad comprehensions represent a convenient notation for expressing database queries. The ringad structure alone does not provide a good explanation or an efficient implementation of relational joins; but by allowing heterogeneous comprehensions, involving both bag and indexed table ringads, we show how to accommodate these too.
Indexed/parametric/graded monads are the key (read the paper to understand the pun).

Sequent Calculus as a Compiler Intermediate Language

Sequent Calculus as a Compiler Intermediate Language
2016 by Paul Downen, Luke Maurer, Zena M. Ariola, Simon Peyton Jones

The typed λ-calculus arises canonically as the term language for a logic called natural deduction, using the Curry-Howard isomorphism: the pervasive connection between logic and programming languages asserting that propositions are types and proofs are programs. Indeed, for many people, the λ-calculus is the living embodiment of Curry-Howard.

But natural deduction is not the only logic! Conspicuously, natural deduction has a twin, born in the very same paper, called the sequent calculus. Thanks to the Curry-Howard isomorphism, terms of the sequent calculus can also be seen as a programming language with an emphasis on control flow.

The Syntax and Semantics of Quantitative Type Theory

The Syntax and Semantics of Quantitative Type Theory by Robert Atkey:

Type Theory offers a tantalising promise: that we can program and reason within a single unified system. However, this promise slips away when we try to produce efficient programs. Type Theory offers little control over the intensional aspect of programs: how are computational resources used, and when can they be reused. Tracking resource usage via types has a long history, starting with Girard's Linear Logic and culminating with recent work in contextual effects, coeffects, and quantitative type theories. However, there is conflict with full dependent Type Theory when accounting for the difference between usages in types and terms. Recently, McBride has proposed a system that resolves this conflict by treating usage in types as a zero usage, so that it doesn't affect the usage in terms. This leads to a simple expressive system, which we have named Quantitative Type Theory (QTT).

McBride presented a syntax and typing rules for the system, as well as an erasure property that exploits the difference between “not used” and “used”, but does not do anything with the finer usage information. In this paper, we present present a semantic interpretation of a variant of McBride's system, where we fully exploit the usage information. We interpret terms simultaneously as having extensional (compile-time) content and intensional (runtime) content. In our example models, extensional content is set-theoretic functions, representing the compile-time or type-level content of a type-theoretic construction. Intensional content is given by realisers for the extensional content. We use Abramsky et al.'s Linear Combinatory Algebras as realisers, yield a large range of potential models from Geometry of Interaction, graph models, and syntactic models. Read constructively, our models provide a resource sensitive compilation method for QTT.

To rigorously define the structure required for models of QTT, we introduce the concept of a Quantitative Category with Families, a generalisation of the standard Category with Families class of models of Type Theory, and show that this class of models soundly interprets Quantitative Type Theory.

Resource-aware programming is a hot topic these days, with Rust exploiting affine and ownership types to scope and track resource usage, and with Ethereum requiring programs to spend "gas" to execute. Combining linear and dependent types has proven difficult though, so making it easier to track and reason about resource usage in dependent type theories would then be a huge benefit to making verification more practical in domains where resources are limited.

RustBelt: Securing the Foundations of the Rust Programming Language

RustBelt: Securing the Foundations of the Rust Programming Language by Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, Derek Dreyer:

Rust is a new systems programming language that promises to overcome the seemingly fundamental tradeoff between high-level safety guarantees and low-level control over resource management. Unfortunately, none of Rust’s safety claims have been formally proven, and there is good reason to question whether they actually hold. Specifically, Rust employs a strong, ownership-based type system, but then extends the expressive power of this core type system through libraries that internally use unsafe features. In this paper, we give the first formal (and machine-checked) safety proof for a language representing a realistic subset of Rust. Our proof is extensible in the sense that, for each new Rust library that uses unsafe features, we can say what verification condition it must satisfy in order for it to be deemed a safe extension to the language. We have carried out this verification for some of the most important libraries that are used throughout the Rust ecosystem.

Rust is definitely pushing the envelope in a new direction, but there's always a little wariness around using libraries that make use of unsafe features, since "safety with performance" is a main reason people want to use Rust. So this is a great step in the right direction!

Type Systems as Macros

Type Systems as Macros, by Stephen Chang, Alex Knauth, Ben Greenman:

We present TURNSTILE, a metalanguage for creating typed embedded languages. To implement the type system, programmers write type checking rules resembling traditional judgment syntax. To implement the semantics, they incorporate elaborations into these rules. TURNSTILE critically depends on the idea of linguistic reuse. It exploits a macro system in a novel way to simultaneously type check and rewrite a surface program into a target language. Reusing a macro system also yields modular implementations whose rules may be mixed and matched to create other languages. Combined with typical compiler and runtime reuse, TURNSTILE produces performant typed embedded languages with little effort.

This looks pretty awesome considering it's not limited to simple typed languages, but extends all the way to System F and F-omega! Even better, they can reuse previous type systems to define new ones, thereby reducing the effort to implement more expressive type systems. All code and further details available here, and here's a blog post where Ben Greenman further discusses the related "type tailoring", and of course, these are both directly related to Active Libraries.

Taken to its extreme, why not have an assembler with a powerful macro system of this sort as your host language, and every high-level language would be built on this. I'm not sure if this approach would extend that far, but it's an interesting idea. You'd need a cpp-like highly portable macro tool, and porting to a new platform consists of writing architecture-specific macros for some core language, like System F.

This work may also conceptually dovetail with another thread discussing fexprs and compilation.

Contextual isomorphisms

Contextual Isomorphisms
Paul Blain Levy
2017

What is the right notion of "isomorphism" between types, in a simple type theory? The traditional answer is: a pair of terms that are inverse, up to a specified congruence. We firstly argue that, in the presence of effects, this answer is too liberal and needs to be restricted, using Führmann’s notion of thunkability in the case of value types (as in call-by-value), or using Munch-Maccagnoni’s notion of linearity in the case of computation types (as in call-by-name). Yet that leaves us with different notions of isomorphism for different kinds of type.

This situation is resolved by means of a new notion of “contextual” isomorphism (or morphism), analogous at the level of types to contextual equivalence of terms. A contextual morphism is a way of replacing one type with the other wherever it may occur in a judgement, in a way that is preserved by the action of any term with holes. For types of pure λ-calculus, we show that a contextual morphism corresponds to a traditional isomorphism. For value types, a contextual morphism corresponds to a thunkable isomorphism, and for computation types, to a linear isomorphism.

This paper is based on a very simple idea that everyone familiar with type-systems can enjoy. It then applies it in a technical setting in which it brings a useful contribution. I suspect that many readers will find that second part too technical, but they may still enjoy the idea itself. To facilitate this, I will rephrase the abstract above in a way that I hope makes it accessible to a larger audience.

The problem that the paper solves is: how do we know what it means for two types to be equivalent? For many languages they are reasonable definitions of equivalence (such that: there exists a bijection between these two types in the language), but for more advanced languages these definitions break down. For example, in presence of hidden mutable state, one can build a pair of functions from the one-element type unit to the two-element type bool and back that are the identity when composed together -- the usual definition of bijection -- while these two types should probably not be considered "the same". Those two functions share some hidden state, so they "cheat". Other, more complex notions of type equivalence have been given in the literature, but how do we know whether they are the "right" notions, or whether they may disappoint us in the same way?

To define what it means for two program fragments to be equivalent, we have a "gold standard", which is contextual equivalence: two program fragments are contextually equivalent if we can replace one for the other in any complete program without changing its behavior. This is simple to state, it is usually clear how to instantiate this definition for a new system, and it gives you a satisfying notion of equivalent. It may not be the most convenient one to work with, so people define others, more specific notions of equivalence (typically beta-eta-equivalence or logical relations); it is fine if they are more sophisticated, and their definiton harder to justify or understand, because they can always be compared to this simple definition to gain confidence.

The simple idea in the paper above is to use this exact same trick to define what it means for two types to be equivalent. Naively, one could say that two types are equivalent if, in any well-typed program, one can replace some occurrences of the first type by occurrences of the second type, all other things being unchanged. This does not quite work, as changing the types that appear in a program without changing its terms would create ill-typed terms. So instead, the paper proposes that two types are equivalent when we are told how to transform any program using the first type into a program using the second type, in a way that is bijective (invertible) and compositional -- see the paper for details.

Then, the author can validate this definition by showing that, when instantiated to languages (simple or complex) where existing notions of equivalence have been proposed, this new notion of equivalence corresponds to the previous notions.

(Readers may find that even the warmup part of the paper, namely the sections 1 to 4, pages 1 to 6, are rather dense, with a compactly exposed general idea and arguably a lack of concrete examples that would help understanding. Surely this terseness is in large part a consequence of strict page limits -- conference articles are the tweets of computer science research. A nice side-effect (no pun intended) is that you can observe a carefully chosen formal language at work, designed to expose the setting and perform relevant proofs in minimal space: category theory, and in particular the concept of naturality, is the killer space-saving measure here.)

Polymorphism, subtyping and type inference in MLsub

I am very enthusiastic about the following paper: it brings new ideas and solves a problem that I did not expect to be solvable, namely usable type inference when both polymorphism and subtyping are implicit. (By "usable" here I mean that the inferred types are both compact and principal, while previous work generally had only one of those properties.)

Polymorphism, Subtyping, and Type Inference in MLsub
Stephen Dolan and Alan Mycroft
2017

We present a type system combining subtyping and ML-style parametric polymorphism. Unlike previous work, our system supports type inference and has compact principal types. We demonstrate this system in the minimal language MLsub, which types a strict superset of core ML programs.

This is made possible by keeping a strict separation between the types used to describe inputs and those used to describe outputs, and extending the classical unification algorithm to handle subtyping constraints between these input and output types. Principal types are kept compact by type simplification, which exploits deep connections between subtyping and the algebra of regular languages. An implementation is available online.

The paper is full of interesting ideas. For example, one idea is that adding type variables to the base grammar of types -- instead of defining them by their substitution -- forces us to look at our type systems in ways that are more open to extension with new features. I would recommend looking at this paper even if you are interested in ML and type inference, but not subtyping, or in polymorphism and subtyping, but not type inference, or in subtyping and type inference, but not functional languages.

This paper is also a teaser for the first's author PhD thesis, Algebraic Subtyping. There is also an implementation available.

(If you are looking for interesting work on inference of polymorphism and subtyping in object-oriented languages, I would recommend Getting F-Bounded Polymorphism into Shape by Ben Greenman, Fabian Muehlboeck and Ross Tate, 2014.)

Proving Programs Correct Using Plain Old Java Types

Proving Programs Correct Using Plain Old Java Types, by Radha Jagadeesan, Alan Jeffrey, Corin Pitcher, James Riely:

Tools for constructing proofs of correctness of programs have a long history of development in the research community, but have often faced difficulty in being widely deployed in software development tools. In this paper, we demonstrate that the off-the-shelf Java type system is already powerful enough to encode non-trivial proofs of correctness using propositional Hoare preconditions and postconditions.

We illustrate the power of this method by adapting Fähndrich and Leino’s work on monotone typestates and Myers and Qi’s closely related work on object initialization. Our approach is expressive enough to address phased initialization protocols and the creation of cyclic data structures, thus allowing for the elimination of null and the special status of constructors. To our knowledge, our system is the first that is able to statically validate standard one-pass traversal algorithms for cyclic graphs, such as those that underlie object deserialization. Our proof of correctness is mechanized using the Java type system, without any extensions to the Java language.

Not a new paper, but it provides a lightweight verification technique for some program properties that you can use right now, without waiting for integrated theorem provers or SMT solvers. Properties that require only monotone typestates can be verified, ie. those that operations can only move the typestate "forwards".

In order to achieve this, they require programmers to follow a few simple rules to avoid Java's pervasive nulls. These are roughly: don't assign null explicitly, be sure to initialize all fields when constructing objects.

XML feed