Type Theory

The Simple Essence of Algebraic Subtyping: Principal Type Inference with Subtyping Made Easy

The Simple Essence of Algebraic Subtyping: Principal Type Inference with Subtyping Made Easy, Lionel Parreaux, ICFP 2020.

MLsub extends traditional Hindley-Milner type inference with subtyping while preserving compact principal types, an exciting new development. However, its specification in terms of biunification is difficult to understand, relying on the new concepts of bisubstitution and polar types, and making use of advanced notions from abstract algebra. In this paper, we show that these are in fact not essential to understanding the mechanisms at play in MLsub. We propose an alternative algorithm called Simple-sub, which can be implemented efficiently in under 500 lines of code (including parsing, simplification, and pretty-printing), looks more familiar, and is easier to understand.

There's also an introductory blog post and an online demo.

Stephen Dolan's Algebraic Subtyping (discussion) unexpectedly provided a solution to the problem of combining type inference and subtyping, but used somewhat heavy and unusual machinery. Now Lionel Parreaux shows that the system can be implemented in a very straightforward and pleasing way. Here's to hoping that it makes it into real languages!

The Little Typer

A new introductory book about dependent types, involving some familiar names:

The Little Typer

by Daniel P. Friedman and David Thrane Christiansen.

Foreword by Robert Harper.

Afterword by Conor McBride.

An introduction to dependent types, demonstrating the most beautiful aspects, one step at a time.

A program's type describes its behavior. Dependent types are a first-class part of a language, and are much more powerful than other kinds of types; using just one language for types and programs allows program descriptions to be as powerful as the programs they describe. The Little Typer explains dependent types, beginning with a very small language that looks very much like Scheme and extending it to cover both programming with dependent types and using dependent types for mathematical reasoning. Readers should be familiar with the basics of a Lisp-like programming language, as presented in the first four chapters of The Little Schemer.

The first five chapters of The Little Typer provide the needed tools to understand dependent types; the remaining chapters use these tools to build a bridge between mathematics and programming. Readers will learn that tools they know from programming—pairs, lists, functions, and recursion—can also capture patterns of reasoning. The Little Typer does not attempt to teach either practical programming skills or a fully rigorous approach to types. Instead, it demonstrates the most beautiful aspects as simply as possible, one step at a time.

Safe Dynamic Memory Management in Ada and SPARK

Safe Dynamic Memory Management in Ada and SPARK by Maroua Maalej, Tucker Taft, Yannick Moy:

Handling memory in a correct and efficient way is a step toward safer, less complex, and higher performing software-intensive systems. However, languages used for critical software development such as Ada, which supports formal verification with its SPARK subset, face challenges regarding any use of pointers due to potential pointer aliasing. In this work, we introduce an extension to the Ada language, and to its SPARK subset, to provide pointer types (“access types” in Ada) that provide provably safe, automatic storage management without any asynchronous garbage collection, and without explicit deallocation by the user. Because the mechanism for these safe pointers relies on strict control of aliasing, it can be used in the SPARK subset for formal verification, including both information flow analysis and proof of safety and correctness properties. In this paper, we present this proposal (which has been submitted for inclusion in the next version of Ada), and explain how we are able to incorporate these pointers into formal analyses

For the systems programmers among you, you might be interested in some new developments in Ada where they propose to add ownership types to Ada's pointer/access types, to improve the flexibility of the programs that can be written and whose safety can be automatically verified. The automated satisfiability of these safety properties is a key goal of the SPARK Ada subset.

The Gentle Art of Levitation

The Gentle Art of Levitation

2010 by James Chapman, Pierre-Evariste Dagand, Conor McBride, Peter Morrisy

We present a closed dependent type theory whose inductive types are given not by a scheme for generative declarations, but by encoding in a universe. Each inductive datatype arises by interpreting its description—a first-class value in a datatype of descriptions. Moreover, the latter itself has a description. Datatype-generic programming thus becomes ordinary programming. We show some of the resulting generic operations and deploy them in particular, useful ways on the datatype of datatype descriptions itself. Surprisingly this apparently self-supporting setup is achievable without paradox or infinite regress.
It's datatype descriptions all the way down.

Comprehending Ringads

Comprehending Ringads

2016 by Jeremy Gibbons

Ringad comprehensions represent a convenient notation for expressing database queries. The ringad structure alone does not provide a good explanation or an efficient implementation of relational joins; but by allowing heterogeneous comprehensions, involving both bag and indexed table ringads, we show how to accommodate these too.
Indexed/parametric/graded monads are the key (read the paper to understand the pun).

Sequent Calculus as a Compiler Intermediate Language

Sequent Calculus as a Compiler Intermediate Language
2016 by Paul Downen, Luke Maurer, Zena M. Ariola, Simon Peyton Jones

The typed λ-calculus arises canonically as the term language for a logic called natural deduction, using the Curry-Howard isomorphism: the pervasive connection between logic and programming languages asserting that propositions are types and proofs are programs. Indeed, for many people, the λ-calculus is the living embodiment of Curry-Howard.

But natural deduction is not the only logic! Conspicuously, natural deduction has a twin, born in the very same paper, called the sequent calculus. Thanks to the Curry-Howard isomorphism, terms of the sequent calculus can also be seen as a programming language with an emphasis on control flow.

The Syntax and Semantics of Quantitative Type Theory

The Syntax and Semantics of Quantitative Type Theory by Robert Atkey:

Type Theory offers a tantalising promise: that we can program and reason within a single unified system. However, this promise slips away when we try to produce efficient programs. Type Theory offers little control over the intensional aspect of programs: how are computational resources used, and when can they be reused. Tracking resource usage via types has a long history, starting with Girard's Linear Logic and culminating with recent work in contextual effects, coeffects, and quantitative type theories. However, there is conflict with full dependent Type Theory when accounting for the difference between usages in types and terms. Recently, McBride has proposed a system that resolves this conflict by treating usage in types as a zero usage, so that it doesn't affect the usage in terms. This leads to a simple expressive system, which we have named Quantitative Type Theory (QTT).

McBride presented a syntax and typing rules for the system, as well as an erasure property that exploits the difference between “not used” and “used”, but does not do anything with the finer usage information. In this paper, we present present a semantic interpretation of a variant of McBride's system, where we fully exploit the usage information. We interpret terms simultaneously as having extensional (compile-time) content and intensional (runtime) content. In our example models, extensional content is set-theoretic functions, representing the compile-time or type-level content of a type-theoretic construction. Intensional content is given by realisers for the extensional content. We use Abramsky et al.'s Linear Combinatory Algebras as realisers, yield a large range of potential models from Geometry of Interaction, graph models, and syntactic models. Read constructively, our models provide a resource sensitive compilation method for QTT.

To rigorously define the structure required for models of QTT, we introduce the concept of a Quantitative Category with Families, a generalisation of the standard Category with Families class of models of Type Theory, and show that this class of models soundly interprets Quantitative Type Theory.

Resource-aware programming is a hot topic these days, with Rust exploiting affine and ownership types to scope and track resource usage, and with Ethereum requiring programs to spend "gas" to execute. Combining linear and dependent types has proven difficult though, so making it easier to track and reason about resource usage in dependent type theories would then be a huge benefit to making verification more practical in domains where resources are limited.

RustBelt: Securing the Foundations of the Rust Programming Language

RustBelt: Securing the Foundations of the Rust Programming Language by Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, Derek Dreyer:

Rust is a new systems programming language that promises to overcome the seemingly fundamental tradeoff between high-level safety guarantees and low-level control over resource management. Unfortunately, none of Rust’s safety claims have been formally proven, and there is good reason to question whether they actually hold. Specifically, Rust employs a strong, ownership-based type system, but then extends the expressive power of this core type system through libraries that internally use unsafe features. In this paper, we give the first formal (and machine-checked) safety proof for a language representing a realistic subset of Rust. Our proof is extensible in the sense that, for each new Rust library that uses unsafe features, we can say what verification condition it must satisfy in order for it to be deemed a safe extension to the language. We have carried out this verification for some of the most important libraries that are used throughout the Rust ecosystem.

Rust is definitely pushing the envelope in a new direction, but there's always a little wariness around using libraries that make use of unsafe features, since "safety with performance" is a main reason people want to use Rust. So this is a great step in the right direction!

Type Systems as Macros

Type Systems as Macros, by Stephen Chang, Alex Knauth, Ben Greenman:

We present TURNSTILE, a metalanguage for creating typed embedded languages. To implement the type system, programmers write type checking rules resembling traditional judgment syntax. To implement the semantics, they incorporate elaborations into these rules. TURNSTILE critically depends on the idea of linguistic reuse. It exploits a macro system in a novel way to simultaneously type check and rewrite a surface program into a target language. Reusing a macro system also yields modular implementations whose rules may be mixed and matched to create other languages. Combined with typical compiler and runtime reuse, TURNSTILE produces performant typed embedded languages with little effort.

This looks pretty awesome considering it's not limited to simple typed languages, but extends all the way to System F and F-omega! Even better, they can reuse previous type systems to define new ones, thereby reducing the effort to implement more expressive type systems. All code and further details available here, and here's a blog post where Ben Greenman further discusses the related "type tailoring", and of course, these are both directly related to Active Libraries.

Taken to its extreme, why not have an assembler with a powerful macro system of this sort as your host language, and every high-level language would be built on this. I'm not sure if this approach would extend that far, but it's an interesting idea. You'd need a cpp-like highly portable macro tool, and porting to a new platform consists of writing architecture-specific macros for some core language, like System F.

This work may also conceptually dovetail with another thread discussing fexprs and compilation.

Contextual isomorphisms

Contextual Isomorphisms
Paul Blain Levy
2017

What is the right notion of "isomorphism" between types, in a simple type theory? The traditional answer is: a pair of terms that are inverse, up to a specified congruence. We firstly argue that, in the presence of effects, this answer is too liberal and needs to be restricted, using Führmann’s notion of thunkability in the case of value types (as in call-by-value), or using Munch-Maccagnoni’s notion of linearity in the case of computation types (as in call-by-name). Yet that leaves us with different notions of isomorphism for different kinds of type.

This situation is resolved by means of a new notion of “contextual” isomorphism (or morphism), analogous at the level of types to contextual equivalence of terms. A contextual morphism is a way of replacing one type with the other wherever it may occur in a judgement, in a way that is preserved by the action of any term with holes. For types of pure λ-calculus, we show that a contextual morphism corresponds to a traditional isomorphism. For value types, a contextual morphism corresponds to a thunkable isomorphism, and for computation types, to a linear isomorphism.

This paper is based on a very simple idea that everyone familiar with type-systems can enjoy. It then applies it in a technical setting in which it brings a useful contribution. I suspect that many readers will find that second part too technical, but they may still enjoy the idea itself. To facilitate this, I will rephrase the abstract above in a way that I hope makes it accessible to a larger audience.

The problem that the paper solves is: how do we know what it means for two types to be equivalent? For many languages they are reasonable definitions of equivalence (such that: there exists a bijection between these two types in the language), but for more advanced languages these definitions break down. For example, in presence of hidden mutable state, one can build a pair of functions from the one-element type unit to the two-element type bool and back that are the identity when composed together -- the usual definition of bijection -- while these two types should probably not be considered "the same". Those two functions share some hidden state, so they "cheat". Other, more complex notions of type equivalence have been given in the literature, but how do we know whether they are the "right" notions, or whether they may disappoint us in the same way?

To define what it means for two program fragments to be equivalent, we have a "gold standard", which is contextual equivalence: two program fragments are contextually equivalent if we can replace one for the other in any complete program without changing its behavior. This is simple to state, it is usually clear how to instantiate this definition for a new system, and it gives you a satisfying notion of equivalent. It may not be the most convenient one to work with, so people define others, more specific notions of equivalence (typically beta-eta-equivalence or logical relations); it is fine if they are more sophisticated, and their definiton harder to justify or understand, because they can always be compared to this simple definition to gain confidence.

The simple idea in the paper above is to use this exact same trick to define what it means for two types to be equivalent. Naively, one could say that two types are equivalent if, in any well-typed program, one can replace some occurrences of the first type by occurrences of the second type, all other things being unchanged. This does not quite work, as changing the types that appear in a program without changing its terms would create ill-typed terms. So instead, the paper proposes that two types are equivalent when we are told how to transform any program using the first type into a program using the second type, in a way that is bijective (invertible) and compositional -- see the paper for details.

Then, the author can validate this definition by showing that, when instantiated to languages (simple or complex) where existing notions of equivalence have been proposed, this new notion of equivalence corresponds to the previous notions.

(Readers may find that even the warmup part of the paper, namely the sections 1 to 4, pages 1 to 6, are rather dense, with a compactly exposed general idea and arguably a lack of concrete examples that would help understanding. Surely this terseness is in large part a consequence of strict page limits -- conference articles are the tweets of computer science research. A nice side-effect (no pun intended) is that you can observe a carefully chosen formal language at work, designed to expose the setting and perform relevant proofs in minimal space: category theory, and in particular the concept of naturality, is the killer space-saving measure here.)

XML feed