Semantics

Applications of Blockchain to Programming Language Theory

Let's talk about Blockchain. Goal is to use this forum topic to highlight its usefulness to programming language theory and practice. If you're familiar with existing research efforts, please share them here. In addition, feel free to generate ideas for how Blockchain could improve languages and developer productivity.

As one tasty example: Blockchain helps to formalize thinking about mutual knowledge and common knowledge, and potentially think about sharing intergalactic computing power through vast distributed computing fabrics. If we can design contracts in such a way that maximizes the usage of mutual knowledge while minimizing common knowledge to situations where you have to "prove your collateral", third-party transactions could eliminate a lot of back office burden. But, there might be benefits in other areas of computer science from such research, as well.

Some language researchers, like Mark S. Miller, have always dreamed of Agoric and the Decades-Long Quest for Secure Smart Contracts.

Some may also be aware that verification of smart contracts is an important research area, because of the notorious theft of purse via logic bug in an Ethereum smart contract.

The Gentle Art of Levitation

The Gentle Art of Levitation

2010 by James Chapman, Pierre-Evariste Dagand, Conor McBride, Peter Morrisy

We present a closed dependent type theory whose inductive types are given not by a scheme for generative declarations, but by encoding in a universe. Each inductive datatype arises by interpreting its description—a first-class value in a datatype of descriptions. Moreover, the latter itself has a description. Datatype-generic programming thus becomes ordinary programming. We show some of the resulting generic operations and deploy them in particular, useful ways on the datatype of datatype descriptions itself. Surprisingly this apparently self-supporting setup is achievable without paradox or infinite regress.
It's datatype descriptions all the way down.

Comprehending Ringads

Comprehending Ringads

2016 by Jeremy Gibbons

Ringad comprehensions represent a convenient notation for expressing database queries. The ringad structure alone does not provide a good explanation or an efficient implementation of relational joins; but by allowing heterogeneous comprehensions, involving both bag and indexed table ringads, we show how to accommodate these too.
Indexed/parametric/graded monads are the key (read the paper to understand the pun).

Sequent Calculus as a Compiler Intermediate Language

Sequent Calculus as a Compiler Intermediate Language
2016 by Paul Downen, Luke Maurer, Zena M. Ariola, Simon Peyton Jones

The typed λ-calculus arises canonically as the term language for a logic called natural deduction, using the Curry-Howard isomorphism: the pervasive connection between logic and programming languages asserting that propositions are types and proofs are programs. Indeed, for many people, the λ-calculus is the living embodiment of Curry-Howard.

But natural deduction is not the only logic! Conspicuously, natural deduction has a twin, born in the very same paper, called the sequent calculus. Thanks to the Curry-Howard isomorphism, terms of the sequent calculus can also be seen as a programming language with an emphasis on control flow.

Implementing Algebraic Effects in C

Implementing Algebraic Effects in C by Daan Leijen:

We describe a full implementation of algebraic effects and handlers as a library in standard and portable C99, where effect operations can be used just like regular C functions. We use a formal operational semantics to guide the C implementation at every step where an evaluation context corresponds directly to a particular C execution context. Finally we show a novel extension to the formal semantics to describe optimized tail resumptions and prove that the extension is sound. This gives two orders of magnitude improvement to the performance of tail resumptive operations (up to about 150 million operations per second on a Core i7@2.6GHz)

Another great paper by Daan Leijen, this time on a C library with immediate practical applications at Microsoft. The applicability is much wider though, since it's an ordinary C library for defining and using arbitrary algebraic effects. It looks pretty usable and is faster and more general than most of the C coroutine libraries that already exist.

It's a nice addition to your toolbox for creating language runtimes in C, particularly since it provides a unified, structured way of creating and handling a variety of sophisticated language behaviours, like async/await, in ordinary C with good performance. There has been considerable discussion here of C and low-level languages with green threads, coroutines and so on, so hopefully others will find this useful!

The Syntax and Semantics of Quantitative Type Theory

The Syntax and Semantics of Quantitative Type Theory by Robert Atkey:

Type Theory offers a tantalising promise: that we can program and reason within a single unified system. However, this promise slips away when we try to produce efficient programs. Type Theory offers little control over the intensional aspect of programs: how are computational resources used, and when can they be reused. Tracking resource usage via types has a long history, starting with Girard's Linear Logic and culminating with recent work in contextual effects, coeffects, and quantitative type theories. However, there is conflict with full dependent Type Theory when accounting for the difference between usages in types and terms. Recently, McBride has proposed a system that resolves this conflict by treating usage in types as a zero usage, so that it doesn't affect the usage in terms. This leads to a simple expressive system, which we have named Quantitative Type Theory (QTT).

McBride presented a syntax and typing rules for the system, as well as an erasure property that exploits the difference between “not used” and “used”, but does not do anything with the finer usage information. In this paper, we present present a semantic interpretation of a variant of McBride's system, where we fully exploit the usage information. We interpret terms simultaneously as having extensional (compile-time) content and intensional (runtime) content. In our example models, extensional content is set-theoretic functions, representing the compile-time or type-level content of a type-theoretic construction. Intensional content is given by realisers for the extensional content. We use Abramsky et al.'s Linear Combinatory Algebras as realisers, yield a large range of potential models from Geometry of Interaction, graph models, and syntactic models. Read constructively, our models provide a resource sensitive compilation method for QTT.

To rigorously define the structure required for models of QTT, we introduce the concept of a Quantitative Category with Families, a generalisation of the standard Category with Families class of models of Type Theory, and show that this class of models soundly interprets Quantitative Type Theory.

Resource-aware programming is a hot topic these days, with Rust exploiting affine and ownership types to scope and track resource usage, and with Ethereum requiring programs to spend "gas" to execute. Combining linear and dependent types has proven difficult though, so making it easier to track and reason about resource usage in dependent type theories would then be a huge benefit to making verification more practical in domains where resources are limited.

Philip Wadler: Category Theory for the Working Hacker

Nothing you don't already know, if you are inteo this sort of thing (and many if not most LtU-ers are), but a quick way to get the basic idea if you are not. Wadler has papers that explain Curry-Howard better, and the category theory content here is very basic -- but it's an easy listen that will give you the fundamental points if you still wonder what this category thing is all about.

To make this a bit more fun for those already in the know: what is totally missing from the talk (understandable given time constraints) is why this should interest the "working hacker". So how about pointing out a few cool uses/ideas that discerning hackers will appreciate? Go for it!

Fully Abstract Compilation via Universal Embedding

Fully Abstract Compilation via Universal Embedding by Max S. New, William J. Bowman, and Amal Ahmed:

A fully abstract compiler guarantees that two source components are observationally equivalent in the source language if and only if their translations are observationally equivalent in the target. Full abstraction implies the translation is secure: target-language attackers can make no more observations of a compiled component than a source-language attacker interacting with the original source component. Proving full abstraction for realistic compilers is challenging because realistic target languages contain features (such as control effects) unavailable in the source, while proofs of full abstraction require showing that every target context to which a compiled component may be linked can be back-translated to a behaviorally equivalent source context.

We prove the first full abstraction result for a translation whose target language contains exceptions, but the source does not. Our translation—specifically, closure conversion of simply typed λ-calculus with recursive types—uses types at the target level to ensure that a compiled component is never linked with attackers that have more distinguishing power than source-level attackers. We present a new back-translation technique based on a deep embedding of the target language into the source language at a dynamic type. Then boundaries are inserted that mediate terms between the untyped embedding and the strongly-typed source. This technique allows back-translating non-terminating programs, target features that are untypeable in the source, and well-bracketed effects.

Potentially a promising step forward to secure multilanguage runtimes. We've previously discussed security vulnerabilities caused by full abstraction failures here and here. The paper also provides a comprehensive review of associated literature, like various means of protection, back translations, embeddings, etc.

Simon Peyton Jones elected into the Royal Society Fellowship

Simon Peyton Jones has been elected as a Fellow of the Royal Society. The Royal Society biography reads:


Simon's main research interest is in functional programming languages, their implementation, and their application. He was a key contributor to the design of the now-standard functional language Haskell, and is the lead designer of the widely-used Glasgow Haskell Compiler (GHC). He has written two textbooks about the implementation of functional languages.

More generally, Simon is interested in language design, rich type systems, compiler technology, code generation, runtime systems, virtual machines, and garbage collection. He is particularly motivated by direct use of principled theory to practical language design and implementation -- that is one reason he loves functional programming so much.

Simon is also chair of Computing at School, the grass-roots organisation that was at the epicentre of the 2014 reform of the English computing curriculum.

Congratulations SPJ!

Running Probabilistic Programs Backwards

I saw this work presented at ESOP 2015 by Neil Toronto, and the talk was excellent (slides).

Running Probabilistic Programs Backwards
Neil Toronto, Jay McCarthy, David Van Horn
2015

Many probabilistic programming languages allow programs to be run under constraints in order to carry out Bayesian inference. Running programs under constraints could enable other uses such as rare event simulation and probabilistic verification---except that all such probabilistic languages are necessarily limited because they are defined or implemented in terms of an impoverished theory of probability. Measure-theoretic probability provides a more general foundation, but its generality makes finding computational content difficult.

We develop a measure-theoretic semantics for a first-order probabilistic language with recursion, which interprets programs as functions that compute preimages. Preimage functions are generally uncomputable, so we derive an abstract semantics. We implement the abstract semantics and use the implementation to carry out Bayesian inference, stochastic ray tracing (a rare event simulation), and probabilistic verification of floating-point error bounds.

(also on SciRate)

The introduction sells the practical side of the work a bit better than the abstract.

Stochastic ray tracing [30] is one such rare-event simulation task. As illus- trated in Fig. 1, to carry out stochastic ray tracing, a probabilistic program simulates a light source emitting a single photon in a random direction, which is reflected or absorbed when it hits a wall. The program outputs the photon’s path, which is constrained to pass through an aperture. Millions of paths that meet the constraint are sampled, then projected onto a simulated sensor array.

The program’s main loop is a recursive function with two arguments: path, the photon’s path so far as a list of points, and dir, the photon’s current direction.

simulate-photon path dir :=
  case (find-hit (fst path) dir) of
    absorb pt −→ (pt, path)
    reflect pt norm −→ simulate-photon (pt, path) (random-half-dir norm)

Running simulate-photon (pt, ()) dir, where pt is the light source’s location and dir is a random emission direction, generates a photon path. The fst of the path (the last collision point) is constrained to be in the aperture. The remainder of the program is simple vector math that computes ray-plane intersections.

In contrast, hand-coded stochastic ray tracers, written in general-purpose languages, are much more complex and divorced from the physical processes they simulate, because they must interleave the advanced Monte Carlo algorithms that ensure the aperture constraint is met.

Unfortunately, while many probabilistic programming languages support random real numbers, none are capable of running a probabilistic program like simulate-photon under constraints to carry out stochastic ray tracing. The reason is not lack of engineering or weak algorithms, but is theoretical at its core: they are all either defined or implemented using [density functions]. [...] Programs whose outputs are deterministic functions of random values and programs with recursion generally cannot denote density functions. The program simulate-photon exhibits both characteristics.

Measure-theoretic probability is a more powerful alternative to this naive probability theory based on probability mass and density functions. It not only subsumes naive probability theory, but is capable of defining any computable probability distribution, and many uncomputable distributions. But while even the earliest work [15] on probabilistic languages is measure-theoretic, the theory’s generality has historically made finding useful computational content difficult. We show that measure-theoretic probability can be made computational by

  1. Using measure-theoretic probability to define a compositional, denotational semantics that gives a valid denotation to every program.
  2. Deriving an abstract semantics, which allows computing answers to questions about probabilistic programs to arbitrary accuracy.
  3. Implementing the abstract semantics and efficiently solving problems.

In fact, our primary implementation, Dr. Bayes, produced Fig. 1b by running a probabilistic program like simulate-photon under an aperture constraint.

XML feed