Applied Category Theory - The Emerging Science of Compositionality

An enjoyable 25-minute introductory talk: YOW! Lambda Jam 2019 - Ken Scambler - Applied Category Theory (slides)

What do programming, quantum physics, chemistry, neuroscience, systems biology, natural language parsing, causality, network theory, game theory, dynamical systems and database theory have in common?

As functional programmers, we know how useful category theory can be for our work - or perhaps how abstruse and distant it can seem. What is less well known is that applying category theory to the real world is an exciting field of study that has really taken off in just the last few years. It turns out that we share something big with other fields and industries - we want to make big things out of little things without everything going to hell! The key is compositionality, the central idea of category theory.

Previously: Seven Sketches in Compositionality: An Invitation to Applied Category Theory.

(via Brian McKenna)

Tensor Considered Harmful

Tensor Considered Harmful, by Alexander Rush

TL;DR: Despite its ubiquity in deep learning, Tensor is broken. It forces bad habits such as exposing private dimensions, broadcasting based on absolute position, and keeping type information in documentation. This post presents a proof-of-concept of an alternative approach, named tensors, with named dimensions. This change eliminates the need for indexing, dim arguments, einsum- style unpacking, and documentation-based coding. The prototype PyTorch library accompanying this blog post is available as namedtensor.

Thanks to Edward Z. Yang for pointing me to this "Considered Harmful" position paper.

Seven Sketches in Compositionality: An Invitation to Applied Category Theory

Seven Sketches in Compositionality: An Invitation to Applied Category Theory

2018 by Brendan Fong and David I. Spivak

Category theory is becoming a central hub for all of pure mathematics. It is unmatched in its ability to organize and layer abstractions, to find commonalities between structures of all sorts, and to facilitate communication between different mathematical communities. But it has also been branching out into science, informatics, and industry. We believe that it has the potential to be a major cohesive force in the world, building rigorous bridges between disparate worlds, both theoretical and practical. The motto at MIT is mens et manus, Latin for mind and hand. We believe that category theory—and pure math in general—has stayed in the realm of mind for too long; it is ripe to be brought to hand.
A very approachable but useful introduction to category theory. It avoids the Scylla and Charybdis of becoming incomprehensible after page 2 (as many academic texts do), and barely scratching the surface (as many popular texts do).

"Three Things I Wish I Knew When I Started Designing Languages"

The transcript of Three Things I Wish I Knew When I Started Designing Languages, a talk given by Peter Alvaro somewhere or other, is up at Info Q.

Peter Alavaro's main research interest is in taming distributed systems. He starts his talk with the provocative thesis, "In the future, all radical new languages will be domain-specific languages." He talks of the evolution of his ideas about dealing with distributed systems:

  1. Little interest by designers of programming-language designers in filling huge difficulty of debugging in context of distributed systems;
  2. PLs often make handling of data somewhat implicit, even with functional programming, which he says is dangerous in distributed programming;
  3. To talk about the flow of data properly, we need to talk about time;
  4. Two things that influenced him as a grad student: Jeff Ullman's claim that encapsulation and declarativity are in tension, and Fagin's theorem (the existential fragment of second-order logic characterises NP);
  5. Idea that distributed systems can be considered as protocols specified a bit like SQL or Datalog queries;
  6. Triviality with query languages of characterising the idea of place in distributive systems: they are just another relation parameter;
  7. Describing evolution of a system in time can be done with two other things: counters and negation, leading to Bertram Ludäscher's language Statelog. But this way of doing things leads to the kind of low-level overexpressive modelling he was trying to avoid;
  8. "What is it about...protocols that they seem to require negation to express?” Turns out that if you drop negation, you characterise the protocols that deliver messages deterministically.

He summarises by saying the only good reason to design a programming language (I assume he means a radically novel language) is to shape your understanding of the problem. No regrets of being the only user of his first language, Datalist, because the point is that it shaped all his later thought in his research.

Selective Functors

From Andrey Mokhov's twitter feed:

Is there any intermediate abstraction between applicative functors and monads? And if yes, what is it? In a new paper with @geo2A, @simonmar and @dimenix we explore "selective functors", which are essentially applicative functors with branching: https://www.staff.ncl.ac.uk/andrey.mokhov/selective-functors.pdf

We've implemented selective functors in Haskell: https://github.com/snowleopard/selective, OCaml: https://github.com/snowleopard/selective-ocaml, and even Coq: https://github.com/tuura/selective-theory-coq (the Coq repository contains some proofs of correctness that our selective instances are lawful). And there is also a PureScript fork!

The Little Typer

A new introductory book about dependent types, involving some familiar names:

The Little Typer

by Daniel P. Friedman and David Thrane Christiansen.

Foreword by Robert Harper.

Afterword by Conor McBride.

An introduction to dependent types, demonstrating the most beautiful aspects, one step at a time.

A program's type describes its behavior. Dependent types are a first-class part of a language, and are much more powerful than other kinds of types; using just one language for types and programs allows program descriptions to be as powerful as the programs they describe. The Little Typer explains dependent types, beginning with a very small language that looks very much like Scheme and extending it to cover both programming with dependent types and using dependent types for mathematical reasoning. Readers should be familiar with the basics of a Lisp-like programming language, as presented in the first four chapters of The Little Schemer.

The first five chapters of The Little Typer provide the needed tools to understand dependent types; the remaining chapters use these tools to build a bridge between mathematics and programming. Readers will learn that tools they know from programming—pairs, lists, functions, and recursion—can also capture patterns of reasoning. The Little Typer does not attempt to teach either practical programming skills or a fully rigorous approach to types. Instead, it demonstrates the most beautiful aspects as simply as possible, one step at a time.

On compositionality

Jules Hedges has written a thought-provoking blog post, On compositionality where he connects the familiar idea of compositionality to the idea of emergent effects in nature, where systems can be understood as either having compositional properties or emergent properties.

The key point about emergent systems is that they are hard to understand, and this is as true for engineering as it is for science. He goes on to say "As a final thought, I claim that compositionality is extremely delicate, and that it is so powerful that it is worth going to extreme lengths to achieve it", so that avoiding emergent effects is a characteristic of good programming language design.

Some thoughts:

  1. His examples of emergent systems are biology and game theory from an economic perspective. I would add to this list physics: of his co-authored paper showing that the spectral gap is undecidable, David Pérez-García said "For example, our results show that adding even a single particle to a lump of matter, however large, could in principle dramatically change its properties."
  2. Spolsky's famous characterisation of interfaces built on shaky foundations as Leaky abstractions to me makes the distinction between compositional and emergent systems a little less than sharp.
  3. We could talk endlessly about the list of what he regards as compositionality-breaking features of PLs. The evils of global state are well-documented, but I find dmbarbour's argument that Local state is poison a very interesting alternative way to look at what properties do we want from code; more generally, what kind of compositionalty PLs offer is very much paradigm dependent. Gotos are considered harmful, but the Linux kernel has little trouble with longjmp because of its mandated coding style: compositionality in engineering is a not just a matter of semantics but also of use. He targets OO and Haskell's type classes - I think he is quite correct - note that within these paradigms one can regain compositionality by restricting to LSP or algebraic classes, and supporting his thesis we see that these ideas have spawned ideas for the design of new, cleaner PLs.

History of Lisp

History of Lisp (The history of LISP according to McCarthy's memory in 1978, presented at the ACM SIGPLAN History of Programming Languages Conference.)

This is such a fun paper which I couldn't find on LtU. It's about the very early history of programming (1950s and '60s), back when things we take for granted today didn't exist yet.

On taking apart complex data structures with functions like CAR and CDR:

It was immediately apparent that arbitrary subexpressions of symbolic expressions could be obtained by composing the functions that extract immediate subexpressions, and this seemed reason enough to go to an algebraic language.

On creating new data, i.e. CONS:

At some point a cons(a,d,p,t) was defined, but it was regarded as a subroutine and not as a function with a value. ... Gelernter and Gerberich noticed that cons should be a function, not just a subroutine, and that its value should be the location of the word that had been taken from the free storage list. This permitted new expressions to be constructed out of subsubexpressions by composing occurrences of cons

On inventing IF:

This led to the invention of the true conditional expression which evaluates only one of N1 and N2 according to whether M is true or false and to a desire for a programming language that would allow its use.

On how supreme laziness led to the invention of garbage collection:

Once we decided on garbage collection, its actual implementation could be postponed, because only toy examples were being done.

You might have heard this before:

S.R. Russell noticed that eval could serve as an interpreter for LISP, promptly hand coded it, and we now had a programming language with an interpreter.

And the rest is history...

Notes on notation and thought

(via HN)

A nice collection of quotes on notation as a tool of thought. Mostly not programming related, which actually makes them more interesting, offering a richer diversity of examples. We used to have quite a few discussions of notation in the early days (at least in part because I never accepted the prevailing dogma that syntax is not that interesting or important), which is a good reminder for folks to check the archives.

Safe Dynamic Memory Management in Ada and SPARK

Safe Dynamic Memory Management in Ada and SPARK by Maroua Maalej, Tucker Taft, Yannick Moy:

Handling memory in a correct and efficient way is a step toward safer, less complex, and higher performing software-intensive systems. However, languages used for critical software development such as Ada, which supports formal verification with its SPARK subset, face challenges regarding any use of pointers due to potential pointer aliasing. In this work, we introduce an extension to the Ada language, and to its SPARK subset, to provide pointer types (“access types” in Ada) that provide provably safe, automatic storage management without any asynchronous garbage collection, and without explicit deallocation by the user. Because the mechanism for these safe pointers relies on strict control of aliasing, it can be used in the SPARK subset for formal verification, including both information flow analysis and proof of safety and correctness properties. In this paper, we present this proposal (which has been submitted for inclusion in the next version of Ada), and explain how we are able to incorporate these pointers into formal analyses

For the systems programmers among you, you might be interested in some new developments in Ada where they propose to add ownership types to Ada's pointer/access types, to improve the flexibility of the programs that can be written and whose safety can be automatically verified. The automated satisfiability of these safety properties is a key goal of the SPARK Ada subset.