Category Theory

Simplicial Databases

Simplicial Databases, David I. Spivak.

In this paper, we define a category DB, called the category of simplicial databases, whose objects are databases and whose morphisms are data-preserving maps. Along the way we give a precise formulation of the category of relational databases, and prove that it is a full subcategory of DB. We also prove that limits and colimits always exist in DB and that they correspond to queries such as select, join, union, etc. One feature of our construction is that the schema of a simplicial database has a natural geometric structure: an underlying simplicial set. The geometry of a schema is a way of keeping track of relationships between distinct tables, and can be thought of as a system of foreign keys. The shape of a schema is generally intuitive (e.g. the schema for round-trip flights is a circle consisting of an edge from $A$ to $B$ and an edge from $B$ to $A$), and as such, may be useful for analyzing data. We give several applications of our approach, as well as possible advantages it has over the relational model. We also indicate some directions for further research.

This is what happens when you try to take the existence of ORDER BY and COUNT in SQL seriously. :-)

If you're puzzled by how a geometric idea like simplexes could show up here, remember that the algebraic view of simplicial sets is as presheaves on the category of finite total orders and order-preserving maps. Every finite sequence gives rise to a total order on its set of positions, and tables have rows and columns as sequences!

An Innocent Model of Linear Logic

An Innocent Model of Linear Logic by Paul-André Melliès was referenced by Noam in a serendipitious subthread of the "Claiming Infinities" thread.

Here's the abstract:

Since its early days, deterministic sequential game semantics has been limited to linear or polarized fragments of linear logic. Every attempt to extend the semantics to full propositional linear logic has bumped against the so-called Blass problem, which indicates (misleadingly) that a category of sequential games cannot be self-dual and cartesian at the same time. We circumvent this problem by considering (1) that sequential games are inherently positional; (2) that they admit internal positions as well as external positions. We construct in this way a sequential game model of propositional linear logic, which incorporates two variants of the innocent arena game model: the well-bracketed and the non well-bracketed ones.

The introduction goes on to refer to to André Joyal's "Category Y with Conway games as objects, and winning strategies as morphisms, composed by sequential interaction," and points out that "it is a precursor of game semantics for proof theory and programming languages," and is "a self-dual category of sequential games." The foreword mentions that the paper goes on to give "a crash course on asynchronous games" and then "constructs a linear continuation monad equivalent to the identity functor, by allowing internal positions in our games, [which] circumvents the Blass problem and defines a model of linear logic."

Jacques Carette called this paper mind-blowing. My mind-blow warning light already exploded. I'm posting this paper because I know a number of LtUers are interested in these topics, and this way I can buttonhole one of them the next time I see them and ask them to explain it to me. ;)

Lawvere Theories and Monads

Martin Hyland and John Power (2007). The Category Theoretic Understanding of Universal Algebra: Lawvere Theories and Monads. ENTCS 172:437-458.

Both monads and Lawvere theories provide characterisations of algebraic structure, with monads providing the more general characterisation. The authors provide an introduction to Lawvere theories, discusses their relationship to sets, and why monads became the more popular treatment.

Then they tackle the application of the theory to the semantics of side effects, where they argue that the generality of monads allow them to characterise computational phenomena that are not to do with side effects such as partiality and continuations, and argue that Lawvere theories more cleanly characterise what side effects are.

This paper is a good introduction to an important line of recent research done by Hyland&Power; cf. also the LtU story Combining computational effects.

Parameterized Notions of Computation

Parameterized Notions of Computation, Robert Atkey, JFP 2008.

Moggi's Computational Monads and Power et al's equivalent notion of Freyd category have captured a large range of computational effects present in programming languages. Examples include non-termination, non-determinism, exceptions, continuations, side-effects and input/output. We present generalisations of both computational monads and Freyd categories, which we call parameterised monads and parameterised Freyd categories, that also capture computational effects with parameters.

Examples of such are composable continuations, side-effects where the type of the state varies and input/output where the range of inputs and outputs varies. By also considering structured parameterisation, we extend the range of effects to cover separated side-effects and multiple independent streams of I/O. We also present two typed λ-calculi that soundly and completely model our categorical definitions — with and without symmetric monoidal parameterisation — and act as prototypical languages with parameterised effects.

Once you've programmed with monads for a while, it's pretty common to start defining parameterized families of monads -- e.g., we might define a family of type constructors for IO, in which the program type additionally tracks which files the computation reads and writes from. This is a very convenient programming pattern, but the theory of it is honestly a little sketchy: on what basis do we conclude that the indices we define actually track what we intend them to? And furthermore, why can we believe that (say) the monadic equational laws still apply? That's the question Atkey lays out a nice solution to. He gives a nice categorical semantics for indexed, effectful computations, and then cooks up lambda calculi whose equational theory corresponds to the equations his semantics justifies.

The application to delimited continuations is quite nice, and the type theories can also give a little insight into the basics of how stuff like Hoare Type Theory works (which uses parameterized monads, with a very sophisticated language of parameters).

On a slightly tangential note, this also raises in my mind a methodological point. Over the last n years, we've seen many people identify certain type constructors, whose usage is pervasive, and greatly simplified with some syntactic extensions -- monads, comonads, applicative functors, arrows, and so on. It's incredible to suggest that we have exhausted the list of interesting types, and so together they constitute a good argument for some kind of language extension mechanism, such as macros. However, all these examples also raise the bar for when a macro is a good idea, because what makes them compelling is precisely that the right syntax yields an interesting and pretty equational theory in the extended language.

Computation and the Periodic Table

By now there is an extensive network of interlocking analogies between physics, topology, logic and computer science, which can be seen most easily by comparing the roles that symmetric monoidal closed categories play in each subject. However, symmetric monoidal categories are just the n = 1, k = 3 entry of a hypothesized “periodic table” of k-tuply monoidal n-categories. This raises the question of how these analogies extend. We present some thoughts on this question, focusing on how monoidal closed 2-categories might let us understand the lambda calculus more deeply.

Link to the talk

via The n-Category Café

Arrows generalise monads and idioms

Two fresh papers from the Edinburgh theory stable:
  • Lindley, Wadler & Yallop, 2008. The Arrow Calculus, (Functional Pearl) (submitted to ICFP).
  • Lindley, Wadler & Yallop, 2008. Idioms are oblivious, arrows are meticulous, monads are promiscuous (submitted to MSFP)
    We revisit the connection between three notions of computation: Moggi’s monads, Hughes’s arrows and McBride and Paterson’s idioms (also called applicative functors ). We show that idioms are equivalent to arrows that satisfy the type isomorphism A ∼> B ≅ 1 ∼> (A -> B) and that monads are equivalent to arrows that satisfy the type isomorphism A ∼> B ≅A → (1 ∼> B). Further, idioms embed into arrows and arrows embed into monads.
The first paper introduce a reformulation of the Power/Thielecke/Paterson/McBride axiomatisation of arrows, which the authors argue is more natural, and shows that arrows generalise both monads and idioms. The second paper studies the relationships between the three formalisations in more formal depth; in particular the results about applicative functors struck me as significant.

Species: making analytic functors practical for functional programming

Species: making analytic functors practical for functional programming, Jacques Carette and Gordon Uszkay. Submitted to MSFP 2008.

Inspired by Joyal's theory of species, we show how to add new type constructors and constructor combinators to the tool set of functional languages. We show that all the important properties of inductive types lift to this new setting. Species are analytic functors, representing a broader range of structures than regular functors. This includes structures such as bags, cycles and graphs. The theory is greatly inspired by combinatorics rather than type theory: this adds interesting new tools to bear, but also requires more work on our part to show that species form a good foundations for a theory of type constructors. The combinatorial tools provide a calculus for these structures which has strong links with classical analysis, via a functorial interpretation of generating series. Furthermore, we show how generic programming idioms also generalise to this richer setting. Once the theory is understood, various methods of implementation are relatively straightforward.

Learning more about the theory of species and working out its implications for programming has been on my to-do list for several years now. It really seems like the natural way to more tightly integrate combinatoric ideas into the functional programming style. (For an example of how fruitful this connection can be, consider how datatype differentiation explains zippers/functional pointers.)

However, there's been this whole "graduation" and "primary research focus" stuff that's kept getting in the way, so I'm happy to see that Carette and Uszkay are figuring it out so I can just crib from them. :)

Help John Baez and Mike Stay!

John Baez and Mike Stay are working on a book chapter titled "Categories in Physics, Topology, Logic and Computation: a Rosetta Stone." They previously asked for some help with the logic section, and now they're looking for help with the computation section:

But now I really need comments from anyone who likes categories and theoretical computer science!

In fact, the final 'computation' section of the paper is still very rough. It introduces combinators but doesn’t really explain them well yet... and perhaps worse, it doesn’t say anything about the lambda calculus!

I plan to fix some of these deficiencies in the next week. But, I figure it’s best to get comments and criticism now, while there’s still time to take it into account.

This is already a great introductory paper, but the computation section is indeed quite rough. Obviously comments are welcome, but even if you don't have anything to add, the first sections are sure to be enjoyable for many LtU readers. The paper does not assume any background in category theory, logic or physics and manages to be an excellent introduction to the surprising connections between these fields. If you have some background, it's a very quick and fun read, and if you can offer feedback, so much the better!

Parametric datatype-genericity

Parametric datatype-genericity. Jeremy Gibbons and Ross Paterson. Submitted for publication.

Datatype-generic programs are programs that are parametrized by a datatype or type functor. There are two main styles of datatype-generic programming: the Algebra of Programming approach, characterized by structured recursion operators parametrized by a shape functor, and the Generic Haskell approach, characterized by case analysis over the structure of a datatype. We show that the former enjoys a kind of parametricity, relating the behaviours of generic functions at different types; in contrast, the latter is more ad hoc, with no coherence required or provided between the various clauses of a definition.

How could we have not mentioned this before?

The main result of this paper is that fold is a higher-order natural transformation. This means, the authors explain, that fold is a rather special kind of datatype-generic operator, both enjoying and requiring coherence between its datatype-specific instances.

We had several long discussions about the uniqueness of fold, which may serve as an introduction for those new to this sort of discussion. The tutorial on the universality of fold is, of course, on the papers page.

For some reason I feel a craving for bananas...

A Logic for Parametric Polymorphism

A Logic for Parametric Polymorphism, Gordon Plotkin and Martín Abadi.

In this paper we introduce a logic for parametric polymorphism. Just as LCF is a logic for the simply-typed lambda-calculus with recursion and arithmetic, our logic is a logic for System F. The logic permits the formal presentation and use of relational parametricity. Parametricity yields --- for example --- encodings of initial algebras, final co-algebras and abstract datatypes, with corresponding proof principles of induction, co-induction and simulation.

XML feed