Theoretical Foundations for Practical Totally Functional Programming, by Colin Kemp

My doctoral dissertation is now publicly accessible at

I believe it may be of interest to many here.

For those with an interest in the lambda-calculus there's lots of lambda-goodness, too much to summarise here, but includes:
- Functional representations of data and operators on them, beyond Church encodings (Chpt 5): See where Characteristic Predicates and Combinator Parsers 'come from', and how they and many like can be derived
- Typing functions on Church Numbers (Chpt 4, and appendix)

There's also some rather deep foundational discussions
- What actually is 'interpretation' (Chpt 7)
- Turing-Complete computation without using tests
- Language-extension done via definitions, vs via interpetation (Chpt 6)
- That the bit (ie binary) isn't the smallest unit of information

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.


What problems does this solve? Why would I be motivated to read this? You must have answered this question in your thesis and viva, so it shouldn't be much work to repeat here. Your post above doesn't give me sufficient information.


Thanks for making this available. I'm about half-way into chapter 2 and finding it very educational and readable so far at least.
It appears to me to be a foundational work in much the same line as many of the other papers in the 'Getting Started' thread, so I might even suggest that it be mentioned there? I'm still getting to grips with some of the fundamentals of PL theory, so I'll leave this to someone wiser.


Hi Noel. OK, I'll give it my best TFP sales-pitch / tease :)

Historically there's been two views of how to compute: the 'Turing' view, where one has machines which process inert symbols, and what we can call the 'Church' view, where entities-with-semantics (ie functions) interact. Simplistically-speaking, in the Turing scheme one's program is composed of entities that are inert symbols which are then interpreted make them appear to have semantics, while in the Church scheme one's program is composed of entities which already have the desired semantics. One can argue that the Church scheme is preferable- in that scheme program-components have their semantics from the start, without needing to be interpreted at runtime with all the costs that entail. Logicially it would seem that interpretation should be avoided, yet it is the Turing view of computation (ie use of interpreted data) is the popular one, more-so than (pure) functional programming! Historically, the Lambda-Calculus lost out to the Turing Machine.

The costs of interpretation extend down from entire programs to the use of interpreters within programs. For example, say in your program you wanted to represent sets of integers, including infinite sets. You could define a set data-type containing eg a symbol for 'empty', a symbol for 'evens', for 'odds'..a symbol for 'theprimes'.. but then how about one or 'theprimeslessthan1000'? For maximum flexibility one would end up needing symbols such as '

They are also everywhere. Use of data is prevalent in programming, as is use of code that processes data (ie interpreters)- it's the common way of doing things in non- functional programming. For example, note that even humble 'if-then' expressions interpret boolean data!

Given its costs, and its prevalence, interpretation can be argued to be a problem. Can we do without all data and without all interpreters, and as a result achieve improved programs? Has Turing lead us astray all these years, seducing us with the promise of easy Turing-Completeness; and is the harder road not taken, expressive pure functional programming, the way of the future?

If the above interests you and you'd like to learn more, then you might want to have a read. I think I've managed to untangle the issues and sort a fair bit of it all out.

If it doesn't, but you're familiar with the lambda-calculus (ie pure functional programming), you might still be interested in how can one can systematically derive functions representing abstract data types (eg naturals, tuples, lists, sets, grammars), and operators on them (eg add/mul, projection, reduce, is-in, parse/print, respectively). Or in the transformations presented which let one eg actually derive step by step that mysterious function, the power function on Church numerals, 'pow a b = b a'.


I imagine this document was written in LaTeX; perhaps you can make the sources available?

You mean the pdf?

Good stuff!

I'm definitely a firm believer in total functions, though I had not heard of TFP, at least in this form.

Anyway, I'm reading through, so far very interesting! There seems to be a minor typo on pg 14, though. The representation of 3 only has 2 omegas, heh!

Not Practical Yet

To be clear, Totally Functional Programming is a completely different concept than Total Functional Programming as described by David Turner. The latter describes Functional Programming without divergence, error, or side-effects - which has many nice properties (such as eliminating semantic distinctions between strict, lenient, and lazy computation).

Totally Functional Programming (TFP) seems to fundamentally be about avoiding 'interpretation', which is to give meaning to a symbol or value that is not directly encoded by that symbol or value. Colin Kemp identifies TFP by its impact on code: the use of conditionals and branch tests. Much of Chapter 1 is dedicated to justifying why TFP is simpler than interpretation, but that hypothesis was by no means proven conclusively.

TFP as described by Colin is largely incompatible with 'representation' used in input/output, even eliding the side-effects of I/O. I can't imagine how, say, Unicode symbol processing or transforming a JPEG to perform edge detection could possibly be performed within TFP. One could use lists of integers and such, but remaining within the spirit of TFP is beyond me. There's a bit of hand-waving towards the end of Chapter 2 about possibly modifying hardware to represent I/O in terms of first-class functions and relationships ('wiring' between devices - maybe some sort of reactive programming?) but I don't expect to see much happen in that vein outside of hardware specification.

Real numbers are skimmed over with a reference to another document. But the challenge of interpreting those numbers was not tackled. For example, not much is offered to aide with, say, units tracking (i.e. kg*m/s is unit of inertia, kg*m*m/(s*s) is unit of energy) along with the real numbers, be it via types or dynamically within the functions.

I'm curious how to manage sets and graphs - issues dear to my heart. Sets described by Colin in Chapter 2 allow one to test membership (which is useful for infinite sets), but I can't imagine how one would go about, say, performing an inner join of one set to another. That said, state-of-the-art functional programming also does poorly with sets: even if it makes them first-class to properly handle uniqueness and disorder, this introduces challenges of observational equality and deterministic aggregation - i.e. need to prove associative and commutative and identity properties for aggregators. State-of-art functional programming does even worse with graphs, for which it is difficult to achieve isomorphic identity for artificial 'labels' of points without yet another first-class construct.

I'd also like to see support for termination analysis and a flexible type discipline, so that I can achieve Total Functional Programming alongside Totally Functional Programming. Mr. Kemp does touch on the latter, in Chapter 4.

Ultimately, I don't feel I could readily take Kemp's work and write a totally functional 'core' for my language, and I'm not 'totally' convinced of the argument justifying an effort to do so - not for simplicity or productivity for the programmer, and certainly not for correctness analysis, safety, security, performance, or any other property.

I think a paper proving or very strongly demonstrating the utility of TFP to a variety of purposes, with less effort spent defining it, might be of greater interest to the community.

I stopped reading when it

I stopped reading when it suggested that, to reduce complexity, we should avoid using data and Church encode everything. o O

Aside: Real numbers

I agree with much of what you've said here. I, too, am far from convinced (from reading the first few chapters so far) that this "style" is useful or desirable or even well-defined.

But, on a smaller point, I mentioned a long while back a data type that (when fused with the fold function) might serve well as a "TFP" notion of "real number," at least as much as characteristic functions serve for sets. The discussion for that post is useless, as the topic of the paper was immediately confused for interval arithmetic and then a long and irrelevant thread on interval arithmetic began.

How's "totally functional" relate to "everything's an object"?

Having just skimmed (sorry), I'm reminded of the view of Smalltalk and the early Actors work (and of Joule and E) that everything is an object, i.e., a function of the message it is applied to (or "sent"), and that the semantics of all values are defined only by their reaction to being applied to (sent) a message.

In this sense, Smalltalk, early Actors, Joule, and E are closer to Church's original untyped lambda calculus, in which all values were functions. Later Actors starting as far back as Clinger's thesis, Scheme, and the concurrent logic languages are in this sense closer to the typed lambda calculus, even though they are not statically typed, in that they distinguish between functions vs primitive values. This has always seemed like a step backwards to me, in that it prevents the introduction by the programmer of new forms of values which seem as primitive as the built in values.

Does the totally functional view make a similar criticism, in that such primitive values are defined by how they are acted upon (interpreted) rather than how they act?

Does the totally functional

Does the totally functional view make a similar criticism, in that such primitive values are defined by how they are acted upon (interpreted) rather than how they act?

That seems like a good way to describe the issue.

This has always seemed like a step backwards to me, in that it prevents the introduction by the programmer of new forms of values which seem as primitive as the built in values.

Two steps forward, one step back...

Many problem-domains - pretty much every problem domain involving I/O with external resources (including HCI) - fundamentally involves pattern-matching and interpretation. This is a rather difficult task to achieve without easily determined observational equality.

In a system where elements are defined by behavior, equality is often a feature the elements can lie about, which makes for interesting puzzles (there are three objects, two of them are equal, one of them always lies about equality, one of them always tells the truth, and the other decides based on the current time...).

One might note that term rewrite systems (like OBJ and Maude) achieve both "how they act, and how they're interpreted", simultaneously: you can easily define new objects and operations that act as primitives. The cost the term rewrite system approach pays: loss of sharp definition - no 'edges' for types or functions that could readily be used for I/O without shipping the whole program.

Behavioral Equality vs Primitive Equality

> In a system where elements are defined by behavior, equality
> is often a feature the elements can lie about, which makes for
> interesting puzzles [...]

Indeed. The capability community has wrestled with precisely this issue in the guise of the Grant Matcher Puzzle. E. Dean Tribble and Jim Larson have each made progress on this puzzle since that web page but I'm not sure what to point at.