New paper: Theory of Programs

A simple mathematical framework, based on elementary set theory, to cover programming concepts.


-- Bertrand Meyer

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Active link

Active link: New paper: Theory of Programs

From the link:

"One of the principal ideas is that a program is simply the description of a mathematical relation. The program text is a rendering of that relation. As a consequence, one may construct programming languages simply as notations to express certain kinds of mathematics. This approach is the reverse of the usual one, where the program text and its programming languages are the starting point and the center of attention: theoreticians develop techniques to relate them to mathematical concepts. It is more effective to start from the mathematics (“unparsing” rather than parsing)."


From a cursory examination, this seems to bear some similarities to Hoare and He's "Unifying Theories of Programming", although it sticks a little more closely to standard set theory notation than UTP does.

Specification vs. Program

We already saw that the first attempt, stating that specifications are abstract and programs concrete, does not make the cut, since “level of abstraction” is a relative notion (the example was an assignment instruction, abstract for some and concrete for others). A seemingly more promising intuition is that programs are executable while specifications are purely descriptive. But that is also not right, even if we ignore the case frequently made for “executable specification” formalisms and stick to more traditional forms of the concepts. “Executable” cannot mean “directly appropriate for execution on a computer”, since in that case the notion would depend on hardware details. It has to mean “expressible in a programming language’. A staple example is that Result^2 = input is a specification whereas a particular square root computation, using for example Newton’s algorithm, is a program. But such examples also fail, since there are many programming languages today in which you can just write Result^2 = input and let the compiler figure out the implementation.

I didn't find this argument very compelling. 'Executable' is exactly what distinguishes a program from mere specification the way I use the terms. In some languages, x^2=y defines an effective procedure for computing x, and in other languages it doesn't. In the former, it's a program. In the latter, it's a specification.

It was an interesting read, though. Thanks for posting.

Rant on axiomatic methods

I haven't finished reading the paper, but the mathematics seemed indeed sound and much more elementary than in other works. I'm not sure it's more elementary than Hoare logic — if anything, more standard. In any case, I'd like to understand what's the simplifying insight. I'm not an expert, but "other stuff was just unnecessarily complicated" sounds an unlikely explanation. "Other stuff also achieved X, Y and Z, but we consciously give them up because $reasons" would be more credible.

I've also read the discussion on axiomatic methods in Sec. 6.1, and it doesn't seem accurate. Let me quote:

[Talking about Hoare logic using axioms] admirable work of Hoare and colleagues [7][8]. A notable property of these efforts is that they postulate their laws [...] The justification for this method — postulate your ideal laws, the model will follow — is that it has, in Russell’s words cited in [9], “the advantages of theft over honest toil”.

That quote from Russell appears too strong — after admiring Hoare's work, it compares it to theft.
First of all, this section seems to miss that people have proved Hoare logic sound wrt. other semantics.
More in general, tons of literature defines axioms and propose different models for those axioms. Those axioms serve as an interface for the different models. Similarly, you only run actual implemented methods, yet interfaces and abstract methods have a role.

However, sometimes interfaces make a system harder to understand. Removing indirection can make a system easier to understand, and maybe that's what this paper is doing as well.

Why so simple?

I think the main reason is that this formalization doesn't really capture any notion of computation. It provides a framework for building specifications as abstract relations on sets. It doesn't handle concepts of objects, first class functions, allocation, etc. but instead just provides a framework for them to be filled in.

No notion of computation?

1. My main worry right now is that I don't get yet well how this theory handles infinite loops, and loops without a termination proof.
2. The second worry is that I can't see a concrete evaluation. If this is aimed at programmers wanting to write correct programs, can we see one example of that?

1. More in detail, I'm not sure what this theory does for non-terminating loops: do they get a real meaning? Sec. 2.6 discusses briefly termination proofs, but we know very well that termination checking is hard. Nonterminating programs are not feasible; can we build bigger programs with them? I'm not sure what happens then. Can we do something on a program without proving that all its loops are terminating?

2. I'm missing an evaluation criteria, but maybe for some goal (an extension of) this theory is sufficient and simpler, even if it lacks something. Since the paper lacks even artificial examples, it's hard to compare this theory to more common ones — one needs to look both at the theory and at its application, and if the latter is much more tedious, the gain is questionable.

* I find "To capture any notion of computation" vague, but if you mean "define a Turing-complete language", you might be right because of point 1 above. Yet, what it does define might suffice.

* It doesn't handle objects, but it does get to control constructs in a WHILE language while seeming simpler than some other semantics for that. It seems adding state would also be possible by fixing a state space.

* It seems you could turn this into a denotational semantics, if you turned Sec. 2's "Notation"s from metalanguage sugar into syntax (the same way you turn shallowly embedded DSLs into deeply embedded ones). You'd just need to discuss syntax, which is not part high-school math.

* Also, I'm not sure this is simpler than small-step semantics: it seems it just relies on concepts that happen to be taught earlier since this doesn't use syntax and derivations.

To me, it looks roughly

To me, it looks roughly analogous to deciding to program in first order logic (or even second order logic), which I've investigated in the past. It's a very simple way of building specifications for programs, but it doesn't have a built-in theory of constructions to go along side the theory of specifications.

I don't mean Turing completeness. I mean a way of reasoning about whether a specification has an implementation.

Non-terminating loops generally have two possible different pathological behaviors when interpreted as specifications. They can be underspecified, like a function definition f(x) = f(x). As a specification, this tells you nothing about f. Or they can unsolvable, e.g. f(x) = f(x)+1.