Call-by-Name, Call-by Value and the Lambda Calculus

Gordon Plotkin's Call-by-Name, Call-by-Value and the Lambda Calculus (Theoretical Computer Science , Vol. 1, pp. 125-159, 1975), is available online.

The fundamental point made in the paper should seem natural to most LtU readers: In order to reason about programming language semantics one should look for programming language/calculus pairs.

The paper contrasts CBN and CBV, and shows the differences between the Lambda Calculi appropriate for describing each of them.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Now if only...

Now if only Milner's "A Theory of Type Polymorphism in Programming" were made available online, I would be a happy man.

Doing it Without Church Rosser

Dear All,

When reading the above paper my head starts spinning because of the freight of showing the church rosser property.

I find for example Paul Blain Levys Call-by-push-value dissertation from 2001 (*) much more accessible. You have the evaluation order fixed and instead of rewriting systems, deductive systems are the presentation means. Also you can have diverging derivations.

But if you think the dissertation is too long to read for you, I can also recommend Oliver Danvy and John Hatcliff paper from 1991 (**) titled Thunks (continued).

Best Regards

(*)
http://www.cs.bham.ac.uk/~pbl/papers/thesisqmwphd.pdf

(**)
http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.44.9660&rep=rep1&type=pdf

The first work to break ground

The first work to break ground in a whole new class of proofs —Plotkin's paper being the best example I know of— is likely to be very difficult to read in retrospect, for a couple of good reasons:

  • Perhaps most obviously, when the first proofs in a class are written people simply don't yet know techniques to make that sort of proof work cleanly. But also,

  • Because the work is breaking new ground, it has to be really excessively careful; once a general line of reasoning is established as valid and becomes familiar, researchers can write less meticulous proofs with some confidence about which details require extreme scrutiny versus which details are not to be sweated over.

Church and Rosser's proof of the Church-Rosser theorem was something of a horror, as I recall, with beautifully simple proofs only becoming available more than three decades later.

Not clear why Church Rosser

It is still not clear why in some window of time, again and again some normalization theorems are given. I found a paper showing a little bit the history of lambda calculus. There is a mention:

The Logic of Curry and Church
Jonathan P. Seldin, March, 2008
http://people.uleth.ca/~jonathan.seldin/CCL.pdf

There it is written:

The consistency result depended on some results that
Church obtained with Rosser and which were published
soon after in [Church and Rosser, 1936]. This latter
paper proved what is now known as the “Church-Rosser
Theorem” or “confluence theorem,” which is now
considered basic in term rewriting. (page 6 CCL.pdf)

So maybe the motivation for normalization results was some old
fear about inconsistency. But when we use for example a deductive
system instead of a rewriting system, we could nowadays use
cut elimination to show consistency.

Seldin on the other hand gives another more modern
motivation for the Church Rosser Property:

Since reduction represents the process of calculation,
it is important to have confidence that the order in
which calculation steps are performed does not make
a difference. (page 52 CCL.pdf)

But we can despise of it, when we have an order in
our calculus.

Bye

If one can't get CR-ness,

If one can't get CR-ness, I suppose one might as well decide it's not all that important.  But what CR-ness comes down to is, if A can be rewritten as B, then A and B are completely interchangeable in all possible ocntexts, which is hugely convenient to be able to assume. It's not, as I see it, fundamentally a property of computation; it's a property of reasoning about computation, which explains its historical origin in logical consistency.

For my dissertation I wanted to show that a calculus with fexprs would be fully as well-behaved as lambda-calculus.  Including CR-ness of course.  Along the way I observed that impure variant calculi can be set up to be fully CR too (fexprs themselves, it turns out, aren't even impure). At that point, one might wonder whether it's ever really necessary to give up CR-ness.

Non-Congruence vs CR-ness

It seems that at one point of your thesis you mention
the lack of "compatibility" prevents CR-ness. I guess
in math "compatibility" corresponds to "congruence":
So T = S should imply QT = QS, Q meaning the quote
function in LISP.

We can also ask for "congruence" of propositions. This
would mean if A <-> B then C(A) <-> C(B). This
breaks easily when for example C does refer to A
repectively B intentionally:

    C(X): The length of the formula X is greater than 10

I have once asked for further examples where the above
"congruence" breaks, and didn't get so much answers. For
example in modal logic, because of the necessity rule
and the distribution rule, we have "congruence".

I always though CR-ness is not related to non-congruence, and
it is rather a problem of not using different equality
signs and then not only trusting derived substition
theorems.

For example in FOL, the object congcruence for
formulas:

   s = t -> (A(s) -> A(t))

Can be shown from congruence for predicates
and function symbols in the first place:

   s = t, ... -> (P(s, ...) -> P(t, ...))

   s = t, ... -> f(s, ...) = f(t, ...)

But I am happy to dive deeper into your thesis, if
I find time for that, and find more counter examples
of congruence, and read about the CR-ness happy end!

Bye

So T = S should imply QT = QS

So T = S should imply QT = QS, Q meaning the quote function in LISP.

Correct.  Which is, in itself, entirely consistent with nontrivial congruence, compatibility, and CR-ness.  The problem (explored by Wand's paper, which really isn't about fexprs, it's about reflection) arises if you expect Q to suppress calculus rewriting of its operand.  There's no reason Q should do that:  in order for Q to behave correctly in the object language, it must prevent a data structure operand from being evaluated, but evaluation is an object-language process.  Rewriting in the calculus exists to model evaluation, but needn't be in lock step with it. A pure data structure should be irreducible in the calculus, but there may be reducible terms that reduce to data structures.  If T and S are reducible, then of course QT and QS should also be reducible.  The key to making that happen is that instead of implicit evaluation, where all data structures are evaluated unless their context says otherwise —which rather obviously ruins congruence— one uses explicit evaluation, where a data structure is to be evaluated only if its context says so.  Explicit evaluation works just fine for a calculus to correctly model Lisp with quotation (or fexprs) while maintaining nontrivial equational theory, compatibility, CR-ness.

A good summary?

I'm studying CPS transforms (let's say by following "Compiling with Continuations, Continued" and other works, mostly by Danvy), and every time a paper cites and summarizes Plotkin's work, I learn something new. Compared to modern papers, I'm missing an introduction summarizing the theorem statements of the body and their significance — and I haven't yet found any sufficient one.

For instance, this Wikipedia page (which looks like it's written by somebody from the Felleisen school, but what do I know) claims that Plotkin showed that evaluation and reduction strategies are only superficially similar — yet it fails to explain (to me) what's the difference: https://en.wikipedia.org/wiki/Reduction_strategy_(lambda_calculus)

I'd appreciate any enlightenment. I understand I should probably just skim the paper more carefully, but I've been reading other papers first.

That Plotkin paper defines

That Plotkin paper defines evaluation strategy in terms of machine semantics (the SECD machine with various tweaks considered) and considers reduction strategy directly in terms of the .. terms. Once you have the former, you can prove the correctness of reductions used by the later. I don't think that the Wikipedia page is worded well. The Plotkin paper starts with the two as distinct concepts and then shows how variants of each are related. But I'm very possibly missing the point they're making.

(Not that it's strong evidence of your theory, but I have a hard copy of that paper somewhere that I received from Felleisen as part of some class he was teaching a dozen or so years ago.)

Thanks for the

Thanks for the feedback.
(Apart from the Felleisen citations, I seem to only ever hear of Morris PhD thesis from Felleisen's school).