What does focusing tell us about language design?

A blog post about Call-By-Push-Value by Rob Simmons: What does focusing tell us about language design?

I think that one of the key observations of focusing/CBPV is that programs are dealing with two different things - data and computation - and that we tend to get the most tripped up when we confuse the two.

  • Data is classified by data types (a.k.a. positive types). Data is defined by how it is constructed, and the way you use data is by pattern-matching against it.
  • Computation is classified by computation types (a.k.a. negative types). Computations are defined their eliminations - that is, by how they respond to signals/messages/pokes/arguments.

There are two things I want to talk about, and they're both recursive types: call-by-push-value has positive recursive types (which have the feel of inductive types and/or algebras and/or what we're used to as datatypes in functional languages) and negative recursive types (which have the feel of recursive, lazy records and/or "codata" whatever that is and/or coalgebras and/or what William Cook calls objects). Both positive and negative recursive types are treated by Paul Blain Levy in his thesis (section 5.3.2) and in the Call-By-Push Value book (section 4.2.2).

In particular, I want to claim that Call-By-Push-Value and focusing suggest two fundamental features that should be, and generally aren't (at least simultaneously) in modern programming languages:

  • Support for structured data with rich case analysis facilities (up to and beyond what are called views)
  • Support for recursive records and negative recursive types.

Previously on Rob's blog: Embracing and extending the Levy language; on LtU: Call by push-value, Levy: a Toy Call-by-Push-Value Language.

And let me also repeat CBPV's slogan, which is one of the finest in PL advocacy: Once the fine structure has been exposed, why ignore it?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Very interesting indeed.

You beat me to it, I was considering posting this on LtU.

I have the (unoriginal) intuition that the study of polarisation/focusing will bring a much better understanding of evaluation order and possibly other parts of programming language semantics, that should have important impact on the "mainstream academic programming languages" in the future. CPBV is probably not the last step; I am aware of Dan Licata and Robert Harper's work on binding, Noam Zeilberger's and Guillaume Munch-Maccagnoni's work at maybe a deeper level, and our own neelk's work on focusing and pattern matching. Anrej Bauer's work on algebraic-effect languages also seems related (in the separation of computations and values, at least).

I would be very interested in a survey of the area for non-experts, analogue of the linear bestiary of François Pottier. There are a lot of different things going on and it is difficult, without knowledge of the technical details, to understand how to evaluate and compare those different approaches.

Misnomer?

Recently I've been discussing CBPV with colleagues. After explaining CBPV to them, they reported that the name Call-by-Push-Value was misleading: they had the impression that the semantics are somehow tied down to some kind of stack machine semantics.

It's true the Paul Levy gives stack machine semantics to CBPV. But so did Gordon Plotkin, to the lambda-v calculus. I am not sure at all that CBPV has more to do with stacks than Call-by-Value has.

CBPV does have a very well-behaved equational theory. The Focusing school of thought finds the distinction between positive and negative types appealing. Effect handlers are cleanly formulated in CBPV. I've found the separation of computation suspension (thunks) from variable abstraction (lambdas) particularly clean for equational reasoning about effects. Some colleagues found being closer to A-normal form particularly appealing.

I am sure science will overcome marketing and CBPV will eventually be recognised for what it is. If I could rename it, I would probably call it the "effectful lambda calculus".

Data and computation are not distinct

Or phrased less provocatively and more truthfully, you can certainly work in a setting where they are, but I don't think it's desirable. All values, data included, can be defined by their eliminations. Data has an induction principle and codata has a co-induction principle. But there's no reason why some values can't have both (e.g. finite lists can be viewed as both data and codata).

I'm also recently of the opinion that a particular calculus for semantics of a programming language (e.g. CBPV) isn't a very good idea, since that amounts to specifying a particular evaluation order too early. Lately I've been happy with letting types define values rather than using them to describe values that are defined by terms in some calculus. In this approach, strict functions are logically the simplest: if the argument is of the required type, then the function output is of the desired type. Lazy semantics are more complicated, and this is why I think strictness is the right default.

Now where were we? Oh yeah: the important thing was I had an onion on my belt, which was the style at the time. They didn't have white onions because of the war. The only thing you could get was those big yellow ones...

Similar Doubt

Not deeply involved in programming languages, but from a simple mathematical logic angle had the same doubt about negative polarity implying co-induction.

First how do we come to a notion of "type-" at all? The initial motivation were negative formulas I guess, and thus inference rules of the following form:

    G, [A], [B] |- C
    ----------------- f2
    G, A & B |- C 

If a connective (here &) lands before the |- sign in the course of a proof, then it had a negative usage (right?). But the inference rule is logically equivalent to the following pair of inference rules (plus cut):

    G |- A & B      G |- A & B
    ---------- p1   ---------- p2
    G |- A          G |- B

The later two inference rules look to me like data type destructors (*), for example used in pattern matching. So I guess data types have already a negative side.

But maybe the author had something in mind with polarity inside induction/recursion. So eventually there is a stronger link between "type-" and polarity.

Bye

(*)
Proofs-as-types: p1 ~ f2 \x.\y.x, p2 ~ \x.\y.y, f2 ~ \h.\z.(h (p1 z) (p2 z)).

Origins of Focusing?

Dear All,

Just had a look at the paper:

Focusing on Binding and Computation
Daniel R. Licata, Noam Zeilberger, Robert Harper
Carnegie Mellon University


They don't have the f2 rule and no connective &. But similar to
linear logic they have positive and negative connectives, so
I guess my & would split up into the (x) and the /\ of their
paper. Whereby the /\ is kind of an ordered multiplicative
disjunction (sic!) in linear logic terms when looking at
single sided sequent, which corresponds to the & on the left
hand side of a two sided sequent.

The /\ is ordered, since we have roughly the following
inference rule:

    G, A |- B > L
    ---------------
    G |- A /\ B > L

So there is a preference to look first at the right
hand side of the connective /\. (Right?)

Where did I see this already? Well in the context of
classical logic there is a smart SAT prover by John
McCarthy (recently by way of Marco Piedra on G+):

The Wang Algorithm for the Propositional (*)
Calculus Programmed in LISP
J. McCarthy
Massachusetts Institute of
Technology Cambridge, MA, USA ©1959


If one follows the LISP code, one sees that it is focused in the sense
that when it has found (and A B) it pushes A (!) and then B on
the top of the left list, so that it will solely try B next. (Right?)

Otherwise the paper "Focusing on Binding and Computation" looks
dense and is probably quite a trove. Especially since they offer
(new?) proof objects, although not yet dependent type based.

Bye

(*)
http://www.softwarepreservation.org/projects/LISP/book/LISP%201.5%20Programmers%20Manual.pdf
Page 44 ff

I think what you're reading

I think what you're reading as a conjunction in their paper (\curlywedge) is more of a binding or nominal operator - it's not going to be helpful or accurate to read it as akin to logical conjunction.

What is /\

Hi,

They call it "representational conjunction". Page 4 top right. Authors say it behaves different from ordinary conjunction, and they refer to 2.5, where we find the "shocking equalities".

But these equalities need not destroy the functioning of the connective as a conjunction (not classical, rather minimal or eventually linear). But I don't find a start rule inside the constructor / destructor group. Only stop rules, basically:

   ----------
   G |- A > A

But a start rule is found in the focusing rules group, basically
we have, in the lower half left of the the group:

  G |- A > L
  ----------   
  G, A |- L

The later is very much the same as the (decide) rule of Dale Miller (I am refering to the system shown in the SEP article). But Dale Millers & is an additive disjunction and not a multiplicative disjunction as I still believe /\ can be identified with.

But I am refering to the single sided calculus of linear logic. In the two side calculus of linear logic, it could be also (x) on the left hand side of the sequent bar, or (&) (& up side down) on the right hand side of the sequent bar. But (x) is also multiplicative.

I am not sure whether they have some element in their calculus that would make it classical or intuitionist. I don't see an absurd element for example.

But they have fix (fixpoints right?) and val (function definition lookup right?). So you are probably right, aim is not really a logical system, more something in the domain of functional programming. But maybe both is possible, logical reasoning (i.e. propositions) and functional reasoning (i.e. recursive functions or some such).

Bye

P.S.: Not sure whether Schroeder-Heister would approve the notation for function definition lookup (val). We find often written something along:

   forall (D => P)         G, D |- C
  ----------------------------------
            G, P |- C

But logically it would be that
(the arrow goes the other direction):

    P -> D

would license:

    G, D |- C
    ---------
    G, P |- C

In the present calculus this confusion has been taken to new heights, since it is not anymore written forall(D=>P), but worse something along forall(D |- P). So there is an implicit completion step in the calculus embodied by the "forall". Is definitional reflection enough precise in the calculus?

P.S.S.: Clever forall(D |- P) could also be useful for logical reasoning, i.e. findall free looping. Just an idea I had recently, not yet sure how to do it. Also not sure how val could be implemented
without some absurd!

Total semantics

If you stick to total functions, then there is no semantic distinction between strict and lazy evaluation. This would be my preference. We can always use monads or similar to model effects, and push an model of time to the edges if we wish to model incremental computations.

Total pain

I agree with your first sentence, but I think we disagree about how to stick to total functions. Modeling nontermination with monads or similar is a boondoggle. My preferred viewpoint: non-termination isn't an effect -- it's an error. One nice thing about defining values via. types is that you can take a recursive definition which fails to terminate as an empty type. That is, a recursive definition should be viewed as a specification for a value, which you are asserting to exist, and a language that requires you to supply proof of this assertion can be fairly described as a bondage and discipline language.

Getting away from calculi actually enables this point of view. With a calculus as defining your semantics, the non-terminating expression is sitting there in your semantics unless you restrict to a bondage and discipline calculus. Neither alternative is particularly attractive, IMO.

non-termination isn't an

non-termination isn't an effect -- it's an error

I think that depends on what you're modeling. I like termination up to any given instant in time. But if World of Warcraft terminated on me, I'd be a little upset.

I think you misread me. Where I said, "use monads or similar to model effects, and push an model of time to the edges if we wish to model incremental computations," I meant that as two distinct techniques. Monads, after all, aren't necessarily at the edges.

One nice thing about defining values via types is that you can take a recursive definition which fails to terminate as an empty type.

Interesting, but I'm not entirely clear on your meaning. Let's take the typical list type:

type List = λa.μL.(nil|cons(a,L))

Now this isn't a type you can guarantee would terminate, but is occupied by many finite values. Would you allow this type? Or would you need to somehow annotate a list with its length?

What do you gain by shifting computation into the type domain? I suppose if a type is occupied by more than one value, you could get a sort of implicit constraint programming, but whether you can leverage that would depend on your `type calculus`... which seems to defeat your purpose.

the non-terminating expression is sitting there in your semantics unless you restrict to a bondage and discipline calculus

I really hate semantics that let me express something then tell me "No, Dave. What are you doing, Dave? You cannot do that, Dave."

I much prefer a physical or syntactic realization of my constraints. It lets me think about the problems and solve them more concretely.

Maybe I did misread you. I

Maybe I did misread you. I thought you intended monads to model non-termination.

Of course I allow lists. A better example of what I mean is:

fact 0 = 1
fact n = n * fact (n-1)

I'm saying that we can view such declarations as defining a dependent type for fact. The type of fact (-1) is not provably inhabited. You're right that this is sort of constraint programming, but where the constraints have simple local solutions (no searching).

What do you gain by

What do you gain by interpreting this as a dependent type? What's the difference between this and just reasoning with ordinary values in an appropriate formalism? Haven't you just trivially lifted everything to the type level?

What's the appropriate formalism?

What ordinary value does fact have? What is the value of (fact -1) above? _|_? Is that an Integer?

fact -1 is a computation

fact -1 is a computation that doesn't terminate. You can call such a computation _|_ if you want. What's the advantage of saying "fact -1 is not inhabited" instead?

I'm not sure whether this answers your question, but ...

First, I'm not saying fact is a type. I'm saying it's defined as being an element of a certain type.

Second, you didn't answer my question about whether _|_ is an integer. A bottom value causes all kinds of problems like not having real products or sums. My approach is closer to what a type theory like Coq does (where all functions are total). Compared to a calculus based approach, it's like having type directed evaluation order. Whether you can be strict or need to be lazy depends on the type.

I'm not sure this whether this addresses your question or not.

I can answer the question

I can answer the question whether _|_ is an integer with the same answer that you give the question whether fact -1 is an integer. My confusion is about what this gives you beyond a change of terminology. Perhaps you can give an example that shows the difference beyond terminology.

Either you say that fact -1 is an integer and as such you allow it to be passed to a position that expects an integer and you have all the problems of bottom, or you say that fact -1 is not an integer and then you have the problem that you either need to prove that your expressions terminate or you ignore the problem as in strict programming languages like ML.

Example

I find the approach simplifying, compared to adjoining _|_, et al to the universe of values. But in terms of what features this gets you in your language, maybe something like this:

We can define the type of a conditional triple as a 3-tuple of type (Bool, T, T) where the first slot contains a Bool (not _|_), and one of the second or third slots contains a T, which depending on the predicate in the first slot. In a lazy language, you'd have to worry about keeping _|_ out of the first slot, and in a strict language you couldn't do the latter part. With CBPV, you can probably encode this particular example easily, but directly defining the semantics seems cleaner to me than encoding in a calculus.

The initial example I gave of treating a recursive definition like 'fact' as a specification rather correct-by-construction inductive definition was intended to contrast more with the approach of Coq or type theory than it was to contrast with a CBV or CBN calculus.

So you're saying that you

So you're saying that you want to specify per type whether it contains bottom or not? So it's like an explicit Lazy[T] type?

I'm saying I want values to

I'm saying I want values to denote elements of the logical theories they satisfy and not expressions in a term calculus. A side effect is that this makes allowable evaluation order issues implicit.

Evaluation order and CBPV

I'm also recently of the opinion that a particular calculus for semantics of a programming language (e.g. CBPV) isn't a very good idea, since that amounts to specifying a particular evaluation order too early.

It's interesting that you say that. The original goal behind CBPV was to find the semantic primitives behind both Call-by-Value and Call-by-Name.

More technically: Paul Levy gives two translations, one from CBV into CBPV, and another one from CBN into CBV. The thesis and book substantiate the aforesaid claim by showing that many (all?) known semantics of CBN and CBV are preserved by these translations. Paul considered basic control flow, recursion, big step operational, CK machine semantics, side effects: printing, global store, exceptions, control effects, local state, pointer game semantics, and categorical semantics.

So, if you don't want to commit to evaluation order, but still rigourously analyse a complete language, CBPV (or its extensions) might be the language for you.

I should have said "even CPBV"

I'm aware of some of the results you cite for CPBV. When you say, "if you don't want to commit to evaluation order," you're referring to a language designer not committing to a choice for the language, whereas I was referring to a programmer not committing at specific call sites.

I don't understand.

Do you want your language to not define evaluation order? Why?

Type directed evaluation order

Or "type compatible evaluation order" (even for a lazy language like Haskell, GHC does strictness analysis to decide when it can use a different evaluation order so long as it keeps the same result).

Let me work through another example. Let's say you have this function:

foo : (Int, Int) -> (Int, Int)
foo p = (0, 1 + fst p)

I'm saying that I think we usually want the theory of this foo to say that, if you have a pair p of Ints, then (foo p) is a pair of Ints satisfying the equation. This allows strict implementation of foo. If someone comes along and writes:

bar : (Int, Int)
bar = foo bar

It doesn't work. You can't establish the type (Int,Int) for that bar (and an implementation is free to diverge). If you actually want to allow foo to support this kind of craziness, then you'd maybe write something like this:

foo : Any -> (Int, Any) 
foo : (Int, Any) -> (Int, Int) -- made up syntax -- alternate signature
foo p = (0, 1 + fst p)

Note that I'm talking about semantics, not syntax. There could be a syntax that let you annotate parameters or functions as "lazy" and then inferred weakest preconditions for those. I'm also in favor of equational style constraints on the parameters rather than just type style constraints. But the point is that the function is defined by its theory and in selecting an implementation, the compiler just looks at that theory to find a suitable evaluation order.

Purity

I note that Matt is assuming a pure language, in which case the only semantic distinction between strict and lazy is where the bottom is showing.

IIRC, Levy's CBPV was developed in context of effectful systems. In such cases, evaluation order has a much larger impact on observable state and hence program behavior, so being explicit about it is more important.