## A lambda calculus with arbitrary set primitives

For various reasons, I am looking for (or looking to develop) a lambda calculus and a semantics that I can use to structure set-theoretic calculations. In other words, I want to be able to write things like (set-filter p A), which returns the subset of A for which p is true, even when A is uncountable. I don't need to write set-filter in this lambda calculus, but I want to include it as a primitive operation (along with membership, cartesian product, etc.), and well-founded sets as primitive values.

My main question: am I going to run into foundational problems, specifically with the ZF axioms? Has someone already tried this and written about it?

I'm not concerned with whether the sets or set operations are computable. I'm not concerned with mathematical foundations, except for the possibility that they might pose difficulties. Having types isn't necessary, and neither is having every lambda denote a set-theoretic function. It doesn't matter whether any given set can be expressed in a finite or even countable number of symbols. I don't need to use only a minimal collection of primitives. (In fact, I'd be nice if I could import them on a whim.)

I want to use the syntax and structure of the lambda calculus, especially lambda abstraction, to express calculations that a particular group of applied mathematicians is interested in. After that, I can consider what subclass of these calculations is computable - but limiting the language to computable calculations right away feels like premature optimization.

It seems easy to do. I'm asking because I found a subtlety in a related lambda calculus + semantics, and there might be more. If someone has found them already, I'd like to know.

(The subtlety, which I half expected, came up when trying to interpret (lambda x in A . e) as a set-theoretic function; i.e. as the set of points in its graph. The simple rule I wrote requires unrestricted comprehension when an enclosing lambda is applied to itself. Denotational semantics provides a solution, but I don't want that kind of complexity. I also decided I didn't need to interpret certain lambda forms as set-theoretic functions, either, if it would make things simpler.)

## Comment viewing options

### David McAllester's Ontic

You might want to look at David McAllester's Ontic.

I worked on Ontic as a grad student in David's group, so I might be able to answer questions about it; but that was 15 years ago, so I've forgotten many of the details...

### Axiom schema of specification

Axiom schema of specification fulfills the 'set filter' role. That is, the resulting set exists in ZF.

That said, you can't ask many interesting questions of the resulting set, such as whether it is identical to the empty set, or for a sample element. I suppose you could ask whether a proposed element is a member (assuming your filter function is decidable).

I'm curious as to your motivation. What complexity does this help you avoid?

### Avoiding complexity

That said, you can't ask many interesting questions of the resulting set, such as whether it is identical to the empty set, or for a sample element.

I should be able to do equality, subset, union, intersection, and even countable (or arbitrary) versions of the latter two as long as I include the right primitives. They won't be computable, but that's not what I'm after.

I'm curious as to your motivation. What complexity does this help you avoid?

I'm glad you brought this up early. I didn't want to say much about the subject area up top, because it risks bringing on a bunch of posts pointing me to the standard papers on the subject, which I have already read.

The topic is measure-theoretic probability. It defines objects that a lambda calculus with only computable primitives could never hope to build, and operations on them. Also, I can't think of any topic that is so closely tied to ZF set theory aside from topology.

So to directly answer your question: I'm not trying to avoid complexity, I'm trying to import the super-computational power necessary to represent routine measure-theoretic calculations.

Another way to get the notational power of lambdas and the super-computational power of arbitrary set operations might be to use lambdas in standard mathematical notation. But lambda terms seem to have no natural semantics (which is why we're always creating them), so it's not clear what the resulting formulas should mean.

### Express vs. Do

I should be able to do equality, subset, union, intersection, and even countable (or arbitrary) versions of the latter two as long as I include the right primitives. They won't be computable, but that's not what I'm after.

Cognitive dissonance just rattles my brain with statements like that. What does it mean to do equality if not to compute or validate it?

### What does it mean to do

What does it mean to do equality if not to compute or validate it?

To express, I suppose. Does "do" generally connote something algorithmic? Can I say that a hypothetical machine that determines whether another halts on all inputs "does" something? Can "do" connote semicomputable or semidecidable? Can it connote any kind of computable approximation?

Usually I use "compute" when describing something algorithmic, and "do" when I'm being sloppy. I might have to revise. I didn't expect a two-letter word to mean something so precise. :)

### Just Do It

"Do" strongly connotes something operational in nature. (Not necessarily algorithmic, either.)

### Term Rewriting

Before you pursue lambda expressions very far, I would (very strongly) suggest studying Term Rewriting systems. CafeOBJ, Maude, BOBJ, and Pure are the four I've studied, though I've only written programs in the first two.

These systems have expressive semantics that seem far more suited to your goals than anything lambdas will offer - i.e. you work at the level of algebraic properties (equivalence, conditional equivalence, behavioral equivalence, associativity, commutativity, identity, in addition to transitions). It is easy to introduce new primitive operations into a model, and to declare their algebraic properties. The languages will also help you perform proofs, and even reuse those proofs to simplify and optimize further modeling.

If you have goals to eventually turn these into programs, you can develop modules especially for that purpose, and the semantics of those modules won't interfere with your set descriptions and semantics descriptions.

### Cool. I'll check those

Cool. I'll check those out.

In the end, I'll want to encode my stuff in Racket and have it compute approximations when computable approximations exist. In other words - like you guessed at the end - I'm looking for an intermediate language between measure theory and an implementation language.

I'm currently looking at lambda calculus because anything written in it can (usually, depending on semantics) be encoded pretty directly. But I haven't yet looked for intermediate languages that are closer to the math. Thanks.

### Markov

Have you taken a look at Markov?

Alternatively, have you considered encoding New Axioms for Rigorous Bayesian Probability in the logic of your choice?

### I have looked at Markov, but

I have looked at Markov, but my Coq-fu is very weak, so I'm afraid most of it reads like token salad. :/ The most I can do is recognize a few lemmas. On the plus side, it's good evidence that the operations I want can be expressed in Coq (at least with Borel sets). Thanks for reminding me.

New Axioms etc. looks like an interesting piece of philosophical work. Cox, Jaynes, and presumably the authors, are mostly interested in showing that probability, and specifically how Bayesian statisticians use it, is a desirable way to reason about uncertainty. (They also tend to aim for uniqueness, but I didn't see that in the paper.) So they come up with lists of desiderata and show that the probability axioms follow.

The major weakness of all the philosophical approaches I've seen is that they don't tend to treat infinities, or they gloss over them. (This seems to do the former; Jaynes does the latter.) It's understandable because they're working from desiderata: it's difficult to come up with a statement that everyone can agree is intuitively true if the statement has to be about something infinite.

As such, any nontrivial proposition about something infinite would have to be encoded as a limit, zero-probability conditions would need to be expressed as limits, and countable additivity would need to be proven as needed. Advantage: all calculations are constructive and approximately computable. Disadvantage: the calculations would be heinously inefficient. Using a different calculation would require moving to a more general theory in which you can prove both give the same answers... and that's measure theory.

(TBH, I can't really tell whether this paper ignores infinites. Their "propositions" seem to come from R, but they only define finite additivity. It seems a little weird, but they use lingo that I don't know, so they may have covered it.)

### Forget Measures...

...the whole point of the Cox-Jaynes-Dupre-Tipler approach is to make probability axiomatic in a way that is strictly more expressive than Kolmogorov's measure-theoretic probability. See What About Quantum Theory? Bayes and the Born Interpretation, where we find:

I show that the group transformation argument yields probabilities in the Bayesian sense, and that in the limit of an inï¬nite number of measurements, the relative frequencies must approach these probabilities. Which is why physicists have been able to get away with assuming, for two generations, that probabilities are relative frequencies.

That is to say, the measure-theoretic approach to probability covers a proper subset of the cases the Cox approach covers.

It might still not be obvious how to get there from here, though, so let me also recommend Dupre and Tipler's earlier paper, The Cox Theorem: Unknowns and Plausible Values, which delves into general additivity more deeply than their later paper.

### How I Learned to Stop Worrying and Love the Paradox

Do you really need to exclude non well-founded terms? If someone writes Russell's Paradox in your language, compiles and runs it, and the program crashes, is that a major problem?

Besides this there are intrinsic reasons for including such [terms], or at least some of them. The notions ruled out as meaningless by certain systems of philosophy [...] are not actually senseless to our intuition. We can understand them as concepts. Even those which lead to contradictions do so only when one ascribes certain properties to them. If one is going to explain the paradoxes, rather than simply run away from them, we need to have [...] reasons why those properties fail.

-- Haskell Curry, Combinatory Logic II

The ability to construct nonsensical expressions may be unappealing in formal logic or set theory, but it has a certain utility for programming languages...

### Well-founded sets

Do you really need to exclude non well-founded terms? If someone writes Russell's Paradox in your language, compiles and runs it, and the program crashes, is that a major problem?

Heh.

It's not just that the program crashes, but that it ceases to mean anything to the people I'm making the language for. Their stuff is already very well defined in terms of well-founded sets. I'm just trying to provide a machine-interpretable formal syntax.

By the way, the unrestricted comprehension I wrote about in the last paragraph of my post turned into an infinite loop when I encoded the reduction of (lambda x in A . e) in Racket as a macro that builds a hashtable. And that was with a finite domain. (And I would need something trickier than laziness to fix it.) So it seems that non-well-foundedness can easily turn into undesirable behavior in implementation, even in the absence of Curry's "certain properties".

I agree with Curry in principle, though. I like my general recursion.

In any case, you've given me a good idea: I can try to encode the well-known paradoxes that arise in naive set theory.

### Coq, Agda, etc.?

It's not just that the program crashes, but that it ceases to mean anything to the people I'm making the language for. Their stuff is already very well defined in terms of well-founded sets. I'm just trying to provide a machine-interpretable formal syntax.

So what's wrong with Coq/Gallina or Agda or something like that?

### In the interest of ecumenicism...

...let's also not forget Isabelle. :-)

### Map theory

You might want to check out this page for references to map theory . Map theory is a proposed foundation to mathematics which allows for the use of (arbitrary) lambda terms as a basic object, with strong connections to set theory. I admit that my knowledge of the theory is quite limited, but I believe that it addresses your questions about paradoxes in a satisfactory manner.

### I've seen a paper on map

I've seen a paper on map theory, and it looked interesting. Thanks for the link.

My main question: am I going to run into foundational problems, specifically with the ZF axioms? Has someone already tried this and written about it?

The answers are: no, and yes.

The reason it makes sense is that the category of sets is cartesian closed, which means that all the notation of the lambda-calculus has a perfectly sensible interpretation in terms of sets. So products can be interpreted as cartesian products of sets, functions as set-theoretic function spaces, and coproducts as disjoint unions. Likewise the type-theoretic syntax of dependent products and sums corresponds naturally to indexed families of sets and indexed sums. Doing mathematics in this way looks just like plain old mathematics, only with the unusual feature that it's actually clear how scoping works and what you're defining. :)

An explicit account of this is given in Paul Taylor's Practical Foundations of Mathematics -- he calls his presentation "Zermelo type theory", and it's essentially a presentation of ordinary ZF set theory as a type-theoretic language.

### Well, not quite.

Well, not quite. Lambda-calculus only interprets the categorical structure of sets, so it only lets you define sets up to canonical natural isomorphism. In particular, it is impossible to define subsets except up to an equivalence class of monomorphisms (which is what a subobject is); you can only take intersections of sets that are subsets of the same set; and you can't specify any ur-elements (individuals), meaning you must effectively code everything as a set, including for example naturals.

And then there is the fact (which the OP says he doesn't care about, but anyway he will) that you can only represent recursively enumerable sets, so in particular complement is restricted, and moreover there is no way to represent uncountable sets; the only infinite set is isomorphic to the naturals.

Not that I'm complaining; personally I don't care much about the non-categorical properties of sets and neither should you.

Of course, this does not rule out adding primitives to deal with the concrete properties of sets.

### Yes...

...that's exactly what Taylor does in his book -- he adds explicit primitives for membership and powersets.
)

### Conservative extension

Do you mean what he does in section 2.8? He does add them as primitives, but they are already definable, so they don't change the theory.

Bool is definable. His powerset(X) is just X -> Bool. And his membership is just application: (\$) = \ x p -> p x : X -> (X -> Bool) -> Bool.

### No...

...I was thinking of the stuff in section 2.2, where he introduces the powerset type. I had simply forgotten that the later section with the higher-order-logic stuff even existed.

### I still don't think that is

I still don't think that is enough. His powerset rules define an adjunction between 2^X and Sub(X), and thus only preserve X up to (canonical) isomorphism. You would not, for example, be able to distinguish between a subset of X which is the natural numbers, or the integers or the rationals, which all have the same cardinality and are consequently isomorphic in Set.

I think you would still need to add some more rules to lock down the elements up to equality.

### Fascinating!

Fascinating!

Also, I didn't expect it to be online. Looks like good lazy Sunday reading material. :D

Unfortunately, if it really doesn't allow uncountable sets, it's a non-starter. It's still fascinating, though.

### Don't try to read the online

Don't try to read the online version; it's a horrible TeX-to-HTML conversion that's missing all the diagrams and most of the equations.

It does allow uncountable sets. You've got natural numbers and powersets, so Bob's your uncle. :)

### Uncountables unavailable.

Unless you treat it as an object language, you simply cannot encode uncountable sets. The reason is that a program in any computationally complete language is encodeable as a string, and therefore a natural number, and natural numbers are countable. At most you can encode types of order type omega, i.e., the same as the set of natural numbers.

Powerset will not help you construct larger sets ultimately because of the reason above; the long explanation involves the fact that intuitionistic ordinals do not behave like ordinals in a classical theory, even without Choice.

In other words, if you need to treat uncountable sets, you need to use a metalanguage for set theory, something like Coq or Isabelle.

### Not encoding

I too want to reason about uncountable sets.

More precisely: I want the meta-theory of my system to have uncountable sets (like the reals, and bigger sets like all real-to-real functions) and THEN to have my object language have connotations for (certain) objects of my meta-theory. In other words, I want object-language terms to have denotations which are 'classical reals'.

Of course I only have access to a countable set of object-language terms, so I cannot reflect the full richness of my meta-theory. But that's OK! A "sufficiently random" real-to-real function can only be distinguished from another such via infinitary statements, even in higher-order logic; so basically we don't care. It is still powerful enough to reason about such things as piecewise functions with symbolic breakpoints.

Such a setting is more than rich enough to do every day mathematics, and yet it does not force you into the fully-constructive straight-jacket which most popular theorem-proving system do. [I know, via double-negation, I can do this in constructive logic too, but then I give up on almost all the automation the systems provide. That is not an acceptable compromise to me.]

### Neat.

That looks kind of interesting. Paging through the paper the trick seems to be to define "pseudo-functions" which are lazy in the sense that they don't depend on their values on parts of the domain which are undefined. (At first I put "lazy" in scare quotes, but after re-reading what I wrote it does indeed seem to be laziness, or at least non-strictness.)

Is there a more general treatment of this technique? For example, I don't think your paper mentions what you said about being invariant under finitary observations.

(And, BTW, just out of curiosity, do you know what the meaning of Blizard's "negative membership" is in your reference 1, mentioned in the para under Definition 10 in section 2.2? Is it similarly used to cancel with positive members for non-strict evaluation? I think I would like to learn more about this.)

### Lazy

You are absolutely correct - these "pseudo functions" are non-strict. I did not use that terminology because the paper was meant for a non-PL community unfamiliar with a lot of PL issues and more comfortable with terminology which seems more mathematical. The piecewise functions of [5] were also non-strict, because that reflects actual mathematical usage.

I don't understand what you are asking in your second paragraph. Which technique in particular, and what do you mean by 'invariant under finitary observations' in this context?

You mean reference [2], because in [1] he deals strictly with multisets (aka non-negative multiplicity). The very first sentence of his paper Negative Membership is

By negative membership we mean the fact of belonging to a collection of objects a negative number of times.

As far as we know, a lot of people have previously worked with generalized sets, but generalized partitions, non-strict pseudo-functions and formal treatment of piecewise-defined functions with 'symbolic' breakpoints are all introduced in our paper for the first time.