## Total Self Compiler via Superstitious Logics

As we have another thread on avoiding the practical consequences of Godel incompleteness theorem, which seems like a worthwhile goal, I thought I'd post this fun little construction.

To what extent can we implement a provably correct compiler for our language? If we start with axioms of the machine (physical or virtual), can we prove that a compiler correctly implements itself on this target machine? There's a simple construction that works for all sorts of systems that effectively allows you to prove self-consistency without running afoul of Godel's theorem. Here's the idea:

Let's start with your favorite system (set theory, type theory) to which we want to add self-evaluation (to, among other things, prove self-consistency). As a first step, let's add a big set/type on top that's large enough to model that system. For example, in ZFC set theory, you can add an inaccessible cardinal. You're just looking at the original system from the outside and saying "ok, take all of the stuff in that system, and add the ability to quantify over that, getting a new system."

Now this new system that you end up with can model and prove the consistency of the original system, but still cannot prove the consistency of itself. This leads to a simple "practical self-compilation": perform this jump to bigger and bigger systems 2^64 times and start with a version of the compiler at level N=2^64. Whenever you want to do a new compiler, you prove the consistency of the axioms at level N-1. Each version of the compiler has N as part of its axioms and can prove the consistency of "itself" (but for versions with smaller N). No compiler ever proves its own consistency, but, as a practical matter, if the only thing you use these levels compile new versions of the compiler, then you won't ever run out of levels.

But there is an inelegance in having to keep track of levels in such a scheme. The number of levels itself clearly isn't important, and even if it's only a version number that's changing, we clearly haven't really created a "self" compiler. Do we really need that number? What's important is just that the scheme prevents infinite descent of the levels. So here's the proposed fix:

Step 1. Start with some finite number N (leave this N a parameter of the construction; later we'll notice that we can pretend it's infinity) of these inaccessible sets/nested universes and index them so that the biggest one is index 0, the next smallest contained inside it is index 1, etc. Index N is the smallest universe that's just big enough to serve as a model for the original theory.

Step 2. We write a semantic eval function that maps terms in the language into our biggest universe and use it to establish the soundness of the logic/type system. Terms reference 'universe 0' gets mapped to universe 1, 'universe 1' gets mapped to universe 2, etc. Our model of the term for 'N' is N-1 in the model. This step works just like the "first practical solution" above, except now we're counting down and we don't use actual numbers for N in our logic.

Step 3. How do we establish that the new 'N' (N-1) is positive? We make our compiler superstitious as follows: we pick an unused no-op of the target machine that should never occur in normal generated code and add an axiom that says that running this no-op actually bumps a counter that exists in the ether, and, if this unseen counter exceeds N, then it halts execution of the machine. Since we could have started with an arbitrary N, we will never reach a contradiction if we take the bottom out of the level descent, so long as we don't let the logic know we took the bottom out.

Then we can have an inference rule in the logic that allows us to conclude that N > M for any particular M:Nat, without any way to infer that (forall M:Nat. N > M), which would lead to contradiction. The implementation of this inference rule invokes the magic no-op until it's confident that enough such levels exist. Rather than a direct statement that the compiler works, we will have a clause that says "if there are enough levels, then the compiler terminates and works." Any human reading this clause can safely this clause (we know there are always sufficiently many levels), and just as importantly, we can instruct the compiler to allow execution of functions that terminate if there are enough levels and that will make things total. i.e. We can have a total self-compiler.

I have a logic that I'm working with that I'd like to try building this scheme with, but I have quite a bit of work left on the machinery I'd need to complete it. Please let me know if you know of someone who does something similar.

Apologies for the long post. I've argued here that this is possible, but with fewer details. Let me know if I need more details. Hopefully someone finds this interesting.

## Comment viewing options

Due to overwhelming community interest, I have decided to follow up with a note about self interpreters, since a frequently stated result is that a total language cannot implement its own interpreter. Obviously I don't dispute the formal result, but I do think some of the ways it is informally stated are not correct or at least misleading. In particular, we can indeed have a total language with a self evaluator.

The idea is basically the same as the above compiler, but rather than having a magic no-op instruction, we write our programs monadically in a special monad with a single effect (call it Bump) that serves this purpose. The type of this monad keeps track of a lower bound M on N. When Bump is invoked in an environment with N > M, the continuation may only be resumed in an environment that has N > M + 1.

We can then implement our type/proof checker as a monadic computation and reason formally that if the proof checker completes, then its judgement is correct. We can further reason that any program requires only some finite number of levels for the proof checker to terminate. And finally, for any particular program, we can reason that there in fact are enough levels and the proof checker will terminate. This reasoning step is is why the proof checker needs to be monadic. When it gets to that step in a proof, the checker issues a Bump effect an appropriate number of times.

I think it's fair to call this a total self-interpreter. The type checker will recognize and let you evaluate exactly the same language you started with. Of course, your type/proof system doesn't know that it's exactly the same language. It thinks it's a smaller language and that prevents your programs from using the results of eval in ways that would lead to trouble.

Any thoughts?

### How would you express this

Then we can have an inference rule in the logic that allows us to conclude that N > M for any particular M:Nat, without any way to infer that (forall M:Nat. N > M), which would lead to contradiction.

How would you express this symbolically? It seems we shouldn't be able to choose N as the value of M, because then we get N > N. Perhaps there'd be a special "IsSmall M" predicate, but then I don't see how (or if) you plan to avoid (forall M:Nat. IsSmall M).

Any human reading this clause can safely this clause (we know there are always sufficiently many levels), and just as importantly, we can instruct the compiler to allow execution of functions that terminate if there are enough levels and that will make things total.

What if someone tries to do this recursively, with one compiler executing a function that runs another compiler, and so on?

For the self-interpreter, when you say "The type of this monad keeps track of a lower bound M on N," that makes sense... but the self-compiler sounds like it might be a different story.

### How would you express this

How would you express this symbolically?

I think the best way is to make the discovery of additional Big Universes (by increasing M) static. So for example, in a sequent logic, you could have an inference rule of the form:

   |- M:BigUniverse
----------------------       ---------------------
|- succU M:BigUniverse       |- firstU:BigUniverse


The important thing here is to not allow a context on the left of the turnstile. If you were allowed to assume M:BigUniverse and then use the above left rule to prove that succU M:BigUniverse, you could then quantify and get into trouble.

I'm sure there are lots of ways to accomplish this. The main thing is to keep the discovery of big universes as part of the static fragment of the system that can't be quantified over.

What if someone tries to do this recursively, with one compiler executing a function that runs another compiler, and so on?

Then this recursive function won't be typeable (should that be 'typable' in the states?) under the rules. That wouldn't require sufficiently many levels - it would require infinitely many.

For the self-interpreter, when you say "The type of this monad keeps track of a lower bound M on N," that makes sense... but the self-compiler sounds like it might be a different story.

So you're saying that the self-compiler step didn't make sense? Or you didn't believe it? Or ...?

Edit: One thing I can say is that my presentation here probably wasn't pedagogically optimal. I would recommend trying to understand the interpreter first, and then try to think about how the scheme could translate to a compiler that targets existing hardware. The obstruction you hit will be trying to prove the correctness of the monadic Bump when the proof system can only itself learn of the existence of some finite (though arbitrary) number of universe levels. The trick presented here is to axiomatize the idea of a hardware instruction that does the Bump, so that we can reason statically, with only a small number of levels known, that there will be enough levels later if Bump succeeds. I wanted to get to this trick, so I jumped right in to self-compilation.

### Self-weakening quine

The main thing is to keep the discovery of big universes as part of the static fragment of the system that can't be quantified over.

The obstruction you hit will be trying to prove the correctness of the monadic Bump when the proof system can only itself learn of the existence of some finite (though arbitrary) number of universe levels. The trick presented here is to axiomatize the idea of a hardware instruction that does the Bump, so that we can reason statically, with only a small number of levels known, that there will be enough levels later if Bump succeeds.

Thanks very much for walking me through both of those cases! They make more sense to me now.

should that be 'typable' in the states?

I'd like to know this too....

Then this recursive function won't be typeable (should that be 'typable' in the states?) under the rules. That wouldn't require sufficiently many levels - it would require infinitely many.

Oh, so if a compiler is verified at a certain universe, it can run some of the code of the program it's compiling, but only if that code is typed such that it fits within that universe.

I'm still skeptical and/or confused, but my lack of understanding is beginning to revolve around a rather specific counterexample....

Suppose the program being compiled is a quine that compiles itself at compile time (e.g. by invoking itself at the type layer, so the type-correctness of the program is held at ransom). It must use some kind of bundled compiler function of its own, whose universe index must be weaker than the original compiler. To keep this going in a loop, the quine will need to be slightly self-mutating, so that the source code at each level produces a compiler weaker than the last. If this quine exists, it can loop for as long as a weaker universe is available.

Does this sound plausible to you? Here's a methodical illustration of the quine:

The next program is a (SourceCode) value:
...quine with M+1 in place of M...

This program's bundled compiler is a total function
(SourceCode -> CompiledCode) at weakness level M:
...definition of self-compiler...

The (Uncheckable) type is a level-M type defined as follows:
Run this program's bundled compiler on the next program.
Branch (trivially) based on the compiled output:
No matter what...
The (Uncheckable) type is the unit type.

The typechecking conundrum is an (Uncheckable) value:
It's the unit value.

The main entrypoint is an (Application) value:
...application which does nothing...

### Another try

I think the answer to your question is that the Uncheckable type doesn't type check. Let me give a slightly more detailed presentation of the idea to explain why.

First, let me highlight the key ideas. Consider the following two systems:

System 1) There is an N > 0 such that there are universes U 0 .. U N such that U (i+1) is a subset of U i when it exists.

System 2) There is an N > 0 such that there are universes V 0 .. V (N-1) such that V (i+1) is a subset of V i when it exists.

System 1 can prove the consistency of System 2 by modeling all of its terms as belonging to U 0, and in particular mapping V i to U (i+1).

These systems are tantalizingly close to isomorphic, but the second system would need to know that N > 1 to be as strong as the first system. I say 'tantalizingly close' because we know how to construct models for this situation in which N is arbitrarily large. Just knowing that N could be one larger shouldn't be a problem in the model.

Because system 1 can prove the consistency of system 2, we know from Godel's incompleteness theorem that this isn't something we can just fix by adding more axioms to the logic. We cannot have a situation where the logic 'understands' its own correctness proof. The idea then, is to move the ability to discover than N > M, for any particular M, into the 'subconscious' of the logic, so that we can do it statically but can't reason about it.

Let's first think about what happens if we're in a theorem prover with this ability to bump M and we want to reason about the correctness of a self proof checker. Interpreting most of the elements of a proof will be a straightforward (if tedious) mapping of terms into universes and we will be able to establish the soundness of the proof checker except for the step of bumping M (the known lower bound on N). We will not be able to prove that this step is correct in general. But for any particular proof, we can know that there are enough levels to make that proof go through. And we can prove something like this for the general case:

forall prf:ProofTerm.  (N > levels prf --> check prf --> conclusion prf)

As human readers, we know that the (N > levels prf) can be true for every proof, and so we can read this statement as a self-consistency proof. But that's from the point of view of a theorem prover. What about from the point of view of a total language? Well, we can build an evaluator that has a type like this:

eval : (t:typedTerm) -> { N > levelOfTerm t } -> typeOfTerm t

where {P} is the type of proofs of the proposition P. That's sort of a self evaluator, but requires a manual per-term check of the level, so we might get objections if we try to call it a self-evaluator. (FYI: We allow that 'typeOfTerm t' might be Nonsense for the doesn't-typecheck case). But what if we introduce a type like this:

-- The bumper "monad" (ok, not exactly a monad because of the M index)
data Bumper M:Nat a:* = Return ({M > N} -> a) |  Bump (Bumper (M+1) a)

The 'data' here is proper inductive (finite depth) data. Now we declare that our language only accepts programs of the form:

main: Show a => Bumper 0 a

Such a program is run by unwrapping Returns until a value is obtained and then showing it. I claim that in an otherwise total language this will not introduce divergence. And further, it allows us to write:

bumpEval: (t:typedTerm) -> Bumper 0 (typeOfTerm t)

This bumpEval is a proper self-evaluator. It will accept exactly the same set of terms as the language in which it is defined. (So you can, for example, build a bumpEval' function that lives inside embedded term language, and run an evalutor inside an evaluator. But if you recur to divergence with this scheme, you won't be able to type such a function.)

Of course, this is a simple calculator type model of programs. Real programs don't simply calculate a value, but also interact with the user. We can also adapt this scheme for that setting by insisting that only finitely many levels are needed to make progress. So we allow a procedure to run even though it has no bound on the number of levels it can descend if it only descends finitely many per step.

Note that I'm writing things in pseudo-code and leaving the details fuzzy because I haven't worked through the details with a system like Coq. Coq with its universe magic might be hard to bolt this on to. I think things are considerably easier if you think about self-implementation ahead of time to allow a shallower encoding.

### Number lazy buggy bumpers

As human readers, we know that the (N > levels prf) can be true for every proof[...]

Just to be clear, I'm not convinced of that yet.

My remaining issues were with the compiler, but now I think they can reduce back to issues with the interpreter: Can you really calculate (typeOfTerm t) without unsafely running a Bumper?

(My quine example, if it works, makes (the compiler's) typechecking diverge; "you won't be able to type such a function" is the point. So I wonder if the same issue exists for the interpreter's (typeOfTerm t). If you can tell me it doesn't, that might help me understand the compiler too.)

### Just to be clear, I'm not

Just to be clear, I'm not convinced of that yet.

But that is trivial: a proof can only ask for finite M and we know that we can make N arbitrarily large.

I think you're bringing intuitions from a dependent type checker. You're thinking that through unsoundness in my logic I might inadvertently allow divergent computation in the type checker. Whereas what I have in mind there is that a proof is a tree of inferences and proof checking just involves checking inferences one-by-one to make sure that the hypotheses and conclusions match the patterns of the axioms.

And type checking can be thought of in a similar manner. So when I say that your program "doesn't type check", I mean that there does not exist a proof that you could supply to the type checker that would convince it of the correctness of the computational fragment of your program. If you supply some bogus proof, it will fail to type check, meaning that it will detect "this proof is not a valid inference tree under the axioms", but it will certainly terminate.

If you'd like, you can think of this as requiring that all normalization steps be elaborated manually by the programmer. If that normalization doesn't terminate, then you don't ever have a well typed program. We can add the ability to run proven-to-terminate computations in the type checker later once we're happy that proven-to-terminate implies termination :).

### Ah, terminating only in practice, not in theory :)

Whereas what I have in mind there is that a proof is a tree of inferences and proof checking just involves checking inferences one-by-one to make sure that the hypotheses and conclusions match the patterns of the axioms. [...]there does not exist a proof that you could supply to the type checker that would convince it of the correctness of the computational fragment of your program.

Nice!

At one point, you say "I claim that in an otherwise total language this will not introduce divergence." Are you talking about using this when you want to make programs that are guaranteed to terminate, but without actually expressing that guarantee as a theorem in the logic?

Then once you've made this Bump of faith for a while, you can Return to tamer computations, now equipped with the knowledge that the theory has a certain level of strength. Awesome.

### At one point, you say "I

At one point, you say "I claim that in an otherwise total language this will not introduce divergence." Are you talking about using this when you want to make programs that are guaranteed to terminate, but without actually expressing that guarantee as a theorem in the logic?

The main use I have in mind for this is self-compilation and modified-self compilation (adding/changing features without lowering consistency strength). Situations where you're embedding an evaluator could be candidate applications, but usually there it will be acceptable to accept that the evaluator language is of lower strength. There is a downside to using Bump in that it becomes visible in your type signatures.

### Self-evaluation

Coming across some papers on this topic that I've bookmarked:

* Typed self-representation (2009): defines a dependent, inconsistent kind system to support a type:type rule, if I'm reading correctly.
* Typed Self-Interpretation by Pattern Matching: introduces a special pattern matching construct which I believe is based on Jay et al's earlier work on the SF-calculus.

### The latest I've found in the

The latest I've found in the sequence of papers about typed metacircularity:

It also reviews the two papers I linked above. The factorization approach used in the second paper is certainly interesting. This latest paper from Jacques Carrette and Aaron Stump notes that while it successfully achieves its goals, reflecting on programs is done at a lower-level of combinators than the higher level of binders and lambda abstraction.

However, I just came across another of Jay's earlier papers on the pattern calculus where he extends the pure pattern calculus to cover case matching on pattern matching functions:

Static typing obviously remains the challenge.