A refutation of Gödel's first incompleteness theorem

Contradictions in Gödel's First Incompleteness Theorem.

Notes for the second edition. I have edited this post a bit, sorry if it is against the etiquette. The original post can be found here. I have removed a couple of -periopheral- mistakes. However I want to also warn that the argument is a lot clearer when adapted to a modern rendition of Gödel's theorem, based on Turing machines, as suggested in i this comment. That adaptation is carried out in the comments that hang from the previous referenced comment, this an onwards.

Intro

This refers to Kurt Gödel's "On formally undecidable propositions of principia mathematica and related systems". The notation used here will be the same as that used by Gödel in that paper.

In that work, Gödel starts with a description of the formal system P, which, according to himself, "is essentially the system obtained by superimposing on the Peano axioms the logic of PM".

Then he goes on to define a map Phi, which is an arithmetization of system P. Phi is a one-to-one correspondence that assigns a natural number, not only to every basic sign in P, but also to every finite series of such signs.

One

There are alternative arithmetizations of system P. I will later delve on how many.

This is obvious from simply considering a different order in the basic signs when they are assigned numbers. For example, if we assign the number 13 to "(" and the number 11 to ")", we obtain a different Phi.

If we want Gödel's proof to be well founded, it should obviously be independent of which arithmetization is chosen to carry it out. The procedure should be correct for any valid Phi chosen. otherwise it would **not** apply to system P, but to system P **and** some particular Phi.

To take care of this, in Gödel's proof we may use a symbol for Phi that represents abstractly any possible valid choice of Phi, and that we can later substitute for a particular Phi when we want to actually get to numbers. This is so that we can show that substituting for any random Phi will produce the same result.

The common way to do this is to add an index i to Phi, coming from some set I with the same cardinality as the set of all possible valid Phi's, so we can establish a bijection among them - an index. Thus Phi becomes here Phi^i.

Two

Later on, Gödel proceeds to spell out Phi, his Phi, which we might call Phi^0, with his correspondence of signs and numbers and his rules to combine them.

And then Gödel proceeds to define a number of metamathematical concepts about system P, that are arithmetizable with Phi^0, with 45 consecutive definitions, culminating with the definition of provable formula.

Definition of provable formula means, in this context, definition of a subset of the natural numbers, so that each number in this set corresponds biunivocally with a provable formula in P.

Let's now stop at his definition number (10):


  E(x) === R(11) * x * R(13)

Here Gödel defines "bracketing" of an expression x, and this is the first time Gödel makes use of Phi^0, since:


  Phi^0( '(' ) = 11

  Phi^0( ')' ) = 13

If we want to remain general, we may rather do:


   E^i(x) === R(Phi^i( '(' )) * x * R(Phi^i( ')' ))

Two little modifications are made in this definition. First, we substitute 11 and 13 for Phi^i acting on "(" and ")". 11 and 13 would be the case if we instantiate the definition with Phi^0.

And second, E inherits an index i; obviously, different Phi^i will define different E^i. And so do most definitions afterwards.

Since, for the moment, in the RHS of definitions from (10) onwards, we are not encoding in Phi^i the index i, which has sprouted on top of all defined symbols, we cease to have an actual number there (in the RHS); we now have an expresion that, given a particular Phi^i, will produce a number.

So far, none of this means that any of Gödel's 45 deffinitions are in any way inconsistent; we are just building a generalization of his argument.

Three

There is something to be said of the propositions Gödel labels as (3) and (4), immediately after his 1-45 definitions. With them, he establishes that, in his own words, "every recursive relation [among natural numbers] is definable in the [just arithmetized] system P", i.e., with Phi^0.

So in the LHS of these two propositions we have a relation among natural numbers, and in the RHS we have a "number", constructed from Phi^0 and his 45 definitions. Between them, we have an arrow from LHS to RHS. It is not clear to me from the text what Gödel was meaning that arrow to be. But it clearly contains an implicit Phi^0.

If we make it explicit and generalized, we must add indexes to all the mathematical and metamathematical symbols he uses: All Bew, Sb, r, Z, u1... must be generalized with an index i.

Then, if we instantiate with some particular Phi^i, it must somehow be added in both sides: in the RHS to reduce the given expression to an actual number, and in the LHS to indicate that the arrow now goes from the relation in the LHS **and** the particular Phi^i chosen, to that actual number.

Obviously, if we want to produce valid statements about system P, we must use indexes, otherwise the resulting numbers are just talking about P and some chosen Phi^i, together.

Only after we have reached some statement about system P that we want to corroborate, should we instantiate some random Phi^i and see whether it still holds, irrespective of any particularity of that map.

These considerations still do not introduce contradiction in Gödel's reasoning.

Four

So we need to keep the indexes in Gödel's proof. And having indexes provides much trouble in (8.1).

In (8.1), Gödel introduces a trick that plays a central role in his proof. He uses the arithmetization of a formula y to substitute free variables in that same formula, thereby creating a self reference within the resulting expression.

However, given all previous considerations, we must now have an index in y, we need y^i, and so, it ceases to be a number. But Z^i is some function that takes a natural number and produces its representation in Phi^i. It needs a number.

Therefore, to be able to do the trick of expressing y^i with itself within itself, we need to convert y^i to a number, and so, we must also encode the index i with our 45 definitions.

The question is that if we choose some Phi^i, and remove the indexes in the RHS to obtain a number, we should also add Phi^i to the LHS, for it is now the arithmetic relattion **plus** some arizmetization Phi^i which determine the number in the RHS, and this is not wanted.

Five

But to encode the index, we ultimately need to encode the actual Phi^i. In (3) and (4), If in the RHS we are to have a number, in the LHS we need the actual Phi^i to determine that number. If we use a reference to the arithmetization as index, we'll also need the "reference map" providing the concrete arithmetizations that correspond to each index. Otherwise we won't be able to reach the number in the RHS.

Thus, if we want definitions 1-45 to serve for Gödel's proof, we need an arithmetization of Phi^i itself -with itself.

This may seem simple enough, since, after all, the Phi^i are just maps, But it leads to all sorts of problems.

Five one

Now, suppose that we can actually arithmetize any Phi^i with itself, and that we pick some random Phi^i, let's call it Phi^0: we can define Phi^0 with Phi^0, and we can use that definition to further define 10-45.

But since Phi^0 is just a random arithmetization of system P, if it suffices to arithmetize Phi^0, then it must also suffice to arithmetize any other Phi^i equally well. However, with Phi^0, we can only use the arithmetization of Phi^0 as index to build defns 10-45.

This means that, as arithmetizations of system P, the different Phi^i are not identical among them, because each one treats differently the arithmetization of itself from the arithmetization of other Phi^i.

Exactly identical arithmetical statements, such as definition (10) instatiated with some particular Phi^i, acquire different meaning and truth value when expressed in one or another Phi^i.

Among those statements, Gödel's theorem.

Five two [This argument does not hold if we have to consider recursively enumerable arithmetiazations.]

A further argument that shows inconsistency in Gödel's theorem comes from considering that if we are going to somehow encode the index with Phi^i, we should first consider what entropy must that index have, since it will correspond to the size of the numbers that we will need to encode them. And that entropy corresponds to the logarithm of the cardinality of I, i.e., of the number of valid Phi^i.

To get an idea about the magnitude of this entropy, it may suffice to think that variables have 2 degrees of freedom, both with countably many choices. Gödel very conveniently establishes a natural correspondence between the indexes of the variables and the indexes of primes and of their consecutive exponentiations, but in fact any correspondence between both (indexes of vars, and indexes of primes and exponents) should do. For example, we can clearly have a Phi^i that maps the first variable of the first order to the 1000th prime number exponentiated to the 29th power.

This gives us all permutations of pairs of independent natural numbers, and so, uncountably many choices for Phi^i; so I must have at least the same cardinality as the real line. Therefore y^i doesn't correspond to a natural number, since it needs more entropy than a natural number can contain, and cannot be fed into Z^i, terminating the proof.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Probably not

Godel's theorem has been checked thousands of times, including mechanically, so the inconsistency you've found is probably one of your contributions. I don't follow it, but what you've written about needing to check every encoding sounds wrong. You only need one encoding. The Godel statement you produce will depend on the encoding you choose, but one Godel statement is enough.

mechanized proof

A reference to a mechanized proof would be the Coq proof by Russell O'Connor. The proof is not completely identical to Gödel's, it contains simplification in places and additional developments related to the question of variable representation, but it should be able to lift most doubts that the result actually holds.

re: mechanized proof

Having a mechanized proof about something that is essentially metamathematical, and where missing a metamathematical step may produce meaningless but coherent mathematics wrongly interpreted as metamathematically meaningful, doesn't help here.

The Book

Nicely put. I might add that there are no mechanized proofs in The Book. Claimed refutations of Gödel's Theorem all, in my experience, treat Gödel's proof as esoteric manipulation of symbols. The original proof was a mess, of course; but by now it's down to a straightforward exercise in diagonalization, and not vulnerable to a complicated argument. It might conceivably be bypassed, by a simple and (once pointed out) obvious bit of reasoning that allows us to simply breeze past it; but I'd expect that to take just a few sentences to explain.

Perhaps

I repeat what was nicely put, and add: perhaps it's fun. If I'm going to answer to an argument that was put forward by Gödel, and that answer can be maintained in the modern incarnations of his words, I should not be so irrespectful so as not to answer directly to him... that was my thought at least :)

My take

My own treatment of Gödel is yonder.

The first proof of something like Gödel's Theorem typically is a mess for, more or less, two reasons; or perhaps one reason with two halves. At first, nobody knows how to prove it cleanly. Iirc, the first simple, clear proof of the Church-Rosser Theorem started circulating about 30–35 years after Church and Rosser's paper. So (1) the originator of the proof doesn't know how to write a really neat proof, and (2) the very awkwardness of the proof makes everyone that much more suspicious and therefore forces the prover to go into inordinate detail to try to make sure they're not overlooking something. Of course, they might be overlooking something anyway, not because they're "wrong" in the essence of their result but because their inordinately complicated method of proof is itself error-prone. Later on, once the essence is better understood and everything looks much simpler, if there was any truly fundamental problem with the result it will become more and more obvious; and by that point, you're not going to convince people there's a problem with the basic result by pointing to the original proof. As the result can be explained simply, any discussion of it must be related to the simple explanation rather than to the unnecessarily difficult original proof; if a new objection can't be related to the simple new treatment of the result, the natural suspicion will be that the lack of clarity is hiding a flaw in the objection.

Ok

I see your point, I concede it. I will try to migrate my argument to the modern incantations of the theorem. Thanks!

My take on your take

In your take you speak about a Turing machine L and another A. The Turing machine L takes a Turing machine f and an input i and works out the output. And A uses L to produce the contrary result; So when applied to itself, produces a contradiction.

the machine A must obviously contain L.
No problem with that. Let's note it with A(L)

L is about implementation, in the sense that to be able to, given some machine f, determine that it produces some output, either you run the machine, or you already have within L all possible results of all possible machines, which is crazy.

So L corresponds with a given way to encode Turing machines and run them.

There are many, many ways of doing this (Encoding and running Turing machines). Let's note this by adding an index to L, L^i. It's totally harmless, and we can still have L^1, that is exactly and simply L, by
definition here and now.

However now we also have different A(L^i), and also A(L^1).

In your text, we may say that you encode A(L^1) in L^1 and feed it to A(L^1).
And this produces a contradiction.

But using indexes allows us to wonder:

What happens if we encode the A(L^1) in L^2 and feed it to the A(L^2) machine?
It will find no trouble at all, no contradiction.
And feeding it to A(L^2) amounts to nothing but running it;
So, running the Turing machine A(L^1) in different environments leads to different
results. The results are implementation dependent.

Expressing the same Turing machine, A(L^1), in 2 different encodings, L^1 and L^2, leads to different results.

Take 3

And here we go :-D ...

The Turing machine L takes a Turing machine f and an input i and works out the output.

Actually, no. All I assumed L would do was decide whether or not f could have input/output pair (i,j). That is, given triple (f,i,j), L would decide whether or not f on input i would produce output j. We don't know how L does its thing. I didn't want a proof that only applies to machines L that work in a certain way, you see; so I made no assumption about how it works (except, of course, the assumption that it's a Turing machine). If there is some way to do what L is supposed to do without L having to work out f's output on i, the proof should cover that.

To bring out the point a bit more: I noted that if L never makes a mistake, there must be some (f,i,j) for which L answers neither "yes" nor "no", because if L never makes a mistake and always answers, we could solve the halting problem. Build a machine L' such that L' takes input (f,i); L' automatically constructs another machine f' such that f' runs f on i and then outputs "confirm"; and L' sends (f',i,confirm) to L. So L' will take (f,i) and tell us "yes" or "no" whether or not f halts on input i. But L' does not necessarily run f on input i. It constructs an f' that would run f on i, but then instead of running f', it passes (f',i,confirm) to L; and we don't know for sure that L runs f' on input i, either.

the machine A must obviously contain L.
No problem with that. Let's note it with A(L)

This is doable, but unnecessary. I do describe how to build A, given L; therefore one could view the construction as a function of L. However, we have only one L (although we don't know which one that is); so I don't see why we should depict A as a function; I'm only interested in the A constructed from the particular L we're looking at. Granted, viewing A as a function of L is doable.

However, at this point you proceed with a bunch of reasoning that depends on assumptions about how L must work. Again: I deliberately made no assumptions about how L works, beyond requiring L to be a Turing machine. If I had made assumptions about how L works, my proof by contradiction would only show that there could not be an L that works the way I'd assumed; and I don't want my proof to be so limited, so I made no such assumptions.

Thanks for the patience

> And here we go :-D ...

Thanks for the patience. Really.

> I didn't want a proof that only applies to machines L that work in a certain way, you see; so I made no assumption about how it works

Nor do I say any assumption has to be made.

But the fact that there are alternative L's is also true. And that fact means that L + A expressed in L behaves differently than expressed in L' . The same Turing machine (A + L in some combination, be L part of A, or part of the input) with the same input, run in different environments -L and L', that are supposedly equivalent as regards running Turing machines-, behaves differently.

on-going

Thanks for the patience. Really.

Pushing the envelope with extra-paradigm scientific ideas is risky and difficult. There are a wide variety of things that can go wrong, both in implementation and presentation. I have some respect and sympathy for someone rational who's willing to try it.

Nor do I say any assumption has to be made.

I did see you making some assumptions, though:

L is about implementation, in the sense that to be able to, given some machine f, determine that it produces some output, either you run the machine, or you already have within L all possible results of all possible machines, which is crazy.

So L corresponds with a given way to encode Turing machines and run them.

That's an unnecessary assumption. You're assuming L has to run the machine. In fact, when someone manages to prove that something surprising can be done, often they've found a surprising way to do it — hence my particular caution not to assume that L has to actually run f in order to deduce things about its behavior.

It is true that L has to use some encoding of f.

There are many, many ways of doing this (Encoding and running Turing machines). Let's note this by adding an index to L, L^i. It's totally harmless, and we can still have L^1, that is exactly and simply L, by definition here and now.

It isn't possible for L to take f unencoded, because the input to a Turing machine has to be a string of symbols, and a Turing machine is not itself a string of symbols. This seems an important point to explore, as you remark (in another comment, below),

> However, we have only one L

The point is that we may easily have more than one. And then things get weird.

It doesn't matter what other Turing machines exist. We suppose that a machine exists with such-and-such properties, and show that this supposition leads to a contradiction. So there wasn't such a machine in the first place.

My point about encodings is that a Turing machine is actually a graph; a set of states, with directed transitions between them labeled with symbols and actions. It's about as un-string-like as you can get. When I speak of L taking as input (f,i,j), what I really mean is that we stipulate some manner of encoding Turing machines as strings, and L takes as input a triple of strings where the first is an encoding of f and the second and third are i and j. There's another encoding there: we have some way to represent a triple of strings; but there are various ways to do that, and the details of how we choose to do it shouldn't matter.

Machine A takes as its input a string i; it figures out what the column number m is for that i; it figures out what the encoding-of-a-Turing-machine is that labels row m; it uses L to decide whether that encoded machine on input i would output "confirm"; and if L produces an answer "no" then A outputs "confirm", if L produces "yes" then A loops forever, and of course if L doesn't produce an answer then neither does A. I have just described a Turing machine, which I'm calling A. We can encode A, producing a string that must appear as the label on some row of the table.

That is, if our initial assumption was true in the first place — that there exists an L with such-and-such properties. But the whole point is that our initial assumption could not have been true, because it leads to a contradiction.

For the later bits of your reasoning, it seems pretty clear to me (tbh) you've gotten yourself tangled up somehow, and I'm not 100% sure what's happened but I have a suggestion. You say

In your text, we may say that you encode A(L^1) in L^1 and feed it to A(L^1).
And this produces a contradiction.

But using indexes allows us to wonder:

What happens if we encode the A(L^1) in L^2 and feed it to the A(L^2) machine?

If I'm reading you rightly, L^1 and L^2 are the same logic except that they use different encodings of machines as strings. A(L^1) and A(L^2) would then be the corresponding machines "A" for these two different versions of "L", although it's a bit dicey to put it this way because, although constructing an "A" for each of two different logics might be fairly described as a function A(L), when you change the encoding used by L you may have to change the way you do the construction of your "A" in ways that are unclear. Glossing over that, though, there's a more basic problem (if I'm reading you rightly): we do not get to choose which machine we encode in which encoding and send to which machine. I'll walk through that carefully.

The machine A(L^1) is doing its own internal processing, based on an input string i. Because A(L^1) is (by supposition) a Turing machine, its encoding (using encoding number "1" that both machines L^1 and A(L^1) use) does in fact occur as the label on some row of the table; since we have A(L^1) in hand, and we know how encoding number "1" works, we could in principle figure out which row m it is, and what input i labels column m, and then we could give input i to A(L^1). It would be fair to think of i as itself an encoding of A(L^1), since we have a unique mapping from Turing machines to input strings i; it probably isn't encoding number "1" (nor encoding number "2"), but technically it is an encoding. And we reason about what would happen if we gave input i to A(L^1), as a consequence of the behavior we've built in to machine A(L^1).

If we wanted to base our reasoning off of encoding number "2", we'd have machine L^2, machine A(L^2), and a whole new table (the ordering of rows could be very different, due to the different encoding). We could apply encoding number "2" to A(L^2), figure out what row the encoding occurs on (call it m2), figure out the input on the same-numbered column (call it i2), and reason about what would happen if we give that input to A(L^2). There isn't anything useful we can say, in general, about what would happen if we gave input i to A(L^2), or input i2 to A(L^1); as far as A(L^2) is concerned, i is just some arbitrary number, likewise i2 has no significance for A(L^1).

Mostly agree with you

> I have some respect and sympathy for someone rational who's willing to try it.

Excellent :)

> There isn't anything useful we can say, in general,
> about what would happen if we gave input i to A(L^2), or input i2 to A(L^1);
> as far as A(L^2) is concerned, i is just some arbitrary number,
> likewise i2 has no significance for A(L^1).

I agree with everything you say, except for one point, that might be recognized on the above quoted words, and which I try to clarify below.

Both L^1 and L^2 embody a way of encoding **any** Turing machine, and working out whether, given some input and output (also properly encoded), the encoded machine would stop with the given output.

Suppose we have a Turing machine f, and an input i and output o appropriate for that machine. Encoding them all to L^1 and giving them as input to Turing machine L^1, should provide the same result as if we encode it in L^2 and work it out with L^2. As an example, we may imagine a Turing machine f1 that determines whether a number n is bigger than 10. If we encode the machine in L^1, encode the number 11 in L^1, and provide both encoded objects as input for L^1, it should stop at "true", exactly the same as if we repeat the procedure with L^2.

I think that we can agree in that if we have some arbitrary L^1 and L^2, and also some particular Turing machine f and input i, and we get a different result if we provide f and i (appropriately encoded) as input to L^1, than we get when we do the same with L^2, we have a problem. We have an implementation dependent result.

Now, the combination of A and L^1 (or any other L) is just pure and simply a combination of a Turing machine and an input for it. So encoding them in L^1 and providing them as input for the L^1 Turing machine should provide the same result as encoding them in L^2 and using it as input for L^2.

And my point is that they do not provide the same result.

(edited for clarity)

More engaged response

If I'm reading you rightly, L^1 and L^2 are the same logic except that they use different encodings of machines as strings

yes. And I agree to most of the previous, although you must admit there is a bit of a drift.

A(L^1) and A(L^2) would then be the corresponding machines "A" for these two different versions of "L", although it's a bit dicey to put it this way because, although constructing an "A" for each of two different logics might be fairly described as a function A(L), when you change the encoding used by L you may have to change the way you do the construction of your "A" in ways that are unclear.

If you change the encoding used by L, you have to change A(L) so that it uses the new encoding. It will just be a new L. I think there is no difference whether you assume that L is a part of A, or an input to it. In both cases, if you want to "perform" A with it, with A encoded in some L, L will have to be encoded to L.

And of course the question is what happens if the L you use to encode and the encoded L are different.

The machine A(L^1) is doing its own internal processing, based on an input string i. Because A(L^1) is (by supposition) a Turing machine, its encoding (using encoding number "1" that both machines L^1 and A(L^1) use) does in fact occur as the label on some row of the table; since we have A(L^1) in hand, and we know how encoding number "1" works, we could in principle figure out which row m it is, and what input i labels column m, and then we could give input i to A(L^1)

Sorry if I misunderstand you, but encoding number "1" is totally out of scope. L^1 is a Turing machine, and as such, it can be encoded in L^1. No need to encode a reference to it.

It would be fair to think of i as itself an encoding of A(L^1), since we have a unique mapping from Turing machines to input strings i

exactly, an encoding of A(L^1) (i.e., of the very simple A combined -as part or as currying- with the quite complex L^1) in L^1

as far as A(L^2) is concerned, i is just some arbitrary number, likewise i2 has no significance for A(L^1).

I totally agree with this. i corresponds to the encoding of A(L^1) in L^1, and to L^2 it is just a number. However,there is also the encoding of A(L^1) in L^2. And that is not just a number to L^2, it corresponds to a particular Turing machine, the same one that corresponds to i in L^1.

And, working out the output of that Turing machine - A(L^1) - in L^1 or L^2 gives different results.

Final take

I will go step by step through the diagonalization method in your post, taking into account the possibility of different encodings for Turing machines, different L^i.

Each L^i must correspond to some table, T^i. Both the rows (Turing machines) and columns (inputs) must be in some encoding that can be fed to L^1 so it can decide whether the Turing machines halt on some output.

To the A(L^i) Turing machines, the tables are oracles, and we'll imagine solved the problem of providing the same oracle to all encodings of each A(L^i).

So now suppose we have a physical implementation of A(L^1), connected to an oracle with the table T^1.

Now we must feed the nth input (encoded in L^1) to the unencoded A(L^1); which corresponds in the nth column to an L^1 encoded copy of itself, of A(L^1). Which therefore must have a connexion with the T^1 itself. Now if we check the nxn cell, there is nothing but chaos. No doubt about it.

Let's imagine now in the same setup that there is a row corresponding to the encoding into L^1 of A(L^2), being L^2 another machine complying with L. To A(L^1) now we provide the ith input that corresponds to the encoding of A(L^2). But now, that input, in the T^2 table to which A(L^2) is connected, doesn't correspond to the A(L^2) encoded in L^2. So there is no trouble in the concerned cells in any of the tables.

This means that different L^i have tables with different entries in some cell that corresponds to the same particular specification of a Turing machine.

The row that corresponds to A(L^1) in the T^1 oracle has some problematic cell, whereas in the T^2 oracle it has none.

Imagine that 2 T^i had different entries in the cells that corresponded to the some Turing machine that checked whether some number is bigger than 10.

The whole edifice around L seems to have a flaw, in which some propositions, some Turing machines, have a certain a meaning or behaviour in every one of the possible L^i except in one, which corresponds to its own encoding. So if this contex is correct, Gödel's proof might survive in it. However, it would be very diminished, since exactly the same propositions that are problematic in some implementation, have a clear meaning in all others. It stops being about Truth, and becomes about implementing Turing machines, or arithmetizations, etc.

More trouble comes from the fact that, if L^1 and L^2 are both correct implementations of L, it should be impossible for a Turing machine to distinguish whether it is being encoded and run in one or another L. However some Turing machine connected to the oracle might search for the rarity in the table, and work out exactly its own implementation by reverse engineering?

It is not difficult to add the concept of "compiler" and "decompiler" Turing machines that take formal specifications of a Turing machine and encode them into some particular L^1, and take their outputs and produce formal specs, and use these machines to clarify the previous algebra.

Conpleteness

This appears to be a criticism of Godel's method, but it is also not a proof of completeness. Where does that leave us? Intuitively do you think a proof of completeness is possible in some systems, or do you think Godel was right, but the proof is flawed?

re completeness

At the beginning of this discussion, I thought I could see an invalidation of Gödel's method. However at the moment I'm not so sure about that; what I see, if my assumptions are correct, is that his method, his proof, is not incorrect, but his conclussions are; his constructions cease to speak about completeness of system P.

And, in my view, if there is no support for such a crazy idea as the incompleteness that Gödel was deriving from his proof, we should better dissmiss it.

But even crazier is the idea that I have really found a flaw in Gödel's argumentation! So I'm not yet thinking too much about consecuences, I am basically going through my argument over and over looking for flaws...

We may have more

> However, we have only one L

The point is that we may easily have more than one. And then things get weird.

Last off

> > The Turing machine L takes a Turing machine f and an input i and works out the output.

> Actually, no. All I assumed L would do was decide whether or not f could have input/output pair (i,j).

To me this amounts to the same thing. The easiest way to take that decision is to run the machine; any other way just adds complexity. Perhaps I'm wrong. But the point is simply that there are alternative L's, not what form might they adopt.

John, I'm not sure about this

Looking at the treatment of Godel on your blog, we find this:

To pin this down, we first need to work out what "sufficiently powerful" means. Gödel wanted a system powerful enough to reason about arithmetic: we can boil this down to, for an arithmetic function f and integer i, does f(i)=j or doesn't it? The functions of interest are, of course, general recursive functions, which are equi-powerful with Turing machines and with λ-calculus; so we can equivalently say we want to reason about whether a given Turing machine with input i will or will not produce output j. But a formal system is itself essentially a Turing machine; so in effect we're talking about a Turing machine L (the formal system; L for logic) that determines whether or not a Turing machine (the function f) on given input produces given output. The system would be consistent and complete if it confirms every true statement about whether or not f on given input produces given output, and doesn't confirm any false such statements.

I think you may have boiled it down too far, losing a couple of important aspects of the incompleteness theorem in the process. One of these losses makes the problem easier, and the other makes it harder, and they seem to sort of cancel each other out and you're left with what is essentially the halting problem. The missing pieces are:

1. Your version of the problem requires only checking a single input to a function, whereas arithmetic has quantifiers. This makes your version easier.

2. Your version doesn't allow for a proof term to accompany the problem. This makes the problem harder. Checking a function at a single input becomes easy if you're allowed a proof annotation -- just choose the number of steps needed for the Turing machine to terminate as your proof term. Affirming that f(i)=j in N steps just requires a universal machine to run f.

Because of these differences, I'm worried that your treatment in the blog doesn't really capture Godel's incompleteness theorem. Thoughts?

Some thoughts

First, a general observation about these sorts of profound principles named after their discoverer. In such cases, the name honors the person credited with discovering the profound principle, but we're likely to have neatified and generalized the principle since, so that what we end up crediting to them is broader than their original work. I'm after the general principle, the thing that's a deep feature of the fabric of reality. With that in mind...

Working from my belief that the principle involved is very profound and general and should afford a simple proof, I viewed entanglement with quantifiers as a defect; a distraction, making it harder to see the principle and giving an impression that the principle is far more dependent on specifics of the system than it actually is.

The part about proof annotations... is a bit more fraught, perhaps. I suspect it should be considered an adjunct to the introduction of quantifiers.

Concerning the halting problem — well, yes. I have always conceptualized Gödel as another form of the same primordial feature of reality as the halting problem. Gödel being the more elaborated form.

Sure

Generalization is what mathematics is about. My question is whether your treatmeant actually generalizes Godel. I don't think it does at the moment.

Two roads diverged in a wood

I think there are two ways to tackle this. On one hand, there's the matter of what general principle one perceives Gödel to be part of. On the other hand, there's the question of technical reducibility of one problem to another. The first question is subjective. The second seems a technical challenge that I, at least, would probably have to launch a major effort to sort out (and perhaps eventually I will; I've no clue how far in the future that might be, though).

It's been a pleasure

It's been the best conversation I've had in a long time :)

:-)

I also have found it quite pleasant.

Universal quantifier on f

we're talking about a Turing machine L (the formal system; L for logic) that determines whether or not a Turing machine (the function f) on given input produces given output.

I would say that there is an implicit closure here.

I mean, we wouldn't want "a

I mean, we wouldn't want "a Turing machine L (the formal system; L for logic)" that can only check the output of some given f. So I think there is a universal quantifier hidden in that sentence.

Yeah, probably not - but

> needing to check every encoding

You don't need to check every encoding, but you need a proof that is independent of the encoding, i.e., that can be carried out with any valid encoding.

> The Godel statement you produce will depend on the encoding you choose, but one Godel statement is enough.

Only if you can prove that it does not depend on your choice of encoding.

One choice is enough

I don't agree that you need a proof that is independent of the encoding. The encoding is used to make precise, in terms of arithmetic, the mechanics of the proof system. Once you believe that this encoding faithfully captures those mechanics, proving the theorem in that setting is enough.

By analogy, suppose I want to prove the triangle inequality in the plane. I introduce coordinates for the plane and do the proof in coordinates. Do I need to worry that my proof works in any coordinate system? No. I can choose coordinates that are convenient to me.

One choice may not be enough

> Do I need to worry that my proof works in any coordinate system?

No; you need to worry if your proof doesn't work in some valid coordinate system. So you have to protect against it.

> I can choose coordinates that are convenient to me.

Only once you have a description that is independent of the coordinates. Engineers choose coorddinates.

the question is

> I don't agree that you need a proof that is independent of the encoding.

In any case, the question is not whether we **need** such a proof. Such a proof should be obvious and immediate if his proof were right. But making it explicit, that we may try a more general proof that is independent of the encoding, helps to uncover a very big understatement in Gödel's propositions (3) and (4), where he hides behind an arrow the whole of the structure of his arithmetization, while making it seem a simple logical implication.

> The encoding is used to

> The encoding is used to make precise, in terms of arithmetic, the mechanics of the proof system. Once you believe that this encoding faithfully captures those mechanics, proving the theorem in that setting is enough.

If there are alternative encodings that also faithfully capture those mechanics, what harm can be found in seeing whether the theorem also works in them?

But you're not just

But you're not just considering alternate encodings. It sounds like you're requiring these alternative encodings to be "first class":

Therefore, to be able to do the trick of expressing y^i with itself within itself, we need to convert y^i to a number, and so, we must also encode the index i with our 45 definitions.

That is, you're not just generalizing the proof to handle one arbitrary encoding, but are insisting that the proof handle all of them at once. I don't see how you're thinking that's necessary.

> are insisting that the

> are insisting that the proof handle all of them at once

I really do not think that is correct. The index that would be encoded would be that corresponding to a single arithmetization. The problem is the entropy that is needed to discriminate it from the alternatives.

Well you're trying to make

Well you're trying to make the encoding aware of the entire family of encodings. Otherwise you wouldn't need to encode the encoding index i. No?

I think that I don't try to

I think that I don't try to make the encoding aware of anything. What I find is that if you try to make the proof independent of the encoding, you end up having to encode the actual encoding - or the index, which ultimately accounts for the same thing.

Equality

Sometimes i think i understand Godel, then i don't think about it for a while, and i find i lose any clarity i might have had.

To me there seems to be some implicit assumptions that divide mathematics and computation. Equality in mathematics seems to be intentional, things are equal for all time, proofs are true for all time.

Computation on the other hand includes time, equality is extensional, the halting problem is a classic example of something we can only know by 'executing' the code. As far as i know this was all introduced by Turing, and the Turing machine.

My understanding of the kind of mathematics Godel was talking about was the former where consistency is a property, if something is true now we mean it is true for all time. A Turing machine on the other hand does not have to do so, it has a memory, and a computation may oscillate being true one time and false the next. As such a Turing machine is not part of the mathematical universe described by Godel. A Turing machine is more a model of a mathematician, than it is a model of mathematics.

In the world of the Turing machine, just because 2 + 2 = 4 today, does not mean it must tomorrow. A Turing machine exists in a larger universe than mathematics because it includes time. In effect you cannot prove a Turing machine consistent from the outside, because it may change the rules with time (the Y2K bug is an example here).

Edit: a little more clarity on the oscillating computation. The Russel paradox is a problem for classical set theory, because it creates a contradiction, but only because we require all proofs to be true for all time. However if we take the universe of one element sets, and populate it in a computer (a set that contains itself, and a set that contains the empty set), and then flag them if they are not members of themselves, there is no contradiction. The contradiction literally cannot exist because a computer cannot create an infinite set, it cannot perform a literal inductive proof. The computer can iteratively deepen the proof but it can never reach infinity. It appears to me that a computation can never have completeness, and we cannot prove the consistency of its calculations (without looking inside the box to see how it is implemented).

If you drop your keys into a river of molten lava, let 'em go

In the world of the Turing machine, just because 2 + 2 = 4 today, does not mean it must tomorrow.

What does it mean for something to be true in the world of a Turing machine?

Nothing much

I am not sure the question makes much sense. To bring a little Wittgenstein into it, we have one language game called mathematics, where truth has a meaning defined in the context of mathematical proofs, and we have another different language game called computation where a Boolean value can be true or false.

So in the world of a Turing machine, true is one of the two states of a Boolean variable, it has no relation to mathematical truth, except that we can model mathematics in computation using certain algorithms. But I think it's a mistake to conflate the use of the truth value to represent mathematical truth with it actually being "truth".

Not convinced.

If we want Gödel's proof to be well founded, it should obviously be independent of which arithmetization is chosen to carry it out. The procedure should be correct for any valid Phi chosen. otherwise it would **not** apply to system P, but to system P **and** some particular Phi.

To take care of this, in Gödel's proof we must use a symbol for Phi that represents abstractly any possible valid choice of Phi, and that we can later substitute for a particular Phi when we want to actually get to numbers. This is so that we can show that substituting for any random Phi will produce the same result, and thus that the result is well founded.

I am not convinced by your line of argument. I've highlighted what I believe to be the problematic word in your reasoning. The word "must" indicates that there is no valid alternative approach, and the rest of your arguments seem to be derived from here.

One alternative approach is to pick an arbitrary phi and show that it works (as Gödel did). If there exists another phi that is a valid encoding then there must be a bijection between the new phi, and the old. If there is no bijection then the two encoding disagree about the set of valid statements.

As long as the bijection holds we can wrap the original phi in a function that produces the new phi. Thus we do not need to show the result for every phi - only for one phi that is a canonical rendering of the set of statements, and we need an argument that no other valid phi exists so that there is no bijection between them. This is generally understand as phi being a canonical representation of the set.

Agreed

The original rendition of the argument has some flaws, and the one you point out is one of them. The adaptation of the argument to the expression of Gödel's theorem through Turing machines, developed in the discussion with John Shutt in the coments, follows your line of reasoning, and is in my opinion more robust than the original post, although also has weaker -but if correct, still significative- results.

Encodings

I don't really see the connection, but then I find your description to be quite informal so it is certainly possible that I've filled in some of the dots in the wrong way.

In your take you speak about a Turing machine L and another A. The Turing machine L takes a Turing machine f and an input i and works out the output. And A uses L to produce the contrary result; So when applied to itself, produces a contradiction.

Here is an example of where I got lost along the way. How can the machine L take the machine f as input? A machine executes on a tape that encodes it input. So where you say \(f\) in the above I can understand what you would mean if you said \(f_L\) to indicate the machine \(f\) encoded into the language accepted by the machine \(L\). Or as it is normally phrased, that L is a simulator (interpreter) of the machine \(f\) (of the language that \(f\) is implemented in).

Any input to a machine that is not a tape (in the form used in the definition of the machine) requires an encoding. Some of these encodings seem to be missing in your description.

I see mention of the final encoding (outermost wrapper) in each case, but no mention of the inner encodings. The claim that feeding the Gödel-string of \(L_1\) into \(L_2\) would have no effect seems to skip over this encoding issue. I would claim (without checking) that if you plug in the extra encodings and keep track of the bijections between them that the issue of translating Gödel-strings from one machine/encoding to another will disappear. Probably.

Encodings are there

In the paragraph you quote, I was trying to link with the terminology used for the common diagonalization method as described in John Shutt's rendering of the theorem, where there is no talk of encodings. Then, in the rest of that comment, and in the following comments in that subthread, I take into account all the necessary encodings, exactly as you suggest, and it is this taking them into account that surfaces the alleged problem I am seeing with the proof. Take this comment and the 2 that follow.

Yes

Yes - thanks for the link to the right comment, I see that you did put the description of those encodings in there. Then I went back over John Shutt's post again and in fact I'm just paraphrasing a point that he has already made, but I'm paraphrasing it quite poorly so I'll just stop there.

Meta things

This discussion seems finished. Thanks to all involved, and to the host, and sorry if someone has felt spammed. (Of course, if anyone else has something more to say, I'm all ears).

I've been a few days sleeplessly agonizing over all that has been put forward here, specilally in the conversation with John Shutt. Most of my original arguments are gone; even the one still left as Five two about cardinality, which was the first I saw, seems to not hold with with recursively enumerable Turing machines.

However, the originating idea of considering and following the alternative encodings, or arithmetizations, still seems to me to lead to a significative result about the metamathematical implications of Gödel's exercise, explicit in the comments around this one, that was buried in a lot of trash in my original post.

So now I think I will try to write a paper about it and see how this holds in the academic community, if it is ever allowed entry at all. I will be doing it here at github. Anyone interested in contributing, either in the redaction or in any other way, will be most welcome.

First, be careful

My advice would be to go over your reasoning even more slowly than you have here. Also, if you can distill a 10,000 feet version that conveys the intuition of what you think goes wrong, that would also help. I had trouble figuring out which your steps was supposed to be the important idea. When the proof has vague or imprecise steps and I don't get the big picture, it's really hard to help.

Thanks

Heartful thanks for the advice. I will follow it and calm down a bit, I was getting carried away.

The problem in understanding the argument is I think that it is necessary to understand a very weird Turing machine called "A" that is very well described in John Shutt's post, it takes a while. Once that is taken, I think that my argument is easy. It boils down to the fact that Turing's statement is made in some encoding, and it contains an encoding, and exactly it, Gödel's statement, expressed in a different encoding than that used within, ceases to be problematic; So Gödel's proposition is just an arbitrary arithmetic proposition in almost any encoding possible. Which in my view is akin to saying that "9 > 8" has different truth value in one or another encoding.

Ah, thanks for explaining

Ah, thanks for explaining the connection to John's blog post. I didn't follow that link and ended up confused when I tried to follow along with your exchange a few days ago.

I believe the situation is that different encodings lead to different Godel sentences. All of them are true but unprovable in the logic (regardless of how you encode them). Any statement which asserts "I am unprovable" -- under any encoding -- is true but unprovable (a fact which holds without regard to any encoding). Of course this isn't going to be very many statements, because it will need to assert that there exists a "proof" which is a very particular mathematical structure under any encoding.

I believe the situation is

I believe the situation is that different encodings lead to different Godel sentences.

Yes, agreed. And each Gödel sentence is a perfectly defined arithmetic recursive function. So what we have here is just a bunch of plain old recursive functions. Each and every one of them can be expressed in all possible encodings.

All of them are true but unprovable in the logic (regardless of how you encode them)

All of them lead to problems when used for Gödel's argument. But also, none of them give any problem when expressed in a different encoding than the one they use internally.

Supposing we are given 2 different arithmetizations of system P or of some equivalent system, and are asked for a proof that at least one of them is invalid as regards being interpreted in system P (or equivalent), a perfecctly sensible strategy would be to find some plain old recursive expression in P that exhibits different behaviour in one or the other arithmetization.

Which is exactly what obtains if you follow Gödel's argument taking into account alternative encodings.

Any statement which asserts "I am unprovable" [...]

If, in different encodings, the exact same statement asserts different things, then you can't say "... unprovable in system P", you have to say "... unprovable in this arithmetization of system P". Which doesn't make sense: depending on a particular implementation to check some general (i.e., independent of the implementation) truth doesn't make sense. So my conclussion is that we have to think of some other interpretation for the weirdness obtained by expressing Gödel's statement in the encoding that it uses internally.

Nope

If, in different encodings, the exact same statement asserts different things, then you can't say "... unprovable in system P", you have to say "... unprovable in this arithmetization of system P".

That's not true. The statement can be interpreted as "unprovable in system P". "Unprovable in an arithmetization" looks like a category error to me.

I tried the analogy of coordinates before, but let me give another analogy. Suppose I see an endgame chess position and want to prove that white can force a win, and decide to write a computer program to check all of the possibilities. I assume you're a programmer, and so you understand how this would work. The first thing I do is pick an encoding of the chess board and implement the rules of chess in terms of that encoding showing that whatever move black makes, white can counter it in a way that leads to a win.

Do you agree that if I do this correctly, I can prove something about the original chess position and not just about my encoding of the chess position?

"Unprovable in an

"Unprovable in an arithmetization" looks like a category error to me.

That's what I meant with "makes no sense".

if I do this correctly, I can prove something about the original chess position and not just about my encoding of the chess position

Yes. And if you obtain a different result if you pick a different encoding and implement the same rules in terms of that other encoding, you should think that you are not doing it correctly.

You're right

My analogy wasn't very good because the thing we want to prove about our logic depends upon the choice of encoding.

If you take a Godel sentence that was generated by an unknown encoding, it will just look like a sentence about numbers. To demonstrate that the sentence is true but unprovable, you need to know that the sentence can be understood as referring to its own provability under some encoding. But it remains true & unprovable even if you don't see how to demonstrate that it is. What is problematic about this?

... as referring to its own

... as referring to its own provability under some encoding

...

What is problematic about this?

That we have agreed that "provability under some encoding" is a categorical error.

Wrong parse

My sentence was meant to be parsed as if it was written:

... you need to know that under some encoding, the sentence can be understood as referring to its own provability.

So "provability under some encoding" is a category error but I didn't use it as such.

If Gi is a Godel statement (which I would assume is a statement -- a symbolic encoding of a proposition -- not a number), constructed on the basis of encoding i, then Gi is either provable or not in *arithmetic*. It doesn't make sense to ask whether it's provable in a different encoding. That's the category error.

You can encode Gi into another encoding, but nothing "goes wrong". Doing so just isn't helpful.

sorry [edited to mark it as off-topic]

[edited: this comment is a bit philosophical, I would steer away from it.]

Gi is either provable or not in *arithmetic*. It doesn't make sense to ask whether it's provable in a different encoding

Yes, but with some caution. I think that there is some important nuance missing in your comment. You don't have arithmetics, you have encodings, models of it. You prove formulae in those models. And to know whether those models are good, you compare the proofs in different models, among them your intuitive model of arithmetics. If all models coincide in a proof, you consider the formula provable in arithmetics.

Once you have a model you trust, you prove in that model, and say that the proof is "in arithmetics". But that is only because you implicitly trust the encoding of arithmetics in which you work. If you were to be offered another model, that made perfect sense as model for arithmetics, and that allowed for different proofs, different truths, you would rightly suspect that one or the other model was not really a good model for arithmetics.

So effecctively you prove things in encodings, in models, but the things are provable in arithmetics, and so proofs of provable formulae can be constructed in valid encodings of arithmetics. [added: The difference I meant here was between verb for models and noun for arithmetics, very lame.]

The problem comes when you have formulae that seem to have proofs in some models but not in others.

I think that that means that you cannot reach arithmetics from your proofs in your models.

By Arithmetic I mean a

By Arithmetic I mean a formal system. When you speak of a model I think you mean an encoding of the formal system. I don't think checking that an encoding faithfully captures a formal system is as interesting as you imply. That's easy math. Also that use of 'model' is going to confuse logicians.

I'm going to have to break for a while but I still don't see any problem. Good luck.

Thanks

I'm going to have to break for a while but I still don't see any problem. Good luck.

Thank you, for all :)

My problem is that I still see a problem...

My problem is that I still see a problem...

Heh. It's really easy to get these things wrong, but doesn't have to be so, and regardless of which, it's truly valuable to have someone who will take you seriously enough to listen. Once upon a time, early in graduate school, I had a suspicion of a possible approach to solving the P=NP problem, that I wanted to think about — a few hours later I'd figured out why that possible approach didn't work — and meanwhile I was wondering, what on Earth would I do if I actually hit on a solution? Why should anyone take me seriously? So I asked me thesis advisor. His answer ran something like this (I don't guarantee I've got this exactly, as it was about 25 years ago, but the principle is there): Well, I know you and if you told me you'd solved that, I would take you seriously and study your proof. And if I was convinced you were right, I would take it to the person who supervised my dissertation. He knows me, so when I said I had a student who had this thing, and it looked right to me, he would take me seriously and study the proof. And if he became convinced, he'd take it to his dissertation supervisor. And if his dissertation supervisor was convinced, he'd announce to the world that the problem had been solved and everyone would take him seriously because he's Juris Hartmanis.

[Btw, I know for sure at least some details have gotten skewed in my memory, because I see from the mathematics geneaology project that, strictly by line of dissertation supervisors, my master's thesis advisor isn't a descendant of Juris Hartmanis, although I'm sure that was the name the trail ended with in his answer to me; instead, he's academically descended from Stephen Kleene. :-P]

I've had the feeling for

I've had the feeling for some time that you were making some subtle conceptual mistake that is difficult for us to identify because it causes you to say things that we interpret differently than you do. It made me uncomfortable that you would speak of "encoding in L^i", but L^i is not an encoding and doesn't do any encoding. Encoding is something the human meta-mathematician does; encoding is taking our meta-mathematical meaning and casting it into formal mathematics. The formal mathematics has no meaning except what we give it. I was daunted by the challenge of how to make this aspect of the situation lucid for you, and of course all the effort I might put into that challenge might fail to produce apparent results and yet I wouldn't know whether the failure was because I didn't explain it well enough or because I was wrong about the misunderstanding I was trying to clear up. But in the above comment I think I see a related confusion.

each Gödel sentence is a perfectly defined arithmetic recursive function.

But no, a Gödel sentence isn't any kind of function. It's just an assemblage of symbols. This is part of why I preferred, in my blog post, to use Turing machines: the input to a Turning machine is explicitly a string of symbols (which hopefully is easy to think of as very concrete), and the machine either halts with an output string of symbols, or doesn't. I suspect you're reading too much meaning into the input strings, possibly because you're not sufficiently separating an input string from the maching/function/whatever that, under some particular meta-mathematical encoding convention, we take it to "mean".

[wait, I think I said that wrong; isn't a Gödel sentence actually an integer? all the more lacking in inherent meaning, of course; doesn't change my point.]

That was not well put

each Gödel sentence is a perfectly defined arithmetic recursive function.
a Gödel sentence isn't any kind of function. It's just an assemblage of symbols
...
isn't a Gödel sentence actually an integer?...

What I meant is that each Gödel sentence is an integer, that, under the encoding in which it is a Gödel sentence, corresponds to a perfectly defined arithmetic relation. Take into account that the arithmetic properties of a Gödel number relate it not only to a string of symbols, but also to the syntactic properties of that string of symbols under sytem P, which, if system P is sound, should correspond to its meaning. [added later: what follows in this paragraph can be dissmissed. see next comment. In other words, we cannot have a Gödel number for "3 > 4", even though we can have that string of symbols [added later: Yes, you can have a Gödel number for "3 > 4". Perhaps what I should have said is that you cannot obtain "Bew( 4 < 3 )", or something, but it now seems slightly off-topic ... I'm tripping here, hold on].].

So, if you decode a Gödel sentence under the encoding in which it is a Gödel sentence, you obtain an arithmetic recursive relation; and if you decode another Gödel sentence, under its own encoding, you obtain a different arithmetic recursive relation. Those 2 different plain old arithmetic relations can both be encoded into one or the other encodings, and, depending on which, they will have different "behaviour".

I also totally agree that Turing machines are a much better framework to discuss this than logic.

[sorry about the bracketed

[sorry about the bracketed editing of the previous comment, I will continue here.]

we cannot have a Gödel number for "3 > 4"

What I was thinking about here is that the numbers we use to reach a Gödel sentence are not just Gödel numbers, but Gödel numbers that belong to certain sets, that are defined metamathematically, and so, for example, you have the set of Gödel numbers that correspond to provable formulae. Ultimately, those metamathematical concepts are defined as recursive functions on the integers, and so, conversely, the arithmetic properties of a Gödel sentence carry, not only the formula, but its relation with system P, since it is built with concepts (Bew, etc) that carry the axioms of system P in their definitions.

State of conversation

Honestly, I only meant to slow down my comments here, not end them. It was taking more of my time than I could afford to let it take, so I decided to step back for a few days; I may still have something to say here.

Ok!

I'll wait then, I also need a rest :)

A bit more detail

I will try here to describe the sketch of a conceptual framework where the argument put forward can be examined in more detail.

First we have an informal definition of Turing machine. Infinite tape, internal state, transitions, etc.

Then we have different formal models for that informal definition of Turing machine. One may be a septuple of sets and functions etc, as specified in the wikipedia.

We also have informal specifications for particular Turing machines. To each informally specified Turing machine, we provide a name.

Each of those formal models allows us to produce a formal specification for any informally specified Turing machine. Otherwise, either our informal definition of Turing machine, or our informal specification of the particular Turing machine, or the formal model, is invalid.

So we might define "informal specification of a Turing machine" as that which a human needs to test whether a Turing machine is valid - according to specifications.

Both formal and informal specifications of Turing machines are made by humans, and we may assume that the alphabet used to make formal specifications can be printed to a tape and used as input for Turing machines.

Now we informally specify a particular Turing machine, L for logic. This machine takes as input a Turing machine f and an input i, and determines what the output of f would be if run with input i - or hangs.

To take a Turing machine f as input, f (and i) must be encoded to some particular format, specific for the machine L. So we may think of different actual formal specifications for L, each requiring a possibly different encoding of the machines it uses as input. Let's call L^i to the different formal specifications of L.

So if we want to feed machine f to L^i, then from our formal, human encoding of f, we must obtain an encoding appropriate for the L^i. For this we define a new Turing machine, which we call C for compiler. To each L^i (and each formal model for Turing machines, but for simplicity we'll assume here one single formal model) there corresponds a C^i compiler, that can take a formal specification of a Turing machine, and produce an encoding of it appropriate to be used as input for L^i.

It's clear that the formal specification of some machine f, and the encoding produced from it by some C^i, are equivalent, and, for example, some L^i machine might require the encoding directly produced by humans here. So our compilers might better be called transpilers. But the fact is that there may be alternative formal encodings, be they made by humans of by compiler machines; and that to take them into account, in this discussion, it's clearer to account for them with compiler Turing machines, than with human incantations. So, without loss of generality, we assume a single encoding produced by humans, and all alternative encodings produced by machines.

One final Turing machine we need to define is a decompiler, D^i, that can take the output of machine L^i and encode it to the language of human formal specifications for Turing machines, so humans can check the output.

Now, with this terminology, I think that we can agree that a requirement for all this to be valid, a sanity check, could be as follows:

  • Let's say we have a formal specification for a Turing machine f, and for an input x for f. We encode both with any C^i, and feed them to the corresponding L^i. If there is output, we take it and encode it with the corresponding D^i, to obtain a human understandable o. For example, if f is "input is bigger than 10", and i is "9", o should be "false".
  • Repeat the procedure, starting with exactly the same formal definitions for f and x, but now using different machines C^j, L^j, and D^j, to obtain o'.
  • The requirement is obviously that o and o' should be identical.

An ad hoc formula for this might be (assuming that m(n) stands for the output of machine m when given n as input - or for hanging) :

For all i, j: Di( Li ( Ci (f), Ci (x) ) ) === Dj( Lj ( Cj (f), Cj (x) ) )

But if you do the former with the A machine, the requirement is broken. Substitute f for A(L^1), and there will be some input for which the equality doesn't hold.

Compiler

If I'm following you (important qualification, that), your compiler C^i is what I'm saying isn't real. You're talking (I think) about mapping a formal Turing machine — such as a seven-tuple of sets and functions — to an encoded string. But that mapping literally cannot be performed by a Turing machine, because the input has to already be encoded if it's to be processed by a Turing machine. You might write a Turing machine C^{i,j} that converts an i-encoding of a Turing machine to a j-encoding, but the input to C^{i,j} has to already be encoded. Indeed, the encoding is all in our heads; these actual Turing machines just do symbol manipulations, and only we as thinking observers perceive a correspondence of these symbol manipulations to any sort of "reasoning" about the behavior of described Turing machines.

I'm reminded of the term "meta-circular evaluator". John C. Reynolds coined this name for a technique of describing the semantics of a programming language. The idea (to remind) is that you specify the behavior of a programming language by writing an interpreter for it in another programming language. This is a technique used to good effect in the Wizard Book. He called it "meta-circular" because you aren't (in general) using a programming language to define itself, which would be a circular definition, but, because you are using one programming language to define another, you can't use this technique to define programming languages as a class: you have to already know at least one programming language before you can use the technique. (I'm not sure that's quite what I would ordinarily call "meta", but admittedly the name "meta-circular evaluator" sounds very cool and has turned out to be a very good meme.)

Btw, Alan Turing in his dissertation explored the idea of a hierarchy of formal logics, where you take a logic L that "ought to" be able to prove some proposition P(L) about itself but can't, and you form a new logic L' that's just like L except that P(L) has been added as an axiom. For example P(L) might be the proposition "L is consistent", which L can't itself prove (unless we've mistakenly started with an L that isn't consistent). L' can prove that L is consistent, because it has an axiom saying so; but L' can't prove P(L'). So we have L'' that adds axiom P(L'). L'' can prove P(L'), but cannot prove P(L''). And so on. Those primes get awkward (L''', L''''), so we might as well use a nonnegative integer index (I'll use subscripts, to avoid clashing with your use of superscripts): L_0, L_1, etc. We can also have a logic L_omega, where omega is the number of integers: that is, L_omega is L augmented by axioms P(L_n) for all finite nonnegative integers n. And we can do this for even larger ordinals, noting that L_omega can prove P(L_n) for every finite n but cannot prove P(L_omega). So there is some interesting stuff involving the way different logics relate to each other. (pdf)

We agree on this

You're talking (I think) about mapping a formal Turing machine — such as a seven-tuple of sets and functions — to an encoded string. But that mapping literally cannot be performed by a Turing machine, because the input has to already be encoded

That's clear, that's the reason I said:

we may assume that the alphabet used to make formal specifications can be printed to a tape and used as input for Turing machines.

What I meant is basically what you mean, I think. The model of a septuple of objects is treated in my sketch as the formal framework under which we can produce formal specifications for specific Turing machines. A septuple of sets and functions is, effectively, a set of axioms and theorems that provide the terminology to describe Turing machines, and this terminology can be very easily formatted in a tape fit to be used asinput for a Turing machine.

You might write a Turing machine C^{i,j} that converts an i-encoding of a Turing machine to a j-encoding, but the input to C^{i,j} has to already be encoded

Again, I think it's not difficult to see that we are saying the same thing:

So our compilers might better be called transpilers

I am not assuming any special property of encodings of Turing machines made by humans as compared to those made by compilers, other than the fact that they can be understood by humans.

I use compilers only to cover the fact that different L^i may require different encodings. We might have assumed that we already have our machines and inputs appropriately encoded for our L machines, but the whole point of the exercise is to make those encodings explicit. So that's the reason to have one canonical, humanly produced, humanly understandable encoding, and compiler (transpiler) machines that go from this canonical encoding to specific encodings for specific L machines: to be able to explicitly label differently different encoding of the same machine.

The sketched framework would work equally well (or equally bad) if I had chosen some other non-humanly understandable encoding as canonical encoding, or even no canonical encoding at all; it would just be more difficult to reason about it. The canonical encoding just provides us with a ground where we can compare the outputs of different machines in different encodings.

BTW

Thanks for the link, really interesting, looking forward to finding a while to read it in more depth

So

So if your objections to my proposed framework are already covered, perhaps we might agree on it, it should help focus the discussion on the real point of contention we have here, which is, I think, the level at which the 2 encodings that are involved in the A machine happen: Are there 2 encodings, or one? if there are 2, are they independent, does it make sense to consider 2 that are different? what is the dependency between them, if there are 2? If we denote by Ai the A machine connected to the oracle/table made with Li, does this make sense?:

 D1( L1 ( C1 (A2) ) )

Notation

I think I need to say what I mean with

For all i, j: Di( Li ( Ci (f), Ci (x) ) ) === Dj( Lj ( Cj (f), Cj (x) ) )

We take the formal human encoding to be formally equivalent to any other encoding that an L machine might need as input; we only distinguish it as canonical as a convenience.

All letters except superscripts correspond to formal specifications of Turing machines and inputs for them, expressed in the canonical encoding. (Obviously, if a Turing machine can be encoded in some encoding, that encoding must be capable of encoding its inputs, since a specification of a Turing machine must speak about its inputs.)

And, as said, we understand "m(n)" to mean the output of turing machine m when given input n.

Thus,

f

is the canonical encoding of a Turing machine,

Ci

is the canonical specification of a compiler Turing maachine that takes as input anything encoded in the canonical encoding and produces as output the same anything that was used as input but encoded in the encoding that Li needs (let's call it the i encoding),

Ci (f)

is the Turing machine f encoded in the i encoding,

Li ( Ci (f), Ci (x) )

is the output of Turing machine f when given x as input, expressed in the i encoding, and finally

Di

takes the previous as input and produces the output of f on x, expressed in the canonical encoding.

C^i(x)

Wanting to be very sure of each detail, here. C^i is a "compiler" that translates a canonical representation of a machine into its representation in encoding i. I have some qualm over this because it's not clear to me that this translation-between-encodings has to be Turing-computable. However, it's easy enough to move past this, at least tentatively, by stipulating that the encodings are limited to ones whose translation-from-canonical is Turing-computable. I'm stuck, though, when you apply C^i to x. That's an arbitrary input string. It might or might not be a canonical representation of a Turing machine. So it doesn't make sense to me to make it input to C^i.

You are right

So it doesn't make sense to me to make it input to C^i

So this also applies to the whole action of D^i, which takes the output of L^i as input.

And I agree that it is unneeded. I was thinking of the possibility of different alphabets for different machines, that might be encoded in other alphabets, but that would mean having different models for specifying Turing machines, and I have already forced agreeing on a single one.

Until there is no talk of types, or while there is only one type that corresponds to the alphabet to be printed in the tape, there is no need to compile inputs or decompile outputs.

So sorry, please simplify and assume C^i(x) and D^i to be noops, the identity function.

it's not clear to me that

it's not clear to me that this translation-between-encodings has to be Turing-computable

I would think it is. An encoded Turing machine is just a big number, or sequence of symbols, that can be decomposed with the aid of the encoding into pieces that correspond to concepts such as internal state, transitions, etc. This decomposition must be Turing-computable, since it is the decomposition that an L machine must somehow do to be able to work out the output of the encoded machine. And all encodings must coincide in their underlying high level concepts, otherwise they wouldn't be equivalent - they are equivalent as regards precisely those high level concepts.

-

Wrong place (really is an ergonomic problem with this interface).

To recap

I will try to round up and point towards a conclusion.

It is commonly agreed that

The Gödel sentence asserts of itself: “I am not provable in S,” where “S” is the relevant formal system

(taken from here)

My point is that it may be that what the Gödel sentence asserts of itself is: "I have some arithmetization of system S within myself (or I have some kind of reference to an arithmetization of S, or any other way to link the Gödel sentence with a particular arithmetization - it is clear that there is a link), and, in that particular arithmetization, it is impossible to construct a proof of myself.".

Would this be significative? In my view, yes. If you have a proof in some arithmetization of system S, and want to lift it to the category of "provable in S", you better make sure that your proof can be carried out in any other arithmetization. Otherwise something is wrong.

So, what might be the test to decide which of both interpretations of Gödel's procedure is correct?

In my view, and sorry for changing from Logic parlance to Turing machines, a valid test would be to decide if this (interpreted as given in the "Notation" comment) is a correct expression:

 D1( L1 ( C1 (A2) ) )

If it is, then it must have a different value than:

 D2( L2 ( C2 (A2) ) )

So I think that "the A2 machine behaves differently in different L", clearly implies that "Gödel's proof cannot be generalized to system S".

I think that to understand my position, all that is needed is this comment, the one about notation, and the one describing the framework used in the notation; and John Shutt's post.

The inequality is:For all

The inequality is:

For all i,j: There exists x: Li ( Ci ( Ai ), x ) =/= Lj ( Cj ( Ai ), x )

i.e., the x that corresponds to the diagonal column of Ai in the table built with Li.

[sorry, added: with i =/= j]

I think I understand the

I think I understand the formal proposition, but I don't see why it would be true. Let's see.

Ai is a Turing machine that takes an input string and does stuff internally that uses encoding i of Turing machines.

Cj takes as input a canonical encoding of a Turing machine, and produces as output the same machine encoded via encoding i. So Cj(Ai) would be a j-encoding of Ai.

Lj takes two arguments, the first of which is a j-encoding of a Turing machine, and the second of which is an arbitrary input string that is supposed to be sent to the Turing machine whose j-encoding was the first argument. You say, I think, that Lj decides whether or not its first argument given its second argument would halt? Okay. The first argument does need to be j-encoded, though, so given a canonical encoding of a Turing machine f, we can j-encode it by sending its canonical encoding through Cj. Thus, to determine whether or not machine f would half on input x, we would run Lj(Cj(f),x).

But, we're assuming machines Cj and Lj are correctly implemented. So for every machine f, every input string x, and every encoding i and j, we have

Li(Ci(f),x) = Lj(Cj(f),x)

And this should be true no matter what f is. The encoding needed by L must be provided by C, but the machine f is the same machine no matter which encoding is used to represent it. In particular, it doesn't matter whether or not f uses some particular encoding internally (or even several different ones); the encoding used by L and C is a matter of how f is represented, not what it does. For example, f could be Ai, and then there's a particular choice of x that we're interested in for the logic; but that's all a matter of what Ai does internally, having no effect on the way our Lj and Cj "reason" about its behavior.

So I don't see where your inequality would come from; my understanding of what the various machines are suggests that those two things would always be equal.

Let Li be an L Turing

Let Li be an L Turing machine, and Ti the corresponding table. In Ti you have rows for both Ci(Ai) and for some arbitrary Ci(Ak).

Let x be the input that corresponds to the diagonal cell in the row for Ci(Ai) in the Ti table, and z the input that corresponds to the diagonal cell in the row for Ck(Ak) in the Tk table.

Now, in the ( Ci(Ai), x ) cell in the Ti table, the value must be the output of Li( Ci(Ai), x ), but that is undecidable.

Exactly the same happens in the cell ( Ck(Ak), z ) of the Tk table.

Now, what happens in the ( Ci(Ak), z ) cell (not necessarily diagonal, or most probably not diagonal) of the Ti table? Its value must be Li( Ci(Ak), z). Li has access to the Tk oracle, through Ci(Ak); it is needed to work out the output of running Ak with input z. There has been trouble building the Tk table in the diagonal cell ( Ck(Ak), z ), undecidability; but, whatever the problem, in the cell, "undecidable" is not a valid entry; either it is blank, or has a value. So Li will not have any trouble filling that ( Ci(Ak), z ) cell. Remember that to run Li( Ci(Ak), z) we do not have to build table Tk, it is given.

Now, of course the same is true in the first case, in the case where we evaluate Li( Ci(Ai), x ) for the Ti table. Li will have access to the Ti oracle, and whatever the trouble there was when building it in the ( Ci(Ai), x ) cell, the value cannot be "undecidable". However, in this case, it is Li that cannot decide on Ai, whatever the value in the cell.

So it seems clear to me that there is a difference executing (Ai, x) in Li or in Lk. It doesn't seem to me now that it is captured in the inequality above, though.

detail

I believe there is a subtle error in your post, in the sixth paragraph under the "Gödel" title. There, first you fill a table

Enumerate the machines f and make them the rows of a table. Enumerate the input/output pairs and make them the columns. In the entry for row n and column m, put a "yes" if L confirms that the nth machine has the mth input/output pair, a "no" if L confirms that it doesn't.

...

let's restrict our yes/no table by looking only at columns [...]

Then, you define machine A, and say:

For given input, go through the column labels till you find the one that matches it (we were careful there'd be only one); call that column number m. Use L to confirm a "no" in table entry m,m

But you don't need to use L to confirm, you have the table already filled; it is an oracle. And not only don't you need: you can't use L. Because if you Use L to confirm a "no" in table entry m,m, and m happens to be A, you enter a loop.

And you don't even need it to find undecidability:

Now, in the ( Ci(Ai), x ) cell in the Ti table, the value must be the output of
Li( Ci(Ai), x ), but that is undecidable.

The value in that cell has to be the contrary of what is found in the cell of the table of the encoded machine, but it actually is the same table and cell, so it must be the same.

However,

Li will not have any trouble filling that ( Ci(Ak), z ) cell.

Li takes the encoded Ak, the table Tk with it, and the input z, goes to the z column in the Tk table, goes to the diagonal cell, takes the value there, and takes the contrary; Then, Li will have no problem using that contrary to fill in the totally different cell ( Ci(Ak), z ) in the totally different table Ti.

It doesn't seem to me now that it is captured in the inequality above, though.

The LHS is undecidable, and the RHS isn't. However, since undecidable is not an output, and the RHS's output depends on an oracle that depends on an undecidable computation for its response, the actual equality may hold.

Um, no

The table is not an oracle. The table exists only in the mind of the meta-mathematician; it's our way of conceptualizing the range of different computations.

Unloop

Here's imho an interesting point.

if you use L to confirm a "no" in table entry m,m, and m happens to be A, you enter a loop.

That's an unnecessary assumption. We don't know how L works. L is making deductions about the behavior of machines, but there's no reason to suppose L must run the machine in order to deduce its behavior. In fact, running the machine would be quite a weak way to make deductions about it, and in particular one would never deduce that way that a machine was not going to halt.

[Okay, "quite a weak way to make deductions" is a little harsh; one can after all make any correct deduction that way about something that the machine will do. Nevertheless, you can't make deductions about non-halting that way, so it makes sense that L probably wouldn't be limited to only things that can be deduced by simulation.]

Nevertheless, you can't make

Nevertheless, you can't make deductions about non-halting that way, so it makes sense that L probably wouldn't be limited to only things that can be deduced by simulation.

Haven't been following too closely, but would abstract interpretation be considered simulation? Because you can infer termination properties using AI.

The abstraction is normally

The abstraction is normally designed so that the abstract machine will terminate (with a loss of precision in the output being calculated). So it is a form of simulation, but of a less powerful class of machine. It would not execute the simulation of the more powerful machine.

e.g. \(L^i\) would not execute the TM encoded in language \(i\) - but it would instead abstract it to an encoding on a less powerful machine. So \(L^i\) would not simulate a Turing Machine as such, but rather a less powerful type of machine that approximates the execution of the machine that \(L^i\) should infer something about.

Okay

I can concede that it is not needed that the table is an oracle. The looping argumant is invalid.

However, it is also not needed that it isn't. We can assume it is an oracle, even though we don't need to assume that. But doing it helps isolate the problem I am trying to pinpoint.

It seems clear to me that undecidability happens in different steps for Li( Ci(Ai), x ) than for Lk( Ck(Ai), x). In the first case, undecidability is immediate; in the second case, undecidability comes from the need to compute Li( Ci(Ai), x).

Confining the table to an oracle helps to isolate the different steps at which undecidability happens. It surfaces the difference between Li( Ci(Ai), x ) and Lk( Ck(Ai), x).

So, is there any reason the table cannot be in an oracle? In your post, you say: "...and we've conspicuously placed no constraints at all on O, not even computability".

It seems clear to me that

It seems clear to me that undecidability happens in different steps for Li( Ci(Ai), x ) than for Lk( Ck(Ai), x). In the first case, undecidability is immediate; in the second case, undecidability comes from the need to compute Li( Ci(Ai), x).

No. The choice of encoding k has no effect on the behavior of the encoded machine; it doesn't matter that, in this particular case, the encoded machine happens to be one that, internally, uses encoding i.

Lj(Cj(f),x)) uses a certain logic, L, to deduce the behavior of a machine f on an input x. The deduction depends on three things: the algorithm of L, the algorithm of f, and the input string x. It does not depend on the choice of encoding, j. In the particular case you've been looking at, f=Ai, the superscript i denotes what encoding this A uses internally for its computation. That is, Ai and Aj are algorithmically different. But Ci(Ai) and Ck(Ai) are just two different encodings of the same machine, which happens to be Ai. The k-encoding of Ai is an encoding of a machine that internally uses encoding i. And as long as logic engine Lk is given an encoding of Ai in the encoding it knows how to read, k, it uses the rules of L to make deductions about the behavior of the encoded machine, Ai.

There is no leakage of the encoding choice j into the behavior of the encoded machine Ai.

Of course you are right

Of course you are right. And in my "detail" comment there was something missing: L machines don't build tables, some other builder agent does, using L machines.

I was trying to focus in the difference between the ( Ci(Ak), z ) cell in the Ti table (not a diagonal cell) and the ( Ck(Ak), z ) cell in the Tk table (a diagonal cell). Of course the content should be the same, as you have clearly shown.

When you focus on the n,n cell in your post, that corresponds to ( Ck(Ak), z ) here, and you find undecidability, it is the builder agent for the table, using Lk, Ck(Ak), and z, that finds it, right? Of course, if the table is in the head of the meta-mathematician, the builder agent will be the actual A machine; if the table can be in an oracle, the agent would be something else.

But in the non-diagonal case, the builder agent (supposing some arbitrary machine that needs to access that cell - not an A machine obviously), with access to an oracle, might take a shortcut and not work out the output of Ak(z), and be decidable. Simply follow the steps you prescribe in the rest of the paragraph after focusing in the n,n cell, with a non diagonal ( Ci(Ak), z ), and it doesn't work: Just ask the oracle for the diagonal z cell, without bothering with the corresponding machine, and do what A has to do with it. We are filling a totally different cell in a different table, so no problem.

In the diagonal case, the builder agent (a Turing machine?) would be unable to avoid undecidability. Is this significant? If correct, does this make the tables (and so the A machines) dependent on the encoding?

diagonal

I suspect there's a bottomless pit of technicalities one could get tangled up in with all these conversions between encodings; but, we seem to be getting past that to the essential concepts involved, which is where we want to be.

The diagonal in the table is defined by the way we enumerate the input strings (the column indices) and the way we enumerate the machine encodings (the row indices). Those choices are built in to machine A; machine A defines for itself what is the diagonal. We need the diagonal to cross-cut every row of the table, which is to say that we need A to cross-cut every row of the table, and behave in a way that demonstrates a problem with our supposition when it cross-cuts its own row. We don't have to know just when it cross-cuts its own row, which is to say, we don't need to know which row it occupies, exactly because it cross-cuts every row; that's the essence of the diagonalization method. And A doesn't need to know which row it occupies either. My head hurts just thinking about whether or not it's possible to guarantee that A knows its own row number; if it is possible to show that this cannot be guaranteed, there's probably a proof of it using... diagonalization. I recall hearing a math talk once in which the speaker made a pretty good case that mathematicians who research infinities end up in therapy.

I suspect — suspect — part of your intuition of a problem here is that you're asking, what if we consider multiple encodings, so that what is on the diagonal for the permutation in one encoding is not on the diagonal for another encoding? But the diagonal is actually defined by its hardcoding into the algorithm of A — even more specifically, it's defined by its hardcoding into the algorithm of Ai. So each Ai has its own diagonal that it implements.

[edit: tried to straighten out somewhat muddled phrasing re 'what is on the diagonal' for one encoding versus another]

Almost seeking therapy

I really think that I see and understand what you mean. The only thing I'm not sure about is your suspicion of my intuition of the problem. I think it is not correct, but I'm not 100% sure, and I'd hate to repeat understood things when you already have shown me wrong. But since I'm not sure, I'll risk it and try to detail how I see the problem in light of what I think your suspicion is.

And A doesn't need to know which row it occupies either.

Agreed

the diagonal is actually defined by its hardcoding into the algorithm of A — even more specifically, it's defined by its hardcoding into the algorithm of Ai. So each Ai has its own diagonal that it implements.

Also agreed.

Let's suppose we have 2 different encodings, and an enumeration of Turing machines and inputs for each encoding, so that together they determine 2 different tables *, j and k. We can use them to build 2 A machines, Aj and Ak. Let's also take that the machine Aj, in its diagonal cell in the j table, corresponds to the x input, and that the Ak machine in the k table corresponds to input z in the diagonal.

Let's now limit our attention to the j table. What to put in the Aj,x cell? This cell corresponds to the n,n cell in your post, and the procedure will result in the undecidability you describe.

Now, in the same j table, there is also a row for Ak, and also a column for the z input. That row and column will not correspond -most probably- to a diagonal cell. But we can check anyway what should be in the cell - and it should be exactly the same as in the diagonal cell in the k table. This we have already agreed, I think.

But now, with the non diagonal cell Ak,z in the j table, your undecidable procedure will not work - immediately. It will have to descend to the k table to meet undecidability. Your explanation leading to undecidability would not be identical to the one for n,n - it would be a little bit longer.

Is it a difference, or do I definitely seek therapy? ;) **

What an agent -machine or otherwise- must (or may) do to fill in the Aj,x cell in the j-table and find undecidability is different than what it must do for the Ak,z cell in the same table. Your undecidable process in both cells differs.

If the difference is real, I am despairing of being able to isolate it. Oracles don't seem such a good idea, since using an oracle to build itself seems too circular. Perhaps debugging, tracing the steps?

* [edited to add: I am not supposing that the tables are filled finished tables or anything. Just that an L machine, and an enumeration of Turing machines in the encoding required by the L machine, determines a table. Any machine that has those hardcoded has access to the table - can work out the content of any cell.]

**[edited to add: Honest question. (Not the one about therapy, that's what the ;) if for.) From the clarity and cockyness of the first days of this discussion, I've come to, for the last couple of days, being continuously changing between: There is a difference. But it is insignificant. No, it is significant. Woah. No, but there is no difference. Oops, there is a difference, isn't there? ]

Everyone's mad except me and thee...

I'm going to make a suggestion. I'm not 100% sure of it; I'd need to go back to my blog post and reread that part with this suggestion in mind, to see whether it fits; but I'm going to put the suggestion out here now, rather than wait to check it first and risk losing track of it before getting to the check.

I suggest that the undecidability reasoning using Ak is inherently reasoning about the place of Ak in its own table, the table that Ak defines. That is, when you say

But now, with the non diagonal cell Ak,z in the j table, your undecidable procedure will not work - immediately. It will have to descend to the k table to meet undecidability.

I'm suggesting that this descent isn't anything new, but is how the procedure always worked; it just wasn't obvious because we weren't considering multiple encodings.

Living in an encoded world

I suggest that the undecidability reasoning using Ak is inherently reasoning about the place of Ak in its own table

Agreed that the argument is inherently about A's own table. But that is not incompatible with there being other cells in the table that correspond to other A machines, cells that must be undecidable, but where undecidability will only be met when we reach reasoning about them in their own tables.

And what is in those cells is, basically, Lj (Aj, x) for Aj in its own table, and Lk (Aj, x) for the same machine in other tables.

But now, with the non diagonal cell Ak,z in the j table, your undecidable procedure will not work - immediately. It will have to descend to the k table to meet undecidability.
I'm suggesting that this descent isn't anything new, but is how the procedure always worked;

Ok, so you agree that there may be a difference there. [edited to add: you haven't really agreed, have you ;)]

The problem is that, if there is such a difference, it may leak the encoding to the encoded.

Suppose we are encoded and live in an encoded world. Let's call this encoding w. Can we work out w from within? Of course not, we shouldn't be able to do that. The encoding shouldn't leak to the encoded objeccts. But now imagine the following procedure.

First, note that in this w world, running a Turing machine f is effectively just doing Lw ( f ).

So we can concoct some enumeration of encodings, or of L machines, and use it to build an A machine for each encoding, and run them. Since we are within w, what we are actually doing is making the world run Lw( Ak, z) for all possible k *. Now, should we be able to notice any difference between running Lw (Aw, z), and running any other Lw (Ak, x)? Beacuse these are the 2 things we have agreed that may be different.

If the above difference is correct and noticeable, we might be able to, among all A machines, identify Aw as the one that contains the w encoding, the encoding of our world.

Note that I'm not saying saying that it must be noticeable and thus we must be able to find w. All I say is that Lw (Aw, z) contains all the information needeed to "fail" earlier than Lw (Ak, x). So that whether it is noticeable and we do notice, may depend on the algorithm the Lw follow, which is outside of our control. However, it may also depend on A, which is under our control, and we might be able to tweak A to expose the difference.

* [edited to add: and z, see next comment]

For completeness

making the world run Lw( Ak, z) for all possible k

For all possible k and z. In this scheme, we should obtain undecidabiity (inconsistencies in the outputs) with one input z for each A, the input that corresponds to the diagonal cell in its own table.

And now, once we have appropriate pairs of undecidable (Ak, z), we can question whether we can trace the A machines and distinguish Aw.

Now with encodings

The question I think has become whether there is a difference between running Lw( Cw(Aw), x) and running Lw( Cw(Ak), z). But that's not exact, because within the world w, we do not know what is happening outside, how those objescts are being used. What we can be certain of is that those objects must somehow exist - that, wathever happens outside, it will have all the information that is assumed within Lw, Cw, and Aw or Ak at its disposal.

So now the question becomes that of finding a path to undecidabity with Lw, Cw, Aw, and x, that is unavailable to Lw, Cw, Ak and z, and that is shorter than any available to the later.

So let's examine the following path.

We have Lw, Cw, Cw(Aw), and x . We analyze Cw(Aw) and see that it contains Lw and an enumeration of Cw encoded machines and inputs. We learn that we must use those enumerations to find out the next necessary operation, which is specified in the diagonal row to the x column. We find out what is in that row, and find Cw(Aw). We already have Cw(Aw), so we can recognize it. And now we learn that we have to output the contrary; but we already know that the contrary must be the same, we have already identified both machines as the same. Here we reach an undecidable state.

We have Lw, Cw, Cw(Ak), and z . We analyze Cw(Ak) and see that it contains Lk and an enumeration of Ck encoded machines and inputs. We learn that we must use those enumerations to find out the next necessary operation, which is specified in the diagonal row to the z column. We find out what is in that row, and find Ck(Ak). We already have Cw(Ak), **but** we cannot recognize it; as of now, we do not understand the k encoding. Now we have to use Lk to analyze Ak and understand what to do. We are already past the state where we found undecidability in the previous case.

So at the moment it semms to me that there is a path, the possibility of leaking seems real.

[edited a bit to remove redundancy]